Crusader_Decomp/docs/pentagram-crusader-reference.md
MaddoScientisto daa363c3d2 Add 'annotate-usecode' command to import USECODE IR JSON annotations
- Introduced a new command 'annotate-usecode' to import USECODE IR JSON annotation hints as Ghidra comments on compiled anchors.
- Added argument parsing for multiple IR JSON files, comment type selection, and a dry-run option.
- Implemented logic to read annotation records from the provided IR files and set comments on the corresponding addresses in Ghidra.
- Enhanced JSON schema to include response structure for the new command.
2026-03-24 18:14:20 +01:00

12 KiB

Pentagram Crusader Reference

Purpose

This note mines Pentagram's Ultima 8 / Crusader code and bundled docs for evidence that is useful to current Crusader reverse-engineering, especially the USECODE / VM lane.

It complements docs/scummvm-crusader-reference.md. Where Pentagram and ScummVM agree, that usually strengthens provenance, but not always confidence: several of the relevant ScummVM Ultima8 components appear to descend from the same Pentagram-era implementation ideas, so matching behavior between the two should not be treated as fully independent confirmation.

Highest-Value Findings

  1. Pentagram contains direct Crusader USECODE parser and VM support, not just generic U8 notes. Files: convert/crusader/ConvertUsecodeCrusader.h, usecode/UsecodeFlex.cpp, usecode/Usecode.cpp, usecode/UCMachine.cpp, usecode/remorseintrinsics.h, kernel/GUIApp.cpp.

  2. Pentagram's older U8 USECODE documentation is still useful as contrast material because it shows which parts of the object/event model stayed stable and which parts changed in Crusader. File: docs/u8usecode.txt.

  3. Pentagram preserves one practical caution that ScummVM does not show as clearly: its Crusader runtime support is incomplete. Files: FAQ, world/Item.cpp, games/RemorseGame.cpp.

  4. Pentagram also records a few engine-format deltas that are useful outside USECODE, including Crusader map coordinate scaling, larger map chunks, and a wider Crusader typeflag.dat record. Files: world/Map.cpp, world/CurrentMap.cpp, graphics/TypeFlags.cpp.

Direct Pentagram Crusader Evidence

USECODE class layout and event lookup

usecode/UsecodeFlex.cpp matches the broad Crusader model already noted from ScummVM:

  • class body object = classid + 2
  • class names come from object 1 at name_object + 4 + 13 * classid
  • Crusader class base offset is read from bytes 8..11 of the class object and decremented by 1
  • Crusader event count is computed as (get_class_base_offset(classid) + 19) / 6

usecode/Usecode.cpp then resolves Crusader event offsets from class data at 20 + 6 * eventid, using bytes +2..+5 of each 6-byte row as the code offset.

Implication for current RE:

  • Pentagram independently preserves the same classid + 2 and 6-byte event-row reading used in the ScummVM note.
  • The shared (base + 19) / 6 event-count rule should still be treated carefully in current owner-loaded/raw EUSECODE work, because local binary validation already showed that this shared Pentagram/ScummVM rule is not a clean fit for sampled raw class records.
  • In other words, Pentagram is strong provenance for the implementation lineage, but not a reason to override validated binary-side arithmetic.

Crusader event-name table

convert/crusader/ConvertUsecodeCrusader.h provides a named Crusader event table for 0x00..0x1f:

  • clear names: look, use, anim, setActivity, cachein, hit, gotHit, hatch, schedule, release, combine, calledFromAnim, enterFastArea, leaveFastArea, justMoved, AvatarStoleSomething, animGetHit
  • weak placeholders remain for 0x0a, 0x0b, 0x0d, 0x11, and 0x15..0x1f

This is slightly rougher than the current ScummVM note in naming quality, but it is still useful because it shows which ordinals were already considered understood in the older Pentagram work and which ones remained unresolved.

Crusader call opcode semantics inside the VM

usecode/UCMachine.cpp contains one especially useful comment-backed distinction:

  • U8 opcode 0x11 calls a function at an explicit class/code offset
  • Crusader opcode 0x11 calls function number yy yy of class xx xx, then translates that number through get_class_event()

That matters for current USECODE analysis because it reinforces the reading that Crusader bytecode is event-ordinal-driven in places where U8 was direct-offset-driven.

Remorse intrinsic runtime table exists, but it is partial and sparse

kernel/GUIApp.cpp creates UCMachine(RemorseIntrinsics, 308) for Remorse, and usecode/remorseintrinsics.h holds that live runtime table.

What is useful:

  • it confirms a real Remorse-specific runtime intrinsic table with at least 308 entries
  • some entries are already mapped to concrete engine hooks such as frame/shape/status/quality accessors, item creation, movement helpers, egg helpers, and timer-tick access

What is not useful enough yet:

  • the table is far sparser and rougher than ScummVM's later Remorse/Regret intrinsic descriptions
  • many entries are still 0 or placeholder comments

Practical use:

  • treat Pentagram intrinsics as secondary hints or provenance for older naming work
  • prefer ScummVM for higher-coverage intrinsic labeling
  • prefer raw binary behavior over either table for actual renames

Version-sensitive global evidence

Pentagram's scratch notes add one useful wrinkle to the global-slot story:

  • docs/scratch/globals/remorse1.01.txt starts with global_address 003D
  • docs/scratch/globals/regret1.01.txt starts with global_address 001E

Cross-reference with ScummVM:

  • the existing ScummVM note records Remorse global 0x003c and Regret global 0x001e

Safest read:

  • Regret lines up cleanly at 0x001e
  • Remorse appears version-sensitive or notation-sensitive between Pentagram artifacts and later ScummVM code (0x003d in the Pentagram scratch output for Remorse 1.01 versus 0x003c in the ScummVM runtime initialization path)

Implication for RE:

  • keep Remorse global-slot claims version-tagged when possible
  • do not collapse 0x003c and 0x003d into one unqualified global statement without checking game/version context

U8-Specific Documentation That Still Helps

docs/u8usecode.txt

This file is U8-specific, not direct Crusader evidence, but it is still useful in three ways.

First, it documents the older U8 class/object indexing model:

  • object 0 = global flag names
  • object 1 = usecode function names
  • object 2 + shape = shape-linked usecode body
  • object 1026 + npc = NPC-linked usecode body

Second, it records the classic U8 per-class layout:

  • 12-byte header prefix
  • 32 event pointers
  • code body after that table

Third, it preserves an older event-meaning list for ordinals 0x00..0x1f.

Why it still matters for Crusader:

  • many semantic event labels survive into the Crusader table: look, use, anim, cachein, hit, gotHit, hatch, schedule, release, combine, enterFastArea, leaveFastArea, AvatarStoleSomething
  • the document makes the Crusader deltas clearer: Crusader moved away from a fixed 32 x 4-byte event-pointer table and instead uses a 6-byte-per-event structure with event-number lookup in the VM

Recommended use:

  • use u8usecode.txt as a contrast document for inherited VM concepts and event semantics
  • do not use it as direct proof of Crusader container layout or opcode contracts

Cross-Reference Against The Existing ScummVM Note

Where Pentagram and ScummVM clearly agree

Both references point to the same core Crusader USECODE model:

  • classid + 2 class lookup
  • class names in object 1
  • bytes 8..11 as the class header field used for Crusader code/event addressing
  • 6-byte Crusader event rows
  • named event ordinals 0x00..0x1f
  • a Crusader-specific VM/global path rather than a straight U8 reuse

This agreement is useful because it shows the model is not a one-off local interpretation.

Where Pentagram adds something materially useful

Pentagram contributes a few things the ScummVM note did not emphasize as strongly:

  • older U8 documentation that makes Crusader structural deltas easier to isolate
  • explicit confirmation in UCMachine.cpp that Crusader opcode 0x11 is event-number dispatch, not raw offset dispatch
  • scratch global dumps that expose version-sensitive Remorse versus Regret behavior
  • explicit incompleteness warnings in the project itself, which help calibrate how much authority to assign to runtime behavior

Where Pentagram should not increase confidence much

For the current header/count dispute in owner-loaded/raw EUSECODE parsing, Pentagram and ScummVM agreeing with each other does not settle the question.

Reason:

  • the relevant Pentagram and ScummVM Crusader USECODE code paths are very close in structure
  • that makes them best treated as one implementation lineage, not two independent external confirmations

Current rule for RE remains:

  • use Pentagram/ScummVM to anchor object indexing, row size, event labels, and VM intent
  • keep the local binary-validated class-header arithmetic as the authority when the shared engine code disagrees with sampled Crusader records

Non-USECODE Engine Findings Worth Keeping

These are lower priority than the USECODE sections, but still useful for future binary-side work.

Map loading

world/Map.cpp shows that Crusader on-disk map records are still read as 16-byte records, but Pentagram doubles x and y after loading when GAME_IS_CRUSADER.

Implication:

  • if a raw loader appears to scale map coordinates or if current external-map tooling sees a factor-of-two mismatch, Pentagram provides a concrete engine-side reason to test that path

Current map chunking

world/CurrentMap.cpp sets mapChunkSize = 1024 for Crusader versus 512 for U8.

Implication:

  • this matches the broader cross-project pattern that Crusader is not just U8 data with renamed files; some world/grid assumptions are materially different

Crusader typeflag.dat

graphics/TypeFlags.cpp switches Crusader to 9-byte records instead of U8's 8-byte records, with extended family-bit handling and multiple Crusader-only flag placeholders.

Implication:

  • Crusader typeflag.dat should continue to be treated as its own format family
  • any local parser or reverse-engineered structure should not inherit the U8 8-byte layout blindly

Confidence Limits

Pentagram is valuable, but only in bounded ways.

Direct reasons for caution:

  • FAQ says Crusader support was a future goal, not a completed feature
  • games/RemorseGame.cpp is clearly incomplete compared with the ScummVM Crusader startup path
  • world/Item.cpp explicitly disables all Crusader usecode events except use()

So for current Crusader RE, the best weighting is:

  • high confidence: parser/disassembler layout clues, event ordinals, VM intent, container/indexing models, file-format deltas
  • medium confidence: sparse Remorse intrinsic names and scratch global artifacts
  • low confidence: full runtime behavior, startup semantics, and any absence-based conclusion from Pentagram's Crusader execution path

Most Useful Pentagram Files

  • convert/crusader/ConvertUsecodeCrusader.h
  • usecode/UsecodeFlex.cpp
  • usecode/Usecode.cpp
  • usecode/UCMachine.cpp
  • docs/u8usecode.txt
  • docs/scratch/globals/remorse1.01.txt
  • world/Item.cpp
  • graphics/TypeFlags.cpp
  • world/Map.cpp
  • world/CurrentMap.cpp

Practical RE Follow-Ups

  1. Keep using Pentagram and ScummVM event names as slot-label hints only, especially for 0x0a, 0x0b, 0x11, and the still-placeholder high ordinals.
  2. When documenting Crusader USECODE VM behavior, cite Pentagram's opcode 0x11 = class/event dispatch distinction alongside the existing ScummVM reference.
  3. Keep local owner-loaded/raw EUSECODE arithmetic authoritative over the shared Pentagram/ScummVM (base + 19) / 6 rule until a direct main USECODE sample proves otherwise.
  4. Tag Remorse global-slot references with version context when using Pentagram scratch outputs.
  5. Reuse Pentagram's map/typeflag deltas when a future binary pass returns to world loaders or shape/type metadata.
  6. Treat missing behavior in Pentagram's Crusader runtime as non-evidence unless ScummVM or raw binary analysis supports the same absence.