- Introduced a new command 'annotate-usecode' to import USECODE IR JSON annotation hints as Ghidra comments on compiled anchors. - Added argument parsing for multiple IR JSON files, comment type selection, and a dry-run option. - Implemented logic to read annotation records from the provided IR files and set comments on the corresponding addresses in Ghidra. - Enhanced JSON schema to include response structure for the new command.
12 KiB
Pentagram Crusader Reference
Purpose
This note mines Pentagram's Ultima 8 / Crusader code and bundled docs for evidence that is useful to current Crusader reverse-engineering, especially the USECODE / VM lane.
It complements docs/scummvm-crusader-reference.md. Where Pentagram and ScummVM agree, that usually strengthens provenance, but not always confidence: several of the relevant ScummVM Ultima8 components appear to descend from the same Pentagram-era implementation ideas, so matching behavior between the two should not be treated as fully independent confirmation.
Highest-Value Findings
-
Pentagram contains direct Crusader USECODE parser and VM support, not just generic U8 notes. Files:
convert/crusader/ConvertUsecodeCrusader.h,usecode/UsecodeFlex.cpp,usecode/Usecode.cpp,usecode/UCMachine.cpp,usecode/remorseintrinsics.h,kernel/GUIApp.cpp. -
Pentagram's older U8 USECODE documentation is still useful as contrast material because it shows which parts of the object/event model stayed stable and which parts changed in Crusader. File:
docs/u8usecode.txt. -
Pentagram preserves one practical caution that ScummVM does not show as clearly: its Crusader runtime support is incomplete. Files:
FAQ,world/Item.cpp,games/RemorseGame.cpp. -
Pentagram also records a few engine-format deltas that are useful outside USECODE, including Crusader map coordinate scaling, larger map chunks, and a wider Crusader
typeflag.datrecord. Files:world/Map.cpp,world/CurrentMap.cpp,graphics/TypeFlags.cpp.
Direct Pentagram Crusader Evidence
USECODE class layout and event lookup
usecode/UsecodeFlex.cpp matches the broad Crusader model already noted from ScummVM:
- class body object =
classid + 2 - class names come from object
1atname_object + 4 + 13 * classid - Crusader class base offset is read from bytes
8..11of the class object and decremented by1 - Crusader event count is computed as
(get_class_base_offset(classid) + 19) / 6
usecode/Usecode.cpp then resolves Crusader event offsets from class data at 20 + 6 * eventid, using bytes +2..+5 of each 6-byte row as the code offset.
Implication for current RE:
- Pentagram independently preserves the same
classid + 2and 6-byte event-row reading used in the ScummVM note. - The shared
(base + 19) / 6event-count rule should still be treated carefully in current owner-loaded/raw EUSECODE work, because local binary validation already showed that this shared Pentagram/ScummVM rule is not a clean fit for sampled raw class records. - In other words, Pentagram is strong provenance for the implementation lineage, but not a reason to override validated binary-side arithmetic.
Crusader event-name table
convert/crusader/ConvertUsecodeCrusader.h provides a named Crusader event table for 0x00..0x1f:
- clear names:
look,use,anim,setActivity,cachein,hit,gotHit,hatch,schedule,release,combine,calledFromAnim,enterFastArea,leaveFastArea,justMoved,AvatarStoleSomething,animGetHit - weak placeholders remain for
0x0a,0x0b,0x0d,0x11, and0x15..0x1f
This is slightly rougher than the current ScummVM note in naming quality, but it is still useful because it shows which ordinals were already considered understood in the older Pentagram work and which ones remained unresolved.
Crusader call opcode semantics inside the VM
usecode/UCMachine.cpp contains one especially useful comment-backed distinction:
- U8 opcode
0x11calls a function at an explicit class/code offset - Crusader opcode
0x11calls function numberyy yyof classxx xx, then translates that number throughget_class_event()
That matters for current USECODE analysis because it reinforces the reading that Crusader bytecode is event-ordinal-driven in places where U8 was direct-offset-driven.
Remorse intrinsic runtime table exists, but it is partial and sparse
kernel/GUIApp.cpp creates UCMachine(RemorseIntrinsics, 308) for Remorse, and usecode/remorseintrinsics.h holds that live runtime table.
What is useful:
- it confirms a real Remorse-specific runtime intrinsic table with at least
308entries - some entries are already mapped to concrete engine hooks such as frame/shape/status/quality accessors, item creation, movement helpers, egg helpers, and timer-tick access
What is not useful enough yet:
- the table is far sparser and rougher than ScummVM's later Remorse/Regret intrinsic descriptions
- many entries are still
0or placeholder comments
Practical use:
- treat Pentagram intrinsics as secondary hints or provenance for older naming work
- prefer ScummVM for higher-coverage intrinsic labeling
- prefer raw binary behavior over either table for actual renames
Version-sensitive global evidence
Pentagram's scratch notes add one useful wrinkle to the global-slot story:
docs/scratch/globals/remorse1.01.txtstarts withglobal_address 003Ddocs/scratch/globals/regret1.01.txtstarts withglobal_address 001E
Cross-reference with ScummVM:
- the existing ScummVM note records Remorse global
0x003cand Regret global0x001e
Safest read:
- Regret lines up cleanly at
0x001e - Remorse appears version-sensitive or notation-sensitive between Pentagram artifacts and later ScummVM code (
0x003din the Pentagram scratch output for Remorse 1.01 versus0x003cin the ScummVM runtime initialization path)
Implication for RE:
- keep Remorse global-slot claims version-tagged when possible
- do not collapse
0x003cand0x003dinto one unqualified global statement without checking game/version context
U8-Specific Documentation That Still Helps
docs/u8usecode.txt
This file is U8-specific, not direct Crusader evidence, but it is still useful in three ways.
First, it documents the older U8 class/object indexing model:
- object
0= global flag names - object
1= usecode function names - object
2 + shape= shape-linked usecode body - object
1026 + npc= NPC-linked usecode body
Second, it records the classic U8 per-class layout:
- 12-byte header prefix
- 32 event pointers
- code body after that table
Third, it preserves an older event-meaning list for ordinals 0x00..0x1f.
Why it still matters for Crusader:
- many semantic event labels survive into the Crusader table:
look,use,anim,cachein,hit,gotHit,hatch,schedule,release,combine,enterFastArea,leaveFastArea,AvatarStoleSomething - the document makes the Crusader deltas clearer: Crusader moved away from a fixed 32 x 4-byte event-pointer table and instead uses a 6-byte-per-event structure with event-number lookup in the VM
Recommended use:
- use
u8usecode.txtas a contrast document for inherited VM concepts and event semantics - do not use it as direct proof of Crusader container layout or opcode contracts
Cross-Reference Against The Existing ScummVM Note
Where Pentagram and ScummVM clearly agree
Both references point to the same core Crusader USECODE model:
classid + 2class lookup- class names in object
1 - bytes
8..11as the class header field used for Crusader code/event addressing - 6-byte Crusader event rows
- named event ordinals
0x00..0x1f - a Crusader-specific VM/global path rather than a straight U8 reuse
This agreement is useful because it shows the model is not a one-off local interpretation.
Where Pentagram adds something materially useful
Pentagram contributes a few things the ScummVM note did not emphasize as strongly:
- older U8 documentation that makes Crusader structural deltas easier to isolate
- explicit confirmation in
UCMachine.cppthat Crusader opcode0x11is event-number dispatch, not raw offset dispatch - scratch global dumps that expose version-sensitive Remorse versus Regret behavior
- explicit incompleteness warnings in the project itself, which help calibrate how much authority to assign to runtime behavior
Where Pentagram should not increase confidence much
For the current header/count dispute in owner-loaded/raw EUSECODE parsing, Pentagram and ScummVM agreeing with each other does not settle the question.
Reason:
- the relevant Pentagram and ScummVM Crusader USECODE code paths are very close in structure
- that makes them best treated as one implementation lineage, not two independent external confirmations
Current rule for RE remains:
- use Pentagram/ScummVM to anchor object indexing, row size, event labels, and VM intent
- keep the local binary-validated class-header arithmetic as the authority when the shared engine code disagrees with sampled Crusader records
Non-USECODE Engine Findings Worth Keeping
These are lower priority than the USECODE sections, but still useful for future binary-side work.
Map loading
world/Map.cpp shows that Crusader on-disk map records are still read as 16-byte records, but Pentagram doubles x and y after loading when GAME_IS_CRUSADER.
Implication:
- if a raw loader appears to scale map coordinates or if current external-map tooling sees a factor-of-two mismatch, Pentagram provides a concrete engine-side reason to test that path
Current map chunking
world/CurrentMap.cpp sets mapChunkSize = 1024 for Crusader versus 512 for U8.
Implication:
- this matches the broader cross-project pattern that Crusader is not just U8 data with renamed files; some world/grid assumptions are materially different
Crusader typeflag.dat
graphics/TypeFlags.cpp switches Crusader to 9-byte records instead of U8's 8-byte records, with extended family-bit handling and multiple Crusader-only flag placeholders.
Implication:
- Crusader
typeflag.datshould continue to be treated as its own format family - any local parser or reverse-engineered structure should not inherit the U8 8-byte layout blindly
Confidence Limits
Pentagram is valuable, but only in bounded ways.
Direct reasons for caution:
FAQsays Crusader support was a future goal, not a completed featuregames/RemorseGame.cppis clearly incomplete compared with the ScummVM Crusader startup pathworld/Item.cppexplicitly disables all Crusader usecode events exceptuse()
So for current Crusader RE, the best weighting is:
- high confidence: parser/disassembler layout clues, event ordinals, VM intent, container/indexing models, file-format deltas
- medium confidence: sparse Remorse intrinsic names and scratch global artifacts
- low confidence: full runtime behavior, startup semantics, and any absence-based conclusion from Pentagram's Crusader execution path
Most Useful Pentagram Files
convert/crusader/ConvertUsecodeCrusader.husecode/UsecodeFlex.cppusecode/Usecode.cppusecode/UCMachine.cppdocs/u8usecode.txtdocs/scratch/globals/remorse1.01.txtworld/Item.cppgraphics/TypeFlags.cppworld/Map.cppworld/CurrentMap.cpp
Practical RE Follow-Ups
- Keep using Pentagram and ScummVM event names as slot-label hints only, especially for
0x0a,0x0b,0x11, and the still-placeholder high ordinals. - When documenting Crusader USECODE VM behavior, cite Pentagram's
opcode 0x11 = class/event dispatchdistinction alongside the existing ScummVM reference. - Keep local owner-loaded/raw EUSECODE arithmetic authoritative over the shared Pentagram/ScummVM
(base + 19) / 6rule until a direct main USECODE sample proves otherwise. - Tag Remorse global-slot references with version context when using Pentagram scratch outputs.
- Reuse Pentagram's map/typeflag deltas when a future binary pass returns to world loaders or shape/type metadata.
- Treat missing behavior in Pentagram's Crusader runtime as non-evidence unless ScummVM or raw binary analysis supports the same absence.