Crusader_Decomp/docs/pentagram-crusader-reference.md

235 lines
12 KiB
Markdown
Raw Normal View History

# Pentagram Crusader Reference
## Purpose
This note mines Pentagram's Ultima 8 / Crusader code and bundled docs for evidence that is useful to current Crusader reverse-engineering, especially the USECODE / VM lane.
It complements [docs/scummvm-crusader-reference.md](docs/scummvm-crusader-reference.md). Where Pentagram and ScummVM agree, that usually strengthens provenance, but not always confidence: several of the relevant ScummVM Ultima8 components appear to descend from the same Pentagram-era implementation ideas, so matching behavior between the two should not be treated as fully independent confirmation.
## Highest-Value Findings
1. Pentagram contains direct Crusader USECODE parser and VM support, not just generic U8 notes.
Files: `convert/crusader/ConvertUsecodeCrusader.h`, `usecode/UsecodeFlex.cpp`, `usecode/Usecode.cpp`, `usecode/UCMachine.cpp`, `usecode/remorseintrinsics.h`, `kernel/GUIApp.cpp`.
2. Pentagram's older U8 USECODE documentation is still useful as contrast material because it shows which parts of the object/event model stayed stable and which parts changed in Crusader.
File: `docs/u8usecode.txt`.
3. Pentagram preserves one practical caution that ScummVM does not show as clearly: its Crusader runtime support is incomplete.
Files: `FAQ`, `world/Item.cpp`, `games/RemorseGame.cpp`.
4. Pentagram also records a few engine-format deltas that are useful outside USECODE, including Crusader map coordinate scaling, larger map chunks, and a wider Crusader `typeflag.dat` record.
Files: `world/Map.cpp`, `world/CurrentMap.cpp`, `graphics/TypeFlags.cpp`.
## Direct Pentagram Crusader Evidence
### USECODE class layout and event lookup
`usecode/UsecodeFlex.cpp` matches the broad Crusader model already noted from ScummVM:
- class body object = `classid + 2`
- class names come from object `1` at `name_object + 4 + 13 * classid`
- Crusader class base offset is read from bytes `8..11` of the class object and decremented by `1`
- Crusader event count is computed as `(get_class_base_offset(classid) + 19) / 6`
`usecode/Usecode.cpp` then resolves Crusader event offsets from class data at `20 + 6 * eventid`, using bytes `+2..+5` of each 6-byte row as the code offset.
Implication for current RE:
- Pentagram independently preserves the same `classid + 2` and 6-byte event-row reading used in the ScummVM note.
- The shared `(base + 19) / 6` event-count rule should still be treated carefully in current owner-loaded/raw EUSECODE work, because local binary validation already showed that this shared Pentagram/ScummVM rule is not a clean fit for sampled raw class records.
- In other words, Pentagram is strong provenance for the implementation lineage, but not a reason to override validated binary-side arithmetic.
### Crusader event-name table
`convert/crusader/ConvertUsecodeCrusader.h` provides a named Crusader event table for `0x00..0x1f`:
- clear names: `look`, `use`, `anim`, `setActivity`, `cachein`, `hit`, `gotHit`, `hatch`, `schedule`, `release`, `combine`, `calledFromAnim`, `enterFastArea`, `leaveFastArea`, `justMoved`, `AvatarStoleSomething`, `animGetHit`
- weak placeholders remain for `0x0a`, `0x0b`, `0x0d`, `0x11`, and `0x15..0x1f`
This is slightly rougher than the current ScummVM note in naming quality, but it is still useful because it shows which ordinals were already considered understood in the older Pentagram work and which ones remained unresolved.
### Crusader call opcode semantics inside the VM
`usecode/UCMachine.cpp` contains one especially useful comment-backed distinction:
- U8 opcode `0x11` calls a function at an explicit class/code offset
- Crusader opcode `0x11` calls function number `yy yy` of class `xx xx`, then translates that number through `get_class_event()`
That matters for current USECODE analysis because it reinforces the reading that Crusader bytecode is event-ordinal-driven in places where U8 was direct-offset-driven.
### Remorse intrinsic runtime table exists, but it is partial and sparse
`kernel/GUIApp.cpp` creates `UCMachine(RemorseIntrinsics, 308)` for Remorse, and `usecode/remorseintrinsics.h` holds that live runtime table.
What is useful:
- it confirms a real Remorse-specific runtime intrinsic table with at least `308` entries
- some entries are already mapped to concrete engine hooks such as frame/shape/status/quality accessors, item creation, movement helpers, egg helpers, and timer-tick access
What is not useful enough yet:
- the table is far sparser and rougher than ScummVM's later Remorse/Regret intrinsic descriptions
- many entries are still `0` or placeholder comments
Practical use:
- treat Pentagram intrinsics as secondary hints or provenance for older naming work
- prefer ScummVM for higher-coverage intrinsic labeling
- prefer raw binary behavior over either table for actual renames
### Version-sensitive global evidence
Pentagram's scratch notes add one useful wrinkle to the global-slot story:
- `docs/scratch/globals/remorse1.01.txt` starts with `global_address 003D`
- `docs/scratch/globals/regret1.01.txt` starts with `global_address 001E`
Cross-reference with ScummVM:
- the existing ScummVM note records Remorse global `0x003c` and Regret global `0x001e`
Safest read:
- Regret lines up cleanly at `0x001e`
- Remorse appears version-sensitive or notation-sensitive between Pentagram artifacts and later ScummVM code (`0x003d` in the Pentagram scratch output for Remorse 1.01 versus `0x003c` in the ScummVM runtime initialization path)
Implication for RE:
- keep Remorse global-slot claims version-tagged when possible
- do not collapse `0x003c` and `0x003d` into one unqualified global statement without checking game/version context
## U8-Specific Documentation That Still Helps
### `docs/u8usecode.txt`
This file is U8-specific, not direct Crusader evidence, but it is still useful in three ways.
First, it documents the older U8 class/object indexing model:
- object `0` = global flag names
- object `1` = usecode function names
- object `2 + shape` = shape-linked usecode body
- object `1026 + npc` = NPC-linked usecode body
Second, it records the classic U8 per-class layout:
- 12-byte header prefix
- 32 event pointers
- code body after that table
Third, it preserves an older event-meaning list for ordinals `0x00..0x1f`.
Why it still matters for Crusader:
- many semantic event labels survive into the Crusader table: `look`, `use`, `anim`, `cachein`, `hit`, `gotHit`, `hatch`, `schedule`, `release`, `combine`, `enterFastArea`, `leaveFastArea`, `AvatarStoleSomething`
- the document makes the Crusader deltas clearer: Crusader moved away from a fixed 32 x 4-byte event-pointer table and instead uses a 6-byte-per-event structure with event-number lookup in the VM
Recommended use:
- use `u8usecode.txt` as a contrast document for inherited VM concepts and event semantics
- do not use it as direct proof of Crusader container layout or opcode contracts
## Cross-Reference Against The Existing ScummVM Note
### Where Pentagram and ScummVM clearly agree
Both references point to the same core Crusader USECODE model:
- `classid + 2` class lookup
- class names in object `1`
- bytes `8..11` as the class header field used for Crusader code/event addressing
- 6-byte Crusader event rows
- named event ordinals `0x00..0x1f`
- a Crusader-specific VM/global path rather than a straight U8 reuse
This agreement is useful because it shows the model is not a one-off local interpretation.
### Where Pentagram adds something materially useful
Pentagram contributes a few things the ScummVM note did not emphasize as strongly:
- older U8 documentation that makes Crusader structural deltas easier to isolate
- explicit confirmation in `UCMachine.cpp` that Crusader opcode `0x11` is event-number dispatch, not raw offset dispatch
- scratch global dumps that expose version-sensitive Remorse versus Regret behavior
- explicit incompleteness warnings in the project itself, which help calibrate how much authority to assign to runtime behavior
### Where Pentagram should not increase confidence much
For the current header/count dispute in owner-loaded/raw EUSECODE parsing, Pentagram and ScummVM agreeing with each other does not settle the question.
Reason:
- the relevant Pentagram and ScummVM Crusader USECODE code paths are very close in structure
- that makes them best treated as one implementation lineage, not two independent external confirmations
Current rule for RE remains:
- use Pentagram/ScummVM to anchor object indexing, row size, event labels, and VM intent
- keep the local binary-validated class-header arithmetic as the authority when the shared engine code disagrees with sampled Crusader records
## Non-USECODE Engine Findings Worth Keeping
These are lower priority than the USECODE sections, but still useful for future binary-side work.
### Map loading
`world/Map.cpp` shows that Crusader on-disk map records are still read as 16-byte records, but Pentagram doubles `x` and `y` after loading when `GAME_IS_CRUSADER`.
Implication:
- if a raw loader appears to scale map coordinates or if current external-map tooling sees a factor-of-two mismatch, Pentagram provides a concrete engine-side reason to test that path
### Current map chunking
`world/CurrentMap.cpp` sets `mapChunkSize = 1024` for Crusader versus `512` for U8.
Implication:
- this matches the broader cross-project pattern that Crusader is not just U8 data with renamed files; some world/grid assumptions are materially different
### Crusader `typeflag.dat`
`graphics/TypeFlags.cpp` switches Crusader to 9-byte records instead of U8's 8-byte records, with extended family-bit handling and multiple Crusader-only flag placeholders.
Implication:
- Crusader `typeflag.dat` should continue to be treated as its own format family
- any local parser or reverse-engineered structure should not inherit the U8 8-byte layout blindly
## Confidence Limits
Pentagram is valuable, but only in bounded ways.
Direct reasons for caution:
- `FAQ` says Crusader support was a future goal, not a completed feature
- `games/RemorseGame.cpp` is clearly incomplete compared with the ScummVM Crusader startup path
- `world/Item.cpp` explicitly disables all Crusader usecode events except `use()`
So for current Crusader RE, the best weighting is:
- high confidence: parser/disassembler layout clues, event ordinals, VM intent, container/indexing models, file-format deltas
- medium confidence: sparse Remorse intrinsic names and scratch global artifacts
- low confidence: full runtime behavior, startup semantics, and any absence-based conclusion from Pentagram's Crusader execution path
## Most Useful Pentagram Files
- `convert/crusader/ConvertUsecodeCrusader.h`
- `usecode/UsecodeFlex.cpp`
- `usecode/Usecode.cpp`
- `usecode/UCMachine.cpp`
- `docs/u8usecode.txt`
- `docs/scratch/globals/remorse1.01.txt`
- `world/Item.cpp`
- `graphics/TypeFlags.cpp`
- `world/Map.cpp`
- `world/CurrentMap.cpp`
## Practical RE Follow-Ups
1. Keep using Pentagram and ScummVM event names as slot-label hints only, especially for `0x0a`, `0x0b`, `0x11`, and the still-placeholder high ordinals.
2. When documenting Crusader USECODE VM behavior, cite Pentagram's `opcode 0x11 = class/event dispatch` distinction alongside the existing ScummVM reference.
3. Keep local owner-loaded/raw EUSECODE arithmetic authoritative over the shared Pentagram/ScummVM `(base + 19) / 6` rule until a direct main USECODE sample proves otherwise.
4. Tag Remorse global-slot references with version context when using Pentagram scratch outputs.
5. Reuse Pentagram's map/typeflag deltas when a future binary pass returns to world loaders or shape/type metadata.
6. Treat missing behavior in Pentagram's Crusader runtime as non-evidence unless ScummVM or raw binary analysis supports the same absence.