Crusader_Decomp/docs/crusader-disasm-reference.md

177 lines
No EOL
12 KiB
Markdown

# Crusader Disasm Reference Corpus
## Purpose
This note records the reusable knowledge in the separate local project at `K:/ghidra/crusader-disasm` and how it should be consumed inside the main `Crusader_Decomp` workflow.
Treat that repo as an auxiliary evidence corpus, not as direct rename authority for `CRUSADER-RAW.EXE` or the NE-loaded `CRUSADER.EXE` program.
## Main Assets
### Handwritten notes
`misc_crusader_notes.txt` is a compact scratchpad with a few still-useful anchors:
- keyboard event/code notes: `CTRL-M = 0x432`, `CTRL-L = 0x426`, `CTRL-V = 0x42f`, `CTRL-Q = 0x410`
- shape/event examples: `STEAM2` (`shape 1297`) is noted with event `15` (`enterFastArea`) and 24 animation frames
- `SnapEgg` (`shape 0x4fe`) is noted as entering the fast area and being handed to a `SnapProcess`
- chest/monitor examples preserve concrete shape ids and field snapshots that can be cross-checked against map/object dumps
- old standalone function labels such as `FUN_1130_0896` are useful only as historical waypoints; they are not direct rename authority for the current raw or NE Ghidra databases
Safe use:
- event/id hinting
- shape-id cross-checks
- lead generation for later binary confirmation
Unsafe use:
- direct function renames based only on these handwritten notes
- assuming the old segment numbering matches the current `CRUSADER-RAW.EXE` or `CRUSADER.EXE` imports without a verified address mapping
### Shape metadata tables
`shapedata.txt` and especially `shapedata_more_complete.txt` provide a broad `Shape info N data 4(top),5,7,8` table.
Current best reuse:
- cross-check `shapeinfo` bytes and flags against item/entity behavior already seen in the binary
- identify repeated per-shape flag/value families before promoting any field meaning
- support later naming of shape-related globals, cache entries, and trigger/object classes
This is one of the strongest data-side additions in the external corpus because it is broad, structured, and easy to re-test locally.
### Static map/object dumps
`mapdump/map-item-dump.txt` is a large coordinate/event-style dump of placed world items.
Current best reuse:
- correlate static placements with `EVENT`, `NPCTRIG`, `CRUZTRIG`, and related USECODE families
- cross-check specific shapes noted in `misc_crusader_notes.txt` against real map presence
- build targeted map-level test cases for later `CRUSADER.EXE` naming and annotation work
This file is valuable because it gives real world-layout evidence that complements the owner-loaded USECODE work already documented in `docs/usecode-roundtrip-ir.md`.
### USECODE opcode list
`usecode_opcodes.txt` contains an extracted opcode table for the JP release of Crusader: No Remorse.
Useful properties:
- broad opcode coverage from `0x00` through `0x7b`
- readable names for assignment, local/member reference, control-flow, spawn/process, and search-family operations
- a concrete local opcode vocabulary independent of ScummVM/Pentagram naming layers
Reuse rule:
- use these names as parser and report hints only
- do not assume the textual names alone are enough to rename compiled handlers in the DOS binary
### Intrinsic and function dumps
The `unkcoffs/` directory holds older cross-version name tables such as:
- `reg_functions.txt`
- `rem_functions.txt`
- `reg_intrinsic_dump.txt`
- `rem_intrinsic_dump.txt`
- `u8_intrinsic_dump.txt`
These dumps are useful because they preserve prior RE vocabulary for Remorse/Regret/U8 function and intrinsic families, including many item, actor, audio, palette, and world operations.
Current safe use:
- hint-only metadata for intrinsic ordinals, signatures, and broad subsystem labels
- cross-version comparison when a Remorse/Regret difference matters
- porting candidate generation for the NE `CRUSADER.EXE` project, where segment-based labels may be easier to reconcile than in the flat raw import
Current unsafe use:
- direct rename authority for modern Ghidra functions
- assuming one game's intrinsic numbering matches another without local confirmation
### Validated follow-up on `misc_crusader_notes.txt`
Several of the old scratchpad items are now tight enough to record as verified follow-up rather than open leads.
- `STEAM2` (`shape 1297`) is no longer just a handwritten event hint. Local extracted data now confirms non-empty `STEAM2::enterFastArea` (`event 0x0f`) and `STEAM2::gotHit` (`event 0x06`) bodies in `USECODE/EUSECODE_extracted/class_event_index.tsv`, which matches the old note's `event 15 (enter fast area)` and `Got Hit: spawn process` direction.
- The old standalone labels `FUN_1130_0896`, `FUN_1130_32af`, `FUN_1020_0000`, and `FUN_1128_026b` are now closed in the live NE database as `Key_HandleOptionKeys`, `NPC_ResetToStartOfAnim`, `Game_Start`, and `Item_ReceiveHit`. The historical notes were directionally useful, but the live names now capture their real scope: `NPC_ResetToStartOfAnim` resets an NPC/item frame to the first frame of a chosen animation and updates the stored last-anim state, while `Item_ReceiveHit` is the main damage/death handler rather than a loose "various deaths" helper.
- The `Item_ReceiveHit` decomp also closes several handwritten shape hints directly from code. It uses `0x576` for the burning human replacement path (`flaming guy running around`), `0x5a9` and `0x52b` for shield-zap sprites, and still routes controlled-NPC death through `Target_PutTargetingReticleOnItem(0)`, which keeps the old `0x59a` reticle note on the right subsystem.
- The old `ItemNPC_AnotherCreate` area-search TODO is now directly closed in the live NE program: `10e8:2710` (now renamed `NPC_CreateIfAreaSearchValid`) allocates/initializes an `AreaSearch` struct, checks `AreaSearch_IsValidPositionPt`, and only then runs `NPC_Create` plus `Item_PopToCoords`. So the practical role is `create NPC only if the requested point passes area-search placement validation`, not a second independent spawn path.
- The handwritten `Kernel_11d0_2491` gloss was incomplete. The live function is not merely "prints kernel info"; it serializes the process table, timer/keyboard/mouse process-id lists, process sizes, and per-process state through file-like writer callbacks, then restores timer/interrupt state afterward. Current safest read is `kernel/process snapshot writer` rather than a simple printf-style diagnostics helper.
- `FREE::ordinal3C` is still corpus-side evidence rather than live-NE proof, but the old disassembly now constrains it much better than the original note implied. It clears the alert state, checks avatar stasis, rolls random thresholds, and spawns `FREE::ordinal21` with a small set of ordinals (`0x0e`, `0x0f`, `0x00b6`, `0x00d2`). That makes `random global FREE event chooser after alert clear` a safer description than the narrower `random voices when alarm is disabled` guess.
- The old `high priority unknown intrinsic: 01E - fire` note is now effectively closed as a hint. The old Remorse intrinsic tables map `Int01E` to `Actor::I_maybeFire(...)`, and the current live export map ties `Int01E` to `1128:11da`. That is still hint-level naming until the compiled body is analyzed locally, but it is no longer an unidentified ordinal.
- The `combat.dat` scratch note is also better grounded now. The external corpus already showed the extracted tactic files are identical between Remorse and Regret, and the live `CRUSADER.EXE` export map now has NPC fields such as `combatDatTacticPtr`, `combatDatTacticCurOffset`, `combatDatBlockNo`, and `tacticNo`. So the named tactic strings are good portable data labels, but they should still be attached to tactic records and NPC state fields, not promoted directly into compiled function names.
Two shape hints also have light compiled-side support now:
- `0x59a` is passed into `SpriteProcess::I_createSprite` from the cursor/targeting lane in segment `1130`, which fits the old `targetting reticle` note.
- `0x5a3` is passed into `SpriteProcess::I_createSprite` from the `13e8` gameplay/UI lane and stored at `0x6054`, which fits the old `use item crosshairs` note.
Remaining caution:
- these follow-ups make the scratchpad much more reusable, but `misc_crusader_notes.txt` is still an auxiliary corpus. The verified parts above should be cited with their live NE addresses or extracted-data files when reused elsewhere in the repo.
### Combat data note
`combat_dat/readme.txt` records that the extracted `combat.dat` tactic files are identical between No Remorse and No Regret.
That is small but useful: tactic names from the combat data are portable labels and should be treated as version-stable unless contradicted by later binary evidence.
## How This Fits The Existing Docs
This external corpus mainly strengthens four areas already active in `Crusader_Decomp`:
1. `docs/usecode-roundtrip-ir.md`
The opcode list, intrinsic dumps, and static trigger/map data provide local cross-checks for the USECODE parser and event-family work.
2. `docs/ne-segment1.md`
The handwritten note set preserves shape ids, keyboard/event codes, and gameplay object examples that can be matched against the segment-1 gameplay/input lane.
3. `docs/raw-porting-progress.md`
The external notes add candidate gameplay/object labels and map-backed test targets, but they should remain supporting evidence until verified in the raw full-EXE database.
4. `docs/overview.md`
The separate disasm repo is now part of the local evidence stack alongside ScummVM and Pentagram, but unlike those source ports it is a prior RE corpus tied directly to Crusader assets and old disassembly work.
## Safe Reuse Policy
Use the external disasm repo for:
- opcode-name hints
- intrinsic/signature hints
- shape-id and map-placement cross-checks
- event-code and key-code lead generation
- candidate subsystem vocabulary before binary confirmation
Do not use it for:
- speculative raw-function renames
- address mapping without an explicit verified translation
- replacing direct binary evidence from `CRUSADER-RAW.EXE` or `CRUSADER.EXE`
## Immediate Porting Frontier For `CRUSADER.EXE`
The next practical use of this corpus is not another raw-only note pass. It is a controlled porting pass into the NE-loaded `CRUSADER.EXE` project in Ghidra.
Best initial targets:
1. Port already-verified raw names that clearly correspond to NE-segment functions where the segment:offset identity can be confirmed directly.
2. Use `unkcoffs/` function and intrinsic dumps as hint-only comparison tables when the NE database exposes clearer segment-local call structure than the flat raw import.
3. Use `map-item-dump.txt` plus shape tables to annotate trigger-heavy or object-heavy NE lanes before promoting any names.
4. Use `usecode_opcodes.txt` to keep future USECODE parser/report output aligned with an additional local opcode vocabulary, especially where ScummVM and Pentagram leave placeholders.
## Recommended Order For The Next Porting Pass
1. Start with one small NE segment or subsystem that already has strong raw names or old disasm vocabulary.
2. Prefer functions with direct string, data-table, or caller-role evidence over unlabeled wrappers.
3. Use the map/shape corpus to explain data-driven objects first; use the intrinsic/function dumps only as secondary hints.
4. Record exact successful raw-to-NE name correspondences so later passes can reuse the mapping instead of re-deriving it.
## Current High-Value Follow-Ups
- Build a shape-id crosswalk between `shapedata_more_complete.txt`, `map-item-dump.txt`, and the existing `EVENT` / `NPCTRIG` / `CRUZTRIG` families.
- Compare the handwritten key/event codes against the already-named cheat/input paths to see which parts of the old notes are now directly closed.
- Use `rem_functions.txt` / `reg_functions.txt` to identify conservative candidate names for still-positional NE functions, but only when the local caller/data evidence matches.
- Keep the external disasm corpus explicitly separated from ScummVM/Pentagram-derived evidence so provenance stays clear in future porting notes.