15 KiB
Crusader Decompilation Mid-Project Plan
Purpose
This file is the live mid-project tracker for the Crusader decompilation effort.
Keep it focused on:
- current verified state,
- active blockers,
- next resume work,
- and the remaining path to a reasonably complete decompilation.
Detailed completed analysis belongs in the files under docs/, not in this plan.
Progress Snapshot
Latest verified batch: docs/combat-dat.md now closes the shipped combat-tactic data file as a documentation target instead of leaving it as a scratch-note reference. Current best read is that all local Remorse/Regret variants share one identical 14-record COMBAT.DAT, the live NE database now already has the right tactic/process field anchors (combatDatTacticPtr, combatDatTacticCurOffset, combatDatBlockNo, tacticNo) plus setup helpers, and the shipped opcode subset is now decoded into a full human-readable tactic catalog using direct binary parsing plus the ScummVM Crusader attack-process interpreter as a reference model.
- Overall useful decompilation progress: about 58%
- Reasonable uncertainty band: about 55% to 63%
- Top 100 far-call target coverage: about 86%
- Segment spread with meaningful analysis: about 34% to 40%
- Tooling maturity for continued work: about 83%
Why The Estimate Moved
- The NE
CRUSADER.EXEdatabase now has materially more named functions, better caller-role coverage, and broader comment-backed provenance than when this tracker was first drafted. - The startup/display lane is no longer a top active section. Its outer ownership and control flow are stable enough that it should stay closed unless new caller evidence changes the model.
- The cheat/debug lane is also much tighter: the
jassica16latch, the broader-lauriegate, the~runtime toggle, the F7-family overlays, the F10/Ctrl behavior, and the0x410CD-transfer-display branch are now separated well enough that this lane is mostly documentation and cleanup, not architecture recovery. - The USECODE/VM lane has moved from broad structure guesses to a partial runtime model: core loader/runtime helpers are named, owner-loaded slot arithmetic is verified against extracted corpora, several masked-create helpers have real contracts, and the major remaining uncertainty is now the upstream selector/caller path rather than the storage format itself.
- The map-renderer crosswalk lane also removed a lot of lingering shape ambiguity by closing more controller/helper families directly from extracted corpora plus scene evidence.
- The combat-tactic data lane is also now materially tighter:
COMBAT.DATis no longer just a named-tactic hint source, but a documented bytecode archive with stable per-record names, verified block structure, a decoded shipped opcode subset, and a practical family-level behavior map for theDumb,Pivot,Advance,Careful, marker-shuttle, and step-out-shoot tactics.
Current Verified State
Primary Tracking Assets
crusader_segment_coverage_ledger.csvremains the main executable-wide coverage tracker and should be updated after each verified batch.crusader_decompilation_notes.mdis an index, not the place for long-form analysis.CRUSADER.EXEremains the default live Ghidra target.- Verified
CRUSADER-RAW.EXEwork remains a supporting evidence base for ports, naming provenance, and caller/context cross-checks.
Strong Or Stable Areas
- seg001 gameplay/input/projectile work is stable enough to support verified raw-name ports into live NE work.
- The raw
0007rendering/camera/tile-visibility lane has a strong structural map and now acts more as supporting evidence than as a primary unknown. - The
0008dispatch-helper and000cstate/transition lanes have broad partial coverage, including enough caller-side structure to support practical NE naming work. - The VM/USECODE lane now also has one earlier compiled-side producer anchored beyond the old direct
Item_GetDamaged/StorageDataProcess_Runcallers:AreaSearch_CollideMoveis now verified as a paired0x20b/0x20ccollision-process producer, and the local seg031 queue helpers are named structurally in the live database. - That same collision-storage producer surface is now wider too: current direct callers are all movement/physics/animation-side (
Item_LegalMoveToPoint,Item_LegalMoveToPointWithCollisionInfo, gravity, animation, supersprite, and fast-area gravity cleanup), and no verified non-collision producer reaches the0x236queue yet. - The movement/collision lane is tighter at the helper level too: the step-aware seg029 sweep wrappers, the seg031 release-side queue cleanup pair, and the adjacent seg090 directional cache-offset helper are now named in the live database, so the remaining uncertainty in this lane sits earlier in caller policy rather than in the local helper layer.
- The startup/display lane is materially closed. Shared dispatch-entry ownership, seg126 file-backed control flow, seg127 fade control, and the surrounding palette/presentation helpers are now understood well enough that they should not stay in the live critical path.
- The cheat/debug lane is mostly closed at the behavior level. The secret-sequence matcher, broader cheat gates, F7 overlays, F10 modifier path,
Ctrl+Llocation popup,Ctrl+Q = 0x410CD-transfer-display toggle,-debug, and-laurieare all separated far more cleanly than before. - The hidden usecode-debugger lane is now structurally understood as a layered orphaned subsystem: seg109 UI pieces, seg1408 break-state helpers, and the seg1418 interpreter handoff are no longer conflated.
- The USECODE/VM lane now has a workable compiled-side model around
entity_vm_runtime_create,entity_vm_runtime_owner_resource_create,entity_vm_context_create_from_slot_index, the masked-create hub at000d:463a, the persistence/load helpers, and the owner-loaded slot/value arithmetic. - The owner-loaded body/range model is no longer speculative. Class-selection uses
class_id + 2, header/subentry math matches extracted corpus output, and concrete body windows forNPCTRIG,EVENT, and related families are now verified. - The map-renderer/documentation lane now has a stronger shape/controller crosswalk. Recent closures include
CRUMORPH,NPC_ONLY,WATCHNS,WATCHEW,CRYOBOX,CRAZYEW,CRAZYNS,VIDEOBOX,PANELEW,GENERATR, and cross-gameDEATHBOX, with viewer-side links kept conservative where actor-side state is still runtime-only. - The command-line/startup lane is much tighter across both games:
-warp <mission> [x y z],-mapoff,-egg, startup teleporter selection, and the-uEUSECODE root override all now have practical behavior models instead of folklore-level descriptions. - The PSX lane is no longer just side inventory. Retail/pre-alpha bundle loading, mission-briefing/passcode structure, and the reduced-content pre-alpha disc now have dedicated notes and enough stable naming to support future targeted passes.
- The Remorse class-lift preparation lane now has a usable document cluster: overall plan, candidate inventory, endpoint spec, ABI constraints, family notes for
EntityDispatchEntryandSpriteNode, a conservativeEntityfamily split, a VM runtime/owner-resource layout note, a compatibility-header draft, and one grouped resume index. - The same class-lift prep lane is now more execution-ready: the
0x4588broker family has its own focused object note, the toolchain story has a dedicated fingerprint-evidence note, and there is now a concrete first-batch class-authoring checklist ready for the first MCP-backed namespace/struct/vtable pass.
Areas That Are No Longer Live Priorities
- Startup/display transition recovery is no longer a front-line blocker unless overlap repair becomes necessary for adjacent work.
- The general cheat/debug key matrix no longer needs broad exploratory work.
- The
-debugswitch is no longer an open mystery; remaining work there is mostly sink-side cleanup and documentation. - The earlier executable-patch experiments around the hidden debugger are documented history, not a current decompilation priority unless new evidence changes the entry model.
Live Blockers
- The main remaining VM uncertainty is the real upstream selector/caller path into
entity_vm_opcode_sequence_runand adjacent masked-create helpers. One earlier producer is now closed atAreaSearch_CollideMovefor the0x236collision-storage family, but the owner-loaded class-family chooser and any broader non-collision producers are still upstream-dark. - The dark masked-materializer wrappers still need caller-role recovery, especially the signed-additive slot-
0x0a/ slot-0x0bpair and the surrounding higher-slot wrapper ladder. - The callback object rooted at
0x4588still lacks a behaviorally safe subsystem name even though its allocation/finalize neighborhood is better constrained. - A few hot or awkward function ranges still lack clean function objects or good boundaries, especially around
000c:db68,000e:ffb0, and several caller-dense gaps in0007,000b, and000e. - Weakly covered resource/data-loader families and non-
CALLFfar-pointer relocations are still a second-pass blocker for some object/table recovery work. - The segment ledger has improved, but it still trails the actual verified state in the notes and Ghidra database. Promoting known segments from documented evidence remains real work, not bookkeeping trivia.
Current Focus
- Keep the live NE
CRUSADER.EXElane as the default working surface, using raw/full-EXE and standalone-segment work only as supporting evidence. - Keep the VM/USECODE lane focused on selector recovery, caller-role recovery, and record-shape confirmation rather than repeating storage-format validation that is already closed.
- Promote ledger coverage from existing verified notes before broadening into fresh executable-wide sweeps.
- Use overlap repair only where it unlocks an active high-payoff lane.
- Use the map-renderer/tooling lane to validate shape ids, map placements, and viewer semantics before promoting additional static-object names in Ghidra.
Next Resume Point
- Resume from
docs/ne-hole-filling-priorities.mdand pick one small NE cluster where the old disasm vocabulary, extracted corpus evidence, and live NE callers overlap cleanly. - Stay on the VM lane and move one step earlier than the now-mapped movement/collision helper set around
AreaSearch_CollideMove: the local seg029/031/090 helper layer is now named, so the next work is the policy/dispatch layer that decides when those legal-move, gravity, animation, or supersprite paths instantiate the local0x236collision-storage queue, plus verification of whether any non-collision producer feeds the sameStorageDataProcess_Create/Runfamily. - Recover caller roles for the remaining dark signed-additive masked wrappers, especially the slot-
0x0a/ slot-0x0bpair, and compare them against the now-anchored slot-0x12caller pattern. - Tighten the higher-slot wrapper ladder around
0005:3115..31daso future event-label promotion depends on compiled caller behavior instead of external tables. - Tighten the seg006 masked-helper caller chains so the local state-selector/value family can be tied to concrete gameplay subsystems.
- Classify the paired seg070 loops behind
entity_vm_runtime_owner_resource_create, especially which temporary buffers and record schemas each family populates. - Promote additional ledger rows directly from already-verified docs and live comments, especially where segments already deserve
Foothold,Partial, orDeep; the new seg029 step-aware sweep batch, seg031 queue-release batch, and seg090 movement-helper batch should be the immediate template. - If the VM lane stalls, revisit
000e:ffb0from the now-better-constrained video/audio caller windows and try to recover an adjacent non-overlapped helper before attempting broad boundary repair. - Continue the map-renderer cross-check lane by building one conservative shape-id/map-placement crosswalk from
shapedata_more_complete.txt, extracted corpora, and authored scene evidence before promoting more trigger-heavy classes in NE. - Keep the PSX pre-alpha lane alive as a secondary target: classify the
LoadExeccallers, test whether the staleTALK1.XApath is still reachable, and compare the shippedLSET1bundles against the retail extractor outputs.
Remaining Work To Reach A Reasonably Complete Decompilation State
1. Coverage And Tracker Completion
- Keep turning the seeded 145-row ledger into a trustworthy whole-program dashboard.
- Sweep remaining lightly covered segment clusters by adjacency and call relationships rather than one-off function hunting.
- Keep the plan, the docs, the ledger, and the live Ghidra comments synchronized after each verified batch.
2. VM / USECODE / Scripting Lane
- Close the upstream selector/caller path into the sequencer and masked-create families.
- Finish separating owner-row-backed data from runtime-decoded control streams and dispatch-entry seed records.
- Expand caller-backed event-label promotion only where binary behavior and slot reuse agree.
- Keep maturing the tooling bridge from extracted corpora into compiled-side annotation/import workflows.
3. Callback / Allocator / Object-Role Lane
- Classify the
0x4588callback object strongly enough for a real subsystem name. - Separate generic cache/allocator mechanics from game-specific client behavior where caller evidence supports it.
- Keep low-level helper names conservative until behavior, not just structure, is clear.
4. Rendering / Animation / UI Support Lanes
- Keep the rendering/palette/animation lanes focused on caller-side semantics and cleanup, not exploratory renaming in isolation.
- Revisit
000e:ffb0and adjacent overlap-heavy video helpers only when the payoff is clear. - Use map-renderer evidence and extracted corpora to validate static-object and helper/controller naming before promoting it into live NE work.
5. Data / Resource / Relocation Coverage
- Tackle deferred non-
CALLFfar-pointer relocations when they are needed for active table/object recovery. - Broaden weakly covered resource/data-loader families where they block real subsystem classification.
- Keep external references like ScummVM or older disasm corpora as evidence aids, not rename authority.
Priority Order
- VM / USECODE selector and caller recovery
- Coverage-ledger refinement from already-verified notes
- Callback-object classification around
0x4588 - High-value boundary repair when it unlocks active work
- Broader segment sweeps and second-pass data/relocation work
- Secondary map-renderer and PSX follow-up lanes
Evidence Anchors
Primary files backing this plan state:
crusader_segment_coverage_ledger.csvcrusader_decompilation_notes.mddocs/overview.mddocs/ne-hole-filling-priorities.mddocs/crusader-disasm-reference.mddocs/raw-porting-progress.mddocs/raw-0008-000c.mddocs/raw-000a-000d.mddocs/raw-000e.mddocs/far-call-targets.mddocs/usecode-roundtrip-ir.md
Update Rule
Update this file when one of the following happens:
- the headline estimate changes materially,
- a live blocker is resolved,
- a subsystem moves from structural to behavioral understanding,
- a segment cluster is promoted materially in the ledger,
- or the next resume point changes enough that the current handoff would mislead the next pass.
Keep this file short. Move detailed completed analysis into the appropriate file under docs/ and leave only the current state, blockers, and forward path here.