Crusader_Decomp/plan-mid.md
MaddoScientisto de42fd1ea1 Add Crusader-specific USECODE data and documentation
- Introduced new file `vm_mask_ladder.tsv` containing detailed mappings for Crusader USECODE VM masks and their associated descriptors.
- Added comprehensive documentation in `scummvm-crusader-reference.md` outlining the structure, findings, and implications for reverse-engineering the Crusader engine within ScummVM.
- Created `usecode-roundtrip-ir.md` to document the plan for converting Crusader USECODE bytes into a human-readable format, detailing the container layout, event names, and intrinsic tables.
- Implemented a PowerShell script `temp_usecode_sample.ps1` for extracting and analyzing USECODE data from the Crusader FLX files, providing insights into class and event structures.
2026-03-22 17:26:39 +01:00

428 lines
No EOL
59 KiB
Markdown

# Crusader Decompilation Mid-Project Plan
## Purpose
This file is the workspace-facing mid-project tracker for the Crusader decompilation effort.
It is intended to answer four questions clearly:
1. How far along is the project?
2. What is already solid?
3. What still blocks broader decompilation?
4. What should be implemented next?
The estimates below are intentionally conservative. They measure verified behavioral understanding, not just renamed symbols.
## Progress Snapshot
## Working Progress
### Last Confirmed State
- Priority 0 has started: `crusader_segment_coverage_ledger.csv` exists and contains a first-pass 145-row ledger.
- The currently seeded ledger rows are conservative and strongest around seg001, seg004, seg021, seg043, seg080, seg082/083/085, seg091, seg094, and seg095.
- Priority 1 has started on the cache/backend cluster: the seg082 allocator mechanics are now materially recovered (`allocator_head_try_alloc_block`, `allocator_head_free_block`, `allocator_free_block_by_ptr`, `allocator_try_alloc_from_head_table`, `allocator_phase_finalize_pass`), and the `0x4588` path now has named lifecycle helpers (`runtime_callback_object_init_once`, `runtime_callback_object_teardown_once`).
- The `0x4588` blocker is tighter than before: `000a:b988` boundary repair now includes both callback sync callsites (`000a:b9e5` / `000a:ba66`) inside one real function body, `000d:9d5e` / `000d:a3b7` are confirmed inside `entity_cleanup_resources_and_dispatch`, and adjacent helpers are now clarified as `allocator_head_finalize_sweep` (`0009:a961`), `video_bios_state_snapshot` (`000a:4a1f`), and `video_mode_set_and_record_state` (`000a:4972`). Concrete subsystem identity is still unresolved.
- A larger MCP rename batch completed for cleanup callees: `palette_buffer_alloc_and_init_256` (`0009:7853`), `file_handle_alloc_init_and_open` (`0009:1c3a`), `file_handle_open_with_mode` (`0009:1d6a`), `surface_release_internal` (`0009:8d7b`), `surface_release_and_maybe_free` (`0009:8e0a`), and `sprite_redraw_global_if_active` (`000d:9231`). This reduces `entity_cleanup_resources_and_dispatch` ambiguity on file/surface/palette teardown paths.
- The previously missing `000d:7e00` function object is now recovered and named `entity_dispatch_entry_init_runtime_state`, with paired destructor `entity_dispatch_entry_release_runtime_state` at `000d:8078`. Adjacent missing helpers `0003:a880` and `0003:b8e2` were also recovered, with `0003:b8e2` promoted to `far_buffer_alloc_with_mode_flags`.
- Additional helper stabilization now covers seg061/064/076: `vga_palette_read` (`0009:6ec7`) is confirmed alongside existing palette write/free paths, `timer_entity_enable_wrapper` (`0008:d3ba`) is named, and seg064 one-shot gate helpers around `0x3b72/0x3b73` are documented with conservative comments while keeping speculative naming deferred.
- Constructor-lane semantics tightened further: `entity_set_update_period_and_reschedule` (`0008:d27e`) and `palette_buffer_alloc_copy_from_source` (`0009:7905`) are now named, and both `0x4588` callback emit callsites (`000d:9d5e`, `000d:a3b7`) now have explicit payload-pair annotations in disassembly.
- The seg082 allocator table structure is now pinned down as the allocator head table at `0x8724` and active head count at `0x879c`, and the old structural helpers at `0009:b06b` / `0009:b1c3` are now promoted to `allocator_try_alloc_from_head_table` and `allocator_phase_finalize_pass`.
- New caller-side seg138 evidence now exists at `FUN_000d_938c` (`000d:938c-000d:9583`): it builds one scratch-palette dispatch entry (`kind 0x3c`) and one current-palette dispatch entry (`kind 0x14`) through `entity_dispatch_entry_init_runtime_state`, waits for each entry's active flag to clear, then redraws the global sprite path and dispatches through the input object's vtable slot `+0x08`. This narrows the open lane to presentation/dispatch semantics without yet justifying a concrete subsystem rename.
- seg137 is now promoted from `Foothold` to `Partial`: direct MCP recovery stabilized a coherent palette/dispatch-entry helper family with safe renames for all-black, all-white, arbitrary-RGB, grayscale, black-state, and solid-color state builders around the same `entity_dispatch_entry_init_runtime_state` lane. The remaining gap is the higher-level event/script meaning of those helpers, not the local mechanics.
- seg005 and seg136 now have new high-value footholds: `FUN_0004_60c0` is recovered as a startup/display orchestration handoff that drives the seg137 palette helper family, validates an object through vtable `+0x0c`, creates the default active dispatch entry, programs mouse state, and then hands off into `0004:1e00`; nearby seg136 helpers are now stabilized as `active_dispatch_entry_mark_enabled`, `active_dispatch_entry_mark_disabled`, and `active_dispatch_entry_create_default`.
- The downstream seg005 handoff body is now also classified further: `FUN_0004_1e00` (`0004:1e00-0004:2420`) is a non-return startup/display transition driver with confirmed use of `vga_palette_set_all_black`, `animation_ctor_variant_b`, `sprite_node_get_or_traverse`, seg064 gate helpers, the `0x2bd8` vtable lane, and the `0x4aa/0x7e22` resource/object lane. The remaining work is naming the exact state label, not repairing the structure.
- seg126 is now promoted from `Foothold` to `Partial`: `FUN_000c_7412`, `transition_preentry_setup_resources`, `transition_preentry_release_resources`, `transition_preentry_run_until_complete_or_abort`, `transition_preentry_step_script`, `thunk_callf_0000_ffff_000c_827d`, `thunk_callf_0000_ffff_000c_82f9`, and `FUN_000c_834a` now show a coherent pre-entry, guarded-entry, script/fade step, and post-transition control shell around the same `FUN_0004_1e00` startup/display state.
- seg127 is now promoted from `Foothold` to `Partial`: `palette_fade_begin_full_up`, `palette_fade_begin_full_down`, `transition_palette_fade_begin`, `transition_palette_fade_tick`, `transition_palette_fade_out_step`, and `transition_palette_fade_in_step` form a concrete local palette-fade controller with verified full-range wrappers and caller-side state gating immediately beside the same seg126/seg005 transition lane.
- seg049 is no longer blank: `watch_entity_controller_create_global`, `watch_entity_controller_create`, and `watch_entity_controller_dispatch_if_present` now show that `0x2bd8` is a real type-stamped watch/camera controller object lane rather than only a raw watched-entity pointer, and that same controller is exercised from `FUN_0004_1e00`.
- seg108 is no longer blank: `sprite_object_clear_flag40_if_present` and `sprite_object_set_flag40_if_present` now anchor the `0x4f38` global sprite/object lane as a real state-bit-controlled object path used beside the same `0x4588` callback sync and startup/display transition flow.
- Direct MCP follow-up on seg126 and seg127 now recovered the missing helper bodies after boundary repair: `transition_preentry_setup_resources` (`000c:c63a`), `transition_preentry_release_resources` (`000c:c890`), `transition_preentry_run_until_complete_or_abort` (`000c:c9f4`), `transition_preentry_step_script` (`000c:ca1d`), and the neighboring `transition_palette_fade_tick` / `transition_palette_fade_begin` / `transition_palette_fade_out_step` / `transition_palette_fade_in_step` chain are now named against verified behavior. The latest semantic pass also tightened the two main open globals: `0x8c5c` / `0x8c60` are now best understood as a paired temporary text-renderer lane, while `0x31a2` behaves like an external input/event break gate maintained by queue/interrupt-side code. The remaining structural cleanup is the separate oversized overlap rooted at `000c:db68`, not the seg126 helper family.
- Bonus cheat-lane cleanup is now visible in Ghidra too: `cheat_code_check` has recovered local names (`input_event_record`, `input_event_offset`, `new_cheat_enabled`, `cheat_status_display_root`) and a decompiler comment stating that it matches the five-byte event-code sequence `50 80 3e fd 27 00` before toggling the cheat-state bytes and taking one of two local notification paths.
- Point 8 cheat/input-lane pass is complete. `keyboard_input_cheat_dispatch` (`0007:04dc`) is renamed and has a full scan-code mapping decompiler comment. `cheat_entity_slot_cycle_and_update_sprite` (`000c:8072`) and `cheat_anim_type_cycle_and_refresh` (`000c:81c0`) are named. Three `DS:0x6050` gate helpers (`000c:8221/8227/822b`) are named. All seven cheat event case-handlers in the 000c dispatch function now have labels and disassembly comments (`event_0x141/0x241/0x441_cheat_debug_overlay_toggle`, `event_0x7e_cheat_latch_runtime_toggle`, `event_0x142/0x143_cheat_fullscreen_mode1/0_refresh`, `event_0x410_cheat_flag_604f_toggle`). The cheat-related string table in seg014 is documented (including the dev Easter-egg `"FART ...TRY... -laurie"`). HACK MOVER / Immortality strings confirmed present with no static code xrefs — attributed to USECODE scripting layer. `0x844` (master cheat flag) vs `0x6045` (live cheat latch) separation remains solid.
- User-directed JELYHACK producer tracing is now tightened one layer upstream of `000d:208b` / `000d:21ed`: the immediate stream producer is the embedded mini-VM object created at context `+0x36`. `entity_vm_context_create_from_slot_index` (`000d:46ec`) feeds that object through `entity_vm_context_setup` (`000c:f844`), which uses `entity_vm_stack_init_with_data` (`000c:f6e8`) and `entity_vm_state_copy` (`000c:f772`) semantics to seed or clone `[+0xcc..+0xd2]`. The actual source payload comes from the runtime owner table at `0x6611 -> +0x1315/+0x1317 -> +0x10/+0x12`, addressed as `base + 0x0d*slot + 4`, and the resulting per-slot source is mirrored into `0x39ca`. This still does not expose a direct `JELYHACK`-named producer object, but it strengthens the current reading that `JELYHACK` / `JELYH2` contribute referent identity while neighboring `REE_BOOT` / `SURCAMEW` / `SFXTRIG` descriptors remain better candidates for event-bearing attachments.
- The next USECODE/JELYHACK pass now resolves the immediate owner-object writer too. `entity_vm_runtime_create` (`000d:4c99`) is the only writer of runtime `+0x1315/+0x1317`, via newly recovered `entity_vm_runtime_owner_resource_create` (`000d:7000`), and the companion `entity_vm_runtime_owner_resource_destroy` (`000d:70fd`) releases that helper. The `000d:7000` body does not copy a caller-supplied table directly: it constructs one embedded seg069/070 helper object, queries that helper for the required table size via vtable `+0x04`, allocates child `+0x10/+0x12`, then populates the `0x0d`-stride per-slot producer records through vtable `+0x0c`. Wrapper classification around `entity_vm_context_try_create_masked_for_entity` is tighter too: local wrapper `0004:f033` uses slot mask `0x8000:0007`, `FUN_0004_f05c` uses `0x2000:0015` and is reached from `0004:f2b3` after overlap/proximity and entity byte `+0x32` state checks, and `FUN_0005_27a4` uses `0x0001:0000` from the `000c:a09e` entity `+0x5b` bit-`0x0004` branch. This is enough for a conservative owner/resource classification, but not yet for a source-format-specific or descriptor-specific rename beyond that partial role name.
- The seg091 `000a:44fd/454d/45fe` cluster is now corrected too: these are reentrancy-guarded fatal-report helpers, not a wrapper selector or live event-dispatch lane. In particular, `entity_vm_slot_load_value` (`000d:51fd`) reaches `000a:44fd` via `PUSH 0x410; PUSH DS; PUSH 0x6616; CALLF 000a:44fd`, which now reads as an error/assert report path rather than a gameplay immortality-event producer.
- The `000d:7000` owner/resource helper is one step tighter as well: its embedded seg069/070 object is file-backed rather than abstract. Construction starts with `dos_file_handle_init` (`0009:1c00`), then uses helper vtable slot `+0x04` as the size query for the child `+0x10/+0x12` allocation and helper vtable slot `+0x0c` as the callback that populates the `0x0d`-stride slot table.
- The caller side for that owner/resource path is now anchored too. A new function object at `000d:44df` is recovered and named `entity_vm_runtime_init_from_path_if_configured`: it checks configured global/string `0x65a`, builds a path through seg072 helper `0009:3600` using globals `0x6d6:0x6d8` plus `0x65a`, validates it through `000a:500a`, then calls `entity_vm_runtime_create(0,0,path)` and seeds post-init runtime state at `0x6611`, `0x6615`, and `0x8c7c`.
- Follow-on correlation for the `000d:21ed/22bc` lane is now tighter: the two bytes consumed at `000d:22d2` and `000d:22ee` are confirmed as compact signed shape/count metadata for the link-matrix walk (not descriptor ids), while the streamed words consumed through `000d:2324..237b` and `entity_link` (`0008:7d27`) behave as runtime entity/link ids with `0x0400` list-flag filtering on writeback. The source stream provenance remains slot-indexed owner-table data (`(+0x10/+0x12) + 0x0d*slot + 4`) mirrored to `0x39ca`, so descriptor-family correlation is strengthened only at the ecosystem level (generic active-event neighborhoods) and not as a direct `SURCAM*` or class-id keyed lane.
- Pass 4 dispatcher/xref follow-up is now partially closed: `FUN_000d_ebe3` sequencing is re-verified (`177c -> 1acb -> 0988 -> 22bc -> optional 1d4a -> 2104`) and the opcode selector local is pinned to `[BP-0x32]` inside `000d:0988` with concrete compares against `0x19/0x1a/0x1b` (`000d:099b`, `000d:09a1`, `000d:0a07`, `000d:0a0d`). The direct upstream selector edge into entry `000d:ebe3` remains unresolved in MCP xref output, so no additional wrapper-level opcode number is promoted yet beyond that internal family identity.
- Readable EUSECODE targeting moved one concrete step forward again. Fresh decompile on `entity_vm_context_create_from_slot_index` re-verifies that the slot-backed VM source copied from the runtime owner table (`(+0x10/+0x12) + 0x0d*slot + 4`) is also written into the per-slot mirror row addressed through `0x39ca[context_slot]` during context creation, which tightens the current source model without over-claiming unique ownership of the mirror. Combined with the extracted `jelyhack_*` and `event_*` reports, this now supports a first verified pseudo-script rendering of two descriptor-side forms: `referent anchor + event-bearing attachment` (`JELYHACK` / `JELYH2` beside `REE_BOOT` / `SFXTRIG` / `SURCAMEW`) and `event hub + trigger-side neighbors` (`EVENT` / `COR_BOOT` / `NPCTRIG` inside the `ROLL_NS..VMAIL` island). The direct descriptor-id bridge and the upstream selector into `FUN_000d_ebe3` both remain unresolved.
- The next wrapper-expansion pass is now also complete. The `0005:2867/2918/2ae2/2d30` family and its adjacent `2c06/2c35/2c68/2c9b/2cd2/2d01` siblings now form one verified mask ladder around `entity_vm_context_try_create_masked_for_entity`: `0x0002:0001`, `0x0020:0005`, `0x0004:0002`, `0x0200:0009`, `0x0400:000a`, `0x0800:000b`, `0x0010:0004`, `0x1000:000c`, `0x4000:000e`, and `0x8000:000f`, alongside the earlier seeds `0x0001:0000`, `0x8000:0007`, and `0x2000:0015`. The strongest new caller-side evidence is gameplay-state-oriented rather than descriptor-id-oriented: `0005:2867` feeds `000c` state helpers that store the result into entity field `+0x39`, `0005:2918` is only reached from a `+0x3c == 0x20b` object lane and carries caller fields `+0x36/+0x38` as an extra dword, and `0005:2d30` gates on entity id range, class word bit `0x8000`, class-record `0x7e46` flag bits, class nibble values `4/7/8`, and a `0x0f16` / `0x20f` dispatch-entry emission path before attempting mask `0x8000:000f`. That makes the wrapper family correlate more naturally with active-event ecosystems (`EVENT` / `NPCTRIG` / `_BOOT`) than with a direct `JELYHACK` referent-anchor lookup, while still stopping short of a hard descriptor-class switch.
- First concrete follow-on pass for the adjacent wrapper entries is now in: direct caller anchors are confirmed for `0005:2c06` (`0005:0292`), `0005:2cd2` (`0005:0fee`), `0005:2c9b` (`0005:5946/59e9`), and `0005:2d01` (`0007:814e/822e`). The old `0005:2c68` indirect-dispatch evidence is now rejected: `0007:e521` and `0007:e73c` push caller-local data `DAT_0000_2c68` into the fatal-report helper `000a:44fd`, not into wrapper `0005:2c68`. That leaves both `0005:2c35` and `0005:2c68` genuinely xref-dark and shifts the remaining selector work back to the true upstream edge into `FUN_000d_ebe3`.
- The current VM owner/path lane is now tighter too. Seg072 helper `0009:3600` is not a raw buffer-advance helper; it is a rotating slash-aware path composer that uses five `0x50`-byte temp buffers and inserts `\` only when adjacent path parts need it. In `entity_vm_runtime_init_from_path_if_configured` (`000d:44df`), `0x65a` therefore reads as the configured relative runtime-owner filename/path component, while `0x6d6:0x6d8` reads as the mutable base/resource-root path buffer. The still-xref-dark wrappers `0005:2c35` / `0005:2c68` are narrower too: their signed extra word is forwarded through `entity_vm_context_create_from_slot_index` into `entity_vm_slot_load_value_plus_offset` and stored in context field `+0x34`, so they are offset-specialized mask wrappers rather than plain duplicates. Direct CALL xrefs into `FUN_000d_ebe3` are also now confirmed from `animation_ctor_variant_a/b/c`, but no new wrapper-level opcode number is proven yet beyond internal `0x19/0x1a/0x1b`.
- The owner/resource helper is now tighter at the file-loader level too. The embedded seg070 methods rooted at raw windows `0009:67b6` and `0009:6916` iterate helper-owned tables at object `+0x10/+0x18`, format per-entry paths with the seg001 string helpers, then open/read/close through `0009:1c3a` / `0009:2034` / `0009:1e61`. That pushes `entity_vm_runtime_owner_resource_create` (`000d:7000`) beyond a generic file-backed label: it now looks like an indexed external file-set loader whose vtable `+0x0c` callback materializes the `0x0d`-stride owner records consumed by the VM runtime.
- The `0x39ca` mirror follow-up is narrower now too. The newly checked `0008:709c/70cb`, `0008:7309/7338`, and `0008:85f9/8617` windows only save/restore or allocate the global `0x39ca:0x39cc` base pointer and zero the backing table. No new competing per-slot row writer is verified there; `entity_vm_context_create_from_slot_index` (`000d:46ec`) still remains the only confirmed writer of concrete `0x39ca[slot] = {source_off, source_seg}` mirror rows.
- The descriptor-side tooling is now one step closer to readable USECODE too. `tools/extract_eusecode_flx.py` widens the JELYHACK local window to include the nearby event-bearing `REE_BOOT` / `SURCAMEW` / `SFXTRIG` records and now emits `readable_descriptor_templates.md` / `.tsv`, which render conservative pseudo-script sketches for the current anchor, event-hub, environmental, and callback lanes rather than only raw descriptor indexes.
- The script-facing bridge artifact is now tighter too. `tools/extract_eusecode_flx.py` emits `runtime_descriptor_family_rankings.md` / `.tsv`, which rank descriptor families against the verified VM/runtime lanes: `EVENT` is the strongest active-event payload fit, `_BOOT` cores and `NPCTRIG` remain strong active-event satellites, `SFXTRIG` and the environmental classes stay moderate active-event fits, `JELYHACK` / `JELYH2` now occupy a dedicated referent-anchor/payload-owner lane, and `SURCAMNS` / `SURCAMEW` stay in a weaker callback/attachment lane until callback-specific opcode or mask evidence appears.
- External reference pass completed against ScummVM's Ultima 8 / Crusader engine. New note `docs/scummvm-crusader-reference.md` records concrete Crusader-specific evidence for `usecode/*.cpp`, `convert/crusader/*.h`, FLEX parsing, sound/speech/movie handling, map record layout, Crusader-only shape/typeflag handling, HUD gumps, and startup/world-state differences. Highest immediate payoff is on USECODE parsing: `usecode/usecode_flex.cpp` gives ScummVM's Crusader class-header/event-count interpretation, while `convert_usecode_crusader.h` provides named event ids `0x00..0x1f` and a large intrinsic-signature table that can be cross-checked against current VM/runtime work.
- The first ScummVM-guided USECODE cross-walk batch is now complete. `usecode/usecode_flex.cpp`, `usecode/usecode.cpp`, `usecode/uc_machine.cpp`, and the Crusader/Regret conversion tables now externally anchor the Crusader class/event model (`classid + 2`, names from object `1`, base offset from bytes `8..11` minus `1`, six-byte event records, event-number call translation through opcode `0x11`) and confirm that Remorse uses a `ByteSet(0x1000)` VM with the shared Crusader event-name table but version-specific intrinsic dispatch. New note `docs/usecode-roundtrip-ir.md` records the safe annotation policy plus a reversible IR v0 that preserves raw class/event bytes, intrinsic ordinals, inline-versus-indirect payload distinctions, and opaque-opcode fallback. Headline estimate is intentionally unchanged because this batch tightened interpretation more than direct binary coverage.
- The next binary-side validation step for that ScummVM cross-walk is now complete too. Sampled owner-loaded EUSECODE class records (`EVENT`, `NPCTRIG`, `SURCAMNS`, `JELYHACK`, `REE_BOOT`, `SURCAMEW`, `SFXTRIG`) now confirm the object-1 name-table and `classid + 2` body lookup locally: deriving `object_index = (table_offset - 0x80) / 8`, `class_id = object_index - 2`, and then reading object `1` at `4 + 13 * class_id` yields the expected class names. The same samples confirm a real header dword at bytes `8..11` and a 6-byte event-slot table at `+20` with `u16 unknown + u32 code/payload` structure.
- The current USECODE-header mismatch is now narrowed further and has a conservative working resolution. `uc_machine.cpp` uses ScummVM's decremented `get_class_base_offset()` as the live code-stream base, while the local owner-loaded records still fit bytes `8..11 = first code-byte offset` with 1-based event code offsets. Under that reading the local event-count rule is `(base_offset - 19) / 6`, equivalently `(raw_u32_at_8_11 - 20) / 6`, which matches the validated `32/33/35` slot tables from the `0x00d4/0x00da/0x00e6` headers. The `000d:44df -> 000d:4c99 -> 000d:7000 -> 000d:46ec` runtime path still shows indexed file loading and slot-table consumption but no verified per-class header rewrite, so the mismatch currently looks best explained by a ScummVM interpretation/detail issue rather than a proven owner-loader transform. No safe new event-label-to-runtime correlation was promoted from this pass, and the headline estimate remains unchanged.
- The conservative owner-loaded class rule is now implemented in `tools/extract_eusecode_flx.py` and refreshed on the current EUSECODE sample. New outputs `class_layout_index.tsv` and `class_event_index.tsv` now expose object index, class id, class-name hint, raw bytes `8..11`, derived `code_base_minus_one`, conservative event counts, and raw 6-byte event rows with ScummVM slot-name hints, giving the round-trip work a concrete parser baseline instead of only prose notes.
### Current Focus
1. User-directed USECODE/JELYHACK lane: use the updated parser outputs to recover a small safe set of non-zero slot semantics, then tighten the reversible script IR around the already verified `000d` VM families before returning to deeper producer/dispatcher mapping.
2. Finish Priority 0 refinement by promoting more exact segment rows where notes already support a verified foothold.
3. Continue the Priority 1 pass by tracing the higher-level startup/display callers, branch outcomes, pre-entry object lanes, palette-fade ownership, watch/camera controller ownership, and active sprite/object ownership that stitch the seg137 palette helper family into the wider `0x4588` / dispatch-entry object-role lane.
### Next Resume Point
1. Continue the user-directed USECODE/JELYHACK follow-on from the recovered producer chain, now with the ScummVM class/event cross-walk in place, especially by:
- validating the new conservative rule against any future main USECODE container sample that becomes available, to decide whether the current mismatch is EUSECODE-specific or whether ScummVM's `get_class_event_count()` arithmetic should be treated as the outlier for Crusader,
- mining the new `class_layout_index.tsv` / `class_event_index.tsv` outputs for repeat non-zero slot patterns before doing more ad hoc byte inspection,
- broadening the local owner-loaded spot-check beyond the first named cluster when convenient, especially across additional `_BOOT`, environmental-event, and callback-eventtrigger classes, while treating the present object-1 / `classid + 2` indexing as the current working model,
- determining whether the owner/resource helper behind `entity_vm_runtime_owner_resource_create` (`000d:7000`) exposes original object indices through its helper `+0x18` table or only slot-local file ids, now that the `+0x10/+0x14/+0x18` table contract is verified,
- re-tracing `0005:2c35` / `0005:2c68` through real caller-role recovery now that they are narrowed to signed slot-offset wrappers, while keeping the disproven `000a:44fd` selector hypothesis retired,
- mapping only the now-verified non-zero low slot ids from sampled classes (`JELYHACK` slot `1`; `EVENT`/`SFXTRIG` slot `10`; `NPCTRIG` slots `10/32`; `REE_BOOT` slots `10/15/16`; `SURCAMNS`/`SURCAMEW` slots `1/10/32/33/34`) onto ScummVM event labels where binary behavior actually matches, and otherwise keeping them as numeric slot annotations,
- separating safe Remorse annotations from Regret-only intrinsic numbering by treating ScummVM intrinsic names as ordinal/signature hints rather than rename authority,
- turning the new conservative parser rule into tooling/tests first, while preserving raw bytes `8..11`, raw 6-byte event rows, and the unresolved leading event word in emitted IR artifacts,
- resolving the remaining selector/opcode path inside `FUN_000d_ebe3` by lifting the write/read path for opcode-local `[BP-0x32]` and any hidden jump/call-table case entry from the now-confirmed `animation_ctor_variant_a/b/c` caller lane,
- hardening the first reversible script IR around preserved class headers, raw event-entry words, intrinsic ordinals, inline-versus-indirect payload forms, and opaque-opcode fallback,
- identifying the concrete seg069/070 helper class behind `entity_vm_runtime_owner_resource_create` (`000d:7000`) now that the helper is narrowed to an indexed external file-set loader around raw windows `0009:67b6` / `0009:6916`,
- tracing the remaining caller roles for the verified ladder entries `0005:2c06/2c35/2c68/2c9b/2cd2/2d01` and the larger `0005:2d30` gate so the slot-mask groups can be mapped to concrete gameplay object/state classes rather than only mask numbers,
- and checking whether any runtime-owner helper besides `entity_vm_context_create_from_slot_index` writes per-slot mirror rows through the far array rooted at `0x39ca`, now that the currently checked `0008:709c/70cb`, `0008:7309/7338`, and `0008:85f9/8617` sites are constrained to base-pointer save/restore or table allocation rather than row writes.
2. Keep classifying the seg126 pre-entry text-renderer lane around `transition_preentry_setup_resources`, `transition_preentry_step_script`, and `transition_preentry_release_resources`, especially by:
- comparing more preset `0x10` / `0x11` text-renderer callsites,
- tracing who owns the rendered buffer loaded into `0x6301:0x6303`,
- mapping the control bytes `0x21` / `0x23` / `0x24` / `0x26` / `0x2a` / `0x40` / `0x5e` to concrete display behavior,
- and deciding whether the paired `0x8c5c` / `0x8c60` lane is a title/body pair, normal/highlight pair, or another fixed UI pairing.
3. Finish the `0x31a2` gate pass as one batch:
- classify the read sites at `0004:c24d`, `000c:ca11`, `000c:e4d8`, `000c:e546`, `000c:e5c6`, `000d:9304`, `000d:b6b1`, and `000d:c0ee`,
- relate them back to interrupt-side updates at `0008:a283` / `0008:a314`,
- and decide whether `0x31a2` is best described as user-acknowledge, queued-input depth, or a broader event-break gate.
4. Tighten the `DS:0x6341` to `0x6828` relationship:
- compare the seg126 `animation_ctor_variant_a` call with the other raw callsites at `0005:3c4f`, `0005:3c74`, `000c:6176`, and `000c:619c`,
- map who owns `g_active_dispatch_entry_farptr[+0x40]`,
- and classify whether seg126 is constructing a transition-local animation payload for the shared active dispatch entry or only toggling an owner-side state bit after setup.
5. Identify which higher-level transition states own the seg127 fade-controller inputs at `0x630a-0x6316` and how that fade state is chosen from the seg005/seg126 startup path.
6. Repair the still-oversized overlap rooted at `000c:db68` only if it blocks follow-on analysis or decompiler visibility in the same transition lane.
7. Clarify the relationship between the seg049 watch/camera controller at `0x2bd8`, the seg108 sprite/object lane at `0x4f38`, and the object validated through `FUN_0004_60c0` vtable slot `+0x0c`.
8. Continue caller-role classification inside `entity_cleanup_resources_and_dispatch` (contains both `000d:9d5e` and `000d:a3b7`) and map how it relates to `FUN_000d_938c`, `FUN_0004_60c0`, `FUN_000c_7412`, `transition_preentry_release_resources`, and the seg136/seg137 active-dispatch helper family.
8. ~~Cheat/input side lane~~**COMPLETED** this pass. All point-8 sub-items are now resolved:
- `keyboard_input_cheat_dispatch` renamed; full scan-code table documented in decompiler comment.
- `cheat_entity_slot_cycle_and_update_sprite` and `cheat_anim_type_cycle_and_refresh` named.
- `DS:0x287b` / `DS:0x2892` success-path presentation: confirmed as opaque near-code discriminator values stored at `+0x49` in the display notification object; the cheat-on/off display is built via `display_null_check_dispatch` + `sprite_node_get_or_traverse`.
- All seven cheat event case-handlers in the 000c dispatch function labeled and commented.
- `0x844` (master) vs `0x6045` (live latch) separation confirmed solid; `0x604b` / `0x604f` / `0x6050` also documented.
- HACK MOVER: no static code xrefs; attributed to USECODE scripting layer. Cheat string table in 000e fully documented.
- Remaining open: exact user-facing identity of events `0x141/0x241/0x441` overlays (strings suggest targeting-reticle / CD-transfer-display), exact DS:0x6087/6091 notification objects, and any further depth on the `0x4f38` / `0x2bd8` vtable path taken by the overlay events.
- The Immortality cheat mechanics are now fully traced at the C level: event `0x410` toggles `DS:0x604f`; the sole read site is `player_receive_damage_and_dispatch_effects` (`0004:c055`) at `0004:c205`, which divides all incoming 32-bit damage by `0x40000` (262,144) when the flag is set, making HP loss negligible while the hit-stagger animation still plays. No static C keyboard dispatch generates event `0x410` — confirmed USECODE/ASYLUM scripting layer only. `DS:0x60d2` / `DS:0x60ee` are the "Immortality enabled." / "Immortality disabled." notification pointers. A parallel handler at `000b:b62c` sets the associated USECODE process state to `0xe` when the event arrives.
- `tools/extract_eusecode_flx.py` now parses the validated full EUSECODE table (`count @ 0x54`, table @ `0x80`) rather than the old heuristic header scan. Current run extracts all `403` non-zero entries and emits a searchable `entry_index.tsv` with `primary_label` and `field_names` summaries.
- The extractor now also emits `descriptor_index.tsv` and `descriptor_neighborhoods.tsv`, which summarize per-class field-tag patterns and the local neighborhoods around trigger/event-related classes.
- Current EUSECODE split is now clearer: the `000e` parser lane plausibly covers text-heavy records like `DATALINK` and `TEXTFIL1`, while the binary descriptor lane exposes object classes such as `EVENT`, `NPCTRIG`, `CRUZTRIG`, `TRIGPAD`, `SPECIAL`, `SURCAMNS`, `SURCAMEW`, `JELYHACK`, and `JELYH2`.
- The descriptor lane now has a real structural foothold too: field-name strings are preceded by short tagged metadata records (`69 xx 00 <name>`, `24 xx 02 <name>`, etc.) in multiple classes. This looks like compact field-definition encoding rather than arbitrary string spill.
- That tag grammar is now useful enough to search semantically: `69:0A00 -> event` is stable across `EVENT`, `NPCTRIG`, `SFXTRIG`, and several `*_BOOT` classes, while `24:0A02 -> eventTrigger` shows up in `SURCAMNS` / `SURCAMEW`.
- Immortality-specific follow-on is now narrowed but not closed: `JELYHACK` and `JELYH2` are confirmed as real referent-only EUSECODE descriptors; `NPCTRIG` is confirmed as an event-capable trigger descriptor; `CRUZTRIG` / `TRIGPAD` expose `referent,item,elev`; but no extracted record has yet been tied directly to binary event value `0x410`.
- The clustering pass tightened the local candidate set around `JELYHACK`: the immediate neighborhood now includes `SPECIAL`, `TRIGPAD`, `DATALINK`, `HOFFMAN`, `REE_BOOT`, `SURCAMEW`, and `SFXTRIG`, which is a plausible map/object island rather than random sparse table order.
- The strongest `record_table_parse_buffer` caller evidence (`000e:1b9f..1d49`) now appears to belong to the animation-object field lane, because the surrounding setup manipulates the already-mapped animation fields at `+0x117/+0x11b/+0x11f/+0x123` and `+0xeaf/+0xeb1`. That weakens the earlier assumption that `000e:3639` is the primary EUSECODE loader and shifts the likely binary-descriptor consumer search back toward the `000d` VM/object path.
- The first concrete `000c` to `000d` bridge in that direction is now visible at `entity_vm_set_value_from_slot_plus_offset` (`000c:f95f`): it calls `entity_vm_slot_load_value_plus_offset` (`000d:5572`) and stores the return pair into object fields `+0xd6/+0xd8`. Supporting slot helpers in the same lane are now named too (`entity_vm_slot_find_or_select`, `entity_vm_slot_decrement_use_count`, `entity_vm_slot_release_value`). The previously noted `000d:51fd` `PUSH 0x410` site is now reclassified as a fatal-report call into `000a:44fd` with `DS:6616`, so it no longer supports a direct compiled-code immortality-event bridge.
- The adjacent `000d:45xx..4exx` island is now promoted out of `FUN_*` placeholders as one coherent VM runtime/context family. Newly named helpers include `entity_vm_runtime_create` / `entity_vm_runtime_init_slots` / `entity_vm_runtime_release_slots` / `entity_vm_runtime_destroy`, `entity_vm_slot_index_from_entity`, `entity_vm_context_try_create_masked_for_entity`, `entity_vm_context_create_from_slot_index`, `entity_vm_context_sync_global_value_and_dispatch`, and the context save/load/destroy helpers. The runtime global at `0x6611` now reads as a real owner for this lane rather than an opaque far pointer.
- Two large caller bodies at `000d:208b` and `000d:21ed` now stand out as concrete context-construction sites: both feed per-object stream/data state from `+0xcc/+0xce` into `entity_vm_context_create_from_slot_index`, then continue by reading from the seeded `+0xd6/+0xd8` bytecode/value lane. This is the clearest current evidence that the `000d` interpreter/object family, not the `000e` text parser, is the near-runtime consumer to keep following for the immortality trigger.
- A second supporting lane is now named too: `entity_vm_referent_registry_init` / `destroy` / `alloc` / `release_by_id` / `free_node` show that `0x8c8c/0x8c8e/0x8c90/0x8c94` form a free-list-backed referent registry. `entity_vm_set_field_da_to_global` writes `0x8c94` from the context `+0xda` lane before entering the still-misaligned `000c:3350` body, which is the first concrete runtime mechanism explaining how referent-only descriptors such as `JELYHACK` can still participate in script state.
- That referent-registry lane is now better structured too: `entity_vm_referent_chain_copy`, `entity_vm_referent_chain_append_unique_from`, `entity_vm_referent_chain_remove_matching_from`, `entity_vm_referent_chain_contains_entry`, `entity_vm_referent_chain_get_entry_data_at`, `entity_vm_referent_chain_set_entry_data_at`, and `entity_vm_referent_chain_get_indirect_data` show that the runtime can build, subtract, and mutate payload chains hanging off one referent anchor. This is the first runtime shape that looks directly useful for a future human-readable / modifiable script IR.
- `entity_vm_opcode_finish` (`000d:3350`) is now identified as the shared opcode epilogue for this family rather than an opaque helper: it writes `0x8c94` from frame-local state, unwinds the temporary slot-array state at `0x659c/0x659e` when present, and returns the current opcode result.
- The runtime/context half of that lane is now named too. The `0x6611` global is managed by `entity_vm_runtime_create` / `entity_vm_runtime_init_slots` / `entity_vm_runtime_release_slots` / `entity_vm_runtime_destroy`, while `entity_vm_slot_index_from_entity`, `entity_vm_context_try_create_masked_for_entity`, and `entity_vm_context_create_from_slot_index` now show how gameplay entities are tested against one owner-side slot-mask table before a `0x6714` VM context is created.
- That context family is no longer anonymous either: `entity_vm_context_sync_global_value_and_dispatch`, `entity_vm_context_save`, `entity_vm_context_load`, `entity_vm_context_destroy`, and `entity_vm_context_free_buffer` now pin down the lifecycle around the same `+0xd6/+0xd8`, `+0x102`, `+0x10c/+0x10e`, and `+0x11b/+0x11d` fields.
- Current best near-runtime callsites for further immortality work are the large `000d:208b` and `000d:21ed` bodies, which both build one VM context from caller stream/data state and then continue by consuming bytes from the seeded context value lane.
- The first opcode family under that lane is also less anonymous now: `000d:0988` can either append unique payload entries or remove matching ones depending on the opcode id (`0x1a/0x1b` taking the removal path), and both branches return through `entity_vm_opcode_finish`.
- That opcode family is now classified one step further: `0x19` = append-unique indirect/string-like payloads, `0x1a` = remove-matching indirect/string-like payloads, `0x1b` = remove-matching inline payloads, and the same helper body strongly implies `0x18` as the missing append-unique inline sibling.
- The first stable `+0xd6/+0xd8` byte-lane semantics are now visible in the two large caller bodies too. The `000d:208b` block is a simple materialize-or-forward path after `entity_vm_context_create_from_slot_index`, while `000d:21ed` copies a caller-owned inline blob into the context `+0x102` buffer and then consumes two stream bytes as compact shape/count metadata before building an `entity_link` closure matrix from the following caller-stream words.
- EUSECODE readability moved one concrete step forward in this pass: decompile output now supports a first verified IR vocabulary for the same lane — `APPEND_UNIQUE_INLINE` (implied `0x18` sibling), `APPEND_UNIQUE_INDIRECT` (`0x19`), `REMOVE_MATCHING_INDIRECT` (`0x1a`), `REMOVE_MATCHING_INLINE` (`0x1b`), `MATERIALIZE_OR_FORWARD_VALUE` (`000d:208b`), `PREPEND_INLINE_PAYLOAD` (`000d:21ed`), and `BUILD_ENTITY_LINK_MATRIX` (`000d:22bc` with `entity_link` at `0008:7d27`). The `000d:22bc` tail also confirms a pushback filter where non-`0x0400` results are written back to the caller stream before `entity_vm_opcode_finish`.
- Current best JELYHACK reading is tighter than before: the extracted chunks still only expose `referent`, but the new referent-registry work means that does not relegate them to inert map labels. The most defensible present model is `JELYHACK/JELYH2 = referent anchors`, with the actual immortality/event behavior carried by neighboring event-capable descriptors in the same local island (`REE_BOOT`, `SURCAMEW`, `SFXTRIG`, or a nearby generic event/trigger record).
- That readability step now has a first concrete artifact: `tools/extract_eusecode_flx.py` emits `referent_anchor_event_graph.tsv` plus a focused `jelyhack_island_graph.md`, which turns the local table neighborhood into a first readable anchor-to-event view instead of only raw descriptor rows.
- The extractor now also emits `jelyhack_descriptor_compare.tsv`, and its first result is useful: `JELYHACK` and `JELYH2` have identical first 16 header words as referent-only sibling descriptors, while `REE_BOOT`, `SURCAMEW`, and `SFXTRIG` show materially richer header/state patterns consistent with the event-bearing side of the island.
- Latest opcode-side refinement: `entity_vm_opcode_finish` (`000d:3350`) is now the shared epilogue for the chain-mutating handlers, while `entity_vm_referent_chain_remove_matching_from` (`000d:6a9a`) and `entity_vm_referent_chain_set_entry_data_at` (`000d:6cf6`) show that the VM can subtract and rewrite payload chains in place, not just append/copy them.
- The `000d:21ed` follow-on is now better anchored semantically too: its nested callee `0008:7d27` is `entity_link`, so the `22bc..2433` block is building a bidirectional entity-link closure matrix from streamed entity ids rather than only emitting an opaque table. A conservative disassembly comment is now in place at `000d:22bc`; rename deferred until the bad outer function split is repaired.
- The extractor work now scales beyond the JELYHACK case: reusable focused-report helpers emit both `jelyhack_*` and `event_*` cluster artifacts, and the first new result is strong. The `EVENT` island (`ROLL_NS`, `COR_BOOT`, `EVENT`, `NPCTRIG`, `CRUZTRIG`, `NPC_ONLY`, `VMAIL`) contains a compact three-node event-bearing core (`COR_BOOT`, `EVENT`, `NPCTRIG`) surrounded by referent/link/text satellites.
- That second island materially improves the EUSECODE model: instead of one special-case `JELYHACK` anchor plus neighbors, we now have a broader pattern of `event-bearing core embedded in referent-neighbor island`, with `EVENT` acting as a large hub descriptor (`source/dest/door/link/time/counter/post1/post2/floor/flicMan`) and `ROLL_NS` / `CRUZTRIG` / `NPC_ONLY` / `VMAIL` reading as attached state or trigger-side records rather than peer event hubs.
- The descriptor-side taxonomy is now wider too: `event_family_index.tsv` / `event_family_summary.md` classify all current event-tagged descriptors into reusable families. The active `69:0A00 -> event` lane now breaks cleanly into one `EVENT` hub, five `_BOOT` event cores, one NPC trigger core, one minimal event core (`SFXTRIG`), and three environmental event classes (`FLAMEBOX`, `NOSTRIL`, `STEAMBOX`), while the surveillance pair `SURCAMNS` / `SURCAMEW` is now cleanly separated as `callback-eventtrigger` rather than generic event-bearing descriptors.
- The `_BOOT` family is now better constrained too. `boot_family_compare.tsv` shows that `AND_BOOT`, `BRO_BOOT`, `COR_BOOT`, `VAR_BOOT`, and `REE_BOOT` all share one common header/template shape, so the family now reads as repeated instantiations of the same event-core descriptor rather than structurally different boot subclasses.
- The best remaining `_BOOT` frontier is now explicit in extractor output as well: `boot_frontier_graph.md` shows `AND_BOOT` / `BRO_BOOT` embedded in a compact referent-heavy neighborhood (`OFFWORK`, `GUARD`, `GDOOR_*`, `BIGCAN`, `CRUMORPH`, `GUARDSQ`, `CARD_*`, wall variants), which is the cleanest unresolved object-side context for the boot-event template.
- The environmental event lane is now promoted out of a generic family label into a clearer structural pattern. `environmental_family_compare.tsv` shows `FLAMEBOX` and `STEAMBOX` as close hazard-event siblings with the same active-event backbone plus direction/count, while `NOSTRIL` is the smaller fire-specific variant that keeps the dual-hazard references and counters but drops the direction/newType side.
- The callback-trigger lane is also more defensible now: `callback_trigger_compare.tsv` confirms that `SURCAMNS` and `SURCAMEW` are effectively one shared callback template, differing only in one `therma` slot tag offset. That keeps the active `event` lane and callback `eventTrigger` lane separated by more than just naming convention.
- Runtime follow-through has resumed too: `000d:ebe3` is now backed by direct instruction evidence as one ordered VM/opcode driver body that calls `000d:177c`, `000d:1acb`, `000d:0988`, internal block `000d:22bc`, then `000d:1d4a` and `000d:2104` in sequence. `000d:ec31` is confirmed as only the internal `CALL 000d:22bc` site inside that body, so the inner block is still not a safe standalone rename target.
- Payload-shape reuse inside that same `FUN_000d_ebe3` sequencer is now partially classified: `000d:177c` behaves as a word-literal stream push, `000d:1acb` consumes one streamed dword pair and pushes a boolean word, `000d:21ed/22bc` remains the signed-byte metadata plus word-id matrix lane, `000d:1d4a` is still a boundary-suspect trap island, and `000d:2104` is a mixed scalar/handle out-pointer finalizer. This is now documented as a compact opcode-to-payload-shape matrix in docs.
- `entity_vm_context_try_create_masked_for_entity` (`000d:463a`) is now pinned down one step further: it first checks the runtime-disable byte at `0x6610`, computes the entity slot, tests the owner-side slot mask in the runtime owner table, and only then creates a context. On success it reports either an immediate result (success with cleared output word) or an object-backed result (success with the created object's low word), which is the clearest current typed boundary between gameplay entities and VM-backed object results.
- The immediate owner-object writer is now identified too. `entity_vm_runtime_create` (`000d:4c99`) stores the only verified runtime `+0x1315/+0x1317` value by calling the newly recovered `entity_vm_runtime_owner_resource_create` (`000d:7000`), whose helper-managed body allocates child `+0x10/+0x12` from a vtable `+0x04` size query and fills the `0x0d`-stride slot table through vtable `+0x0c`. The paired release path is `entity_vm_runtime_owner_resource_destroy` (`000d:70fd`).
- The first wrapper-side mask families are now anchored by direct instruction evidence as well: local wrapper `0004:f033` passes `0x8000:0007`, `FUN_0004_f05c` passes `0x2000:0015` from the `0004:f2b3` overlap/proximity branch with entity byte `+0x32` state toggling, and `FUN_0005_27a4` passes `0x0001:0000` from the `000c:a09e` entity `+0x5b` bit-`0x0004` branch. This is enough to distinguish at least three gameplay-side mask lanes without yet claiming descriptor-specific ownership such as `JELYHACK` versus `REE_BOOT`.
- One exact `0x410` collision that could have reopened the wrong lane is now ruled out: `000e:0953` pushes literal `0x410` into imported `ASYLUM.27` from the animation/audio path after setting the `+0xef1` audio-completion byte. Because `ASYLUM.DLL` is the `ASS_*` audio/media library, this is not evidence for a second gameplay or USECODE event source; the other previously suspected compiled-code bridge at `000d:51fd` is now ruled out too because that site calls the seg091 fatal-report helper `000a:44fd` with `DS:6616`, not gameplay dispatch.
9. Revisit `allocator_phase_finalize_pass` only where it intersects the same callback object semantics, rather than broad allocator mechanics that are already sufficiently constrained.
10. Continue `ASYLUM.24` only after the `0x4588` / dispatch-entry lane and `0004:1e00` transition path have no further cheap wins.
11. User-directed USECODE/JELYHACK side lane (next actionable IR step): map the new sequencer-local payload-shape matrix to concrete opcode numbers by recovering the upstream opcode dispatcher lane that selects `FUN_000d_ebe3`, then test whether those opcode numbers correlate better with active-event families (`EVENT`/`NPCTRIG`/`*_BOOT`/`SFXTRIG`) than with callback-trigger (`SURCAM*`) descriptors.
12. Use the new ScummVM reference note as a focused cross-check batch before deeper parser or VM work:
- compare local USECODE/EUSECODE container assumptions against ScummVM's Crusader `UsecodeFlex` class-header parsing (`classid + 2`, class-name table at object `1`, base offset from bytes `8..11`, event-count formula `(base + 19) / 6`),
- import the conservative Crusader event-name table from `convert_usecode_crusader.h` (`look/use/anim/cachein/hit/gotHit/hatch/schedule/release/equip/unequip/combine/calledFromAnim/enterFastArea/leaveFastArea/avatarStoleSomething/animGetHit/unhatch`) into the current USECODE annotation workflow where they match verified behavior,
- compare current weapon/ammo and item-family reads against ScummVM's `WeaponInfo`, `ShapeInfo`, and `ItemFactory` structures so quality/ammo/clip semantics are kept aligned with evidence,
- and prioritize local parsers or validators for the ScummVM-loaded Crusader data files that are still weakly covered here: `dtable.flx`, `damage.flx`, `glob.flx`, `wpnovlay.dat`, `sound.flx`, and per-shape speech FLEX archives.
### Headline Estimate
- Overall useful decompilation progress: about 37%
- Reasonable uncertainty band: about 31% to 40%
This is the best single-number estimate for the full game right now.
### Supporting Metrics
| Metric | Estimate | Meaning |
|---|---:|---|
| Top 100 far-call target coverage | about 80% | Roughly 80 of the top 100 most-called far-call targets have been named or materially classified |
| Whole-program behavioral coverage | about 37% | Verified subsystem and function understanding across the executable |
| Segment spread with meaningful analysis | about 20% to 26% | Segments with more than a trivial foothold or isolated note |
| Tooling maturity for continued work | about 75% | Core repair, lookup, and fallback automation needed for continued progress |
### Why These Numbers Differ
- The hot-target metric is much higher because the project has already focused on the most shared and most-called helpers.
- The whole-program metric is lower because most of the 145 NE segments still have not had systematic coverage passes.
- The segment-spread metric is lower still because only a subset of segments have coherent subsystem-level treatment.
## What Is Already In Place
### Workflow and Tooling
- Raw full-EXE Ghidra target is established and in active use.
- Verified raw-import mapping exists for seg001 and seg021.
- NE relocation parsing has been implemented.
- Internal literal far-call fixups have been applied to the raw import.
- PyGhidra fallback tooling exists for create/delete function work and batch scripted edits.
- Conservative boundary-repair workflow already exists and has been used successfully.
- Notes are detailed enough to support a formal executable-wide tracker.
### Objective Milestones Already Reached
- 145 NE segments identified from the internal NE header.
- 8851 internal literal CALLF sites patched to real targets in the raw import.
- 2841 non-CALLF far-pointer relocations identified and deferred.
- 119 import callsites annotated.
- Top 100 far-call target list processed through five tiers, with about 80 named or materially classified.
## Strongly Advanced Areas
### Core Gameplay and Entity Work
- seg001 gameplay, cursor, entity lifecycle, projectile, combat, and AI footholds are strong.
- A verified seg001 raw-port path is working and already used for multiple projectile helpers.
- Entity table, class-table, and several global gameplay fields are partially mapped.
### Timer, Event, and State Systems
- seg021 timer and event-dispatch work has meaningful coverage.
- 000c state-dispatch, cursor-nav, UI-listbox, palette-fade, and mini-VM clusters have footholds.
### Rendering and Camera
- 0007 rendering, draw-list, tile-visibility, and camera work has strong structural coverage.
- `world_to_screen_coords` and adjacent geometric helpers are understood well enough to support further caller analysis.
### Dispatch and Pair-Sync Helpers
- 0008 dispatch-entry helper families have multiple verified rename batches.
- Pair-sync and target-state helper clusters are no longer isolated unknowns.
### Cache, Tracked Handles, and Bucket Logic
- 000a cache manager layer is structurally mapped.
- 000a tracked-handle table is structurally mapped.
- 000d tracked bucket / proximity / visibility bucket logic has several meaningful behavioral names.
- The client/cache distinction is much clearer than before.
### Parser and Animation Framework
- 000e parser cluster has a stable set of verified names.
- 000e animation framework has a real foothold: chunk lookup, audio load, tick, frame advance, and constructor variants are partly mapped.
### Local Repair Successes
- seg043 overlap repair succeeded and recovered multiple valid function objects.
- seg091 boundary recovery succeeded and exposed RNG helpers plus local init/context helpers.
- Recent seg004 reset-path recovery and cache-reset follow-up added a new high-value analysis cluster.
## What Still Blocks Broader Coverage
### High-Value Classification Gaps
- The object rooted at `0x4588` is still not classified well enough to safely rename the callback object itself beyond the current allocator-side glue names.
- `ASYLUM.24` is only known as an import site, not yet a confidently identified routine.
- Some structural names in the cache/backend/finalize cluster are waiting on object-role confirmation.
### Boundary and Decompiler Gaps
- Some high-caller targets still require conservative boundary repair or follow-up validation.
- Certain functions still decompile poorly because of overlaps, thunk-heavy paths, or unresolved downstream targets.
- `000e:ffb0` remains a notable animation/video-side blocker because of overlapping instructions.
### Coverage Management Gap
- A first-pass normalized segment-by-segment coverage ledger now exists for all 145 NE segments.
- The remaining gap is refinement rather than absence: most segments still need manual promotion from `None` to `Foothold` / `Partial` / `Deep` as coverage expands.
### Deferred Data Work
- Non-CALLF far-pointer relocations still exist and will matter for deeper object/table recovery.
- They are no longer the main blocker, but they remain a real second-pass problem.
## Current Best Assessment Of Remaining Work
The project has solved most of the architectural uncertainty needed to keep going efficiently.
The remaining effort is mainly a scaling problem:
- expand coverage across many more segments,
- remove the last high-value boundary blockers,
- convert structural names into subsystem names when evidence is strong enough,
- and normalize progress tracking so the whole program can be managed deliberately.
In practical terms, this looks like a true mid-project state rather than an early exploratory state or a late polish state.
## Implementation Priorities
### Priority 0: Coverage Ledger
First pass completed: an executable-wide coverage ledger now exists for all 145 NE segments in `crusader_segment_coverage_ledger.csv`.
Next work under Priority 0:
1. Promote additional segments from `None` where notes already support a verified foothold.
2. Normalize raw-address subsystem islands (notably the `000e:` parser/animation cluster) back onto exact NE segment rows.
3. Keep the ledger updated together with `crusader_decompilation_notes.md` after each verified batch.
Minimum columns:
| Column | Meaning |
|---|---|
| Segment | NE segment number |
| Type | Code or data |
| File offset | From the NE segment table |
| Length | Segment length |
| Coverage status | None, foothold, partial, deep |
| Known subsystem | Best current classification |
| Key named functions | Short summary only |
| Blockers | Boundary, import, thunk, overlap, unknown object, etc. |
| Notes source | Notes section or evidence anchor |
This is the most important missing artifact because it will make the percentage estimates maintainable.
### Priority 1: Finish The New Cache/Backend Cluster
Work the newest verified reset-path cluster to closure:
1. Trace more callers of `0009:b06b`.
2. Trace more callers of `FUN_0009_a961`.
3. Classify the object rooted at `0x4588`.
4. Revisit `allocator_phase_finalize_pass` once the object role is clearer.
This is currently the best next analysis target because it closes a live cluster that already has fresh verified work around it.
### Priority 2: `ASYLUM.24` Resolved
`ASYLUM.DLL` was imported as a separate NE program in Ghidra and its export table is now verified as an `ASS_*` audio DLL, not the immortality/USECODE interpreter lane.
Resolved result:
- `ASYLUM.24` = `_ASS_StopAllSFX` at `1018:0681`
- `runtime_cache_reset_sequence` therefore performs an audio stop before the cache/tracked-handle reset work
- this import is not evidence for the immortality cheat path; the `0x410` toggle remains attributed to the interpreted `EUSECODE.FLX` lane rather than `ASYLUM.DLL`
### Priority 3: Continue Small-Batch Boundary Repair
Use the existing conservative repair approach for remaining high-value blockers.
Good candidates include:
- unresolved high-caller function objects,
- ranges that still steal bytes from adjacent real bodies,
- and overlaps that block decompilation of already-active subsystems.
### Priority 4: Finish Partial Subsystem Islands Before Expanding Broadly
Recommended order:
1. seg043 plus connected seg004 reset and dispatch paths
2. 000e animation/video overlap at `000e:ffb0`
3. 000c UI-listbox, mini-VM, and cursor-nav families
4. Remaining structural 0007 and 0008 helper cohorts
The goal is to reduce the number of half-understood islands before starting broad segment sweeps.
### Priority 5: Broaden Coverage Across The Remaining Executable
Once the ledger exists and the current hot cluster is closed, broaden analysis segment by segment.
Preferred method:
1. Group segments by adjacency and call relationships.
2. Identify entry points and hot callees first.
3. Classify globals and tables next.
4. Promote helper names only when supported by strong evidence.
## Recommended Tracking Model
Use these status values for segment coverage:
| Status | Meaning |
|---|---|
| None | No meaningful verified analysis yet |
| Foothold | One or two verified entry points or helper names, but no subsystem picture |
| Partial | Several verified names plus some globals/tables or object fields |
| Deep | Coherent subsystem-level understanding with multiple verified related functions |
Use these status values for subsystem maturity:
| Status | Meaning |
|---|---|
| Unknown | Not enough evidence to classify |
| Structural | Behavior is partly mapped but still generic |
| Behavioral | Confident subsystem role is known |
| Stable | Multiple connected functions and data objects support the classification |
## Suggested Immediate Work Queue
### Queue A: Highest Leverage
1. Expand the first-pass segment coverage ledger beyond the currently seeded segments.
2. Trace `allocator_try_alloc_from_head_table`, `allocator_head_finalize_sweep`, and `allocator_phase_finalize_pass`.
3. Identify `ASYLUM.24`.
### Queue B: Repair And Stabilize
1. Review remaining high-caller gap functions.
2. Repair any still-blocking overlaps in small batches.
3. Re-decompile repaired ranges and keep only evidence-backed names.
### Queue C: Broaden Carefully
1. Expand into adjacent segments connected to already-understood clusters.
2. Avoid speculative naming.
3. Update the notes and the coverage ledger together after each verified batch.
## Concrete Progress Interpretation
If a single number is needed, use 25%.
If a more honest dashboard is acceptable, use all three:
- 80% of top-100 hot targets processed
- 25% overall behavioral decompilation progress
- 10% to 15% segment spread with meaningful analysis
That combination best reflects the actual state of the project.
## Source Anchors
Primary sources for this file:
- `crusader_segment_coverage_ledger.csv`
- `crusader_decompilation_notes.md`
- `crusader_ne_segments.csv`
- `tier4_output.txt`
- `tier5_output.txt`
- repo memory progress summary
## Next Update Rule
Update this file when one of the following happens:
- the overall estimate changes materially,
- a new subsystem reaches behavioral or stable status,
- a major blocker such as `0x4588`, `allocator_phase_finalize_pass`, or `ASYLUM.24` is resolved,
- or the segment coverage ledger is created and becomes the new primary progress source.