Crusader_Decomp/plan-mid.md
MaddoScientisto 3daffbf113 Add extractor for Crusader's EUSECODE.FLX container
- Implemented a Python script to extract data from the EUSECODE.FLX file format.
- Defined data structures for candidate entries and extracted chunks using dataclasses.
- Added functions to read and parse the FLX table, extract candidate data, and generate human-readable output files.
- Included functionality for analyzing extracted data, including generating summaries, descriptors, and event family reports.
- Implemented utilities for calculating printable ratios, zero ratios, and identifying text-like data.
- Added support for writing various output formats, including JSON, TSV, and Markdown.
2026-03-22 14:27:38 +01:00

43 KiB

Crusader Decompilation Mid-Project Plan

Purpose

This file is the workspace-facing mid-project tracker for the Crusader decompilation effort. It is intended to answer four questions clearly:

  1. How far along is the project?
  2. What is already solid?
  3. What still blocks broader decompilation?
  4. What should be implemented next?

The estimates below are intentionally conservative. They measure verified behavioral understanding, not just renamed symbols.

Progress Snapshot

Working Progress

Last Confirmed State

  • Priority 0 has started: crusader_segment_coverage_ledger.csv exists and contains a first-pass 145-row ledger.
  • The currently seeded ledger rows are conservative and strongest around seg001, seg004, seg021, seg043, seg080, seg082/083/085, seg091, seg094, and seg095.
  • Priority 1 has started on the cache/backend cluster: the seg082 allocator mechanics are now materially recovered (allocator_head_try_alloc_block, allocator_head_free_block, allocator_free_block_by_ptr, allocator_try_alloc_from_head_table, allocator_phase_finalize_pass), and the 0x4588 path now has named lifecycle helpers (runtime_callback_object_init_once, runtime_callback_object_teardown_once).
  • The 0x4588 blocker is tighter than before: 000a:b988 boundary repair now includes both callback sync callsites (000a:b9e5 / 000a:ba66) inside one real function body, 000d:9d5e / 000d:a3b7 are confirmed inside entity_cleanup_resources_and_dispatch, and adjacent helpers are now clarified as allocator_head_finalize_sweep (0009:a961), video_bios_state_snapshot (000a:4a1f), and video_mode_set_and_record_state (000a:4972). Concrete subsystem identity is still unresolved.
  • A larger MCP rename batch completed for cleanup callees: palette_buffer_alloc_and_init_256 (0009:7853), file_handle_alloc_init_and_open (0009:1c3a), file_handle_open_with_mode (0009:1d6a), surface_release_internal (0009:8d7b), surface_release_and_maybe_free (0009:8e0a), and sprite_redraw_global_if_active (000d:9231). This reduces entity_cleanup_resources_and_dispatch ambiguity on file/surface/palette teardown paths.
  • The previously missing 000d:7e00 function object is now recovered and named entity_dispatch_entry_init_runtime_state, with paired destructor entity_dispatch_entry_release_runtime_state at 000d:8078. Adjacent missing helpers 0003:a880 and 0003:b8e2 were also recovered, with 0003:b8e2 promoted to far_buffer_alloc_with_mode_flags.
  • Additional helper stabilization now covers seg061/064/076: vga_palette_read (0009:6ec7) is confirmed alongside existing palette write/free paths, timer_entity_enable_wrapper (0008:d3ba) is named, and seg064 one-shot gate helpers around 0x3b72/0x3b73 are documented with conservative comments while keeping speculative naming deferred.
  • Constructor-lane semantics tightened further: entity_set_update_period_and_reschedule (0008:d27e) and palette_buffer_alloc_copy_from_source (0009:7905) are now named, and both 0x4588 callback emit callsites (000d:9d5e, 000d:a3b7) now have explicit payload-pair annotations in disassembly.
  • The seg082 allocator table structure is now pinned down as the allocator head table at 0x8724 and active head count at 0x879c, and the old structural helpers at 0009:b06b / 0009:b1c3 are now promoted to allocator_try_alloc_from_head_table and allocator_phase_finalize_pass.
  • New caller-side seg138 evidence now exists at FUN_000d_938c (000d:938c-000d:9583): it builds one scratch-palette dispatch entry (kind 0x3c) and one current-palette dispatch entry (kind 0x14) through entity_dispatch_entry_init_runtime_state, waits for each entry's active flag to clear, then redraws the global sprite path and dispatches through the input object's vtable slot +0x08. This narrows the open lane to presentation/dispatch semantics without yet justifying a concrete subsystem rename.
  • seg137 is now promoted from Foothold to Partial: direct MCP recovery stabilized a coherent palette/dispatch-entry helper family with safe renames for all-black, all-white, arbitrary-RGB, grayscale, black-state, and solid-color state builders around the same entity_dispatch_entry_init_runtime_state lane. The remaining gap is the higher-level event/script meaning of those helpers, not the local mechanics.
  • seg005 and seg136 now have new high-value footholds: FUN_0004_60c0 is recovered as a startup/display orchestration handoff that drives the seg137 palette helper family, validates an object through vtable +0x0c, creates the default active dispatch entry, programs mouse state, and then hands off into 0004:1e00; nearby seg136 helpers are now stabilized as active_dispatch_entry_mark_enabled, active_dispatch_entry_mark_disabled, and active_dispatch_entry_create_default.
  • The downstream seg005 handoff body is now also classified further: FUN_0004_1e00 (0004:1e00-0004:2420) is a non-return startup/display transition driver with confirmed use of vga_palette_set_all_black, animation_ctor_variant_b, sprite_node_get_or_traverse, seg064 gate helpers, the 0x2bd8 vtable lane, and the 0x4aa/0x7e22 resource/object lane. The remaining work is naming the exact state label, not repairing the structure.
  • seg126 is now promoted from Foothold to Partial: FUN_000c_7412, transition_preentry_setup_resources, transition_preentry_release_resources, transition_preentry_run_until_complete_or_abort, transition_preentry_step_script, thunk_callf_0000_ffff_000c_827d, thunk_callf_0000_ffff_000c_82f9, and FUN_000c_834a now show a coherent pre-entry, guarded-entry, script/fade step, and post-transition control shell around the same FUN_0004_1e00 startup/display state.
  • seg127 is now promoted from Foothold to Partial: palette_fade_begin_full_up, palette_fade_begin_full_down, transition_palette_fade_begin, transition_palette_fade_tick, transition_palette_fade_out_step, and transition_palette_fade_in_step form a concrete local palette-fade controller with verified full-range wrappers and caller-side state gating immediately beside the same seg126/seg005 transition lane.
  • seg049 is no longer blank: watch_entity_controller_create_global, watch_entity_controller_create, and watch_entity_controller_dispatch_if_present now show that 0x2bd8 is a real type-stamped watch/camera controller object lane rather than only a raw watched-entity pointer, and that same controller is exercised from FUN_0004_1e00.
  • seg108 is no longer blank: sprite_object_clear_flag40_if_present and sprite_object_set_flag40_if_present now anchor the 0x4f38 global sprite/object lane as a real state-bit-controlled object path used beside the same 0x4588 callback sync and startup/display transition flow.
  • Direct MCP follow-up on seg126 and seg127 now recovered the missing helper bodies after boundary repair: transition_preentry_setup_resources (000c:c63a), transition_preentry_release_resources (000c:c890), transition_preentry_run_until_complete_or_abort (000c:c9f4), transition_preentry_step_script (000c:ca1d), and the neighboring transition_palette_fade_tick / transition_palette_fade_begin / transition_palette_fade_out_step / transition_palette_fade_in_step chain are now named against verified behavior. The latest semantic pass also tightened the two main open globals: 0x8c5c / 0x8c60 are now best understood as a paired temporary text-renderer lane, while 0x31a2 behaves like an external input/event break gate maintained by queue/interrupt-side code. The remaining structural cleanup is the separate oversized overlap rooted at 000c:db68, not the seg126 helper family.
  • Bonus cheat-lane cleanup is now visible in Ghidra too: cheat_code_check has recovered local names (input_event_record, input_event_offset, new_cheat_enabled, cheat_status_display_root) and a decompiler comment stating that it matches the five-byte event-code sequence 50 80 3e fd 27 00 before toggling the cheat-state bytes and taking one of two local notification paths.
  • Point 8 cheat/input-lane pass is complete. keyboard_input_cheat_dispatch (0007:04dc) is renamed and has a full scan-code mapping decompiler comment. cheat_entity_slot_cycle_and_update_sprite (000c:8072) and cheat_anim_type_cycle_and_refresh (000c:81c0) are named. Three DS:0x6050 gate helpers (000c:8221/8227/822b) are named. All seven cheat event case-handlers in the 000c dispatch function now have labels and disassembly comments (event_0x141/0x241/0x441_cheat_debug_overlay_toggle, event_0x7e_cheat_latch_runtime_toggle, event_0x142/0x143_cheat_fullscreen_mode1/0_refresh, event_0x410_cheat_flag_604f_toggle). The cheat-related string table in seg014 is documented (including the dev Easter-egg "FART ...TRY... -laurie"). HACK MOVER / Immortality strings confirmed present with no static code xrefs — attributed to USECODE scripting layer. 0x844 (master cheat flag) vs 0x6045 (live cheat latch) separation remains solid.
  • User-directed JELYHACK producer tracing is now tightened one layer upstream of 000d:208b / 000d:21ed: the immediate stream producer is the embedded mini-VM object created at context +0x36. entity_vm_context_create_from_slot_index (000d:46ec) feeds that object through entity_vm_context_setup (000c:f844), which uses entity_vm_stack_init_with_data (000c:f6e8) and entity_vm_state_copy (000c:f772) semantics to seed or clone [+0xcc..+0xd2]. The actual source payload comes from the runtime owner table at 0x6611 -> +0x1315/+0x1317 -> +0x10/+0x12, addressed as base + 0x0d*slot + 4, and the resulting per-slot source is mirrored into 0x39ca. This still does not expose a direct JELYHACK-named producer object, but it strengthens the current reading that JELYHACK / JELYH2 contribute referent identity while neighboring REE_BOOT / SURCAMEW / SFXTRIG descriptors remain better candidates for event-bearing attachments.
  • The next USECODE/JELYHACK pass now resolves the immediate owner-object writer too. entity_vm_runtime_create (000d:4c99) is the only writer of runtime +0x1315/+0x1317, via newly recovered entity_vm_runtime_owner_resource_create (000d:7000), and the companion entity_vm_runtime_owner_resource_destroy (000d:70fd) releases that helper. The 000d:7000 body does not copy a caller-supplied table directly: it constructs one embedded seg069/070 helper object, queries that helper for the required table size via vtable +0x04, allocates child +0x10/+0x12, then populates the 0x0d-stride per-slot producer records through vtable +0x0c. Wrapper classification around entity_vm_context_try_create_masked_for_entity is tighter too: local wrapper 0004:f033 uses slot mask 0x8000:0007, FUN_0004_f05c uses 0x2000:0015 and is reached from 0004:f2b3 after overlap/proximity and entity byte +0x32 state checks, and FUN_0005_27a4 uses 0x0001:0000 from the 000c:a09e entity +0x5b bit-0x0004 branch. This is enough for a conservative owner/resource classification, but not yet for a source-format-specific or descriptor-specific rename beyond that partial role name.

Current Focus

  1. User-directed USECODE/JELYHACK lane: identify who populates the runtime owner/resource object returned by 000d:7000, especially the +0x10/+0x12 per-slot producer table and the gameplay wrappers around entity_vm_context_try_create_masked_for_entity that decide which entities can materialize slot-backed VM contexts.
  2. Finish Priority 0 refinement by promoting more exact segment rows where notes already support a verified foothold.
  3. Continue the Priority 1 pass by tracing the higher-level startup/display callers, branch outcomes, pre-entry object lanes, palette-fade ownership, watch/camera controller ownership, and active sprite/object ownership that stitch the seg137 palette helper family into the wider 0x4588 / dispatch-entry object-role lane.

Next Resume Point

  1. Continue the user-directed USECODE/JELYHACK follow-on from the recovered producer chain, especially by:
    • identifying the concrete seg069/070 helper class and source arguments behind entity_vm_runtime_owner_resource_create (000d:7000), especially the vtable +0x04 size query and +0x0c table-population call that fill child +0x10/+0x12,
    • extending wrapper classification outward from the now-verified seeds 0004:f033 (0x8000:0007), FUN_0004_f05c (0x2000:0015), and FUN_0005_27a4 (0x0001:0000) into the neighboring 0005:2867/2918/2ae2/2d30 family so the slot-mask groups can be mapped to concrete gameplay object classes,
    • checking whether any recovered owner-table records or slot families line up with the JELYHACK-island referent/event neighborhood more strongly than with generic entity-script traffic,
    • and tracing whether the 0x39ca per-slot payload mirror is initialized only from entity_vm_context_create_from_slot_index or is also refreshed by other runtime-owner helper paths.
  2. Keep classifying the seg126 pre-entry text-renderer lane around transition_preentry_setup_resources, transition_preentry_step_script, and transition_preentry_release_resources, especially by:
    • comparing more preset 0x10 / 0x11 text-renderer callsites,
    • tracing who owns the rendered buffer loaded into 0x6301:0x6303,
    • mapping the control bytes 0x21 / 0x23 / 0x24 / 0x26 / 0x2a / 0x40 / 0x5e to concrete display behavior,
    • and deciding whether the paired 0x8c5c / 0x8c60 lane is a title/body pair, normal/highlight pair, or another fixed UI pairing.
  3. Finish the 0x31a2 gate pass as one batch:
    • classify the read sites at 0004:c24d, 000c:ca11, 000c:e4d8, 000c:e546, 000c:e5c6, 000d:9304, 000d:b6b1, and 000d:c0ee,
    • relate them back to interrupt-side updates at 0008:a283 / 0008:a314,
    • and decide whether 0x31a2 is best described as user-acknowledge, queued-input depth, or a broader event-break gate.
  4. Tighten the DS:0x6341 to 0x6828 relationship:
    • compare the seg126 animation_ctor_variant_a call with the other raw callsites at 0005:3c4f, 0005:3c74, 000c:6176, and 000c:619c,
    • map who owns g_active_dispatch_entry_farptr[+0x40],
    • and classify whether seg126 is constructing a transition-local animation payload for the shared active dispatch entry or only toggling an owner-side state bit after setup.
  5. Identify which higher-level transition states own the seg127 fade-controller inputs at 0x630a-0x6316 and how that fade state is chosen from the seg005/seg126 startup path.
  6. Repair the still-oversized overlap rooted at 000c:db68 only if it blocks follow-on analysis or decompiler visibility in the same transition lane.
  7. Clarify the relationship between the seg049 watch/camera controller at 0x2bd8, the seg108 sprite/object lane at 0x4f38, and the object validated through FUN_0004_60c0 vtable slot +0x0c.
  8. Continue caller-role classification inside entity_cleanup_resources_and_dispatch (contains both 000d:9d5e and 000d:a3b7) and map how it relates to FUN_000d_938c, FUN_0004_60c0, FUN_000c_7412, transition_preentry_release_resources, and the seg136/seg137 active-dispatch helper family.
  9. Cheat/input side laneCOMPLETED this pass. All point-8 sub-items are now resolved:
    • keyboard_input_cheat_dispatch renamed; full scan-code table documented in decompiler comment.
    • cheat_entity_slot_cycle_and_update_sprite and cheat_anim_type_cycle_and_refresh named.
    • DS:0x287b / DS:0x2892 success-path presentation: confirmed as opaque near-code discriminator values stored at +0x49 in the display notification object; the cheat-on/off display is built via display_null_check_dispatch + sprite_node_get_or_traverse.
    • All seven cheat event case-handlers in the 000c dispatch function labeled and commented.
    • 0x844 (master) vs 0x6045 (live latch) separation confirmed solid; 0x604b / 0x604f / 0x6050 also documented.
    • HACK MOVER: no static code xrefs; attributed to USECODE scripting layer. Cheat string table in 000e fully documented.
    • Remaining open: exact user-facing identity of events 0x141/0x241/0x441 overlays (strings suggest targeting-reticle / CD-transfer-display), exact DS:0x6087/6091 notification objects, and any further depth on the 0x4f38 / 0x2bd8 vtable path taken by the overlay events.
    • The Immortality cheat mechanics are now fully traced at the C level: event 0x410 toggles DS:0x604f; the sole read site is player_receive_damage_and_dispatch_effects (0004:c055) at 0004:c205, which divides all incoming 32-bit damage by 0x40000 (262,144) when the flag is set, making HP loss negligible while the hit-stagger animation still plays. No static C keyboard dispatch generates event 0x410 — confirmed USECODE/ASYLUM scripting layer only. DS:0x60d2 / DS:0x60ee are the "Immortality enabled." / "Immortality disabled." notification pointers. A parallel handler at 000b:b62c sets the associated USECODE process state to 0xe when the event arrives.
    • tools/extract_eusecode_flx.py now parses the validated full EUSECODE table (count @ 0x54, table @ 0x80) rather than the old heuristic header scan. Current run extracts all 403 non-zero entries and emits a searchable entry_index.tsv with primary_label and field_names summaries.
    • The extractor now also emits descriptor_index.tsv and descriptor_neighborhoods.tsv, which summarize per-class field-tag patterns and the local neighborhoods around trigger/event-related classes.
    • Current EUSECODE split is now clearer: the 000e parser lane plausibly covers text-heavy records like DATALINK and TEXTFIL1, while the binary descriptor lane exposes object classes such as EVENT, NPCTRIG, CRUZTRIG, TRIGPAD, SPECIAL, SURCAMNS, SURCAMEW, JELYHACK, and JELYH2.
    • The descriptor lane now has a real structural foothold too: field-name strings are preceded by short tagged metadata records (69 xx 00 <name>, 24 xx 02 <name>, etc.) in multiple classes. This looks like compact field-definition encoding rather than arbitrary string spill.
    • That tag grammar is now useful enough to search semantically: 69:0A00 -> event is stable across EVENT, NPCTRIG, SFXTRIG, and several *_BOOT classes, while 24:0A02 -> eventTrigger shows up in SURCAMNS / SURCAMEW.
    • Immortality-specific follow-on is now narrowed but not closed: JELYHACK and JELYH2 are confirmed as real referent-only EUSECODE descriptors; NPCTRIG is confirmed as an event-capable trigger descriptor; CRUZTRIG / TRIGPAD expose referent,item,elev; but no extracted record has yet been tied directly to binary event value 0x410.
    • The clustering pass tightened the local candidate set around JELYHACK: the immediate neighborhood now includes SPECIAL, TRIGPAD, DATALINK, HOFFMAN, REE_BOOT, SURCAMEW, and SFXTRIG, which is a plausible map/object island rather than random sparse table order.
    • The strongest record_table_parse_buffer caller evidence (000e:1b9f..1d49) now appears to belong to the animation-object field lane, because the surrounding setup manipulates the already-mapped animation fields at +0x117/+0x11b/+0x11f/+0x123 and +0xeaf/+0xeb1. That weakens the earlier assumption that 000e:3639 is the primary EUSECODE loader and shifts the likely binary-descriptor consumer search back toward the 000d VM/object path.
    • The first concrete 000c to 000d bridge in that direction is now visible at entity_vm_set_value_from_slot_plus_offset (000c:f95f): it calls entity_vm_slot_load_value_plus_offset (000d:5572) and stores the return pair into object fields +0xd6/+0xd8; on the 000d side, entity_vm_slot_load_value (000d:51fd) contains a verified PUSH 0x410 path. Supporting slot helpers in the same lane are now named too (entity_vm_slot_find_or_select, entity_vm_slot_decrement_use_count, entity_vm_slot_release_value). This still does not prove the immortality trigger chain, but it is the strongest current code-side connection between the mini-VM lane and a live 0x410 producer.
    • The adjacent 000d:45xx..4exx island is now promoted out of FUN_* placeholders as one coherent VM runtime/context family. Newly named helpers include entity_vm_runtime_create / entity_vm_runtime_init_slots / entity_vm_runtime_release_slots / entity_vm_runtime_destroy, entity_vm_slot_index_from_entity, entity_vm_context_try_create_masked_for_entity, entity_vm_context_create_from_slot_index, entity_vm_context_sync_global_value_and_dispatch, and the context save/load/destroy helpers. The runtime global at 0x6611 now reads as a real owner for this lane rather than an opaque far pointer.
    • Two large caller bodies at 000d:208b and 000d:21ed now stand out as concrete context-construction sites: both feed per-object stream/data state from +0xcc/+0xce into entity_vm_context_create_from_slot_index, then continue by reading from the seeded +0xd6/+0xd8 bytecode/value lane. This is the clearest current evidence that the 000d interpreter/object family, not the 000e text parser, is the near-runtime consumer to keep following for the immortality trigger.
    • A second supporting lane is now named too: entity_vm_referent_registry_init / destroy / alloc / release_by_id / free_node show that 0x8c8c/0x8c8e/0x8c90/0x8c94 form a free-list-backed referent registry. entity_vm_set_field_da_to_global writes 0x8c94 from the context +0xda lane before entering the still-misaligned 000c:3350 body, which is the first concrete runtime mechanism explaining how referent-only descriptors such as JELYHACK can still participate in script state.
    • That referent-registry lane is now better structured too: entity_vm_referent_chain_copy, entity_vm_referent_chain_append_unique_from, entity_vm_referent_chain_remove_matching_from, entity_vm_referent_chain_contains_entry, entity_vm_referent_chain_get_entry_data_at, entity_vm_referent_chain_set_entry_data_at, and entity_vm_referent_chain_get_indirect_data show that the runtime can build, subtract, and mutate payload chains hanging off one referent anchor. This is the first runtime shape that looks directly useful for a future human-readable / modifiable script IR.
    • entity_vm_opcode_finish (000d:3350) is now identified as the shared opcode epilogue for this family rather than an opaque helper: it writes 0x8c94 from frame-local state, unwinds the temporary slot-array state at 0x659c/0x659e when present, and returns the current opcode result.
    • The runtime/context half of that lane is now named too. The 0x6611 global is managed by entity_vm_runtime_create / entity_vm_runtime_init_slots / entity_vm_runtime_release_slots / entity_vm_runtime_destroy, while entity_vm_slot_index_from_entity, entity_vm_context_try_create_masked_for_entity, and entity_vm_context_create_from_slot_index now show how gameplay entities are tested against one owner-side slot-mask table before a 0x6714 VM context is created.
    • That context family is no longer anonymous either: entity_vm_context_sync_global_value_and_dispatch, entity_vm_context_save, entity_vm_context_load, entity_vm_context_destroy, and entity_vm_context_free_buffer now pin down the lifecycle around the same +0xd6/+0xd8, +0x102, +0x10c/+0x10e, and +0x11b/+0x11d fields.
    • Current best near-runtime callsites for further immortality work are the large 000d:208b and 000d:21ed bodies, which both build one VM context from caller stream/data state and then continue by consuming bytes from the seeded context value lane.
    • The first opcode family under that lane is also less anonymous now: 000d:0988 can either append unique payload entries or remove matching ones depending on the opcode id (0x1a/0x1b taking the removal path), and both branches return through entity_vm_opcode_finish.
    • That opcode family is now classified one step further: 0x19 = append-unique indirect/string-like payloads, 0x1a = remove-matching indirect/string-like payloads, 0x1b = remove-matching inline payloads, and the same helper body strongly implies 0x18 as the missing append-unique inline sibling.
    • The first stable +0xd6/+0xd8 byte-lane semantics are now visible in the two large caller bodies too. The 000d:208b block is a simple materialize-or-forward path after entity_vm_context_create_from_slot_index, while 000d:21ed copies a caller-owned inline blob into the context +0x102 buffer and then consumes two stream bytes as compact shape/count metadata before building an entity_link closure matrix from the following caller-stream words.
    • Current best JELYHACK reading is tighter than before: the extracted chunks still only expose referent, but the new referent-registry work means that does not relegate them to inert map labels. The most defensible present model is JELYHACK/JELYH2 = referent anchors, with the actual immortality/event behavior carried by neighboring event-capable descriptors in the same local island (REE_BOOT, SURCAMEW, SFXTRIG, or a nearby generic event/trigger record).
    • That readability step now has a first concrete artifact: tools/extract_eusecode_flx.py emits referent_anchor_event_graph.tsv plus a focused jelyhack_island_graph.md, which turns the local table neighborhood into a first readable anchor-to-event view instead of only raw descriptor rows.
    • The extractor now also emits jelyhack_descriptor_compare.tsv, and its first result is useful: JELYHACK and JELYH2 have identical first 16 header words as referent-only sibling descriptors, while REE_BOOT, SURCAMEW, and SFXTRIG show materially richer header/state patterns consistent with the event-bearing side of the island.
    • Latest opcode-side refinement: entity_vm_opcode_finish (000d:3350) is now the shared epilogue for the chain-mutating handlers, while entity_vm_referent_chain_remove_matching_from (000d:6a9a) and entity_vm_referent_chain_set_entry_data_at (000d:6cf6) show that the VM can subtract and rewrite payload chains in place, not just append/copy them.
    • The 000d:21ed follow-on is now better anchored semantically too: its nested callee 0008:7d27 is entity_link, so the 22bc..2433 block is building a bidirectional entity-link closure matrix from streamed entity ids rather than only emitting an opaque table. A conservative disassembly comment is now in place at 000d:22bc; rename deferred until the bad outer function split is repaired.
    • The extractor work now scales beyond the JELYHACK case: reusable focused-report helpers emit both jelyhack_* and event_* cluster artifacts, and the first new result is strong. The EVENT island (ROLL_NS, COR_BOOT, EVENT, NPCTRIG, CRUZTRIG, NPC_ONLY, VMAIL) contains a compact three-node event-bearing core (COR_BOOT, EVENT, NPCTRIG) surrounded by referent/link/text satellites.
    • That second island materially improves the EUSECODE model: instead of one special-case JELYHACK anchor plus neighbors, we now have a broader pattern of event-bearing core embedded in referent-neighbor island, with EVENT acting as a large hub descriptor (source/dest/door/link/time/counter/post1/post2/floor/flicMan) and ROLL_NS / CRUZTRIG / NPC_ONLY / VMAIL reading as attached state or trigger-side records rather than peer event hubs.
    • The descriptor-side taxonomy is now wider too: event_family_index.tsv / event_family_summary.md classify all current event-tagged descriptors into reusable families. The active 69:0A00 -> event lane now breaks cleanly into one EVENT hub, five _BOOT event cores, one NPC trigger core, one minimal event core (SFXTRIG), and three environmental event classes (FLAMEBOX, NOSTRIL, STEAMBOX), while the surveillance pair SURCAMNS / SURCAMEW is now cleanly separated as callback-eventtrigger rather than generic event-bearing descriptors.
    • The _BOOT family is now better constrained too. boot_family_compare.tsv shows that AND_BOOT, BRO_BOOT, COR_BOOT, VAR_BOOT, and REE_BOOT all share one common header/template shape, so the family now reads as repeated instantiations of the same event-core descriptor rather than structurally different boot subclasses.
    • The best remaining _BOOT frontier is now explicit in extractor output as well: boot_frontier_graph.md shows AND_BOOT / BRO_BOOT embedded in a compact referent-heavy neighborhood (OFFWORK, GUARD, GDOOR_*, BIGCAN, CRUMORPH, GUARDSQ, CARD_*, wall variants), which is the cleanest unresolved object-side context for the boot-event template.
    • The environmental event lane is now promoted out of a generic family label into a clearer structural pattern. environmental_family_compare.tsv shows FLAMEBOX and STEAMBOX as close hazard-event siblings with the same active-event backbone plus direction/count, while NOSTRIL is the smaller fire-specific variant that keeps the dual-hazard references and counters but drops the direction/newType side.
    • The callback-trigger lane is also more defensible now: callback_trigger_compare.tsv confirms that SURCAMNS and SURCAMEW are effectively one shared callback template, differing only in one therma slot tag offset. That keeps the active event lane and callback eventTrigger lane separated by more than just naming convention.
    • Runtime follow-through has resumed too: 000d:ebe3 is now backed by direct instruction evidence as one ordered VM/opcode driver body that calls 000d:177c, 000d:1acb, 000d:0988, internal block 000d:22bc, then 000d:1d4a and 000d:2104 in sequence. 000d:ec31 is confirmed as only the internal CALL 000d:22bc site inside that body, so the inner block is still not a safe standalone rename target.
    • entity_vm_context_try_create_masked_for_entity (000d:463a) is now pinned down one step further: it first checks the runtime-disable byte at 0x6610, computes the entity slot, tests the owner-side slot mask in the runtime owner table, and only then creates a context. On success it reports either an immediate result (success with cleared output word) or an object-backed result (success with the created object's low word), which is the clearest current typed boundary between gameplay entities and VM-backed object results.
    • The immediate owner-object writer is now identified too. entity_vm_runtime_create (000d:4c99) stores the only verified runtime +0x1315/+0x1317 value by calling the newly recovered entity_vm_runtime_owner_resource_create (000d:7000), whose helper-managed body allocates child +0x10/+0x12 from a vtable +0x04 size query and fills the 0x0d-stride slot table through vtable +0x0c. The paired release path is entity_vm_runtime_owner_resource_destroy (000d:70fd).
    • The first wrapper-side mask families are now anchored by direct instruction evidence as well: local wrapper 0004:f033 passes 0x8000:0007, FUN_0004_f05c passes 0x2000:0015 from the 0004:f2b3 overlap/proximity branch with entity byte +0x32 state toggling, and FUN_0005_27a4 passes 0x0001:0000 from the 000c:a09e entity +0x5b bit-0x0004 branch. This is enough to distinguish at least three gameplay-side mask lanes without yet claiming descriptor-specific ownership such as JELYHACK versus REE_BOOT.
    • One exact 0x410 collision that could have reopened the wrong lane is now ruled out: 000e:0953 pushes literal 0x410 into imported ASYLUM.27 from the animation/audio path after setting the +0xef1 audio-completion byte. Because ASYLUM.DLL is the ASS_* audio/media library, this is not evidence for a second gameplay or USECODE event source; the live compiled-code bridge for the immortality event remains the 000d VM lane at entity_vm_slot_load_value (000d:51fd).
  10. Revisit allocator_phase_finalize_pass only where it intersects the same callback object semantics, rather than broad allocator mechanics that are already sufficiently constrained.
  11. Continue ASYLUM.24 only after the 0x4588 / dispatch-entry lane and 0004:1e00 transition path have no further cheap wins.
  12. User-directed USECODE/JELYHACK side lane: trace who seeds the caller stream/data pair at +0xcc/+0xce before the 000d:208b and 000d:21ed context-construction blocks, and correlate those producer-side objects with referent ids or descriptor-class neighborhoods that could distinguish JELYHACK / JELYH2 anchors from the neighboring REE_BOOT, SURCAMEW, and SFXTRIG event-bearing attachments.

Headline Estimate

  • Overall useful decompilation progress: about 35%
  • Reasonable uncertainty band: about 30% to 40%

This is the best single-number estimate for the full game right now.

Supporting Metrics

Metric Estimate Meaning
Top 100 far-call target coverage about 80% Roughly 80 of the top 100 most-called far-call targets have been named or materially classified
Whole-program behavioral coverage about 35% Verified subsystem and function understanding across the executable
Segment spread with meaningful analysis about 19% to 25% Segments with more than a trivial foothold or isolated note
Tooling maturity for continued work about 75% Core repair, lookup, and fallback automation needed for continued progress

Why These Numbers Differ

  • The hot-target metric is much higher because the project has already focused on the most shared and most-called helpers.
  • The whole-program metric is lower because most of the 145 NE segments still have not had systematic coverage passes.
  • The segment-spread metric is lower still because only a subset of segments have coherent subsystem-level treatment.

What Is Already In Place

Workflow and Tooling

  • Raw full-EXE Ghidra target is established and in active use.
  • Verified raw-import mapping exists for seg001 and seg021.
  • NE relocation parsing has been implemented.
  • Internal literal far-call fixups have been applied to the raw import.
  • PyGhidra fallback tooling exists for create/delete function work and batch scripted edits.
  • Conservative boundary-repair workflow already exists and has been used successfully.
  • Notes are detailed enough to support a formal executable-wide tracker.

Objective Milestones Already Reached

  • 145 NE segments identified from the internal NE header.
  • 8851 internal literal CALLF sites patched to real targets in the raw import.
  • 2841 non-CALLF far-pointer relocations identified and deferred.
  • 119 import callsites annotated.
  • Top 100 far-call target list processed through five tiers, with about 80 named or materially classified.

Strongly Advanced Areas

Core Gameplay and Entity Work

  • seg001 gameplay, cursor, entity lifecycle, projectile, combat, and AI footholds are strong.
  • A verified seg001 raw-port path is working and already used for multiple projectile helpers.
  • Entity table, class-table, and several global gameplay fields are partially mapped.

Timer, Event, and State Systems

  • seg021 timer and event-dispatch work has meaningful coverage.
  • 000c state-dispatch, cursor-nav, UI-listbox, palette-fade, and mini-VM clusters have footholds.

Rendering and Camera

  • 0007 rendering, draw-list, tile-visibility, and camera work has strong structural coverage.
  • world_to_screen_coords and adjacent geometric helpers are understood well enough to support further caller analysis.

Dispatch and Pair-Sync Helpers

  • 0008 dispatch-entry helper families have multiple verified rename batches.
  • Pair-sync and target-state helper clusters are no longer isolated unknowns.

Cache, Tracked Handles, and Bucket Logic

  • 000a cache manager layer is structurally mapped.
  • 000a tracked-handle table is structurally mapped.
  • 000d tracked bucket / proximity / visibility bucket logic has several meaningful behavioral names.
  • The client/cache distinction is much clearer than before.

Parser and Animation Framework

  • 000e parser cluster has a stable set of verified names.
  • 000e animation framework has a real foothold: chunk lookup, audio load, tick, frame advance, and constructor variants are partly mapped.

Local Repair Successes

  • seg043 overlap repair succeeded and recovered multiple valid function objects.
  • seg091 boundary recovery succeeded and exposed RNG helpers plus local init/context helpers.
  • Recent seg004 reset-path recovery and cache-reset follow-up added a new high-value analysis cluster.

What Still Blocks Broader Coverage

High-Value Classification Gaps

  • The object rooted at 0x4588 is still not classified well enough to safely rename the callback object itself beyond the current allocator-side glue names.
  • ASYLUM.24 is only known as an import site, not yet a confidently identified routine.
  • Some structural names in the cache/backend/finalize cluster are waiting on object-role confirmation.

Boundary and Decompiler Gaps

  • Some high-caller targets still require conservative boundary repair or follow-up validation.
  • Certain functions still decompile poorly because of overlaps, thunk-heavy paths, or unresolved downstream targets.
  • 000e:ffb0 remains a notable animation/video-side blocker because of overlapping instructions.

Coverage Management Gap

  • A first-pass normalized segment-by-segment coverage ledger now exists for all 145 NE segments.
  • The remaining gap is refinement rather than absence: most segments still need manual promotion from None to Foothold / Partial / Deep as coverage expands.

Deferred Data Work

  • Non-CALLF far-pointer relocations still exist and will matter for deeper object/table recovery.
  • They are no longer the main blocker, but they remain a real second-pass problem.

Current Best Assessment Of Remaining Work

The project has solved most of the architectural uncertainty needed to keep going efficiently. The remaining effort is mainly a scaling problem:

  • expand coverage across many more segments,
  • remove the last high-value boundary blockers,
  • convert structural names into subsystem names when evidence is strong enough,
  • and normalize progress tracking so the whole program can be managed deliberately.

In practical terms, this looks like a true mid-project state rather than an early exploratory state or a late polish state.

Implementation Priorities

Priority 0: Coverage Ledger

First pass completed: an executable-wide coverage ledger now exists for all 145 NE segments in crusader_segment_coverage_ledger.csv.

Next work under Priority 0:

  1. Promote additional segments from None where notes already support a verified foothold.
  2. Normalize raw-address subsystem islands (notably the 000e: parser/animation cluster) back onto exact NE segment rows.
  3. Keep the ledger updated together with crusader_decompilation_notes.md after each verified batch.

Minimum columns:

Column Meaning
Segment NE segment number
Type Code or data
File offset From the NE segment table
Length Segment length
Coverage status None, foothold, partial, deep
Known subsystem Best current classification
Key named functions Short summary only
Blockers Boundary, import, thunk, overlap, unknown object, etc.
Notes source Notes section or evidence anchor

This is the most important missing artifact because it will make the percentage estimates maintainable.

Priority 1: Finish The New Cache/Backend Cluster

Work the newest verified reset-path cluster to closure:

  1. Trace more callers of 0009:b06b.
  2. Trace more callers of FUN_0009_a961.
  3. Classify the object rooted at 0x4588.
  4. Revisit allocator_phase_finalize_pass once the object role is clearer.

This is currently the best next analysis target because it closes a live cluster that already has fresh verified work around it.

Priority 2: ASYLUM.24 Resolved

ASYLUM.DLL was imported as a separate NE program in Ghidra and its export table is now verified as an ASS_* audio DLL, not the immortality/USECODE interpreter lane.

Resolved result:

  • ASYLUM.24 = _ASS_StopAllSFX at 1018:0681
  • runtime_cache_reset_sequence therefore performs an audio stop before the cache/tracked-handle reset work
  • this import is not evidence for the immortality cheat path; the 0x410 toggle remains attributed to the interpreted EUSECODE.FLX lane rather than ASYLUM.DLL

Priority 3: Continue Small-Batch Boundary Repair

Use the existing conservative repair approach for remaining high-value blockers.

Good candidates include:

  • unresolved high-caller function objects,
  • ranges that still steal bytes from adjacent real bodies,
  • and overlaps that block decompilation of already-active subsystems.

Priority 4: Finish Partial Subsystem Islands Before Expanding Broadly

Recommended order:

  1. seg043 plus connected seg004 reset and dispatch paths
  2. 000e animation/video overlap at 000e:ffb0
  3. 000c UI-listbox, mini-VM, and cursor-nav families
  4. Remaining structural 0007 and 0008 helper cohorts

The goal is to reduce the number of half-understood islands before starting broad segment sweeps.

Priority 5: Broaden Coverage Across The Remaining Executable

Once the ledger exists and the current hot cluster is closed, broaden analysis segment by segment.

Preferred method:

  1. Group segments by adjacency and call relationships.
  2. Identify entry points and hot callees first.
  3. Classify globals and tables next.
  4. Promote helper names only when supported by strong evidence.

Use these status values for segment coverage:

Status Meaning
None No meaningful verified analysis yet
Foothold One or two verified entry points or helper names, but no subsystem picture
Partial Several verified names plus some globals/tables or object fields
Deep Coherent subsystem-level understanding with multiple verified related functions

Use these status values for subsystem maturity:

Status Meaning
Unknown Not enough evidence to classify
Structural Behavior is partly mapped but still generic
Behavioral Confident subsystem role is known
Stable Multiple connected functions and data objects support the classification

Suggested Immediate Work Queue

Queue A: Highest Leverage

  1. Expand the first-pass segment coverage ledger beyond the currently seeded segments.
  2. Trace allocator_try_alloc_from_head_table, allocator_head_finalize_sweep, and allocator_phase_finalize_pass.
  3. Identify ASYLUM.24.

Queue B: Repair And Stabilize

  1. Review remaining high-caller gap functions.
  2. Repair any still-blocking overlaps in small batches.
  3. Re-decompile repaired ranges and keep only evidence-backed names.

Queue C: Broaden Carefully

  1. Expand into adjacent segments connected to already-understood clusters.
  2. Avoid speculative naming.
  3. Update the notes and the coverage ledger together after each verified batch.

Concrete Progress Interpretation

If a single number is needed, use 25%.

If a more honest dashboard is acceptable, use all three:

  • 80% of top-100 hot targets processed
  • 25% overall behavioral decompilation progress
  • 10% to 15% segment spread with meaningful analysis

That combination best reflects the actual state of the project.

Source Anchors

Primary sources for this file:

  • crusader_segment_coverage_ledger.csv
  • crusader_decompilation_notes.md
  • crusader_ne_segments.csv
  • tier4_output.txt
  • tier5_output.txt
  • repo memory progress summary

Next Update Rule

Update this file when one of the following happens:

  • the overall estimate changes materially,
  • a new subsystem reaches behavioral or stable status,
  • a major blocker such as 0x4588, allocator_phase_finalize_pass, or ASYLUM.24 is resolved,
  • or the segment coverage ledger is created and becomes the new primary progress source.