- Implemented a Python script to extract data from the EUSECODE.FLX file format. - Defined data structures for candidate entries and extracted chunks using dataclasses. - Added functions to read and parse the FLX table, extract candidate data, and generate human-readable output files. - Included functionality for analyzing extracted data, including generating summaries, descriptors, and event family reports. - Implemented utilities for calculating printable ratios, zero ratios, and identifying text-like data. - Added support for writing various output formats, including JSON, TSV, and Markdown.
394 lines
No EOL
43 KiB
Markdown
394 lines
No EOL
43 KiB
Markdown
# Crusader Decompilation Mid-Project Plan
|
|
|
|
## Purpose
|
|
|
|
This file is the workspace-facing mid-project tracker for the Crusader decompilation effort.
|
|
It is intended to answer four questions clearly:
|
|
|
|
1. How far along is the project?
|
|
2. What is already solid?
|
|
3. What still blocks broader decompilation?
|
|
4. What should be implemented next?
|
|
|
|
The estimates below are intentionally conservative. They measure verified behavioral understanding, not just renamed symbols.
|
|
|
|
## Progress Snapshot
|
|
|
|
## Working Progress
|
|
|
|
### Last Confirmed State
|
|
|
|
- Priority 0 has started: `crusader_segment_coverage_ledger.csv` exists and contains a first-pass 145-row ledger.
|
|
- The currently seeded ledger rows are conservative and strongest around seg001, seg004, seg021, seg043, seg080, seg082/083/085, seg091, seg094, and seg095.
|
|
- Priority 1 has started on the cache/backend cluster: the seg082 allocator mechanics are now materially recovered (`allocator_head_try_alloc_block`, `allocator_head_free_block`, `allocator_free_block_by_ptr`, `allocator_try_alloc_from_head_table`, `allocator_phase_finalize_pass`), and the `0x4588` path now has named lifecycle helpers (`runtime_callback_object_init_once`, `runtime_callback_object_teardown_once`).
|
|
- The `0x4588` blocker is tighter than before: `000a:b988` boundary repair now includes both callback sync callsites (`000a:b9e5` / `000a:ba66`) inside one real function body, `000d:9d5e` / `000d:a3b7` are confirmed inside `entity_cleanup_resources_and_dispatch`, and adjacent helpers are now clarified as `allocator_head_finalize_sweep` (`0009:a961`), `video_bios_state_snapshot` (`000a:4a1f`), and `video_mode_set_and_record_state` (`000a:4972`). Concrete subsystem identity is still unresolved.
|
|
- A larger MCP rename batch completed for cleanup callees: `palette_buffer_alloc_and_init_256` (`0009:7853`), `file_handle_alloc_init_and_open` (`0009:1c3a`), `file_handle_open_with_mode` (`0009:1d6a`), `surface_release_internal` (`0009:8d7b`), `surface_release_and_maybe_free` (`0009:8e0a`), and `sprite_redraw_global_if_active` (`000d:9231`). This reduces `entity_cleanup_resources_and_dispatch` ambiguity on file/surface/palette teardown paths.
|
|
- The previously missing `000d:7e00` function object is now recovered and named `entity_dispatch_entry_init_runtime_state`, with paired destructor `entity_dispatch_entry_release_runtime_state` at `000d:8078`. Adjacent missing helpers `0003:a880` and `0003:b8e2` were also recovered, with `0003:b8e2` promoted to `far_buffer_alloc_with_mode_flags`.
|
|
- Additional helper stabilization now covers seg061/064/076: `vga_palette_read` (`0009:6ec7`) is confirmed alongside existing palette write/free paths, `timer_entity_enable_wrapper` (`0008:d3ba`) is named, and seg064 one-shot gate helpers around `0x3b72/0x3b73` are documented with conservative comments while keeping speculative naming deferred.
|
|
- Constructor-lane semantics tightened further: `entity_set_update_period_and_reschedule` (`0008:d27e`) and `palette_buffer_alloc_copy_from_source` (`0009:7905`) are now named, and both `0x4588` callback emit callsites (`000d:9d5e`, `000d:a3b7`) now have explicit payload-pair annotations in disassembly.
|
|
- The seg082 allocator table structure is now pinned down as the allocator head table at `0x8724` and active head count at `0x879c`, and the old structural helpers at `0009:b06b` / `0009:b1c3` are now promoted to `allocator_try_alloc_from_head_table` and `allocator_phase_finalize_pass`.
|
|
- New caller-side seg138 evidence now exists at `FUN_000d_938c` (`000d:938c-000d:9583`): it builds one scratch-palette dispatch entry (`kind 0x3c`) and one current-palette dispatch entry (`kind 0x14`) through `entity_dispatch_entry_init_runtime_state`, waits for each entry's active flag to clear, then redraws the global sprite path and dispatches through the input object's vtable slot `+0x08`. This narrows the open lane to presentation/dispatch semantics without yet justifying a concrete subsystem rename.
|
|
- seg137 is now promoted from `Foothold` to `Partial`: direct MCP recovery stabilized a coherent palette/dispatch-entry helper family with safe renames for all-black, all-white, arbitrary-RGB, grayscale, black-state, and solid-color state builders around the same `entity_dispatch_entry_init_runtime_state` lane. The remaining gap is the higher-level event/script meaning of those helpers, not the local mechanics.
|
|
- seg005 and seg136 now have new high-value footholds: `FUN_0004_60c0` is recovered as a startup/display orchestration handoff that drives the seg137 palette helper family, validates an object through vtable `+0x0c`, creates the default active dispatch entry, programs mouse state, and then hands off into `0004:1e00`; nearby seg136 helpers are now stabilized as `active_dispatch_entry_mark_enabled`, `active_dispatch_entry_mark_disabled`, and `active_dispatch_entry_create_default`.
|
|
- The downstream seg005 handoff body is now also classified further: `FUN_0004_1e00` (`0004:1e00-0004:2420`) is a non-return startup/display transition driver with confirmed use of `vga_palette_set_all_black`, `animation_ctor_variant_b`, `sprite_node_get_or_traverse`, seg064 gate helpers, the `0x2bd8` vtable lane, and the `0x4aa/0x7e22` resource/object lane. The remaining work is naming the exact state label, not repairing the structure.
|
|
- seg126 is now promoted from `Foothold` to `Partial`: `FUN_000c_7412`, `transition_preentry_setup_resources`, `transition_preentry_release_resources`, `transition_preentry_run_until_complete_or_abort`, `transition_preentry_step_script`, `thunk_callf_0000_ffff_000c_827d`, `thunk_callf_0000_ffff_000c_82f9`, and `FUN_000c_834a` now show a coherent pre-entry, guarded-entry, script/fade step, and post-transition control shell around the same `FUN_0004_1e00` startup/display state.
|
|
- seg127 is now promoted from `Foothold` to `Partial`: `palette_fade_begin_full_up`, `palette_fade_begin_full_down`, `transition_palette_fade_begin`, `transition_palette_fade_tick`, `transition_palette_fade_out_step`, and `transition_palette_fade_in_step` form a concrete local palette-fade controller with verified full-range wrappers and caller-side state gating immediately beside the same seg126/seg005 transition lane.
|
|
- seg049 is no longer blank: `watch_entity_controller_create_global`, `watch_entity_controller_create`, and `watch_entity_controller_dispatch_if_present` now show that `0x2bd8` is a real type-stamped watch/camera controller object lane rather than only a raw watched-entity pointer, and that same controller is exercised from `FUN_0004_1e00`.
|
|
- seg108 is no longer blank: `sprite_object_clear_flag40_if_present` and `sprite_object_set_flag40_if_present` now anchor the `0x4f38` global sprite/object lane as a real state-bit-controlled object path used beside the same `0x4588` callback sync and startup/display transition flow.
|
|
- Direct MCP follow-up on seg126 and seg127 now recovered the missing helper bodies after boundary repair: `transition_preentry_setup_resources` (`000c:c63a`), `transition_preentry_release_resources` (`000c:c890`), `transition_preentry_run_until_complete_or_abort` (`000c:c9f4`), `transition_preentry_step_script` (`000c:ca1d`), and the neighboring `transition_palette_fade_tick` / `transition_palette_fade_begin` / `transition_palette_fade_out_step` / `transition_palette_fade_in_step` chain are now named against verified behavior. The latest semantic pass also tightened the two main open globals: `0x8c5c` / `0x8c60` are now best understood as a paired temporary text-renderer lane, while `0x31a2` behaves like an external input/event break gate maintained by queue/interrupt-side code. The remaining structural cleanup is the separate oversized overlap rooted at `000c:db68`, not the seg126 helper family.
|
|
- Bonus cheat-lane cleanup is now visible in Ghidra too: `cheat_code_check` has recovered local names (`input_event_record`, `input_event_offset`, `new_cheat_enabled`, `cheat_status_display_root`) and a decompiler comment stating that it matches the five-byte event-code sequence `50 80 3e fd 27 00` before toggling the cheat-state bytes and taking one of two local notification paths.
|
|
- Point 8 cheat/input-lane pass is complete. `keyboard_input_cheat_dispatch` (`0007:04dc`) is renamed and has a full scan-code mapping decompiler comment. `cheat_entity_slot_cycle_and_update_sprite` (`000c:8072`) and `cheat_anim_type_cycle_and_refresh` (`000c:81c0`) are named. Three `DS:0x6050` gate helpers (`000c:8221/8227/822b`) are named. All seven cheat event case-handlers in the 000c dispatch function now have labels and disassembly comments (`event_0x141/0x241/0x441_cheat_debug_overlay_toggle`, `event_0x7e_cheat_latch_runtime_toggle`, `event_0x142/0x143_cheat_fullscreen_mode1/0_refresh`, `event_0x410_cheat_flag_604f_toggle`). The cheat-related string table in seg014 is documented (including the dev Easter-egg `"FART ...TRY... -laurie"`). HACK MOVER / Immortality strings confirmed present with no static code xrefs — attributed to USECODE scripting layer. `0x844` (master cheat flag) vs `0x6045` (live cheat latch) separation remains solid.
|
|
- User-directed JELYHACK producer tracing is now tightened one layer upstream of `000d:208b` / `000d:21ed`: the immediate stream producer is the embedded mini-VM object created at context `+0x36`. `entity_vm_context_create_from_slot_index` (`000d:46ec`) feeds that object through `entity_vm_context_setup` (`000c:f844`), which uses `entity_vm_stack_init_with_data` (`000c:f6e8`) and `entity_vm_state_copy` (`000c:f772`) semantics to seed or clone `[+0xcc..+0xd2]`. The actual source payload comes from the runtime owner table at `0x6611 -> +0x1315/+0x1317 -> +0x10/+0x12`, addressed as `base + 0x0d*slot + 4`, and the resulting per-slot source is mirrored into `0x39ca`. This still does not expose a direct `JELYHACK`-named producer object, but it strengthens the current reading that `JELYHACK` / `JELYH2` contribute referent identity while neighboring `REE_BOOT` / `SURCAMEW` / `SFXTRIG` descriptors remain better candidates for event-bearing attachments.
|
|
- The next USECODE/JELYHACK pass now resolves the immediate owner-object writer too. `entity_vm_runtime_create` (`000d:4c99`) is the only writer of runtime `+0x1315/+0x1317`, via newly recovered `entity_vm_runtime_owner_resource_create` (`000d:7000`), and the companion `entity_vm_runtime_owner_resource_destroy` (`000d:70fd`) releases that helper. The `000d:7000` body does not copy a caller-supplied table directly: it constructs one embedded seg069/070 helper object, queries that helper for the required table size via vtable `+0x04`, allocates child `+0x10/+0x12`, then populates the `0x0d`-stride per-slot producer records through vtable `+0x0c`. Wrapper classification around `entity_vm_context_try_create_masked_for_entity` is tighter too: local wrapper `0004:f033` uses slot mask `0x8000:0007`, `FUN_0004_f05c` uses `0x2000:0015` and is reached from `0004:f2b3` after overlap/proximity and entity byte `+0x32` state checks, and `FUN_0005_27a4` uses `0x0001:0000` from the `000c:a09e` entity `+0x5b` bit-`0x0004` branch. This is enough for a conservative owner/resource classification, but not yet for a source-format-specific or descriptor-specific rename beyond that partial role name.
|
|
|
|
### Current Focus
|
|
|
|
1. User-directed USECODE/JELYHACK lane: identify who populates the runtime owner/resource object returned by `000d:7000`, especially the `+0x10/+0x12` per-slot producer table and the gameplay wrappers around `entity_vm_context_try_create_masked_for_entity` that decide which entities can materialize slot-backed VM contexts.
|
|
2. Finish Priority 0 refinement by promoting more exact segment rows where notes already support a verified foothold.
|
|
3. Continue the Priority 1 pass by tracing the higher-level startup/display callers, branch outcomes, pre-entry object lanes, palette-fade ownership, watch/camera controller ownership, and active sprite/object ownership that stitch the seg137 palette helper family into the wider `0x4588` / dispatch-entry object-role lane.
|
|
|
|
### Next Resume Point
|
|
|
|
1. Continue the user-directed USECODE/JELYHACK follow-on from the recovered producer chain, especially by:
|
|
- identifying the concrete seg069/070 helper class and source arguments behind `entity_vm_runtime_owner_resource_create` (`000d:7000`), especially the vtable `+0x04` size query and `+0x0c` table-population call that fill child `+0x10/+0x12`,
|
|
- extending wrapper classification outward from the now-verified seeds `0004:f033` (`0x8000:0007`), `FUN_0004_f05c` (`0x2000:0015`), and `FUN_0005_27a4` (`0x0001:0000`) into the neighboring `0005:2867/2918/2ae2/2d30` family so the slot-mask groups can be mapped to concrete gameplay object classes,
|
|
- checking whether any recovered owner-table records or slot families line up with the JELYHACK-island referent/event neighborhood more strongly than with generic entity-script traffic,
|
|
- and tracing whether the `0x39ca` per-slot payload mirror is initialized only from `entity_vm_context_create_from_slot_index` or is also refreshed by other runtime-owner helper paths.
|
|
2. Keep classifying the seg126 pre-entry text-renderer lane around `transition_preentry_setup_resources`, `transition_preentry_step_script`, and `transition_preentry_release_resources`, especially by:
|
|
- comparing more preset `0x10` / `0x11` text-renderer callsites,
|
|
- tracing who owns the rendered buffer loaded into `0x6301:0x6303`,
|
|
- mapping the control bytes `0x21` / `0x23` / `0x24` / `0x26` / `0x2a` / `0x40` / `0x5e` to concrete display behavior,
|
|
- and deciding whether the paired `0x8c5c` / `0x8c60` lane is a title/body pair, normal/highlight pair, or another fixed UI pairing.
|
|
3. Finish the `0x31a2` gate pass as one batch:
|
|
- classify the read sites at `0004:c24d`, `000c:ca11`, `000c:e4d8`, `000c:e546`, `000c:e5c6`, `000d:9304`, `000d:b6b1`, and `000d:c0ee`,
|
|
- relate them back to interrupt-side updates at `0008:a283` / `0008:a314`,
|
|
- and decide whether `0x31a2` is best described as user-acknowledge, queued-input depth, or a broader event-break gate.
|
|
4. Tighten the `DS:0x6341` to `0x6828` relationship:
|
|
- compare the seg126 `animation_ctor_variant_a` call with the other raw callsites at `0005:3c4f`, `0005:3c74`, `000c:6176`, and `000c:619c`,
|
|
- map who owns `g_active_dispatch_entry_farptr[+0x40]`,
|
|
- and classify whether seg126 is constructing a transition-local animation payload for the shared active dispatch entry or only toggling an owner-side state bit after setup.
|
|
5. Identify which higher-level transition states own the seg127 fade-controller inputs at `0x630a-0x6316` and how that fade state is chosen from the seg005/seg126 startup path.
|
|
6. Repair the still-oversized overlap rooted at `000c:db68` only if it blocks follow-on analysis or decompiler visibility in the same transition lane.
|
|
7. Clarify the relationship between the seg049 watch/camera controller at `0x2bd8`, the seg108 sprite/object lane at `0x4f38`, and the object validated through `FUN_0004_60c0` vtable slot `+0x0c`.
|
|
8. Continue caller-role classification inside `entity_cleanup_resources_and_dispatch` (contains both `000d:9d5e` and `000d:a3b7`) and map how it relates to `FUN_000d_938c`, `FUN_0004_60c0`, `FUN_000c_7412`, `transition_preentry_release_resources`, and the seg136/seg137 active-dispatch helper family.
|
|
8. ~~Cheat/input side lane~~ — **COMPLETED** this pass. All point-8 sub-items are now resolved:
|
|
- `keyboard_input_cheat_dispatch` renamed; full scan-code table documented in decompiler comment.
|
|
- `cheat_entity_slot_cycle_and_update_sprite` and `cheat_anim_type_cycle_and_refresh` named.
|
|
- `DS:0x287b` / `DS:0x2892` success-path presentation: confirmed as opaque near-code discriminator values stored at `+0x49` in the display notification object; the cheat-on/off display is built via `display_null_check_dispatch` + `sprite_node_get_or_traverse`.
|
|
- All seven cheat event case-handlers in the 000c dispatch function labeled and commented.
|
|
- `0x844` (master) vs `0x6045` (live latch) separation confirmed solid; `0x604b` / `0x604f` / `0x6050` also documented.
|
|
- HACK MOVER: no static code xrefs; attributed to USECODE scripting layer. Cheat string table in 000e fully documented.
|
|
- Remaining open: exact user-facing identity of events `0x141/0x241/0x441` overlays (strings suggest targeting-reticle / CD-transfer-display), exact DS:0x6087/6091 notification objects, and any further depth on the `0x4f38` / `0x2bd8` vtable path taken by the overlay events.
|
|
- The Immortality cheat mechanics are now fully traced at the C level: event `0x410` toggles `DS:0x604f`; the sole read site is `player_receive_damage_and_dispatch_effects` (`0004:c055`) at `0004:c205`, which divides all incoming 32-bit damage by `0x40000` (262,144) when the flag is set, making HP loss negligible while the hit-stagger animation still plays. No static C keyboard dispatch generates event `0x410` — confirmed USECODE/ASYLUM scripting layer only. `DS:0x60d2` / `DS:0x60ee` are the "Immortality enabled." / "Immortality disabled." notification pointers. A parallel handler at `000b:b62c` sets the associated USECODE process state to `0xe` when the event arrives.
|
|
- `tools/extract_eusecode_flx.py` now parses the validated full EUSECODE table (`count @ 0x54`, table @ `0x80`) rather than the old heuristic header scan. Current run extracts all `403` non-zero entries and emits a searchable `entry_index.tsv` with `primary_label` and `field_names` summaries.
|
|
- The extractor now also emits `descriptor_index.tsv` and `descriptor_neighborhoods.tsv`, which summarize per-class field-tag patterns and the local neighborhoods around trigger/event-related classes.
|
|
- Current EUSECODE split is now clearer: the `000e` parser lane plausibly covers text-heavy records like `DATALINK` and `TEXTFIL1`, while the binary descriptor lane exposes object classes such as `EVENT`, `NPCTRIG`, `CRUZTRIG`, `TRIGPAD`, `SPECIAL`, `SURCAMNS`, `SURCAMEW`, `JELYHACK`, and `JELYH2`.
|
|
- The descriptor lane now has a real structural foothold too: field-name strings are preceded by short tagged metadata records (`69 xx 00 <name>`, `24 xx 02 <name>`, etc.) in multiple classes. This looks like compact field-definition encoding rather than arbitrary string spill.
|
|
- That tag grammar is now useful enough to search semantically: `69:0A00 -> event` is stable across `EVENT`, `NPCTRIG`, `SFXTRIG`, and several `*_BOOT` classes, while `24:0A02 -> eventTrigger` shows up in `SURCAMNS` / `SURCAMEW`.
|
|
- Immortality-specific follow-on is now narrowed but not closed: `JELYHACK` and `JELYH2` are confirmed as real referent-only EUSECODE descriptors; `NPCTRIG` is confirmed as an event-capable trigger descriptor; `CRUZTRIG` / `TRIGPAD` expose `referent,item,elev`; but no extracted record has yet been tied directly to binary event value `0x410`.
|
|
- The clustering pass tightened the local candidate set around `JELYHACK`: the immediate neighborhood now includes `SPECIAL`, `TRIGPAD`, `DATALINK`, `HOFFMAN`, `REE_BOOT`, `SURCAMEW`, and `SFXTRIG`, which is a plausible map/object island rather than random sparse table order.
|
|
- The strongest `record_table_parse_buffer` caller evidence (`000e:1b9f..1d49`) now appears to belong to the animation-object field lane, because the surrounding setup manipulates the already-mapped animation fields at `+0x117/+0x11b/+0x11f/+0x123` and `+0xeaf/+0xeb1`. That weakens the earlier assumption that `000e:3639` is the primary EUSECODE loader and shifts the likely binary-descriptor consumer search back toward the `000d` VM/object path.
|
|
- The first concrete `000c` to `000d` bridge in that direction is now visible at `entity_vm_set_value_from_slot_plus_offset` (`000c:f95f`): it calls `entity_vm_slot_load_value_plus_offset` (`000d:5572`) and stores the return pair into object fields `+0xd6/+0xd8`; on the `000d` side, `entity_vm_slot_load_value` (`000d:51fd`) contains a verified `PUSH 0x410` path. Supporting slot helpers in the same lane are now named too (`entity_vm_slot_find_or_select`, `entity_vm_slot_decrement_use_count`, `entity_vm_slot_release_value`). This still does not prove the immortality trigger chain, but it is the strongest current code-side connection between the mini-VM lane and a live `0x410` producer.
|
|
- The adjacent `000d:45xx..4exx` island is now promoted out of `FUN_*` placeholders as one coherent VM runtime/context family. Newly named helpers include `entity_vm_runtime_create` / `entity_vm_runtime_init_slots` / `entity_vm_runtime_release_slots` / `entity_vm_runtime_destroy`, `entity_vm_slot_index_from_entity`, `entity_vm_context_try_create_masked_for_entity`, `entity_vm_context_create_from_slot_index`, `entity_vm_context_sync_global_value_and_dispatch`, and the context save/load/destroy helpers. The runtime global at `0x6611` now reads as a real owner for this lane rather than an opaque far pointer.
|
|
- Two large caller bodies at `000d:208b` and `000d:21ed` now stand out as concrete context-construction sites: both feed per-object stream/data state from `+0xcc/+0xce` into `entity_vm_context_create_from_slot_index`, then continue by reading from the seeded `+0xd6/+0xd8` bytecode/value lane. This is the clearest current evidence that the `000d` interpreter/object family, not the `000e` text parser, is the near-runtime consumer to keep following for the immortality trigger.
|
|
- A second supporting lane is now named too: `entity_vm_referent_registry_init` / `destroy` / `alloc` / `release_by_id` / `free_node` show that `0x8c8c/0x8c8e/0x8c90/0x8c94` form a free-list-backed referent registry. `entity_vm_set_field_da_to_global` writes `0x8c94` from the context `+0xda` lane before entering the still-misaligned `000c:3350` body, which is the first concrete runtime mechanism explaining how referent-only descriptors such as `JELYHACK` can still participate in script state.
|
|
- That referent-registry lane is now better structured too: `entity_vm_referent_chain_copy`, `entity_vm_referent_chain_append_unique_from`, `entity_vm_referent_chain_remove_matching_from`, `entity_vm_referent_chain_contains_entry`, `entity_vm_referent_chain_get_entry_data_at`, `entity_vm_referent_chain_set_entry_data_at`, and `entity_vm_referent_chain_get_indirect_data` show that the runtime can build, subtract, and mutate payload chains hanging off one referent anchor. This is the first runtime shape that looks directly useful for a future human-readable / modifiable script IR.
|
|
- `entity_vm_opcode_finish` (`000d:3350`) is now identified as the shared opcode epilogue for this family rather than an opaque helper: it writes `0x8c94` from frame-local state, unwinds the temporary slot-array state at `0x659c/0x659e` when present, and returns the current opcode result.
|
|
- The runtime/context half of that lane is now named too. The `0x6611` global is managed by `entity_vm_runtime_create` / `entity_vm_runtime_init_slots` / `entity_vm_runtime_release_slots` / `entity_vm_runtime_destroy`, while `entity_vm_slot_index_from_entity`, `entity_vm_context_try_create_masked_for_entity`, and `entity_vm_context_create_from_slot_index` now show how gameplay entities are tested against one owner-side slot-mask table before a `0x6714` VM context is created.
|
|
- That context family is no longer anonymous either: `entity_vm_context_sync_global_value_and_dispatch`, `entity_vm_context_save`, `entity_vm_context_load`, `entity_vm_context_destroy`, and `entity_vm_context_free_buffer` now pin down the lifecycle around the same `+0xd6/+0xd8`, `+0x102`, `+0x10c/+0x10e`, and `+0x11b/+0x11d` fields.
|
|
- Current best near-runtime callsites for further immortality work are the large `000d:208b` and `000d:21ed` bodies, which both build one VM context from caller stream/data state and then continue by consuming bytes from the seeded context value lane.
|
|
- The first opcode family under that lane is also less anonymous now: `000d:0988` can either append unique payload entries or remove matching ones depending on the opcode id (`0x1a/0x1b` taking the removal path), and both branches return through `entity_vm_opcode_finish`.
|
|
- That opcode family is now classified one step further: `0x19` = append-unique indirect/string-like payloads, `0x1a` = remove-matching indirect/string-like payloads, `0x1b` = remove-matching inline payloads, and the same helper body strongly implies `0x18` as the missing append-unique inline sibling.
|
|
- The first stable `+0xd6/+0xd8` byte-lane semantics are now visible in the two large caller bodies too. The `000d:208b` block is a simple materialize-or-forward path after `entity_vm_context_create_from_slot_index`, while `000d:21ed` copies a caller-owned inline blob into the context `+0x102` buffer and then consumes two stream bytes as compact shape/count metadata before building an `entity_link` closure matrix from the following caller-stream words.
|
|
- Current best JELYHACK reading is tighter than before: the extracted chunks still only expose `referent`, but the new referent-registry work means that does not relegate them to inert map labels. The most defensible present model is `JELYHACK/JELYH2 = referent anchors`, with the actual immortality/event behavior carried by neighboring event-capable descriptors in the same local island (`REE_BOOT`, `SURCAMEW`, `SFXTRIG`, or a nearby generic event/trigger record).
|
|
- That readability step now has a first concrete artifact: `tools/extract_eusecode_flx.py` emits `referent_anchor_event_graph.tsv` plus a focused `jelyhack_island_graph.md`, which turns the local table neighborhood into a first readable anchor-to-event view instead of only raw descriptor rows.
|
|
- The extractor now also emits `jelyhack_descriptor_compare.tsv`, and its first result is useful: `JELYHACK` and `JELYH2` have identical first 16 header words as referent-only sibling descriptors, while `REE_BOOT`, `SURCAMEW`, and `SFXTRIG` show materially richer header/state patterns consistent with the event-bearing side of the island.
|
|
- Latest opcode-side refinement: `entity_vm_opcode_finish` (`000d:3350`) is now the shared epilogue for the chain-mutating handlers, while `entity_vm_referent_chain_remove_matching_from` (`000d:6a9a`) and `entity_vm_referent_chain_set_entry_data_at` (`000d:6cf6`) show that the VM can subtract and rewrite payload chains in place, not just append/copy them.
|
|
- The `000d:21ed` follow-on is now better anchored semantically too: its nested callee `0008:7d27` is `entity_link`, so the `22bc..2433` block is building a bidirectional entity-link closure matrix from streamed entity ids rather than only emitting an opaque table. A conservative disassembly comment is now in place at `000d:22bc`; rename deferred until the bad outer function split is repaired.
|
|
- The extractor work now scales beyond the JELYHACK case: reusable focused-report helpers emit both `jelyhack_*` and `event_*` cluster artifacts, and the first new result is strong. The `EVENT` island (`ROLL_NS`, `COR_BOOT`, `EVENT`, `NPCTRIG`, `CRUZTRIG`, `NPC_ONLY`, `VMAIL`) contains a compact three-node event-bearing core (`COR_BOOT`, `EVENT`, `NPCTRIG`) surrounded by referent/link/text satellites.
|
|
- That second island materially improves the EUSECODE model: instead of one special-case `JELYHACK` anchor plus neighbors, we now have a broader pattern of `event-bearing core embedded in referent-neighbor island`, with `EVENT` acting as a large hub descriptor (`source/dest/door/link/time/counter/post1/post2/floor/flicMan`) and `ROLL_NS` / `CRUZTRIG` / `NPC_ONLY` / `VMAIL` reading as attached state or trigger-side records rather than peer event hubs.
|
|
- The descriptor-side taxonomy is now wider too: `event_family_index.tsv` / `event_family_summary.md` classify all current event-tagged descriptors into reusable families. The active `69:0A00 -> event` lane now breaks cleanly into one `EVENT` hub, five `_BOOT` event cores, one NPC trigger core, one minimal event core (`SFXTRIG`), and three environmental event classes (`FLAMEBOX`, `NOSTRIL`, `STEAMBOX`), while the surveillance pair `SURCAMNS` / `SURCAMEW` is now cleanly separated as `callback-eventtrigger` rather than generic event-bearing descriptors.
|
|
- The `_BOOT` family is now better constrained too. `boot_family_compare.tsv` shows that `AND_BOOT`, `BRO_BOOT`, `COR_BOOT`, `VAR_BOOT`, and `REE_BOOT` all share one common header/template shape, so the family now reads as repeated instantiations of the same event-core descriptor rather than structurally different boot subclasses.
|
|
- The best remaining `_BOOT` frontier is now explicit in extractor output as well: `boot_frontier_graph.md` shows `AND_BOOT` / `BRO_BOOT` embedded in a compact referent-heavy neighborhood (`OFFWORK`, `GUARD`, `GDOOR_*`, `BIGCAN`, `CRUMORPH`, `GUARDSQ`, `CARD_*`, wall variants), which is the cleanest unresolved object-side context for the boot-event template.
|
|
- The environmental event lane is now promoted out of a generic family label into a clearer structural pattern. `environmental_family_compare.tsv` shows `FLAMEBOX` and `STEAMBOX` as close hazard-event siblings with the same active-event backbone plus direction/count, while `NOSTRIL` is the smaller fire-specific variant that keeps the dual-hazard references and counters but drops the direction/newType side.
|
|
- The callback-trigger lane is also more defensible now: `callback_trigger_compare.tsv` confirms that `SURCAMNS` and `SURCAMEW` are effectively one shared callback template, differing only in one `therma` slot tag offset. That keeps the active `event` lane and callback `eventTrigger` lane separated by more than just naming convention.
|
|
- Runtime follow-through has resumed too: `000d:ebe3` is now backed by direct instruction evidence as one ordered VM/opcode driver body that calls `000d:177c`, `000d:1acb`, `000d:0988`, internal block `000d:22bc`, then `000d:1d4a` and `000d:2104` in sequence. `000d:ec31` is confirmed as only the internal `CALL 000d:22bc` site inside that body, so the inner block is still not a safe standalone rename target.
|
|
- `entity_vm_context_try_create_masked_for_entity` (`000d:463a`) is now pinned down one step further: it first checks the runtime-disable byte at `0x6610`, computes the entity slot, tests the owner-side slot mask in the runtime owner table, and only then creates a context. On success it reports either an immediate result (success with cleared output word) or an object-backed result (success with the created object's low word), which is the clearest current typed boundary between gameplay entities and VM-backed object results.
|
|
- The immediate owner-object writer is now identified too. `entity_vm_runtime_create` (`000d:4c99`) stores the only verified runtime `+0x1315/+0x1317` value by calling the newly recovered `entity_vm_runtime_owner_resource_create` (`000d:7000`), whose helper-managed body allocates child `+0x10/+0x12` from a vtable `+0x04` size query and fills the `0x0d`-stride slot table through vtable `+0x0c`. The paired release path is `entity_vm_runtime_owner_resource_destroy` (`000d:70fd`).
|
|
- The first wrapper-side mask families are now anchored by direct instruction evidence as well: local wrapper `0004:f033` passes `0x8000:0007`, `FUN_0004_f05c` passes `0x2000:0015` from the `0004:f2b3` overlap/proximity branch with entity byte `+0x32` state toggling, and `FUN_0005_27a4` passes `0x0001:0000` from the `000c:a09e` entity `+0x5b` bit-`0x0004` branch. This is enough to distinguish at least three gameplay-side mask lanes without yet claiming descriptor-specific ownership such as `JELYHACK` versus `REE_BOOT`.
|
|
- One exact `0x410` collision that could have reopened the wrong lane is now ruled out: `000e:0953` pushes literal `0x410` into imported `ASYLUM.27` from the animation/audio path after setting the `+0xef1` audio-completion byte. Because `ASYLUM.DLL` is the `ASS_*` audio/media library, this is not evidence for a second gameplay or USECODE event source; the live compiled-code bridge for the immortality event remains the `000d` VM lane at `entity_vm_slot_load_value` (`000d:51fd`).
|
|
9. Revisit `allocator_phase_finalize_pass` only where it intersects the same callback object semantics, rather than broad allocator mechanics that are already sufficiently constrained.
|
|
10. Continue `ASYLUM.24` only after the `0x4588` / dispatch-entry lane and `0004:1e00` transition path have no further cheap wins.
|
|
11. User-directed USECODE/JELYHACK side lane: trace who seeds the caller stream/data pair at `+0xcc/+0xce` before the `000d:208b` and `000d:21ed` context-construction blocks, and correlate those producer-side objects with referent ids or descriptor-class neighborhoods that could distinguish `JELYHACK` / `JELYH2` anchors from the neighboring `REE_BOOT`, `SURCAMEW`, and `SFXTRIG` event-bearing attachments.
|
|
|
|
### Headline Estimate
|
|
|
|
- Overall useful decompilation progress: about 35%
|
|
- Reasonable uncertainty band: about 30% to 40%
|
|
|
|
This is the best single-number estimate for the full game right now.
|
|
|
|
### Supporting Metrics
|
|
|
|
| Metric | Estimate | Meaning |
|
|
|---|---:|---|
|
|
| Top 100 far-call target coverage | about 80% | Roughly 80 of the top 100 most-called far-call targets have been named or materially classified |
|
|
| Whole-program behavioral coverage | about 35% | Verified subsystem and function understanding across the executable |
|
|
| Segment spread with meaningful analysis | about 19% to 25% | Segments with more than a trivial foothold or isolated note |
|
|
| Tooling maturity for continued work | about 75% | Core repair, lookup, and fallback automation needed for continued progress |
|
|
|
|
### Why These Numbers Differ
|
|
|
|
- The hot-target metric is much higher because the project has already focused on the most shared and most-called helpers.
|
|
- The whole-program metric is lower because most of the 145 NE segments still have not had systematic coverage passes.
|
|
- The segment-spread metric is lower still because only a subset of segments have coherent subsystem-level treatment.
|
|
|
|
## What Is Already In Place
|
|
|
|
### Workflow and Tooling
|
|
|
|
- Raw full-EXE Ghidra target is established and in active use.
|
|
- Verified raw-import mapping exists for seg001 and seg021.
|
|
- NE relocation parsing has been implemented.
|
|
- Internal literal far-call fixups have been applied to the raw import.
|
|
- PyGhidra fallback tooling exists for create/delete function work and batch scripted edits.
|
|
- Conservative boundary-repair workflow already exists and has been used successfully.
|
|
- Notes are detailed enough to support a formal executable-wide tracker.
|
|
|
|
### Objective Milestones Already Reached
|
|
|
|
- 145 NE segments identified from the internal NE header.
|
|
- 8851 internal literal CALLF sites patched to real targets in the raw import.
|
|
- 2841 non-CALLF far-pointer relocations identified and deferred.
|
|
- 119 import callsites annotated.
|
|
- Top 100 far-call target list processed through five tiers, with about 80 named or materially classified.
|
|
|
|
## Strongly Advanced Areas
|
|
|
|
### Core Gameplay and Entity Work
|
|
|
|
- seg001 gameplay, cursor, entity lifecycle, projectile, combat, and AI footholds are strong.
|
|
- A verified seg001 raw-port path is working and already used for multiple projectile helpers.
|
|
- Entity table, class-table, and several global gameplay fields are partially mapped.
|
|
|
|
### Timer, Event, and State Systems
|
|
|
|
- seg021 timer and event-dispatch work has meaningful coverage.
|
|
- 000c state-dispatch, cursor-nav, UI-listbox, palette-fade, and mini-VM clusters have footholds.
|
|
|
|
### Rendering and Camera
|
|
|
|
- 0007 rendering, draw-list, tile-visibility, and camera work has strong structural coverage.
|
|
- `world_to_screen_coords` and adjacent geometric helpers are understood well enough to support further caller analysis.
|
|
|
|
### Dispatch and Pair-Sync Helpers
|
|
|
|
- 0008 dispatch-entry helper families have multiple verified rename batches.
|
|
- Pair-sync and target-state helper clusters are no longer isolated unknowns.
|
|
|
|
### Cache, Tracked Handles, and Bucket Logic
|
|
|
|
- 000a cache manager layer is structurally mapped.
|
|
- 000a tracked-handle table is structurally mapped.
|
|
- 000d tracked bucket / proximity / visibility bucket logic has several meaningful behavioral names.
|
|
- The client/cache distinction is much clearer than before.
|
|
|
|
### Parser and Animation Framework
|
|
|
|
- 000e parser cluster has a stable set of verified names.
|
|
- 000e animation framework has a real foothold: chunk lookup, audio load, tick, frame advance, and constructor variants are partly mapped.
|
|
|
|
### Local Repair Successes
|
|
|
|
- seg043 overlap repair succeeded and recovered multiple valid function objects.
|
|
- seg091 boundary recovery succeeded and exposed RNG helpers plus local init/context helpers.
|
|
- Recent seg004 reset-path recovery and cache-reset follow-up added a new high-value analysis cluster.
|
|
|
|
## What Still Blocks Broader Coverage
|
|
|
|
### High-Value Classification Gaps
|
|
|
|
- The object rooted at `0x4588` is still not classified well enough to safely rename the callback object itself beyond the current allocator-side glue names.
|
|
- `ASYLUM.24` is only known as an import site, not yet a confidently identified routine.
|
|
- Some structural names in the cache/backend/finalize cluster are waiting on object-role confirmation.
|
|
|
|
### Boundary and Decompiler Gaps
|
|
|
|
- Some high-caller targets still require conservative boundary repair or follow-up validation.
|
|
- Certain functions still decompile poorly because of overlaps, thunk-heavy paths, or unresolved downstream targets.
|
|
- `000e:ffb0` remains a notable animation/video-side blocker because of overlapping instructions.
|
|
|
|
### Coverage Management Gap
|
|
|
|
- A first-pass normalized segment-by-segment coverage ledger now exists for all 145 NE segments.
|
|
- The remaining gap is refinement rather than absence: most segments still need manual promotion from `None` to `Foothold` / `Partial` / `Deep` as coverage expands.
|
|
|
|
### Deferred Data Work
|
|
|
|
- Non-CALLF far-pointer relocations still exist and will matter for deeper object/table recovery.
|
|
- They are no longer the main blocker, but they remain a real second-pass problem.
|
|
|
|
## Current Best Assessment Of Remaining Work
|
|
|
|
The project has solved most of the architectural uncertainty needed to keep going efficiently.
|
|
The remaining effort is mainly a scaling problem:
|
|
|
|
- expand coverage across many more segments,
|
|
- remove the last high-value boundary blockers,
|
|
- convert structural names into subsystem names when evidence is strong enough,
|
|
- and normalize progress tracking so the whole program can be managed deliberately.
|
|
|
|
In practical terms, this looks like a true mid-project state rather than an early exploratory state or a late polish state.
|
|
|
|
## Implementation Priorities
|
|
|
|
### Priority 0: Coverage Ledger
|
|
|
|
First pass completed: an executable-wide coverage ledger now exists for all 145 NE segments in `crusader_segment_coverage_ledger.csv`.
|
|
|
|
Next work under Priority 0:
|
|
|
|
1. Promote additional segments from `None` where notes already support a verified foothold.
|
|
2. Normalize raw-address subsystem islands (notably the `000e:` parser/animation cluster) back onto exact NE segment rows.
|
|
3. Keep the ledger updated together with `crusader_decompilation_notes.md` after each verified batch.
|
|
|
|
Minimum columns:
|
|
|
|
| Column | Meaning |
|
|
|---|---|
|
|
| Segment | NE segment number |
|
|
| Type | Code or data |
|
|
| File offset | From the NE segment table |
|
|
| Length | Segment length |
|
|
| Coverage status | None, foothold, partial, deep |
|
|
| Known subsystem | Best current classification |
|
|
| Key named functions | Short summary only |
|
|
| Blockers | Boundary, import, thunk, overlap, unknown object, etc. |
|
|
| Notes source | Notes section or evidence anchor |
|
|
|
|
This is the most important missing artifact because it will make the percentage estimates maintainable.
|
|
|
|
### Priority 1: Finish The New Cache/Backend Cluster
|
|
|
|
Work the newest verified reset-path cluster to closure:
|
|
|
|
1. Trace more callers of `0009:b06b`.
|
|
2. Trace more callers of `FUN_0009_a961`.
|
|
3. Classify the object rooted at `0x4588`.
|
|
4. Revisit `allocator_phase_finalize_pass` once the object role is clearer.
|
|
|
|
This is currently the best next analysis target because it closes a live cluster that already has fresh verified work around it.
|
|
|
|
### Priority 2: `ASYLUM.24` Resolved
|
|
|
|
`ASYLUM.DLL` was imported as a separate NE program in Ghidra and its export table is now verified as an `ASS_*` audio DLL, not the immortality/USECODE interpreter lane.
|
|
|
|
Resolved result:
|
|
|
|
- `ASYLUM.24` = `_ASS_StopAllSFX` at `1018:0681`
|
|
- `runtime_cache_reset_sequence` therefore performs an audio stop before the cache/tracked-handle reset work
|
|
- this import is not evidence for the immortality cheat path; the `0x410` toggle remains attributed to the interpreted `EUSECODE.FLX` lane rather than `ASYLUM.DLL`
|
|
|
|
### Priority 3: Continue Small-Batch Boundary Repair
|
|
|
|
Use the existing conservative repair approach for remaining high-value blockers.
|
|
|
|
Good candidates include:
|
|
|
|
- unresolved high-caller function objects,
|
|
- ranges that still steal bytes from adjacent real bodies,
|
|
- and overlaps that block decompilation of already-active subsystems.
|
|
|
|
### Priority 4: Finish Partial Subsystem Islands Before Expanding Broadly
|
|
|
|
Recommended order:
|
|
|
|
1. seg043 plus connected seg004 reset and dispatch paths
|
|
2. 000e animation/video overlap at `000e:ffb0`
|
|
3. 000c UI-listbox, mini-VM, and cursor-nav families
|
|
4. Remaining structural 0007 and 0008 helper cohorts
|
|
|
|
The goal is to reduce the number of half-understood islands before starting broad segment sweeps.
|
|
|
|
### Priority 5: Broaden Coverage Across The Remaining Executable
|
|
|
|
Once the ledger exists and the current hot cluster is closed, broaden analysis segment by segment.
|
|
|
|
Preferred method:
|
|
|
|
1. Group segments by adjacency and call relationships.
|
|
2. Identify entry points and hot callees first.
|
|
3. Classify globals and tables next.
|
|
4. Promote helper names only when supported by strong evidence.
|
|
|
|
## Recommended Tracking Model
|
|
|
|
Use these status values for segment coverage:
|
|
|
|
| Status | Meaning |
|
|
|---|---|
|
|
| None | No meaningful verified analysis yet |
|
|
| Foothold | One or two verified entry points or helper names, but no subsystem picture |
|
|
| Partial | Several verified names plus some globals/tables or object fields |
|
|
| Deep | Coherent subsystem-level understanding with multiple verified related functions |
|
|
|
|
Use these status values for subsystem maturity:
|
|
|
|
| Status | Meaning |
|
|
|---|---|
|
|
| Unknown | Not enough evidence to classify |
|
|
| Structural | Behavior is partly mapped but still generic |
|
|
| Behavioral | Confident subsystem role is known |
|
|
| Stable | Multiple connected functions and data objects support the classification |
|
|
|
|
## Suggested Immediate Work Queue
|
|
|
|
### Queue A: Highest Leverage
|
|
|
|
1. Expand the first-pass segment coverage ledger beyond the currently seeded segments.
|
|
2. Trace `allocator_try_alloc_from_head_table`, `allocator_head_finalize_sweep`, and `allocator_phase_finalize_pass`.
|
|
3. Identify `ASYLUM.24`.
|
|
|
|
### Queue B: Repair And Stabilize
|
|
|
|
1. Review remaining high-caller gap functions.
|
|
2. Repair any still-blocking overlaps in small batches.
|
|
3. Re-decompile repaired ranges and keep only evidence-backed names.
|
|
|
|
### Queue C: Broaden Carefully
|
|
|
|
1. Expand into adjacent segments connected to already-understood clusters.
|
|
2. Avoid speculative naming.
|
|
3. Update the notes and the coverage ledger together after each verified batch.
|
|
|
|
## Concrete Progress Interpretation
|
|
|
|
If a single number is needed, use 25%.
|
|
|
|
If a more honest dashboard is acceptable, use all three:
|
|
|
|
- 80% of top-100 hot targets processed
|
|
- 25% overall behavioral decompilation progress
|
|
- 10% to 15% segment spread with meaningful analysis
|
|
|
|
That combination best reflects the actual state of the project.
|
|
|
|
## Source Anchors
|
|
|
|
Primary sources for this file:
|
|
|
|
- `crusader_segment_coverage_ledger.csv`
|
|
- `crusader_decompilation_notes.md`
|
|
- `crusader_ne_segments.csv`
|
|
- `tier4_output.txt`
|
|
- `tier5_output.txt`
|
|
- repo memory progress summary
|
|
|
|
## Next Update Rule
|
|
|
|
Update this file when one of the following happens:
|
|
|
|
- the overall estimate changes materially,
|
|
- a new subsystem reaches behavioral or stable status,
|
|
- a major blocker such as `0x4588`, `allocator_phase_finalize_pass`, or `ASYLUM.24` is resolved,
|
|
- or the segment coverage ledger is created and becomes the new primary progress source. |