Add Crusader-specific USECODE data and documentation
- Introduced new file `vm_mask_ladder.tsv` containing detailed mappings for Crusader USECODE VM masks and their associated descriptors. - Added comprehensive documentation in `scummvm-crusader-reference.md` outlining the structure, findings, and implications for reverse-engineering the Crusader engine within ScummVM. - Created `usecode-roundtrip-ir.md` to document the plan for converting Crusader USECODE bytes into a human-readable format, detailing the container layout, event names, and intrinsic tables. - Implemented a PowerShell script `temp_usecode_sample.ps1` for extracting and analyzing USECODE data from the Crusader FLX files, providing insights into class and event structures.
This commit is contained in:
parent
3daffbf113
commit
de42fd1ea1
42 changed files with 21970 additions and 1522 deletions
60
plan-mid.md
60
plan-mid.md
|
|
@ -40,20 +40,47 @@ The estimates below are intentionally conservative. They measure verified behavi
|
|||
- Point 8 cheat/input-lane pass is complete. `keyboard_input_cheat_dispatch` (`0007:04dc`) is renamed and has a full scan-code mapping decompiler comment. `cheat_entity_slot_cycle_and_update_sprite` (`000c:8072`) and `cheat_anim_type_cycle_and_refresh` (`000c:81c0`) are named. Three `DS:0x6050` gate helpers (`000c:8221/8227/822b`) are named. All seven cheat event case-handlers in the 000c dispatch function now have labels and disassembly comments (`event_0x141/0x241/0x441_cheat_debug_overlay_toggle`, `event_0x7e_cheat_latch_runtime_toggle`, `event_0x142/0x143_cheat_fullscreen_mode1/0_refresh`, `event_0x410_cheat_flag_604f_toggle`). The cheat-related string table in seg014 is documented (including the dev Easter-egg `"FART ...TRY... -laurie"`). HACK MOVER / Immortality strings confirmed present with no static code xrefs — attributed to USECODE scripting layer. `0x844` (master cheat flag) vs `0x6045` (live cheat latch) separation remains solid.
|
||||
- User-directed JELYHACK producer tracing is now tightened one layer upstream of `000d:208b` / `000d:21ed`: the immediate stream producer is the embedded mini-VM object created at context `+0x36`. `entity_vm_context_create_from_slot_index` (`000d:46ec`) feeds that object through `entity_vm_context_setup` (`000c:f844`), which uses `entity_vm_stack_init_with_data` (`000c:f6e8`) and `entity_vm_state_copy` (`000c:f772`) semantics to seed or clone `[+0xcc..+0xd2]`. The actual source payload comes from the runtime owner table at `0x6611 -> +0x1315/+0x1317 -> +0x10/+0x12`, addressed as `base + 0x0d*slot + 4`, and the resulting per-slot source is mirrored into `0x39ca`. This still does not expose a direct `JELYHACK`-named producer object, but it strengthens the current reading that `JELYHACK` / `JELYH2` contribute referent identity while neighboring `REE_BOOT` / `SURCAMEW` / `SFXTRIG` descriptors remain better candidates for event-bearing attachments.
|
||||
- The next USECODE/JELYHACK pass now resolves the immediate owner-object writer too. `entity_vm_runtime_create` (`000d:4c99`) is the only writer of runtime `+0x1315/+0x1317`, via newly recovered `entity_vm_runtime_owner_resource_create` (`000d:7000`), and the companion `entity_vm_runtime_owner_resource_destroy` (`000d:70fd`) releases that helper. The `000d:7000` body does not copy a caller-supplied table directly: it constructs one embedded seg069/070 helper object, queries that helper for the required table size via vtable `+0x04`, allocates child `+0x10/+0x12`, then populates the `0x0d`-stride per-slot producer records through vtable `+0x0c`. Wrapper classification around `entity_vm_context_try_create_masked_for_entity` is tighter too: local wrapper `0004:f033` uses slot mask `0x8000:0007`, `FUN_0004_f05c` uses `0x2000:0015` and is reached from `0004:f2b3` after overlap/proximity and entity byte `+0x32` state checks, and `FUN_0005_27a4` uses `0x0001:0000` from the `000c:a09e` entity `+0x5b` bit-`0x0004` branch. This is enough for a conservative owner/resource classification, but not yet for a source-format-specific or descriptor-specific rename beyond that partial role name.
|
||||
- The seg091 `000a:44fd/454d/45fe` cluster is now corrected too: these are reentrancy-guarded fatal-report helpers, not a wrapper selector or live event-dispatch lane. In particular, `entity_vm_slot_load_value` (`000d:51fd`) reaches `000a:44fd` via `PUSH 0x410; PUSH DS; PUSH 0x6616; CALLF 000a:44fd`, which now reads as an error/assert report path rather than a gameplay immortality-event producer.
|
||||
- The `000d:7000` owner/resource helper is one step tighter as well: its embedded seg069/070 object is file-backed rather than abstract. Construction starts with `dos_file_handle_init` (`0009:1c00`), then uses helper vtable slot `+0x04` as the size query for the child `+0x10/+0x12` allocation and helper vtable slot `+0x0c` as the callback that populates the `0x0d`-stride slot table.
|
||||
- The caller side for that owner/resource path is now anchored too. A new function object at `000d:44df` is recovered and named `entity_vm_runtime_init_from_path_if_configured`: it checks configured global/string `0x65a`, builds a path through seg072 helper `0009:3600` using globals `0x6d6:0x6d8` plus `0x65a`, validates it through `000a:500a`, then calls `entity_vm_runtime_create(0,0,path)` and seeds post-init runtime state at `0x6611`, `0x6615`, and `0x8c7c`.
|
||||
- Follow-on correlation for the `000d:21ed/22bc` lane is now tighter: the two bytes consumed at `000d:22d2` and `000d:22ee` are confirmed as compact signed shape/count metadata for the link-matrix walk (not descriptor ids), while the streamed words consumed through `000d:2324..237b` and `entity_link` (`0008:7d27`) behave as runtime entity/link ids with `0x0400` list-flag filtering on writeback. The source stream provenance remains slot-indexed owner-table data (`(+0x10/+0x12) + 0x0d*slot + 4`) mirrored to `0x39ca`, so descriptor-family correlation is strengthened only at the ecosystem level (generic active-event neighborhoods) and not as a direct `SURCAM*` or class-id keyed lane.
|
||||
- Pass 4 dispatcher/xref follow-up is now partially closed: `FUN_000d_ebe3` sequencing is re-verified (`177c -> 1acb -> 0988 -> 22bc -> optional 1d4a -> 2104`) and the opcode selector local is pinned to `[BP-0x32]` inside `000d:0988` with concrete compares against `0x19/0x1a/0x1b` (`000d:099b`, `000d:09a1`, `000d:0a07`, `000d:0a0d`). The direct upstream selector edge into entry `000d:ebe3` remains unresolved in MCP xref output, so no additional wrapper-level opcode number is promoted yet beyond that internal family identity.
|
||||
- Readable EUSECODE targeting moved one concrete step forward again. Fresh decompile on `entity_vm_context_create_from_slot_index` re-verifies that the slot-backed VM source copied from the runtime owner table (`(+0x10/+0x12) + 0x0d*slot + 4`) is also written into the per-slot mirror row addressed through `0x39ca[context_slot]` during context creation, which tightens the current source model without over-claiming unique ownership of the mirror. Combined with the extracted `jelyhack_*` and `event_*` reports, this now supports a first verified pseudo-script rendering of two descriptor-side forms: `referent anchor + event-bearing attachment` (`JELYHACK` / `JELYH2` beside `REE_BOOT` / `SFXTRIG` / `SURCAMEW`) and `event hub + trigger-side neighbors` (`EVENT` / `COR_BOOT` / `NPCTRIG` inside the `ROLL_NS..VMAIL` island). The direct descriptor-id bridge and the upstream selector into `FUN_000d_ebe3` both remain unresolved.
|
||||
- The next wrapper-expansion pass is now also complete. The `0005:2867/2918/2ae2/2d30` family and its adjacent `2c06/2c35/2c68/2c9b/2cd2/2d01` siblings now form one verified mask ladder around `entity_vm_context_try_create_masked_for_entity`: `0x0002:0001`, `0x0020:0005`, `0x0004:0002`, `0x0200:0009`, `0x0400:000a`, `0x0800:000b`, `0x0010:0004`, `0x1000:000c`, `0x4000:000e`, and `0x8000:000f`, alongside the earlier seeds `0x0001:0000`, `0x8000:0007`, and `0x2000:0015`. The strongest new caller-side evidence is gameplay-state-oriented rather than descriptor-id-oriented: `0005:2867` feeds `000c` state helpers that store the result into entity field `+0x39`, `0005:2918` is only reached from a `+0x3c == 0x20b` object lane and carries caller fields `+0x36/+0x38` as an extra dword, and `0005:2d30` gates on entity id range, class word bit `0x8000`, class-record `0x7e46` flag bits, class nibble values `4/7/8`, and a `0x0f16` / `0x20f` dispatch-entry emission path before attempting mask `0x8000:000f`. That makes the wrapper family correlate more naturally with active-event ecosystems (`EVENT` / `NPCTRIG` / `_BOOT`) than with a direct `JELYHACK` referent-anchor lookup, while still stopping short of a hard descriptor-class switch.
|
||||
- First concrete follow-on pass for the adjacent wrapper entries is now in: direct caller anchors are confirmed for `0005:2c06` (`0005:0292`), `0005:2cd2` (`0005:0fee`), `0005:2c9b` (`0005:5946/59e9`), and `0005:2d01` (`0007:814e/822e`). The old `0005:2c68` indirect-dispatch evidence is now rejected: `0007:e521` and `0007:e73c` push caller-local data `DAT_0000_2c68` into the fatal-report helper `000a:44fd`, not into wrapper `0005:2c68`. That leaves both `0005:2c35` and `0005:2c68` genuinely xref-dark and shifts the remaining selector work back to the true upstream edge into `FUN_000d_ebe3`.
|
||||
- The current VM owner/path lane is now tighter too. Seg072 helper `0009:3600` is not a raw buffer-advance helper; it is a rotating slash-aware path composer that uses five `0x50`-byte temp buffers and inserts `\` only when adjacent path parts need it. In `entity_vm_runtime_init_from_path_if_configured` (`000d:44df`), `0x65a` therefore reads as the configured relative runtime-owner filename/path component, while `0x6d6:0x6d8` reads as the mutable base/resource-root path buffer. The still-xref-dark wrappers `0005:2c35` / `0005:2c68` are narrower too: their signed extra word is forwarded through `entity_vm_context_create_from_slot_index` into `entity_vm_slot_load_value_plus_offset` and stored in context field `+0x34`, so they are offset-specialized mask wrappers rather than plain duplicates. Direct CALL xrefs into `FUN_000d_ebe3` are also now confirmed from `animation_ctor_variant_a/b/c`, but no new wrapper-level opcode number is proven yet beyond internal `0x19/0x1a/0x1b`.
|
||||
- The owner/resource helper is now tighter at the file-loader level too. The embedded seg070 methods rooted at raw windows `0009:67b6` and `0009:6916` iterate helper-owned tables at object `+0x10/+0x18`, format per-entry paths with the seg001 string helpers, then open/read/close through `0009:1c3a` / `0009:2034` / `0009:1e61`. That pushes `entity_vm_runtime_owner_resource_create` (`000d:7000`) beyond a generic file-backed label: it now looks like an indexed external file-set loader whose vtable `+0x0c` callback materializes the `0x0d`-stride owner records consumed by the VM runtime.
|
||||
- The `0x39ca` mirror follow-up is narrower now too. The newly checked `0008:709c/70cb`, `0008:7309/7338`, and `0008:85f9/8617` windows only save/restore or allocate the global `0x39ca:0x39cc` base pointer and zero the backing table. No new competing per-slot row writer is verified there; `entity_vm_context_create_from_slot_index` (`000d:46ec`) still remains the only confirmed writer of concrete `0x39ca[slot] = {source_off, source_seg}` mirror rows.
|
||||
- The descriptor-side tooling is now one step closer to readable USECODE too. `tools/extract_eusecode_flx.py` widens the JELYHACK local window to include the nearby event-bearing `REE_BOOT` / `SURCAMEW` / `SFXTRIG` records and now emits `readable_descriptor_templates.md` / `.tsv`, which render conservative pseudo-script sketches for the current anchor, event-hub, environmental, and callback lanes rather than only raw descriptor indexes.
|
||||
- The script-facing bridge artifact is now tighter too. `tools/extract_eusecode_flx.py` emits `runtime_descriptor_family_rankings.md` / `.tsv`, which rank descriptor families against the verified VM/runtime lanes: `EVENT` is the strongest active-event payload fit, `_BOOT` cores and `NPCTRIG` remain strong active-event satellites, `SFXTRIG` and the environmental classes stay moderate active-event fits, `JELYHACK` / `JELYH2` now occupy a dedicated referent-anchor/payload-owner lane, and `SURCAMNS` / `SURCAMEW` stay in a weaker callback/attachment lane until callback-specific opcode or mask evidence appears.
|
||||
- External reference pass completed against ScummVM's Ultima 8 / Crusader engine. New note `docs/scummvm-crusader-reference.md` records concrete Crusader-specific evidence for `usecode/*.cpp`, `convert/crusader/*.h`, FLEX parsing, sound/speech/movie handling, map record layout, Crusader-only shape/typeflag handling, HUD gumps, and startup/world-state differences. Highest immediate payoff is on USECODE parsing: `usecode/usecode_flex.cpp` gives ScummVM's Crusader class-header/event-count interpretation, while `convert_usecode_crusader.h` provides named event ids `0x00..0x1f` and a large intrinsic-signature table that can be cross-checked against current VM/runtime work.
|
||||
- The first ScummVM-guided USECODE cross-walk batch is now complete. `usecode/usecode_flex.cpp`, `usecode/usecode.cpp`, `usecode/uc_machine.cpp`, and the Crusader/Regret conversion tables now externally anchor the Crusader class/event model (`classid + 2`, names from object `1`, base offset from bytes `8..11` minus `1`, six-byte event records, event-number call translation through opcode `0x11`) and confirm that Remorse uses a `ByteSet(0x1000)` VM with the shared Crusader event-name table but version-specific intrinsic dispatch. New note `docs/usecode-roundtrip-ir.md` records the safe annotation policy plus a reversible IR v0 that preserves raw class/event bytes, intrinsic ordinals, inline-versus-indirect payload distinctions, and opaque-opcode fallback. Headline estimate is intentionally unchanged because this batch tightened interpretation more than direct binary coverage.
|
||||
- The next binary-side validation step for that ScummVM cross-walk is now complete too. Sampled owner-loaded EUSECODE class records (`EVENT`, `NPCTRIG`, `SURCAMNS`, `JELYHACK`, `REE_BOOT`, `SURCAMEW`, `SFXTRIG`) now confirm the object-1 name-table and `classid + 2` body lookup locally: deriving `object_index = (table_offset - 0x80) / 8`, `class_id = object_index - 2`, and then reading object `1` at `4 + 13 * class_id` yields the expected class names. The same samples confirm a real header dword at bytes `8..11` and a 6-byte event-slot table at `+20` with `u16 unknown + u32 code/payload` structure.
|
||||
- The current USECODE-header mismatch is now narrowed further and has a conservative working resolution. `uc_machine.cpp` uses ScummVM's decremented `get_class_base_offset()` as the live code-stream base, while the local owner-loaded records still fit bytes `8..11 = first code-byte offset` with 1-based event code offsets. Under that reading the local event-count rule is `(base_offset - 19) / 6`, equivalently `(raw_u32_at_8_11 - 20) / 6`, which matches the validated `32/33/35` slot tables from the `0x00d4/0x00da/0x00e6` headers. The `000d:44df -> 000d:4c99 -> 000d:7000 -> 000d:46ec` runtime path still shows indexed file loading and slot-table consumption but no verified per-class header rewrite, so the mismatch currently looks best explained by a ScummVM interpretation/detail issue rather than a proven owner-loader transform. No safe new event-label-to-runtime correlation was promoted from this pass, and the headline estimate remains unchanged.
|
||||
- The conservative owner-loaded class rule is now implemented in `tools/extract_eusecode_flx.py` and refreshed on the current EUSECODE sample. New outputs `class_layout_index.tsv` and `class_event_index.tsv` now expose object index, class id, class-name hint, raw bytes `8..11`, derived `code_base_minus_one`, conservative event counts, and raw 6-byte event rows with ScummVM slot-name hints, giving the round-trip work a concrete parser baseline instead of only prose notes.
|
||||
|
||||
### Current Focus
|
||||
|
||||
1. User-directed USECODE/JELYHACK lane: identify who populates the runtime owner/resource object returned by `000d:7000`, especially the `+0x10/+0x12` per-slot producer table and the gameplay wrappers around `entity_vm_context_try_create_masked_for_entity` that decide which entities can materialize slot-backed VM contexts.
|
||||
1. User-directed USECODE/JELYHACK lane: use the updated parser outputs to recover a small safe set of non-zero slot semantics, then tighten the reversible script IR around the already verified `000d` VM families before returning to deeper producer/dispatcher mapping.
|
||||
2. Finish Priority 0 refinement by promoting more exact segment rows where notes already support a verified foothold.
|
||||
3. Continue the Priority 1 pass by tracing the higher-level startup/display callers, branch outcomes, pre-entry object lanes, palette-fade ownership, watch/camera controller ownership, and active sprite/object ownership that stitch the seg137 palette helper family into the wider `0x4588` / dispatch-entry object-role lane.
|
||||
|
||||
### Next Resume Point
|
||||
|
||||
1. Continue the user-directed USECODE/JELYHACK follow-on from the recovered producer chain, especially by:
|
||||
- identifying the concrete seg069/070 helper class and source arguments behind `entity_vm_runtime_owner_resource_create` (`000d:7000`), especially the vtable `+0x04` size query and `+0x0c` table-population call that fill child `+0x10/+0x12`,
|
||||
- extending wrapper classification outward from the now-verified seeds `0004:f033` (`0x8000:0007`), `FUN_0004_f05c` (`0x2000:0015`), and `FUN_0005_27a4` (`0x0001:0000`) into the neighboring `0005:2867/2918/2ae2/2d30` family so the slot-mask groups can be mapped to concrete gameplay object classes,
|
||||
- checking whether any recovered owner-table records or slot families line up with the JELYHACK-island referent/event neighborhood more strongly than with generic entity-script traffic,
|
||||
- and tracing whether the `0x39ca` per-slot payload mirror is initialized only from `entity_vm_context_create_from_slot_index` or is also refreshed by other runtime-owner helper paths.
|
||||
1. Continue the user-directed USECODE/JELYHACK follow-on from the recovered producer chain, now with the ScummVM class/event cross-walk in place, especially by:
|
||||
- validating the new conservative rule against any future main USECODE container sample that becomes available, to decide whether the current mismatch is EUSECODE-specific or whether ScummVM's `get_class_event_count()` arithmetic should be treated as the outlier for Crusader,
|
||||
- mining the new `class_layout_index.tsv` / `class_event_index.tsv` outputs for repeat non-zero slot patterns before doing more ad hoc byte inspection,
|
||||
- broadening the local owner-loaded spot-check beyond the first named cluster when convenient, especially across additional `_BOOT`, environmental-event, and callback-eventtrigger classes, while treating the present object-1 / `classid + 2` indexing as the current working model,
|
||||
- determining whether the owner/resource helper behind `entity_vm_runtime_owner_resource_create` (`000d:7000`) exposes original object indices through its helper `+0x18` table or only slot-local file ids, now that the `+0x10/+0x14/+0x18` table contract is verified,
|
||||
- re-tracing `0005:2c35` / `0005:2c68` through real caller-role recovery now that they are narrowed to signed slot-offset wrappers, while keeping the disproven `000a:44fd` selector hypothesis retired,
|
||||
- mapping only the now-verified non-zero low slot ids from sampled classes (`JELYHACK` slot `1`; `EVENT`/`SFXTRIG` slot `10`; `NPCTRIG` slots `10/32`; `REE_BOOT` slots `10/15/16`; `SURCAMNS`/`SURCAMEW` slots `1/10/32/33/34`) onto ScummVM event labels where binary behavior actually matches, and otherwise keeping them as numeric slot annotations,
|
||||
- separating safe Remorse annotations from Regret-only intrinsic numbering by treating ScummVM intrinsic names as ordinal/signature hints rather than rename authority,
|
||||
- turning the new conservative parser rule into tooling/tests first, while preserving raw bytes `8..11`, raw 6-byte event rows, and the unresolved leading event word in emitted IR artifacts,
|
||||
- resolving the remaining selector/opcode path inside `FUN_000d_ebe3` by lifting the write/read path for opcode-local `[BP-0x32]` and any hidden jump/call-table case entry from the now-confirmed `animation_ctor_variant_a/b/c` caller lane,
|
||||
- hardening the first reversible script IR around preserved class headers, raw event-entry words, intrinsic ordinals, inline-versus-indirect payload forms, and opaque-opcode fallback,
|
||||
- identifying the concrete seg069/070 helper class behind `entity_vm_runtime_owner_resource_create` (`000d:7000`) now that the helper is narrowed to an indexed external file-set loader around raw windows `0009:67b6` / `0009:6916`,
|
||||
- tracing the remaining caller roles for the verified ladder entries `0005:2c06/2c35/2c68/2c9b/2cd2/2d01` and the larger `0005:2d30` gate so the slot-mask groups can be mapped to concrete gameplay object/state classes rather than only mask numbers,
|
||||
- and checking whether any runtime-owner helper besides `entity_vm_context_create_from_slot_index` writes per-slot mirror rows through the far array rooted at `0x39ca`, now that the currently checked `0008:709c/70cb`, `0008:7309/7338`, and `0008:85f9/8617` sites are constrained to base-pointer save/restore or table allocation rather than row writes.
|
||||
2. Keep classifying the seg126 pre-entry text-renderer lane around `transition_preentry_setup_resources`, `transition_preentry_step_script`, and `transition_preentry_release_resources`, especially by:
|
||||
- comparing more preset `0x10` / `0x11` text-renderer callsites,
|
||||
- tracing who owns the rendered buffer loaded into `0x6301:0x6303`,
|
||||
|
|
@ -88,7 +115,7 @@ The estimates below are intentionally conservative. They measure verified behavi
|
|||
- Immortality-specific follow-on is now narrowed but not closed: `JELYHACK` and `JELYH2` are confirmed as real referent-only EUSECODE descriptors; `NPCTRIG` is confirmed as an event-capable trigger descriptor; `CRUZTRIG` / `TRIGPAD` expose `referent,item,elev`; but no extracted record has yet been tied directly to binary event value `0x410`.
|
||||
- The clustering pass tightened the local candidate set around `JELYHACK`: the immediate neighborhood now includes `SPECIAL`, `TRIGPAD`, `DATALINK`, `HOFFMAN`, `REE_BOOT`, `SURCAMEW`, and `SFXTRIG`, which is a plausible map/object island rather than random sparse table order.
|
||||
- The strongest `record_table_parse_buffer` caller evidence (`000e:1b9f..1d49`) now appears to belong to the animation-object field lane, because the surrounding setup manipulates the already-mapped animation fields at `+0x117/+0x11b/+0x11f/+0x123` and `+0xeaf/+0xeb1`. That weakens the earlier assumption that `000e:3639` is the primary EUSECODE loader and shifts the likely binary-descriptor consumer search back toward the `000d` VM/object path.
|
||||
- The first concrete `000c` to `000d` bridge in that direction is now visible at `entity_vm_set_value_from_slot_plus_offset` (`000c:f95f`): it calls `entity_vm_slot_load_value_plus_offset` (`000d:5572`) and stores the return pair into object fields `+0xd6/+0xd8`; on the `000d` side, `entity_vm_slot_load_value` (`000d:51fd`) contains a verified `PUSH 0x410` path. Supporting slot helpers in the same lane are now named too (`entity_vm_slot_find_or_select`, `entity_vm_slot_decrement_use_count`, `entity_vm_slot_release_value`). This still does not prove the immortality trigger chain, but it is the strongest current code-side connection between the mini-VM lane and a live `0x410` producer.
|
||||
- The first concrete `000c` to `000d` bridge in that direction is now visible at `entity_vm_set_value_from_slot_plus_offset` (`000c:f95f`): it calls `entity_vm_slot_load_value_plus_offset` (`000d:5572`) and stores the return pair into object fields `+0xd6/+0xd8`. Supporting slot helpers in the same lane are now named too (`entity_vm_slot_find_or_select`, `entity_vm_slot_decrement_use_count`, `entity_vm_slot_release_value`). The previously noted `000d:51fd` `PUSH 0x410` site is now reclassified as a fatal-report call into `000a:44fd` with `DS:6616`, so it no longer supports a direct compiled-code immortality-event bridge.
|
||||
- The adjacent `000d:45xx..4exx` island is now promoted out of `FUN_*` placeholders as one coherent VM runtime/context family. Newly named helpers include `entity_vm_runtime_create` / `entity_vm_runtime_init_slots` / `entity_vm_runtime_release_slots` / `entity_vm_runtime_destroy`, `entity_vm_slot_index_from_entity`, `entity_vm_context_try_create_masked_for_entity`, `entity_vm_context_create_from_slot_index`, `entity_vm_context_sync_global_value_and_dispatch`, and the context save/load/destroy helpers. The runtime global at `0x6611` now reads as a real owner for this lane rather than an opaque far pointer.
|
||||
- Two large caller bodies at `000d:208b` and `000d:21ed` now stand out as concrete context-construction sites: both feed per-object stream/data state from `+0xcc/+0xce` into `entity_vm_context_create_from_slot_index`, then continue by reading from the seeded `+0xd6/+0xd8` bytecode/value lane. This is the clearest current evidence that the `000d` interpreter/object family, not the `000e` text parser, is the near-runtime consumer to keep following for the immortality trigger.
|
||||
- A second supporting lane is now named too: `entity_vm_referent_registry_init` / `destroy` / `alloc` / `release_by_id` / `free_node` show that `0x8c8c/0x8c8e/0x8c90/0x8c94` form a free-list-backed referent registry. `entity_vm_set_field_da_to_global` writes `0x8c94` from the context `+0xda` lane before entering the still-misaligned `000c:3350` body, which is the first concrete runtime mechanism explaining how referent-only descriptors such as `JELYHACK` can still participate in script state.
|
||||
|
|
@ -100,6 +127,7 @@ The estimates below are intentionally conservative. They measure verified behavi
|
|||
- The first opcode family under that lane is also less anonymous now: `000d:0988` can either append unique payload entries or remove matching ones depending on the opcode id (`0x1a/0x1b` taking the removal path), and both branches return through `entity_vm_opcode_finish`.
|
||||
- That opcode family is now classified one step further: `0x19` = append-unique indirect/string-like payloads, `0x1a` = remove-matching indirect/string-like payloads, `0x1b` = remove-matching inline payloads, and the same helper body strongly implies `0x18` as the missing append-unique inline sibling.
|
||||
- The first stable `+0xd6/+0xd8` byte-lane semantics are now visible in the two large caller bodies too. The `000d:208b` block is a simple materialize-or-forward path after `entity_vm_context_create_from_slot_index`, while `000d:21ed` copies a caller-owned inline blob into the context `+0x102` buffer and then consumes two stream bytes as compact shape/count metadata before building an `entity_link` closure matrix from the following caller-stream words.
|
||||
- EUSECODE readability moved one concrete step forward in this pass: decompile output now supports a first verified IR vocabulary for the same lane — `APPEND_UNIQUE_INLINE` (implied `0x18` sibling), `APPEND_UNIQUE_INDIRECT` (`0x19`), `REMOVE_MATCHING_INDIRECT` (`0x1a`), `REMOVE_MATCHING_INLINE` (`0x1b`), `MATERIALIZE_OR_FORWARD_VALUE` (`000d:208b`), `PREPEND_INLINE_PAYLOAD` (`000d:21ed`), and `BUILD_ENTITY_LINK_MATRIX` (`000d:22bc` with `entity_link` at `0008:7d27`). The `000d:22bc` tail also confirms a pushback filter where non-`0x0400` results are written back to the caller stream before `entity_vm_opcode_finish`.
|
||||
- Current best JELYHACK reading is tighter than before: the extracted chunks still only expose `referent`, but the new referent-registry work means that does not relegate them to inert map labels. The most defensible present model is `JELYHACK/JELYH2 = referent anchors`, with the actual immortality/event behavior carried by neighboring event-capable descriptors in the same local island (`REE_BOOT`, `SURCAMEW`, `SFXTRIG`, or a nearby generic event/trigger record).
|
||||
- That readability step now has a first concrete artifact: `tools/extract_eusecode_flx.py` emits `referent_anchor_event_graph.tsv` plus a focused `jelyhack_island_graph.md`, which turns the local table neighborhood into a first readable anchor-to-event view instead of only raw descriptor rows.
|
||||
- The extractor now also emits `jelyhack_descriptor_compare.tsv`, and its first result is useful: `JELYHACK` and `JELYH2` have identical first 16 header words as referent-only sibling descriptors, while `REE_BOOT`, `SURCAMEW`, and `SFXTRIG` show materially richer header/state patterns consistent with the event-bearing side of the island.
|
||||
|
|
@ -113,18 +141,24 @@ The estimates below are intentionally conservative. They measure verified behavi
|
|||
- The environmental event lane is now promoted out of a generic family label into a clearer structural pattern. `environmental_family_compare.tsv` shows `FLAMEBOX` and `STEAMBOX` as close hazard-event siblings with the same active-event backbone plus direction/count, while `NOSTRIL` is the smaller fire-specific variant that keeps the dual-hazard references and counters but drops the direction/newType side.
|
||||
- The callback-trigger lane is also more defensible now: `callback_trigger_compare.tsv` confirms that `SURCAMNS` and `SURCAMEW` are effectively one shared callback template, differing only in one `therma` slot tag offset. That keeps the active `event` lane and callback `eventTrigger` lane separated by more than just naming convention.
|
||||
- Runtime follow-through has resumed too: `000d:ebe3` is now backed by direct instruction evidence as one ordered VM/opcode driver body that calls `000d:177c`, `000d:1acb`, `000d:0988`, internal block `000d:22bc`, then `000d:1d4a` and `000d:2104` in sequence. `000d:ec31` is confirmed as only the internal `CALL 000d:22bc` site inside that body, so the inner block is still not a safe standalone rename target.
|
||||
- Payload-shape reuse inside that same `FUN_000d_ebe3` sequencer is now partially classified: `000d:177c` behaves as a word-literal stream push, `000d:1acb` consumes one streamed dword pair and pushes a boolean word, `000d:21ed/22bc` remains the signed-byte metadata plus word-id matrix lane, `000d:1d4a` is still a boundary-suspect trap island, and `000d:2104` is a mixed scalar/handle out-pointer finalizer. This is now documented as a compact opcode-to-payload-shape matrix in docs.
|
||||
- `entity_vm_context_try_create_masked_for_entity` (`000d:463a`) is now pinned down one step further: it first checks the runtime-disable byte at `0x6610`, computes the entity slot, tests the owner-side slot mask in the runtime owner table, and only then creates a context. On success it reports either an immediate result (success with cleared output word) or an object-backed result (success with the created object's low word), which is the clearest current typed boundary between gameplay entities and VM-backed object results.
|
||||
- The immediate owner-object writer is now identified too. `entity_vm_runtime_create` (`000d:4c99`) stores the only verified runtime `+0x1315/+0x1317` value by calling the newly recovered `entity_vm_runtime_owner_resource_create` (`000d:7000`), whose helper-managed body allocates child `+0x10/+0x12` from a vtable `+0x04` size query and fills the `0x0d`-stride slot table through vtable `+0x0c`. The paired release path is `entity_vm_runtime_owner_resource_destroy` (`000d:70fd`).
|
||||
- The first wrapper-side mask families are now anchored by direct instruction evidence as well: local wrapper `0004:f033` passes `0x8000:0007`, `FUN_0004_f05c` passes `0x2000:0015` from the `0004:f2b3` overlap/proximity branch with entity byte `+0x32` state toggling, and `FUN_0005_27a4` passes `0x0001:0000` from the `000c:a09e` entity `+0x5b` bit-`0x0004` branch. This is enough to distinguish at least three gameplay-side mask lanes without yet claiming descriptor-specific ownership such as `JELYHACK` versus `REE_BOOT`.
|
||||
- One exact `0x410` collision that could have reopened the wrong lane is now ruled out: `000e:0953` pushes literal `0x410` into imported `ASYLUM.27` from the animation/audio path after setting the `+0xef1` audio-completion byte. Because `ASYLUM.DLL` is the `ASS_*` audio/media library, this is not evidence for a second gameplay or USECODE event source; the live compiled-code bridge for the immortality event remains the `000d` VM lane at `entity_vm_slot_load_value` (`000d:51fd`).
|
||||
- One exact `0x410` collision that could have reopened the wrong lane is now ruled out: `000e:0953` pushes literal `0x410` into imported `ASYLUM.27` from the animation/audio path after setting the `+0xef1` audio-completion byte. Because `ASYLUM.DLL` is the `ASS_*` audio/media library, this is not evidence for a second gameplay or USECODE event source; the other previously suspected compiled-code bridge at `000d:51fd` is now ruled out too because that site calls the seg091 fatal-report helper `000a:44fd` with `DS:6616`, not gameplay dispatch.
|
||||
9. Revisit `allocator_phase_finalize_pass` only where it intersects the same callback object semantics, rather than broad allocator mechanics that are already sufficiently constrained.
|
||||
10. Continue `ASYLUM.24` only after the `0x4588` / dispatch-entry lane and `0004:1e00` transition path have no further cheap wins.
|
||||
11. User-directed USECODE/JELYHACK side lane: trace who seeds the caller stream/data pair at `+0xcc/+0xce` before the `000d:208b` and `000d:21ed` context-construction blocks, and correlate those producer-side objects with referent ids or descriptor-class neighborhoods that could distinguish `JELYHACK` / `JELYH2` anchors from the neighboring `REE_BOOT`, `SURCAMEW`, and `SFXTRIG` event-bearing attachments.
|
||||
11. User-directed USECODE/JELYHACK side lane (next actionable IR step): map the new sequencer-local payload-shape matrix to concrete opcode numbers by recovering the upstream opcode dispatcher lane that selects `FUN_000d_ebe3`, then test whether those opcode numbers correlate better with active-event families (`EVENT`/`NPCTRIG`/`*_BOOT`/`SFXTRIG`) than with callback-trigger (`SURCAM*`) descriptors.
|
||||
12. Use the new ScummVM reference note as a focused cross-check batch before deeper parser or VM work:
|
||||
- compare local USECODE/EUSECODE container assumptions against ScummVM's Crusader `UsecodeFlex` class-header parsing (`classid + 2`, class-name table at object `1`, base offset from bytes `8..11`, event-count formula `(base + 19) / 6`),
|
||||
- import the conservative Crusader event-name table from `convert_usecode_crusader.h` (`look/use/anim/cachein/hit/gotHit/hatch/schedule/release/equip/unequip/combine/calledFromAnim/enterFastArea/leaveFastArea/avatarStoleSomething/animGetHit/unhatch`) into the current USECODE annotation workflow where they match verified behavior,
|
||||
- compare current weapon/ammo and item-family reads against ScummVM's `WeaponInfo`, `ShapeInfo`, and `ItemFactory` structures so quality/ammo/clip semantics are kept aligned with evidence,
|
||||
- and prioritize local parsers or validators for the ScummVM-loaded Crusader data files that are still weakly covered here: `dtable.flx`, `damage.flx`, `glob.flx`, `wpnovlay.dat`, `sound.flx`, and per-shape speech FLEX archives.
|
||||
|
||||
### Headline Estimate
|
||||
|
||||
- Overall useful decompilation progress: about 35%
|
||||
- Reasonable uncertainty band: about 30% to 40%
|
||||
- Overall useful decompilation progress: about 37%
|
||||
- Reasonable uncertainty band: about 31% to 40%
|
||||
|
||||
This is the best single-number estimate for the full game right now.
|
||||
|
||||
|
|
@ -133,8 +167,8 @@ This is the best single-number estimate for the full game right now.
|
|||
| Metric | Estimate | Meaning |
|
||||
|---|---:|---|
|
||||
| Top 100 far-call target coverage | about 80% | Roughly 80 of the top 100 most-called far-call targets have been named or materially classified |
|
||||
| Whole-program behavioral coverage | about 35% | Verified subsystem and function understanding across the executable |
|
||||
| Segment spread with meaningful analysis | about 19% to 25% | Segments with more than a trivial foothold or isolated note |
|
||||
| Whole-program behavioral coverage | about 37% | Verified subsystem and function understanding across the executable |
|
||||
| Segment spread with meaningful analysis | about 20% to 26% | Segments with more than a trivial foothold or isolated note |
|
||||
| Tooling maturity for continued work | about 75% | Core repair, lookup, and fallback automation needed for continued progress |
|
||||
|
||||
### Why These Numbers Differ
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue