Add Crusader-specific USECODE data and documentation
- Introduced new file `vm_mask_ladder.tsv` containing detailed mappings for Crusader USECODE VM masks and their associated descriptors. - Added comprehensive documentation in `scummvm-crusader-reference.md` outlining the structure, findings, and implications for reverse-engineering the Crusader engine within ScummVM. - Created `usecode-roundtrip-ir.md` to document the plan for converting Crusader USECODE bytes into a human-readable format, detailing the container layout, event names, and intrinsic tables. - Implemented a PowerShell script `temp_usecode_sample.ps1` for extracting and analyzing USECODE data from the Crusader FLX files, providing insights into class and event structures.
This commit is contained in:
parent
3daffbf113
commit
de42fd1ea1
42 changed files with 21970 additions and 1522 deletions
|
|
@ -42,6 +42,7 @@ A small helper cluster in the raw `000e:` area implements a fixed-size CRLF reco
|
|||
- `table_end = 0x6090`, which matches the first non-zero payload offset
|
||||
- `403` non-zero entries in the current file
|
||||
- `tools/extract_eusecode_flx.py` now parses the full validated table and emits all `403` non-zero entries under `USECODE/EUSECODE_extracted/`, including `entry_index.tsv`, `descriptor_index.tsv`, `descriptor_neighborhoods.tsv`, `summary.json`, per-chunk `.bin`, and `.strings.txt` sidecars.
|
||||
- The extractor now also carries the conservative owner-loaded class rule directly into machine-readable outputs: `class_layout_index.tsv` records `object_index`, `class_id`, the raw bytes-`8..11` field, derived `code_base_minus_one`, and `conservative_event_count`, while `class_event_index.tsv` expands parsed classes into raw 6-byte event rows with slot numbers, ScummVM event-name hints for `0x00..0x1f`, unresolved leading words, and raw code-offset dwords.
|
||||
- The generated reports now expose lightweight descriptor summaries (`primary_label`, `field_names`, `field_tags`) so the object lane can be searched by field grammar instead of only by raw names.
|
||||
- The extracted data now separates into at least two lanes:
|
||||
- text-heavy records that fit the `000e:` CRLF parser model, such as `DATALINK` mission/objective text and `TEXTFIL1` message banks
|
||||
|
|
@ -92,7 +93,7 @@ A small helper cluster in the raw `000e:` area implements a fixed-size CRLF reco
|
|||
- opcode `0x1a` = remove matching indirect/string-like payload entries from the referent chain
|
||||
- opcode `0x1b` = remove matching inline/fixed-size payload entries from the referent chain
|
||||
- the same helper body also implies the missing sibling `0x18` as the inline/fixed-size append-unique form, because only `0x19/0x1a` set the indirect compare flag while only `0x1a/0x1b` take the removal path
|
||||
- The first concrete `000c` to `000d` bridge inside that lane remains `entity_vm_set_value_from_slot_plus_offset` at `000c:f95f`: it calls `entity_vm_slot_load_value_plus_offset`, stores its return pair into object fields `+0xd6/+0xd8`, and sits immediately beside other `entity_vm_*` helpers in the `000c:f6b8..f9d9` mini-VM cluster. On the `000d` side, `entity_vm_slot_load_value_plus_offset` wraps `entity_vm_slot_load_value`, and `entity_vm_slot_load_value` contains a concrete `PUSH 0x410` event-emission path at `000d:5290`.
|
||||
- The first concrete `000c` to `000d` bridge inside that lane remains `entity_vm_set_value_from_slot_plus_offset` at `000c:f95f`: it calls `entity_vm_slot_load_value_plus_offset`, stores its return pair into object fields `+0xd6/+0xd8`, and sits immediately beside other `entity_vm_*` helpers in the `000c:f6b8..f9d9` mini-VM cluster. On the `000d` side, `entity_vm_slot_load_value_plus_offset` wraps `entity_vm_slot_load_value`, but the old `PUSH 0x410` suspicion at `000d:5290` is now rejected: that site reaches the seg091 fatal-report helper family at `000a:44fd`, not live gameplay dispatch.
|
||||
- The two main `000d` caller blocks beneath that bridge now have a first stable byte/value reading too:
|
||||
- internal block `000d:208b` is the simple materialize-or-forward path: it creates one VM context from the caller's stream state, checks the returned object flags, and either writes the returned value pair straight to the caller output slot or forwards the created object's low word through the shared opcode epilogue
|
||||
- internal block `000d:21ed` is the inline-payload path: it creates the same VM context, prepends the caller-owned blob into the backward-growing context buffer at `+0x102`, then consumes two bytes from the seeded `+0xd6/+0xd8` lane as small shape/count metadata before building an `entity_link` closure matrix from the following caller-stream words and pushing back the non-`0x0400` results
|
||||
|
|
@ -121,6 +122,8 @@ A small helper cluster in the raw `000e:` area implements a fixed-size CRLF reco
|
|||
- `environmental-event`: `FLAMEBOX`, `NOSTRIL`, `STEAMBOX`
|
||||
- `callback-eventtrigger`: `SURCAMNS`, `SURCAMEW`
|
||||
- That split matters because it is the first extractor-backed distinction between active event carriers and callback-only trigger holders. The `69:0A00 -> event` classes now look like the active event-bearing core of the descriptor system, while the surveillance classes with `24:0A02 -> eventTrigger` are better treated as callback/attachment endpoints rather than peer event hubs.
|
||||
- The extractor now emits a stronger script-facing bridge artifact too: `runtime_descriptor_family_rankings.md` / `.tsv` rank those descriptor families against the verified runtime lanes instead of only listing neighborhoods. Current best fit is `EVENT` as the strongest active-event payload lane, `_BOOT` cores and `NPCTRIG` as strong satellites, `SFXTRIG` / environmental classes as moderate active-event fits, `JELYHACK` / `JELYH2` as the dedicated referent-anchor lane, and `SURCAM*` as structurally distinct callback/attachment holders.
|
||||
- That ranking is anchored by the current owner-loader evidence as well as the descriptor grammar: `000d:44df -> 000d:4c99 -> 000d:7000` supplies the slot-backed source, and raw seg070 windows `0009:67b6` / `0009:6916` now show the embedded helper walking object `+0x10/+0x18` tables, formatting per-entry paths, and open/read/close-loading files before the `0x0d`-stride owner records are materialized.
|
||||
- The next focused pass tightened the `_BOOT` lane too. `boot_family_compare.tsv` now shows that all five `_BOOT` event cores (`AND_BOOT`, `BRO_BOOT`, `COR_BOOT`, `VAR_BOOT`, `REE_BOOT`) share the same header skeleton and the same compact field shape (`referent,event,counter,item`). The meaningful differences are payload size and local neighborhood, not descriptor schema.
|
||||
- The new `boot_frontier_graph.md` makes the best early `_BOOT` frontier explicit: `AND_BOOT` and `BRO_BOOT` sit in one compact referent-heavy neighborhood (`OFFWORK`, `GUARD`, `GDOOR_N`, `GDOOR_E`, `BIGCAN`, `CRUMORPH`, `GUARDSQ`, `CARD_NS`, `CARD_EW`, `EWALLEW`/`EWALLNS`) and also point directly at each other as adjacent event-bearing siblings. So the present best reading is a reusable boot-event core template instantiated in several different local object islands, not a set of unrelated boot scripts.
|
||||
- The environmental hazard lane is now similarly constrained. `environmental_family_compare.tsv` shows that `FLAMEBOX` and `STEAMBOX` are close structural siblings with the same active-event backbone (`referent,event,<hazard>,<hazard2>,direction,count`) and matching `24:0A02 / 24:FC02 / 24:FE02` object-link pattern, while `NOSTRIL` is a smaller fire-specific variant that keeps the active `event` plus dual fire references and count fields but drops the direction/newType side.
|
||||
|
|
@ -188,12 +191,18 @@ All three constructor variants (`000e:2777`, `000e:2860`, `000e:2969`) follow th
|
|||
|
||||
1. Call `FUN_000e_e935` (allocator — produces garbled 11KB decompile, not renamed)
|
||||
2. Set fields `+0xb4` through `+0xc2` on the result
|
||||
3. Call `000d:ebe3` (multi-step chain initializer: calls `177c`, `1acb`, `0988`, `22bc`, `1d4a`, `2104` in sequence)
|
||||
3. Call `000d:ebe3` directly (confirmed CALL sites at `000e:283e`, `000e:2931`, `000e:29e4`; multi-step chain initializer: calls `177c`, `1acb`, `0988`, `22bc`, `1d4a`, `2104` in sequence)
|
||||
4. Call `assert_alive_sentinel` (assertion: checks `+0xd4 != -1`)
|
||||
5. Call `func_0x000eec83`
|
||||
|
||||
The chain at `000d:ebe3` steps through VM opcode handlers (`000d:177c`, `000d:1acb`, `000d:0988`) that operate on a bytecode VM object with stack pointer at `+0xcc` (decremented by 2 per push) and segment base at `+0xce`.
|
||||
|
||||
The constructor-side field setup before that sequencer is now slightly tighter too:
|
||||
|
||||
- variants A and B both set `+0xc0 = 1` before the direct `000d:ebe3` call and derive `+0xc2` from `DS:0x604e`
|
||||
- variant C instead sets `+0xc0 = 0`, `+0xc2 = 1`, and `+0x4c = 0x000d` before the same sequencer call
|
||||
- these direct xrefs make `000d:ebe3` a constructor-side animation sequencer rather than a globally xref-dark dispatcher, but they still do not expose any new wrapper-level opcode number beyond the internal `0x19/0x1a/0x1b` family already proven inside `000d:0988`
|
||||
|
||||
### Constructor variant renames
|
||||
|
||||
| Address | Name |
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue