Add detailed class event processing and family comparison tools

- Enhance `extract_eusecode_flx.py` to derive class event rows with additional metadata including derived body windows and repeated template statuses.
- Introduce `usecode_family_compare.py` for comparing event families, analyzing commonalities in event bodies, and generating reports on identical groups and differences.
- Implement new data structures for managing class event rows and family artifact specifications.
- Update output formats to include derived body information and repeated family regression checks.
- Ensure robust validation of repeated family expectations against actual extracted data.
This commit is contained in:
MaddoScientisto 2026-03-22 23:24:46 +01:00
commit 4d3c8cd81b
23 changed files with 15033 additions and 14221 deletions

View file

@ -1,6 +1,6 @@
# EUSECODE.FLX First-Pass Extraction
Input: k:\ghidra\Crusader_Decomp\USECODE\EUSECODE.FLX
Input: USECODE\EUSECODE.FLX
File size: 0x87E45 (556613 bytes)
Candidate entries: 403
@ -423,7 +423,9 @@ ASCII: `........................................................................
- `.strings.txt` files are the main human-readable output for now; `.txt` files are emitted only for chunks that look text-like.
- `descriptor_index.tsv` summarizes guessed class labels, field names, and compact tag patterns for descriptor-like chunks.
- `class_layout_index.tsv` records the conservative owner-loaded class parsing state: object index, class id, class-name hint, raw bytes-8..11 field, derived code-base-minus-one, and event-count/table-end values when the local divisibility and bounds checks succeed.
- `class_event_index.tsv` expands parsed owner-loaded classes into raw 6-byte event rows with slot numbers, ScummVM event-name hints for `0x00..0x1f`, unresolved leading words, and raw code-offset dwords for round-trip tooling work.
- `class_event_index.tsv` now also emits derived body-window columns (`derived_body_start`, `derived_body_end`, `derived_body_length`) plus conservative `repeated_template_status` tags for verified repeated families.
- `boot_family_decompile.md` / `.tsv`, `callback_family_decompile.md` / `.tsv`, and `environmental_family_decompile.md` / `.tsv` now provide reversible per-class decompile artifacts for the `_BOOT`, `SURCAM*`, and environmental repeated-family lanes.
- `repeated_family_regressions.tsv` enforces the current repeated-family slot sets plus the verified raw-row and derived body-window fields for `JELYHACK/JELYH2`, `_BOOT`, `SURCAM*`, and `FLAMEBOX/NOSTRIL/STEAMBOX`.
- `descriptor_neighborhoods.tsv` captures local table neighborhoods around trigger/event-related classes such as `JELYHACK`, `NPCTRIG`, `CRUZTRIG`, `TRIGPAD`, and `SPECIAL`.
- `referent_anchor_event_graph.tsv` groups referent-bearing descriptors with nearby event-bearing neighbors so the attachment model can be inspected without ad hoc grepping.
- `jelyhack_island_graph.md` now uses a wider local window so the `JELYHACK` / `JELYH2` anchors can be inspected alongside the nearby event-bearing `REE_BOOT`, `SURCAMEW`, and `SFXTRIG` descriptors rather than stopping at the referent-only neighbors.