Decompilation unk files generation

This commit is contained in:
MaddoScientisto 2026-04-10 00:45:41 +02:00
commit 746709f40c
503 changed files with 45757 additions and 31 deletions

View file

@ -0,0 +1,189 @@
# Roadmap To Full Function Knowledge
## Purpose
This note turns the current `CRUSADER.EXE` decompilation state into a concrete path toward broad, evidence-backed function coverage.
`Full function knowledge` here does not mean every body must already have a perfect gameplay-facing name. It means every reachable function should end up in one of three states:
- stable behavioral name with caller/context evidence,
- conservative structural name with a documented local contract,
- or an explicit blocker entry that explains why boundary repair, caller recovery, or data typing is still required.
## Current Baseline
- The live `CRUSADER.EXE` function table currently sits at `3032` total functions, with `1795` non-anonymous names and `1237` remaining `FUN_/nullfn_` placeholders. That puts the current raw naming floor at about `59.2%` known versus `40.8%` still anonymous.
- `plan-mid.md` currently pegs overall useful decompilation progress at about `59%`, with a reasonable band of `56%` to `64%`.
- The strongest live lanes are still gameplay/input/projectiles, VM/runtime structure, startup/display, cheat/debug controls, and several class-lift pilot families.
- The main remaining coverage gaps are no longer random single functions. They cluster around a few recurring problems: upstream caller policy, overlapped function boundaries, weak segment-ledger promotion, and structurally named families that still lack their final behavioral label.
- The hottest anonymous segment concentrations are now easier to state concretely too: `1000` still holds `166` unknowns, `10e8` holds `62`, `1190` holds `35`, `13e8` holds `23`, and `13c8` holds `22`. That means broad sweeps are still best spent in a mixed strategy: keep harvesting low-risk UI/process helpers in the partially named menu/gump lanes while using separate deeper passes for the denser engine/runtime segments.
## Definition Of Done
Treat the roadmap as materially complete when all of the following are true:
1. `crusader_segment_coverage_ledger.csv` has no remaining `None` rows for code segments that have active live function objects.
2. Every hot caller-dense range has either named functions or a documented boundary-repair blocker.
3. Top far-call targets are at least `95%` behaviorally classified, not just structurally named.
4. The VM/USECODE selector path is caller-backed far enough upstream that the remaining masked-create helpers are no longer anonymous policy wrappers.
5. The `0x4588` callback family has a defensible subsystem label or a tightly bounded residual uncertainty note.
6. Parser/animation/video helper ranges no longer sit in the ledger as broad unmapped holes.
## Workstreams
### 1. Keep The Write Path Healthy
The current batch confirmed another practical workflow rule: analysis can continue while the GUI owns the live project, but explicit write-back through the external MCP edit path fails when the project lock is held by the GUI session.
Use this order:
1. Do read-only decompile/xref analysis against the live open session.
2. Stage small rename/comment batches in notes.
3. Apply them only when one of these is true:
- the active MCP session has a real writable context, or
- the GUI is closed and the local PyGhidra write fallback can open the project.
### 2. Finish Low-Risk Live Naming Families First
The next fastest gains are in live clusters where neighboring functions are already named and only a few destructor/helper slots remain anonymous.
Current best examples:
- seg033 NPC action / surrender / guard / loiter process lane
- healer / battery-charger utility process cleanup lane
- compact helper functions whose callers are already named and whose local contract is obvious from one decompile
These batches should stay small: `1` to `5` renames plus provenance comments.
### 3. Push One Layer Upstream From Structural Names
A large fraction of the remaining unknowns are no longer local-contract mysteries. They are caller-policy mysteries.
Highest-payoff upstream fronts remain:
1. VM / USECODE selector and masked-create caller policy
2. collision-storage queue producer policy above `AreaSearch_CollideMove`
3. startup/display residual presentation handoff selectors
4. watch/controller ownership around the seg049 lane
The rule for these passes is simple: prefer caller recovery over renaming isolated helpers in place.
### 4. Convert Segment Holes Into Tracked Families
The ledger is still lagging behind the actual live state in a few places. Segment rows should not stay at `None` once a real family footprint is known.
Promotion rule:
- `Foothold`: one real subsystem lane plus several stable names
- `Partial`: multiple connected helpers and at least one caller-backed behavioral claim
- `Deep`: enough coverage that new work mostly expands or refines the lane instead of discovering it
### 5. Repair Boundaries Only Where They Unlock Coverage
Large overlap-heavy ranges still matter, but they should be attacked in payoff order:
1. caller-dense windows that block multiple names
2. parser/animation/video windows with direct downstream value
3. residual startup/display overlap only when it blocks a live rename or caller chain
Current standing examples include `000c:db68`, `000e:ffb0`, and several sparse gaps in `0007` and `000b`.
### 6. Turn Structural Object Families Into Behavioral Ones
Two families still deserve focused classification work because they block naming quality across many segments:
1. the VM selector / owner-resource family
2. the `0x4588` callback / allocator / presentation family
## Batch Discipline
For each rename batch:
1. Verify the body locally by decompile or disassembly.
2. Verify at least one caller or neighboring family anchor when the name is behavioral rather than purely structural.
3. Add a short Ghidra comment preserving why the name is safe.
4. Update the ledger if a segment can be promoted.
5. Update `plan-mid.md` only when the real next resume point changed.
## Latest Applied Rename Batch
This batch is now landed in `CRUSADER.EXE`. The practical workflow lesson still stands: read-only MCP analysis can run against the GUI-owned session, but the external write path had to fall back to local PyGhidra after the GUI-held project lock blocked explicit write access.
- `1100:0437` -> `SurrenderProcess_Destroy`
- `1100:0913` -> `NPC_DoRandomIdleAnimTwiceIfNotBusy`
- `1128:1e14` -> `CruHealer_Destroy`
- `1128:1fbe` -> `BatteryChargerProcess_Destroy`
Evidence summary:
- `1100:0437` resets the surrender vtable root, clears the local NPC flag bit, and destroys the two embedded dispatch-entry children.
- `1100:0913` is only called from `GuardProcess_Run` and `LoiterProcess_Run`; it gates on `NPC_IsBusy`, seeds a two-way random choice, and then calls `NPC_DoAnim` twice.
- `1128:1e14` and `1128:1fbe` are clear destructor slots for the healer and battery-charger process families because they restore the process vtable root, stop the family-specific sound, clear avatar stasis, and destroy the embedded child entries.
- The next seg033 pass also closed four more slot-1 destructor entries directly from the live process-function tables: `PaceProcess_Destroy`, `GuardProcess_Destroy`, `LoiterProcess_Destroy`, and `StandProcess_Destroy` are now confirmed from `g_*ProcessFnPtr` table ownership rather than from weaker local shape alone.
- The same vtable-driven cleanup method now also widened beyond seg033: `NPCActionProcess_Destroy`, `DeathSilenceProcess_Destroy`, `PathfinderProcess_Destroy`, `TeleporterProcess_Destroy`, and `EggHatcherProcess_Destroy` are all now grounded directly in their live `g_*ProcessFnPtr` slot-1 ownership rather than left as anonymous generic destructors.
- The current AI-lane residue is narrower again: the base `NPCActionProcess` no-op run body, the shared slot-10 no-op body, the loiter-only slot-10 override, and the common process-vtable slot stubs are now all named structurally. The remaining uncertainty in this immediate window is therefore mostly semantic rather than object-identity: exactly what slot-10/slot-11 mean at the behavior-policy level, not which functions own those slots.
- The same slot-1 cleanup method now generalizes beyond the AI families too: `SnapProcess_Destroy`, `VideoPlayer_Destroy`, `GameTimeProcess_Destroy`, `WaitProcess_Destroy`, and `SpriteProcess_Destroy` are now live as verified vtable-owned process destructors, which makes the broader process-family cleanup lane more systematic instead of one-off.
- The parser/animation-adjacent media lane now has its first comparable foothold too. `FlicPlayProcess_Destroy`, `FlicWaitProcess_Destroy`, `MusicPlayerProcess_RunNoop`, `MusicPlayerProcess_Destroy`, `AssProcess_Destroy`, `FlicWaitProcess_VtableSlot10TickAndMaybeAdvance`, `MusicPlayerProcess_VtableSlot10Noop`, `AssProcess_VtableSlot5ClearCreatedFlag`, and `AssProcess_VtableSlot6SetCreatedFlag` are now live from direct `g_*ProcessFnPtr` slot ownership and local body evidence. The remaining gap in this lane is no longer basic object identity; it is the deeper connection between these process families and the older raw `000e:` parser/video helper anchors.
- That same lane now extends one level deeper into the video helper stack: `VideoPlayer_InitializePlayback`, `VideoPlayer_OpenMediaFiles`, `VideoPlayer_AllocPlaybackBuffers`, `VideoPlayer_OpenMoviListAndPrimeStreams`, `VideoPlayer_StopAndDestroyWrapper`, and `VideoPlayerProcess_VtableSlot11Noop` are now live from direct caller relationships (`PlayFlicProbably_1468_3f77`, `FlicPlay_1468_4169`, `VideoPlayer_Run`), MOVI-string evidence, and the `g_videoPlayerProcessFnPtr` slot map. The next high-value unknowns in this lane are now the remaining unnamed video helper bodies around palette/error handling and the still-unmapped raw `000e:` parser/video equivalents.
- The chunk-processing layer is now materially clearer too: `File_Exists`, `VideoPlayer_FormatErrorMessage`, `VideoPlayer_AdvanceChunkCursor`, `VideoPlayer_AdvanceChunkCursorWrapper`, `VideoPlayer_LoadAudioChunk`, `VideoPlayer_LoadVideoChunk`, and `VideoPlayer_BlitDecodedFrame` are live from direct chunk-tag evidence (`01wb`, `00db`, `00dc`), caller placement inside the playback loop, and the now-named MOVI setup path. That leaves fewer anonymous helpers in the video lane; the remaining gaps are concentrated in subtitle/palette helpers and the four concrete blitters that `VideoPlayer_BlitDecodedFrame` dispatches to.
- A separate music/save-state mini-cluster also closed cleanly after switching lanes: `Music_RestorePreviousTrackFromStack`, `Music_LoadStateAndReplayCurrentTrack`, and `Music_SaveState` are now live from direct `Savegame_LoadProbably` / `Savegame_QuickSave` caller evidence and the explicit music-track stack behavior around `g_musicTrack` and `DAT_1478_3b75`. That gives the audio side a foothold outside the video/process lane and reduces ambiguity around transient-screen music restoration.
- The savegame UI cluster is now fully closed at the `13d0:` helper level: `SavegameSlot_GetLabelPtr`, `SavegameSlot_SetLabel`, `File_CloseAndMaybeFree`, `SavegameNameField_MapInputChar`, `SavegameNameField_HandleKey`, `SavegameNameField_Draw`, `SavegameMenu_Destroy`, `SavegameMenu_HandleKey`, `SavegameMenu_HandleSlotAction`, `SavegameSlot_DrawCornerDecorations`, `SavegameSlotGump_Create`, `SavegameSlotGump_Destroy`, `SavegameSlot_HandleClick`, `SavegameSlot_BeginEditOrActivate`, and `SavegameSlot_Select` are now live from direct save/load-gump caller evidence. That removes the remaining anonymous helpers from the savegame menu lane instead of leaving a partly named cluster behind.
- A fresh `13c8:` top-level menu cluster now has its shell named too: `MainMenu_Destroy`, `MainMenu_DrawCornerDecorations`, `MainMenu_HandleButtonClick`, `MainMenu_HandleKey`, and `MainMenu_ActivateSelection` are live from direct caller behavior and the selection-dispatch cases that open difficulty, save/load, credits, and related modal flows. The remaining work in that lane is now below the shell level, inside the subordinate modal/menu bodies rather than the main dispatcher itself.
- The tiny remaining ASS cluster also closed one concrete gap: `ASS_StoreInitCallbackState` is now named from the direct `Init_ASS` / `Uninit_ASS` lifecycle, where it is passed as the callback address in the ASS init data and later unwound through the stored global state. That leaves the ASS lane with no anonymous helpers in its small top-level init/process callback path.
- The small residual `10f8:` item-type lane is now also cleaner: `ItemScript_AppendBytes` and `ItemTypeflagRecord_ResetDefaults` are live as structural helpers beside the already named typename-record functions. These are not deep semantic wins, but they remove the remaining obvious placeholders from that compact item-type helper cluster.
- A larger ownership-backed process batch is now live too: `MapJumpProcess_Destroy`, `FadeProcess1_Destroy`, `AnimProcess_Destroy`, `ItemProcess_Destroy`, `SuperSpriteProcess_Destroy`, `OneFrameDelayProc_Destroy`, `CameraProcess_Destroy`, `KeyDaemonProcess_Destroy`, `KeyboardProcess_Destroy`, `AccWaitProcess_Destroy`, `SystemTimerProcess_Destroy`, `BiosProcess_Destroy`, `CustomWaitProcess_Destroy`, `DumbTimerProcess_Destroy`, `CycleProcess_Destroy`, `FadeProcAlt_Destroy`, and `MyTimerProcess_Destroy` were all closed from direct `g_*FnPtr` slot-1 ownership. This is a high-confidence unknown-count reduction batch rather than a semantic deep dive, but it materially shrinks the remaining anonymous process families.
- A companion slot-method batch is now live as well: `MapJumpProcess_VtableSlot10AdvanceItemFind`, `AnimProcess_VtableSlot10DispatchByPort`, `FadeProcess2_VtableSlot10BlendTowardTargetPalette`, `AttackProcess_VtableSlot10DispatchByClip`, `WaitProcessFamily_VtableSlot10DispatchByPair`, `AccWaitProcess_VtableSlot10DispatchByAnimation`, `BiosProcess_VtableSlot10DosRealFarCall`, `CustomWaitProcess_VtableSlot11ArmAndRun`, `MyTimerProcess_VtableSlot10IncrementCounterOnTick`, `BaseCameraProcess_VtableSlot10SetViewportRect`, and `BaseCameraProcess_VtableSlot11FreeBuffer` are now named from direct `g_*FnPtr` ownership plus local body contracts. That broad-pass work keeps reducing placeholder density without overclaiming semantics that still depend on table decoding.
- A further broad UI/gump pass is also live: `StdIntHandlerProcess_Destroy`, `GumpShared_DestroyNoop`, `GumpShared_VtableSlot3Noop`, `GumpShared_VtableSlot7Noop`, `GumpShared_VtableSlot8Noop`, `GumpShared_VtableSlot9Noop`, `GumpShared_VtableSlot10Noop`, `GumpShared_VtableSlot16Noop`, `GumpShared_VtableSlot17Noop`, `ButtonGump_Destroy`, `KeypadGump_Destroy`, `KeypadButtonGump_Destroy`, `HelpGump_Destroy`, `HelpGump_RunAmbientSfxTick`, `RunCreditsProcess_Destroy`, `QuickSaveLoadExitGump_Destroy`, and `Gump13f80383_Destroy` are now named from direct table ownership and trivial body evidence. This removes a large amount of UI placeholder noise before any deeper dialog-by-dialog semantics are needed, and it also corrects the earlier too-narrow keyboard-only labels on gump slots that are actually reused across multiple families.
- Another small structural process-family batch is now live too: `AnimProcess_RunNoop`, `Process1048_0000_RunNoop`, `Process1048_0000_Destroy`, `AnimPrimitiveProcessSomethingElse_Destroy`, `AnimPrimitiveProcessFamily_VtableSlot11CallSlot3`, `Process1188_0000_RunOnTimerDelta`, and `Process1188_0000_Destroy` were closed from direct table ownership and trivial local contracts. They are still generic where the owner families remain unnamed, but they further reduce the anonymous surface for later semantic passes.
- A further tiny broad-sweep batch is now live from adjacency-backed structural evidence: `SystemTimerProcess_RunNoop`, `Gump13f80383_VtableSlot10Noop`, and `Gump13f80383_VtableSlot11Noop`. These were deliberately limited to obvious no-op slots sitting directly inside already-partially-named families, keeping the sweep conservative while still trimming another few placeholders.
- Another small broad-sweep cluster is now live in the main-menu neighborhood: `MainMenuOptionsPanel_Create`, `MainMenuOptionButtonGump_Create`, `MainMenuOptionButtonGump_HandlePointerEvent`, `MainMenuOptionButtonGump_Draw`, and `Gump13f80383_Draw`. This batch came from direct local-family layout evidence rather than deep semantic tracing: one parent options-panel constructor, one repeatedly-instantiated button-gump constructor, its pointer-event and draw methods, and the obvious draw method in the already-named `13f8:` gump family.
- That same small main-menu cluster tightened one step further immediately after: `MainMenuOptionButtonGump_SelectPeer` is now live from the direct paired-button state flip and peer-search body that `MainMenuOptionButtonGump_HandlePointerEvent` calls on hit. This keeps the batch conservative while closing the one remaining obvious helper in that local family.
- The broad sweep then picked up another small owner-safe UI batch in the help lane too: `HelpGump_RefreshPage`, `HelpGump_HandleAdvanceAction`, and `HelpGump_HandleNavigationKey` are now live from direct caller links inside `HelpGump_Create` plus the shared page-state field at `+0x49`. That leaves the remaining `13e8:` help-family residue narrower and more obviously local.
## Latest Applied Helper Batch
This next pair is now landed in the live session through the write-capable MCP script path:
- `10f8:0437` -> `ItemType_GetTypenameRecordPtrAtIndex`
- `10f8:045b` -> `ItemType_FindTypenameRecordIndex`
Current best read:
- `10f8:0437` is the tiny `0x20`-stride helper behind the `typename.dat` table: it returns the base pointer of the requested `1`-based record and returns `0` for index `0`.
- `10f8:045b` scans the same `typename.dat` record table from a requested starting index, optionally uppercases both the query and the candidate record name, and returns the matching record index or `-1`.
That closes the local seg032 helper batch and confirms a more useful write rule for future work: when the edit-plan endpoint refuses to commit in-session, the live write-capable script path can still land small targeted renames and provenance comments without leaving the open GUI workflow.
## Latest Applied Broad-Sweep UI Batch
This next conservative broad-sweep batch is now landed in the live session through the MCP-backed write path:
- `12f8:02e4` -> `GumpShared_DestroyCommon`
- `13f8:0237` -> `QuickSaveLoadExitGump_HandleChildButtonEvent`
- `13f8:0299` -> `QuickSaveLoadExitGump_HandleKey`
- `13f8:0349` -> `QuickSaveLoadExitGump_DrawLabel`
- `13f8:0383` -> `QuickSaveLoadExitGump_Create`
- `13c8:2f37` -> `MainMenuOptionsPanelButtonGump_Create`
- `13c8:2fca` -> `MainMenuOptionsPanelButtonGump_DrawLabel`
- `13c8:3004` -> `MainMenuOptionsPanelButtonGump_Select`
- `13c8:3030` -> `MainMenuOptionsPanelButtonGump_Deselect`
Current best read:
- `12f8:02e4` is the shared gump base destroy path used by `HelpGump_Destroy`, `QuickSaveLoadExitGump_Destroy`, `Gump13f80383_Destroy`, and several sibling UI families: it releases the linked child at `+0x42/+0x44`, clears that link, and then runs the common unlink/free helper path.
- The `13f8:` mini-cluster is now a real quick-save/load/exit modal family instead of a partly named shell. `QuickSaveLoadExitGump_Create` is called directly from `World_HandleKeyboardInput`, `QuickSaveLoadExitGump_DrawLabel` consumes the local label fields at `+0x47/+0x49/+0x4b`, and `QuickSaveLoadExitGump_HandleKey` plus `QuickSaveLoadExitGump_HandleChildButtonEvent` both funnel into the same local action-dispatch slot.
- The `13c8:2f37..3030` cluster is now also tighter than a generic unnamed button wrapper. `MainMenuOptionsPanel_Create` calls `MainMenuOptionsPanelButtonGump_Create` six times, the wrapper sits directly over the generic `1308:032b` button-gump constructor, and the adjacent `DrawLabel` / `Select` / `Deselect` bodies all operate on the same local label/selected-state fields.
- The same `13c8:` lane now also has a clearer main options-menu core. `MainMenuOptionsMenu_Destroy` saves options back to config before teardown, `MainMenuOptionsMenu_Create` builds the option-entry table and rectangle layout, `MainMenuOptionsMenu_GetOptionRect` owns the per-index placement math, and the adjacent `HandleChildButtonEvent` / `HandleKey` / `DrawTitle` / `MainMenuOptionsMenuButtonGump_DrawLabel` bodies close another chunk of placeholder-heavy UI surface without forcing final option-by-option semantics.
- This batch is deliberately broad rather than deep: it trims placeholder-heavy UI surface without pretending the remaining dialog/menu semantics are already solved.
## Recommended Next Steps
1. Keep sweeping the same `12f8` / `13c8` / `13f8` UI-gump neighborhood for more structurally obvious virtual slots and tiny wrappers before switching back to deeper caller-policy work.
2. Promote more ledger rows directly from already-verified notes so the `None` set shrinks before the next broad sweep.
3. Resume the VM selector lane from the earlier policy layer above the now-named movement/collision producer surface.
4. Revisit the parser/animation lane and map the live equivalents of the old raw `000e:` parser/video helper anchors, starting from the now-named FLIC/media process families and the newly named video helper stack (`VideoPlayer_InitializePlayback`, `VideoPlayer_OpenMediaFiles`, `VideoPlayer_AllocPlaybackBuffers`, `VideoPlayer_OpenMoviListAndPrimeStreams`, `VideoPlayer_LoadAudioChunk`, `VideoPlayer_LoadVideoChunk`).
5. Keep harvesting small caller-closed side clusters like the music/save-state helpers when a larger lane stalls; they are low-risk ways to keep the global unknown count moving down.
6. Savegame UI is now a completed side cluster at the helper/gump-handler level; future work there should only be broader semantic polish if needed, not placeholder cleanup.
7. The `13c8:` main-menu shell plus the new options-panel and options-menu helper clusters is now a good staging point for later work on subordinate menu/dialog bodies if a future pass wants another UI-heavy cluster.
8. Keep the next write batch small enough that every rename can carry a one-sentence provenance comment.
9. Continue treating explicit write-path health as part of the workflow, not as an afterthought, so future rename batches do not stall at commit time.