Crusader_Decomp/docs/function-knowledge-roadmap.md

22 KiB

Roadmap To Full Function Knowledge

Purpose

This note turns the current CRUSADER.EXE decompilation state into a concrete path toward broad, evidence-backed function coverage.

Full function knowledge here does not mean every body must already have a perfect gameplay-facing name. It means every reachable function should end up in one of three states:

  • stable behavioral name with caller/context evidence,
  • conservative structural name with a documented local contract,
  • or an explicit blocker entry that explains why boundary repair, caller recovery, or data typing is still required.

Current Baseline

  • The live CRUSADER.EXE function table currently sits at 3032 total functions, with 1795 non-anonymous names and 1237 remaining FUN_/nullfn_ placeholders. That puts the current raw naming floor at about 59.2% known versus 40.8% still anonymous.
  • plan-mid.md currently pegs overall useful decompilation progress at about 59%, with a reasonable band of 56% to 64%.
  • The strongest live lanes are still gameplay/input/projectiles, VM/runtime structure, startup/display, cheat/debug controls, and several class-lift pilot families.
  • The main remaining coverage gaps are no longer random single functions. They cluster around a few recurring problems: upstream caller policy, overlapped function boundaries, weak segment-ledger promotion, and structurally named families that still lack their final behavioral label.
  • The hottest anonymous segment concentrations are now easier to state concretely too: 1000 still holds 166 unknowns, 10e8 holds 62, 1190 holds 35, 13e8 holds 23, and 13c8 holds 22. That means broad sweeps are still best spent in a mixed strategy: keep harvesting low-risk UI/process helpers in the partially named menu/gump lanes while using separate deeper passes for the denser engine/runtime segments.

Definition Of Done

Treat the roadmap as materially complete when all of the following are true:

  1. crusader_segment_coverage_ledger.csv has no remaining None rows for code segments that have active live function objects.
  2. Every hot caller-dense range has either named functions or a documented boundary-repair blocker.
  3. Top far-call targets are at least 95% behaviorally classified, not just structurally named.
  4. The VM/USECODE selector path is caller-backed far enough upstream that the remaining masked-create helpers are no longer anonymous policy wrappers.
  5. The 0x4588 callback family has a defensible subsystem label or a tightly bounded residual uncertainty note.
  6. Parser/animation/video helper ranges no longer sit in the ledger as broad unmapped holes.

Workstreams

1. Keep The Write Path Healthy

The current batch confirmed another practical workflow rule: analysis can continue while the GUI owns the live project, but explicit write-back through the external MCP edit path fails when the project lock is held by the GUI session.

Use this order:

  1. Do read-only decompile/xref analysis against the live open session.
  2. Stage small rename/comment batches in notes.
  3. Apply them only when one of these is true:
    • the active MCP session has a real writable context, or
    • the GUI is closed and the local PyGhidra write fallback can open the project.

2. Finish Low-Risk Live Naming Families First

The next fastest gains are in live clusters where neighboring functions are already named and only a few destructor/helper slots remain anonymous.

Current best examples:

  • seg033 NPC action / surrender / guard / loiter process lane
  • healer / battery-charger utility process cleanup lane
  • compact helper functions whose callers are already named and whose local contract is obvious from one decompile

These batches should stay small: 1 to 5 renames plus provenance comments.

3. Push One Layer Upstream From Structural Names

A large fraction of the remaining unknowns are no longer local-contract mysteries. They are caller-policy mysteries.

Highest-payoff upstream fronts remain:

  1. VM / USECODE selector and masked-create caller policy
  2. collision-storage queue producer policy above AreaSearch_CollideMove
  3. startup/display residual presentation handoff selectors
  4. watch/controller ownership around the seg049 lane

The rule for these passes is simple: prefer caller recovery over renaming isolated helpers in place.

4. Convert Segment Holes Into Tracked Families

The ledger is still lagging behind the actual live state in a few places. Segment rows should not stay at None once a real family footprint is known.

Promotion rule:

  • Foothold: one real subsystem lane plus several stable names
  • Partial: multiple connected helpers and at least one caller-backed behavioral claim
  • Deep: enough coverage that new work mostly expands or refines the lane instead of discovering it

5. Repair Boundaries Only Where They Unlock Coverage

Large overlap-heavy ranges still matter, but they should be attacked in payoff order:

  1. caller-dense windows that block multiple names
  2. parser/animation/video windows with direct downstream value
  3. residual startup/display overlap only when it blocks a live rename or caller chain

Current standing examples include 000c:db68, 000e:ffb0, and several sparse gaps in 0007 and 000b.

6. Turn Structural Object Families Into Behavioral Ones

Two families still deserve focused classification work because they block naming quality across many segments:

  1. the VM selector / owner-resource family
  2. the 0x4588 callback / allocator / presentation family

Batch Discipline

For each rename batch:

  1. Verify the body locally by decompile or disassembly.
  2. Verify at least one caller or neighboring family anchor when the name is behavioral rather than purely structural.
  3. Add a short Ghidra comment preserving why the name is safe.
  4. Update the ledger if a segment can be promoted.
  5. Update plan-mid.md only when the real next resume point changed.

Latest Applied Rename Batch

This batch is now landed in CRUSADER.EXE. The practical workflow lesson still stands: read-only MCP analysis can run against the GUI-owned session, but the external write path had to fall back to local PyGhidra after the GUI-held project lock blocked explicit write access.

  • 1100:0437 -> SurrenderProcess_Destroy
  • 1100:0913 -> NPC_DoRandomIdleAnimTwiceIfNotBusy
  • 1128:1e14 -> CruHealer_Destroy
  • 1128:1fbe -> BatteryChargerProcess_Destroy

Evidence summary:

  • 1100:0437 resets the surrender vtable root, clears the local NPC flag bit, and destroys the two embedded dispatch-entry children.
  • 1100:0913 is only called from GuardProcess_Run and LoiterProcess_Run; it gates on NPC_IsBusy, seeds a two-way random choice, and then calls NPC_DoAnim twice.
  • 1128:1e14 and 1128:1fbe are clear destructor slots for the healer and battery-charger process families because they restore the process vtable root, stop the family-specific sound, clear avatar stasis, and destroy the embedded child entries.
  • The next seg033 pass also closed four more slot-1 destructor entries directly from the live process-function tables: PaceProcess_Destroy, GuardProcess_Destroy, LoiterProcess_Destroy, and StandProcess_Destroy are now confirmed from g_*ProcessFnPtr table ownership rather than from weaker local shape alone.
  • The same vtable-driven cleanup method now also widened beyond seg033: NPCActionProcess_Destroy, DeathSilenceProcess_Destroy, PathfinderProcess_Destroy, TeleporterProcess_Destroy, and EggHatcherProcess_Destroy are all now grounded directly in their live g_*ProcessFnPtr slot-1 ownership rather than left as anonymous generic destructors.
  • The current AI-lane residue is narrower again: the base NPCActionProcess no-op run body, the shared slot-10 no-op body, the loiter-only slot-10 override, and the common process-vtable slot stubs are now all named structurally. The remaining uncertainty in this immediate window is therefore mostly semantic rather than object-identity: exactly what slot-10/slot-11 mean at the behavior-policy level, not which functions own those slots.
  • The same slot-1 cleanup method now generalizes beyond the AI families too: SnapProcess_Destroy, VideoPlayer_Destroy, GameTimeProcess_Destroy, WaitProcess_Destroy, and SpriteProcess_Destroy are now live as verified vtable-owned process destructors, which makes the broader process-family cleanup lane more systematic instead of one-off.
  • The parser/animation-adjacent media lane now has its first comparable foothold too. FlicPlayProcess_Destroy, FlicWaitProcess_Destroy, MusicPlayerProcess_RunNoop, MusicPlayerProcess_Destroy, AssProcess_Destroy, FlicWaitProcess_VtableSlot10TickAndMaybeAdvance, MusicPlayerProcess_VtableSlot10Noop, AssProcess_VtableSlot5ClearCreatedFlag, and AssProcess_VtableSlot6SetCreatedFlag are now live from direct g_*ProcessFnPtr slot ownership and local body evidence. The remaining gap in this lane is no longer basic object identity; it is the deeper connection between these process families and the older raw 000e: parser/video helper anchors.
  • That same lane now extends one level deeper into the video helper stack: VideoPlayer_InitializePlayback, VideoPlayer_OpenMediaFiles, VideoPlayer_AllocPlaybackBuffers, VideoPlayer_OpenMoviListAndPrimeStreams, VideoPlayer_StopAndDestroyWrapper, and VideoPlayerProcess_VtableSlot11Noop are now live from direct caller relationships (PlayFlicProbably_1468_3f77, FlicPlay_1468_4169, VideoPlayer_Run), MOVI-string evidence, and the g_videoPlayerProcessFnPtr slot map. The next high-value unknowns in this lane are now the remaining unnamed video helper bodies around palette/error handling and the still-unmapped raw 000e: parser/video equivalents.
  • The chunk-processing layer is now materially clearer too: File_Exists, VideoPlayer_FormatErrorMessage, VideoPlayer_AdvanceChunkCursor, VideoPlayer_AdvanceChunkCursorWrapper, VideoPlayer_LoadAudioChunk, VideoPlayer_LoadVideoChunk, and VideoPlayer_BlitDecodedFrame are live from direct chunk-tag evidence (01wb, 00db, 00dc), caller placement inside the playback loop, and the now-named MOVI setup path. That leaves fewer anonymous helpers in the video lane; the remaining gaps are concentrated in subtitle/palette helpers and the four concrete blitters that VideoPlayer_BlitDecodedFrame dispatches to.
  • A separate music/save-state mini-cluster also closed cleanly after switching lanes: Music_RestorePreviousTrackFromStack, Music_LoadStateAndReplayCurrentTrack, and Music_SaveState are now live from direct Savegame_LoadProbably / Savegame_QuickSave caller evidence and the explicit music-track stack behavior around g_musicTrack and DAT_1478_3b75. That gives the audio side a foothold outside the video/process lane and reduces ambiguity around transient-screen music restoration.
  • The savegame UI cluster is now fully closed at the 13d0: helper level: SavegameSlot_GetLabelPtr, SavegameSlot_SetLabel, File_CloseAndMaybeFree, SavegameNameField_MapInputChar, SavegameNameField_HandleKey, SavegameNameField_Draw, SavegameMenu_Destroy, SavegameMenu_HandleKey, SavegameMenu_HandleSlotAction, SavegameSlot_DrawCornerDecorations, SavegameSlotGump_Create, SavegameSlotGump_Destroy, SavegameSlot_HandleClick, SavegameSlot_BeginEditOrActivate, and SavegameSlot_Select are now live from direct save/load-gump caller evidence. That removes the remaining anonymous helpers from the savegame menu lane instead of leaving a partly named cluster behind.
  • A fresh 13c8: top-level menu cluster now has its shell named too: MainMenu_Destroy, MainMenu_DrawCornerDecorations, MainMenu_HandleButtonClick, MainMenu_HandleKey, and MainMenu_ActivateSelection are live from direct caller behavior and the selection-dispatch cases that open difficulty, save/load, credits, and related modal flows. The remaining work in that lane is now below the shell level, inside the subordinate modal/menu bodies rather than the main dispatcher itself.
  • The tiny remaining ASS cluster also closed one concrete gap: ASS_StoreInitCallbackState is now named from the direct Init_ASS / Uninit_ASS lifecycle, where it is passed as the callback address in the ASS init data and later unwound through the stored global state. That leaves the ASS lane with no anonymous helpers in its small top-level init/process callback path.
  • The small residual 10f8: item-type lane is now also cleaner: ItemScript_AppendBytes and ItemTypeflagRecord_ResetDefaults are live as structural helpers beside the already named typename-record functions. These are not deep semantic wins, but they remove the remaining obvious placeholders from that compact item-type helper cluster.
  • A larger ownership-backed process batch is now live too: MapJumpProcess_Destroy, FadeProcess1_Destroy, AnimProcess_Destroy, ItemProcess_Destroy, SuperSpriteProcess_Destroy, OneFrameDelayProc_Destroy, CameraProcess_Destroy, KeyDaemonProcess_Destroy, KeyboardProcess_Destroy, AccWaitProcess_Destroy, SystemTimerProcess_Destroy, BiosProcess_Destroy, CustomWaitProcess_Destroy, DumbTimerProcess_Destroy, CycleProcess_Destroy, FadeProcAlt_Destroy, and MyTimerProcess_Destroy were all closed from direct g_*FnPtr slot-1 ownership. This is a high-confidence unknown-count reduction batch rather than a semantic deep dive, but it materially shrinks the remaining anonymous process families.
  • A companion slot-method batch is now live as well: MapJumpProcess_VtableSlot10AdvanceItemFind, AnimProcess_VtableSlot10DispatchByPort, FadeProcess2_VtableSlot10BlendTowardTargetPalette, AttackProcess_VtableSlot10DispatchByClip, WaitProcessFamily_VtableSlot10DispatchByPair, AccWaitProcess_VtableSlot10DispatchByAnimation, BiosProcess_VtableSlot10DosRealFarCall, CustomWaitProcess_VtableSlot11ArmAndRun, MyTimerProcess_VtableSlot10IncrementCounterOnTick, BaseCameraProcess_VtableSlot10SetViewportRect, and BaseCameraProcess_VtableSlot11FreeBuffer are now named from direct g_*FnPtr ownership plus local body contracts. That broad-pass work keeps reducing placeholder density without overclaiming semantics that still depend on table decoding.
  • A further broad UI/gump pass is also live: StdIntHandlerProcess_Destroy, GumpShared_DestroyNoop, GumpShared_VtableSlot3Noop, GumpShared_VtableSlot7Noop, GumpShared_VtableSlot8Noop, GumpShared_VtableSlot9Noop, GumpShared_VtableSlot10Noop, GumpShared_VtableSlot16Noop, GumpShared_VtableSlot17Noop, ButtonGump_Destroy, KeypadGump_Destroy, KeypadButtonGump_Destroy, HelpGump_Destroy, HelpGump_RunAmbientSfxTick, RunCreditsProcess_Destroy, QuickSaveLoadExitGump_Destroy, and Gump13f80383_Destroy are now named from direct table ownership and trivial body evidence. This removes a large amount of UI placeholder noise before any deeper dialog-by-dialog semantics are needed, and it also corrects the earlier too-narrow keyboard-only labels on gump slots that are actually reused across multiple families.
  • Another small structural process-family batch is now live too: AnimProcess_RunNoop, Process1048_0000_RunNoop, Process1048_0000_Destroy, AnimPrimitiveProcessSomethingElse_Destroy, AnimPrimitiveProcessFamily_VtableSlot11CallSlot3, Process1188_0000_RunOnTimerDelta, and Process1188_0000_Destroy were closed from direct table ownership and trivial local contracts. They are still generic where the owner families remain unnamed, but they further reduce the anonymous surface for later semantic passes.
  • A further tiny broad-sweep batch is now live from adjacency-backed structural evidence: SystemTimerProcess_RunNoop, Gump13f80383_VtableSlot10Noop, and Gump13f80383_VtableSlot11Noop. These were deliberately limited to obvious no-op slots sitting directly inside already-partially-named families, keeping the sweep conservative while still trimming another few placeholders.
  • Another small broad-sweep cluster is now live in the main-menu neighborhood: MainMenuOptionsPanel_Create, MainMenuOptionButtonGump_Create, MainMenuOptionButtonGump_HandlePointerEvent, MainMenuOptionButtonGump_Draw, and Gump13f80383_Draw. This batch came from direct local-family layout evidence rather than deep semantic tracing: one parent options-panel constructor, one repeatedly-instantiated button-gump constructor, its pointer-event and draw methods, and the obvious draw method in the already-named 13f8: gump family.
  • That same small main-menu cluster tightened one step further immediately after: MainMenuOptionButtonGump_SelectPeer is now live from the direct paired-button state flip and peer-search body that MainMenuOptionButtonGump_HandlePointerEvent calls on hit. This keeps the batch conservative while closing the one remaining obvious helper in that local family.
  • The broad sweep then picked up another small owner-safe UI batch in the help lane too: HelpGump_RefreshPage, HelpGump_HandleAdvanceAction, and HelpGump_HandleNavigationKey are now live from direct caller links inside HelpGump_Create plus the shared page-state field at +0x49. That leaves the remaining 13e8: help-family residue narrower and more obviously local.

Latest Applied Helper Batch

This next pair is now landed in the live session through the write-capable MCP script path:

  • 10f8:0437 -> ItemType_GetTypenameRecordPtrAtIndex
  • 10f8:045b -> ItemType_FindTypenameRecordIndex

Current best read:

  • 10f8:0437 is the tiny 0x20-stride helper behind the typename.dat table: it returns the base pointer of the requested 1-based record and returns 0 for index 0.
  • 10f8:045b scans the same typename.dat record table from a requested starting index, optionally uppercases both the query and the candidate record name, and returns the matching record index or -1.

That closes the local seg032 helper batch and confirms a more useful write rule for future work: when the edit-plan endpoint refuses to commit in-session, the live write-capable script path can still land small targeted renames and provenance comments without leaving the open GUI workflow.

Latest Applied Broad-Sweep UI Batch

This next conservative broad-sweep batch is now landed in the live session through the MCP-backed write path:

  • 12f8:02e4 -> GumpShared_DestroyCommon
  • 13f8:0237 -> QuickSaveLoadExitGump_HandleChildButtonEvent
  • 13f8:0299 -> QuickSaveLoadExitGump_HandleKey
  • 13f8:0349 -> QuickSaveLoadExitGump_DrawLabel
  • 13f8:0383 -> QuickSaveLoadExitGump_Create
  • 13c8:2f37 -> MainMenuOptionsPanelButtonGump_Create
  • 13c8:2fca -> MainMenuOptionsPanelButtonGump_DrawLabel
  • 13c8:3004 -> MainMenuOptionsPanelButtonGump_Select
  • 13c8:3030 -> MainMenuOptionsPanelButtonGump_Deselect

Current best read:

  • 12f8:02e4 is the shared gump base destroy path used by HelpGump_Destroy, QuickSaveLoadExitGump_Destroy, Gump13f80383_Destroy, and several sibling UI families: it releases the linked child at +0x42/+0x44, clears that link, and then runs the common unlink/free helper path.
  • The 13f8: mini-cluster is now a real quick-save/load/exit modal family instead of a partly named shell. QuickSaveLoadExitGump_Create is called directly from World_HandleKeyboardInput, QuickSaveLoadExitGump_DrawLabel consumes the local label fields at +0x47/+0x49/+0x4b, and QuickSaveLoadExitGump_HandleKey plus QuickSaveLoadExitGump_HandleChildButtonEvent both funnel into the same local action-dispatch slot.
  • The 13c8:2f37..3030 cluster is now also tighter than a generic unnamed button wrapper. MainMenuOptionsPanel_Create calls MainMenuOptionsPanelButtonGump_Create six times, the wrapper sits directly over the generic 1308:032b button-gump constructor, and the adjacent DrawLabel / Select / Deselect bodies all operate on the same local label/selected-state fields.
  • The same 13c8: lane now also has a clearer main options-menu core. MainMenuOptionsMenu_Destroy saves options back to config before teardown, MainMenuOptionsMenu_Create builds the option-entry table and rectangle layout, MainMenuOptionsMenu_GetOptionRect owns the per-index placement math, and the adjacent HandleChildButtonEvent / HandleKey / DrawTitle / MainMenuOptionsMenuButtonGump_DrawLabel bodies close another chunk of placeholder-heavy UI surface without forcing final option-by-option semantics.
  • This batch is deliberately broad rather than deep: it trims placeholder-heavy UI surface without pretending the remaining dialog/menu semantics are already solved.
  1. Keep sweeping the same 12f8 / 13c8 / 13f8 UI-gump neighborhood for more structurally obvious virtual slots and tiny wrappers before switching back to deeper caller-policy work.
  2. Promote more ledger rows directly from already-verified notes so the None set shrinks before the next broad sweep.
  3. Resume the VM selector lane from the earlier policy layer above the now-named movement/collision producer surface.
  4. Revisit the parser/animation lane and map the live equivalents of the old raw 000e: parser/video helper anchors, starting from the now-named FLIC/media process families and the newly named video helper stack (VideoPlayer_InitializePlayback, VideoPlayer_OpenMediaFiles, VideoPlayer_AllocPlaybackBuffers, VideoPlayer_OpenMoviListAndPrimeStreams, VideoPlayer_LoadAudioChunk, VideoPlayer_LoadVideoChunk).
  5. Keep harvesting small caller-closed side clusters like the music/save-state helpers when a larger lane stalls; they are low-risk ways to keep the global unknown count moving down.
  6. Savegame UI is now a completed side cluster at the helper/gump-handler level; future work there should only be broader semantic polish if needed, not placeholder cleanup.
  7. The 13c8: main-menu shell plus the new options-panel and options-menu helper clusters is now a good staging point for later work on subordinate menu/dialog bodies if a future pass wants another UI-heavy cluster.
  8. Keep the next write batch small enough that every rename can carry a one-sentence provenance comment.
  9. Continue treating explicit write-path health as part of the workflow, not as an afterthought, so future rename batches do not stall at commit time.