- Created `crusader_segment_coverage_ledger.csv` to track segment coverage status, types, and known functions. - Introduced `plan-mid.md` as a mid-project tracker outlining progress, objectives, and implementation priorities for the decompilation effort. - Added scripts in `pyghidra_plans` to assist with instruction window dumping and reference inspection for the object at `0x4588`. - Implemented functionality to scan for instruction uses of specific targets related to the decompilation project.
13 KiB
Crusader Decompilation Mid-Project Plan
Purpose
This file is the workspace-facing mid-project tracker for the Crusader decompilation effort. It is intended to answer four questions clearly:
- How far along is the project?
- What is already solid?
- What still blocks broader decompilation?
- What should be implemented next?
The estimates below are intentionally conservative. They measure verified behavioral understanding, not just renamed symbols.
Progress Snapshot
Working Progress
Last Confirmed State
- Priority 0 has started:
crusader_segment_coverage_ledger.csvexists and contains a first-pass 145-row ledger. - The currently seeded ledger rows are conservative and strongest around seg001, seg004, seg021, seg043, seg080, seg082/083/085, seg091, seg094, and seg095.
- Priority 1 has started on the cache/backend cluster: the seg082 allocator mechanics are now materially recovered (
allocator_head_try_alloc_block,allocator_head_free_block,allocator_free_block_by_ptr), and the next unresolved clue is that0x4588behaves like a runtime-installed callback/dispatch object used byentity_conditional_render_dispatchplus a one-shot teardown path. - The
0x4588blocker is tighter than before: no-function windows now confirm a direct install at000a:493e, repeated clear paths in seg004, and additional vtable+0x0ccallbacks from unresolved000a:and000d:callers, but the concrete subsystem name is still unresolved.
Current Focus
- Finish Priority 0 refinement by promoting more exact segment rows where notes already support a verified foothold.
- Continue the Priority 1 pass by tracing the remaining caller-side
0x4588/0009:b1c3object-role evidence rather than the already-recovered allocator mechanics.
Next Resume Point
- Update the ledger for any additional exact segment anchors found in the reset/cache or render-path notes.
- Recover or classify the still-unbounded callback callers around
000a:b9e5/000a:ba66and000d:9d5e/000d:a3b7; they now look like the best remaining cheap wins on the0x4588path. - Revisit the nearby install/lifecycle gap around
000a:493e/000a:4a56only if those caller windows need a stronger object-owner model. - Continue
ASYLUM.24only after the0x4588path has no further cheap wins.
Headline Estimate
- Overall useful decompilation progress: about 25%
- Reasonable uncertainty band: about 20% to 30%
This is the best single-number estimate for the full game right now.
Supporting Metrics
| Metric | Estimate | Meaning |
|---|---|---|
| Top 100 far-call target coverage | about 80% | Roughly 80 of the top 100 most-called far-call targets have been named or materially classified |
| Whole-program behavioral coverage | about 25% | Verified subsystem and function understanding across the executable |
| Segment spread with meaningful analysis | about 10% to 15% | Segments with more than a trivial foothold or isolated note |
| Tooling maturity for continued work | about 75% | Core repair, lookup, and fallback automation needed for continued progress |
Why These Numbers Differ
- The hot-target metric is much higher because the project has already focused on the most shared and most-called helpers.
- The whole-program metric is lower because most of the 145 NE segments still have not had systematic coverage passes.
- The segment-spread metric is lower still because only a subset of segments have coherent subsystem-level treatment.
What Is Already In Place
Workflow and Tooling
- Raw full-EXE Ghidra target is established and in active use.
- Verified raw-import mapping exists for seg001 and seg021.
- NE relocation parsing has been implemented.
- Internal literal far-call fixups have been applied to the raw import.
- PyGhidra fallback tooling exists for create/delete function work and batch scripted edits.
- Conservative boundary-repair workflow already exists and has been used successfully.
- Notes are detailed enough to support a formal executable-wide tracker.
Objective Milestones Already Reached
- 145 NE segments identified from the internal NE header.
- 8851 internal literal CALLF sites patched to real targets in the raw import.
- 2841 non-CALLF far-pointer relocations identified and deferred.
- 119 import callsites annotated.
- Top 100 far-call target list processed through five tiers, with about 80 named or materially classified.
Strongly Advanced Areas
Core Gameplay and Entity Work
- seg001 gameplay, cursor, entity lifecycle, projectile, combat, and AI footholds are strong.
- A verified seg001 raw-port path is working and already used for multiple projectile helpers.
- Entity table, class-table, and several global gameplay fields are partially mapped.
Timer, Event, and State Systems
- seg021 timer and event-dispatch work has meaningful coverage.
- 000c state-dispatch, cursor-nav, UI-listbox, palette-fade, and mini-VM clusters have footholds.
Rendering and Camera
- 0007 rendering, draw-list, tile-visibility, and camera work has strong structural coverage.
world_to_screen_coordsand adjacent geometric helpers are understood well enough to support further caller analysis.
Dispatch and Pair-Sync Helpers
- 0008 dispatch-entry helper families have multiple verified rename batches.
- Pair-sync and target-state helper clusters are no longer isolated unknowns.
Cache, Tracked Handles, and Bucket Logic
- 000a cache manager layer is structurally mapped.
- 000a tracked-handle table is structurally mapped.
- 000d tracked bucket / proximity / visibility bucket logic has several meaningful behavioral names.
- The client/cache distinction is much clearer than before.
Parser and Animation Framework
- 000e parser cluster has a stable set of verified names.
- 000e animation framework has a real foothold: chunk lookup, audio load, tick, frame advance, and constructor variants are partly mapped.
Local Repair Successes
- seg043 overlap repair succeeded and recovered multiple valid function objects.
- seg091 boundary recovery succeeded and exposed RNG helpers plus local init/context helpers.
- Recent seg004 reset-path recovery and cache-reset follow-up added a new high-value analysis cluster.
What Still Blocks Broader Coverage
High-Value Classification Gaps
- The object rooted at
0x4588is still not classified well enough to safely rename0009:b1c3. ASYLUM.24is only known as an import site, not yet a confidently identified routine.- Some structural names in the cache/backend/finalize cluster are waiting on object-role confirmation.
Boundary and Decompiler Gaps
- Some high-caller targets still require conservative boundary repair or follow-up validation.
- Certain functions still decompile poorly because of overlaps, thunk-heavy paths, or unresolved downstream targets.
000e:ffb0remains a notable animation/video-side blocker because of overlapping instructions.
Coverage Management Gap
- A first-pass normalized segment-by-segment coverage ledger now exists for all 145 NE segments.
- The remaining gap is refinement rather than absence: most segments still need manual promotion from
NonetoFoothold/Partial/Deepas coverage expands.
Deferred Data Work
- Non-CALLF far-pointer relocations still exist and will matter for deeper object/table recovery.
- They are no longer the main blocker, but they remain a real second-pass problem.
Current Best Assessment Of Remaining Work
The project has solved most of the architectural uncertainty needed to keep going efficiently. The remaining effort is mainly a scaling problem:
- expand coverage across many more segments,
- remove the last high-value boundary blockers,
- convert structural names into subsystem names when evidence is strong enough,
- and normalize progress tracking so the whole program can be managed deliberately.
In practical terms, this looks like a true mid-project state rather than an early exploratory state or a late polish state.
Implementation Priorities
Priority 0: Coverage Ledger
First pass completed: an executable-wide coverage ledger now exists for all 145 NE segments in crusader_segment_coverage_ledger.csv.
Next work under Priority 0:
- Promote additional segments from
Nonewhere notes already support a verified foothold. - Normalize raw-address subsystem islands (notably the
000e:parser/animation cluster) back onto exact NE segment rows. - Keep the ledger updated together with
crusader_decompilation_notes.mdafter each verified batch.
Minimum columns:
| Column | Meaning |
|---|---|
| Segment | NE segment number |
| Type | Code or data |
| File offset | From the NE segment table |
| Length | Segment length |
| Coverage status | None, foothold, partial, deep |
| Known subsystem | Best current classification |
| Key named functions | Short summary only |
| Blockers | Boundary, import, thunk, overlap, unknown object, etc. |
| Notes source | Notes section or evidence anchor |
This is the most important missing artifact because it will make the percentage estimates maintainable.
Priority 1: Finish The New Cache/Backend Cluster
Work the newest verified reset-path cluster to closure:
- Trace more callers of
0009:b06b. - Trace more callers of
FUN_0009_a961. - Classify the object rooted at
0x4588. - Revisit
0009:b1c3once the object role is clearer.
This is currently the best next analysis target because it closes a live cluster that already has fresh verified work around it.
Priority 2: Resolve ASYLUM.24
Identify what imported routine ASYLUM.24 actually is.
Goal:
- tighten the description of
runtime_cache_reset_sequence, - determine whether the import belongs to cache/resource/backend/media initialization,
- and improve naming confidence around the reset path.
Priority 3: Continue Small-Batch Boundary Repair
Use the existing conservative repair approach for remaining high-value blockers.
Good candidates include:
- unresolved high-caller function objects,
- ranges that still steal bytes from adjacent real bodies,
- and overlaps that block decompilation of already-active subsystems.
Priority 4: Finish Partial Subsystem Islands Before Expanding Broadly
Recommended order:
- seg043 plus connected seg004 reset and dispatch paths
- 000e animation/video overlap at
000e:ffb0 - 000c UI-listbox, mini-VM, and cursor-nav families
- Remaining structural 0007 and 0008 helper cohorts
The goal is to reduce the number of half-understood islands before starting broad segment sweeps.
Priority 5: Broaden Coverage Across The Remaining Executable
Once the ledger exists and the current hot cluster is closed, broaden analysis segment by segment.
Preferred method:
- Group segments by adjacency and call relationships.
- Identify entry points and hot callees first.
- Classify globals and tables next.
- Promote helper names only when supported by strong evidence.
Recommended Tracking Model
Use these status values for segment coverage:
| Status | Meaning |
|---|---|
| None | No meaningful verified analysis yet |
| Foothold | One or two verified entry points or helper names, but no subsystem picture |
| Partial | Several verified names plus some globals/tables or object fields |
| Deep | Coherent subsystem-level understanding with multiple verified related functions |
Use these status values for subsystem maturity:
| Status | Meaning |
|---|---|
| Unknown | Not enough evidence to classify |
| Structural | Behavior is partly mapped but still generic |
| Behavioral | Confident subsystem role is known |
| Stable | Multiple connected functions and data objects support the classification |
Suggested Immediate Work Queue
Queue A: Highest Leverage
- Expand the first-pass segment coverage ledger beyond the currently seeded segments.
- Trace
0009:b06b,FUN_0009_a961, and0009:b1c3. - Identify
ASYLUM.24.
Queue B: Repair And Stabilize
- Review remaining high-caller gap functions.
- Repair any still-blocking overlaps in small batches.
- Re-decompile repaired ranges and keep only evidence-backed names.
Queue C: Broaden Carefully
- Expand into adjacent segments connected to already-understood clusters.
- Avoid speculative naming.
- Update the notes and the coverage ledger together after each verified batch.
Concrete Progress Interpretation
If a single number is needed, use 25%.
If a more honest dashboard is acceptable, use all three:
- 80% of top-100 hot targets processed
- 25% overall behavioral decompilation progress
- 10% to 15% segment spread with meaningful analysis
That combination best reflects the actual state of the project.
Source Anchors
Primary sources for this file:
crusader_segment_coverage_ledger.csvcrusader_decompilation_notes.mdcrusader_ne_segments.csvtier4_output.txttier5_output.txt- repo memory progress summary
Next Update Rule
Update this file when one of the following happens:
- the overall estimate changes materially,
- a new subsystem reaches behavioral or stable status,
- a major blocker such as
0x4588,0009:b1c3, orASYLUM.24is resolved, - or the segment coverage ledger is created and becomes the new primary progress source.