Crusader_Decomp/docs/remorse-class-lift-index.md

132 lines
No EOL
11 KiB
Markdown

# Remorse Class-Lift Work Index
## Purpose
This note is the easy-to-find landing page for the current Remorse C++ and class-lifting preparation work.
Use it as the starting point when the project returns to:
- class and namespace authoring inside Ghidra
- vtable and instance-layout promotion
- hand-maintained C++ skeleton emission
- ABI-safe source reconstruction planning
This index does not replace the detailed notes. It groups them into one work order so later implementation can resume quickly.
## Read This First
1. [docs/remorse-cpp-decompilation-plan.md](docs/remorse-cpp-decompilation-plan.md)
2. [docs/remorse-class-candidate-inventory.md](docs/remorse-class-candidate-inventory.md)
3. [docs/remorse-rebuild-abi-notes.md](docs/remorse-rebuild-abi-notes.md)
4. [docs/ghidra-mcp-class-lifting-endpoint-spec.md](docs/ghidra-mcp-class-lifting-endpoint-spec.md)
That set gives the high-level target, the current candidate families, the rebuild constraints, and the future MCP authoring surface.
## Current Note Groups
### 1. Overall Direction
- [docs/remorse-cpp-decompilation-plan.md](docs/remorse-cpp-decompilation-plan.md): staged route from decompiler-style C toward evidence-backed C++.
- [docs/remorse-rebuild-abi-notes.md](docs/remorse-rebuild-abi-notes.md): Track A versus Track B guardrails, segmented-pointer concerns, packing, and calling-convention constraints.
- [docs/remorse-toolchain-fingerprint-evidence.md](docs/remorse-toolchain-fingerprint-evidence.md): focused evidence note for the bound NE/Phar Lap/High-C-related toolchain story that underlies the ABI constraints.
- [docs/remorse-cpp-compatibility-header-draft.md](docs/remorse-cpp-compatibility-header-draft.md): first draft of the compatibility/support layer that future C++ skeletons should target.
### 2. Candidate Inventory And Tooling Surface
- [docs/remorse-class-candidate-inventory.md](docs/remorse-class-candidate-inventory.md): strongest current class candidates and modeling order.
- [docs/ghidra-mcp-class-lifting-endpoint-spec.md](docs/ghidra-mcp-class-lifting-endpoint-spec.md): missing class/vtable/datatype authoring operations for the local MCP fork.
### 3. Family-Specific Layout Notes
- [docs/entity-dispatch-entry-class-layout.md](docs/entity-dispatch-entry-class-layout.md): current `EntityDispatchEntry` base-versus-derived model, release surface, and subtype overlays.
- [docs/sprite-node-class-layout.md](docs/sprite-node-class-layout.md): `SpriteNode` destructor/event surface and candidate virtual-slot map.
- [docs/entity-class-family-split.md](docs/entity-class-family-split.md): conservative split of the large `Entity` lane into base, projectile, debris, corpse/actor, and adjacent non-entity families.
- [docs/entity-vm-runtime-owner-resource-layout.md](docs/entity-vm-runtime-owner-resource-layout.md): current runtime/helper/context ownership model for the VM lane.
- [docs/presentation-callback-broker-layout.md](docs/presentation-callback-broker-layout.md): current object/lifecycle/vtable evidence for the `0x4588` presentation-state callback broker family.
- [docs/usecode-debugger-break-state-layout.md](docs/usecode-debugger-break-state-layout.md): current object/lifecycle/layout evidence for the dormant seg1408 debugger-state family.
### 4. Execution Checklists
- [docs/remorse-first-class-authoring-checklist.md](docs/remorse-first-class-authoring-checklist.md): concrete first-batch checklist for the initial Ghidra/MCP class-authoring pass, with pilot-family guidance and source-emission gates.
## Recommended Work Order
### Stage 1: Keep The Evidence Model Honest
1. Re-read the plan, ABI note, and candidate inventory.
2. Pick one family with bounded ambiguity.
3. Confirm ctor, dtor, vtable root, and stable field groups before any class ownership changes in Ghidra.
Best current pilot families:
1. `EntityDispatchEntry`
2. `SpriteNode`
3. `EntityVmOwnerResource`
`Entity` remains a top-priority family, but it should be split deliberately rather than promoted as one giant class too early.
### Stage 2: Ghidra Authoring Pass
1. Create class or namespace owners.
2. Move only strongly owned methods first.
3. Create provisional instance structs and vtable structs.
4. Preserve slot order and unresolved fields instead of trying to beautify them.
The future MCP endpoint sequence should follow the spec note rather than ad hoc scripting.
### Stage 3: First C++ Skeleton Slice
1. Emit one header/source pair for the pilot family.
2. Build it against the compatibility layer rather than raw host C++ alone.
3. Keep unresolved offsets as named placeholders instead of collapsing them into speculative semantics.
4. Record which parts are Track A safe versus Track B convenience-only.
## Immediate Follow-Ups Still Open
1. Convert the first family note into a hand-maintained C++ skeleton once the compatibility header is accepted.
2. Implement the MCP class/vtable authoring endpoints only after the workflow and note set above are stable enough to drive them.
3. Add one more dedicated note for the callback/object lane around `0x4588` only if later caller evidence supports a stronger subsystem name than `PresentationCallbackBroker`.
4. Turn the first-class-authoring checklist into a completed execution log once the first real MCP batch lands.
## Current Live Authoring Snapshot
The live `CRUSADER.EXE` class-authoring lane is no longer just a plan.
Current authored `Remorse` classes in the active database are:
- `EntityDispatchEntry`
- `EntityVmOwnerResource`
- `EntityVmRuntime`
- `EntityVmContext`
- `EntityVmSlotEntry`
The VM lane is still the furthest along in actual Ghidra authoring. Recent live batches added the bounded `EntityVmSlotEntry` class owner plus more owned `EntityVmRuntime` methods (`GetSlotChunkPtrAtOffset`, `ReleaseSlotChunkRef`, `TryUnloadSlotChunk`, `DebugDumpSlotMemory`, `ApplyToMatchingOwnerRows`) rather than stopping at free-function naming.
The next planned pilot family is no longer purely preparatory either. `Remorse::EntityDispatchEntry` now exists as a real class owner in-session with a first provisional `/Remorse/EntityDispatchEntryBase` datatype covering the stable field block through `+0x18` and a matching `/Remorse/EntityDispatchEntryVtable` datatype exposing only the verified `+0x14` and `+0x28` callback slots. The first base-method batch has also landed from the old `0008:` note cluster after re-anchoring that range onto the live `11e0:` process-substrate segment: `InitBase`, `SetSourceType`, `SetEventTypeChecked`, `SetGroupId`, `Unlink`, and `IncrementGroupId` now live under the class owner with provenance comments preserved.
That family also has its first derived slice now. The old `000d:7e00/8078` runtime-state pair is re-anchored in the live `1440:` fade/palette cluster as `InitRuntimeState` and `ReleaseRuntimeState`, and `/Remorse/EntityDispatchEntryRuntimeState` now exists as a provisional overlay datatype with the recovered `+0x40..+0x4c` runtime-state tail fields. That is a meaningful pause point because the pilot family now has a class owner, a base datatype, a vtable shell, a first base-method batch, and one concrete derived/runtime-state batch rather than just one isolated constructor lane.
The latest signature-recovery pass also tightened two of those runtime methods materially:
- `GetSlotChunkPtrAtOffset(runtime_farptr, slot_index, chunk_index, intra_chunk_offset)` now reads as a real slot-chunk accessor instead of a five-word anonymous wrapper.
- `ApplyToMatchingOwnerRows(runtime_farptr, slot_index_filter, chunk_index_filter)` now reads as a real iterator/filter helper instead of a split-word scratch signature.
The next live batch pushed that further still: most of the `EntityVmRuntime` method cluster now carries an explicit 4-byte `EntityVmRuntime * this` in-session, including `Create`. The main remaining type gap inside that class is no longer the runtime object itself, but the exact far slot-entry pointer positions on `AcquireSlotForEntity` and `InitSlotOwnerBuffers`.
That VM-side gap is now closed too: `AcquireSlotForEntity` returns `EntityVmSlotEntry *32` in `DX:AX`, `InitSlotOwnerBuffers` now accepts `EntityVmSlotEntry *32`, `EntityVmOwnerResource::{Create,Destroy}` now carry explicit 4-byte `this`, and the simple `EntityVmContext` lifecycle methods now do the same.
The next family switch has also landed in the live database: `Remorse::UsecodeDebuggerBreakState` now exists as a real class owner with a provisional `0x2f2` datatype and a stronger method batch (`Create`, `MaybeBreakOnCurrentLine`, `BreakpointInsertSorted`, `BreakpointRemove`, `HasBreakpoint`, `CallstackPushFrame`, `CallstackPushEntry`, `CallstackPopEntry`, `EnableSingleStep`, `ClearStepState`, `CurrentEntryGetUnitName`).
That debugger family is no longer just a top-level shell. The internal record shapes are now recovered and applied live well enough to treat the two tables as real fixed-size arrays in-session: breakpoint entries are `0x0b` bytes with `unit_name_inline[9] + line_number`, and callstack entries are `0x15` bytes with `unit_name_inline[9]` plus the currently safest trailing fields `source_stream_cursor_farptr`, `current_frame_payload_farptr`, and still-neutral `aux_farptr`.
The next debugger pass tightened the bounded helper and callback edges too. `1408:0230` now lives under `Remorse::UsecodeDebuggerBreakState::BreakpointFindFirstForUnitAtOrAfterLine` instead of as an anonymous seg1408 helper, and the retail vtable root at `1478:65ab` is no longer a blind spot: slot 0 resolves to `OnBreakTriggeredNoop` and slot 1 resolves to `VtableSlot1ReturnZero`, which keeps the class surface honest about the shipped build's inert break callbacks while leaving non-retail behavior open.
The follow-up seg109 consumer pass is also done now. `13a0:0291` and its local helper `13a0:045c` are documented in-session as the first concrete consumers of the current callstack entry's trailing payload: they read entry `+0x09` as a descriptor/source-stream cursor and entry `+0x0d` as the current-frame payload context while formatting debugger dump/watch text. That cursor name is now promoted into the live `/Remorse/UsecodeDebuggerCallstackEntry` datatype and the `CallstackPushFrame` signature too, which means the remaining open callstack question is mostly the unused `aux_farptr` lane rather than the first two dwords.
The VM lane also advanced one more selective step without overpromoting inheritance: `Remorse::EntityVmContext::CreateFromSlotIndex` now has a caller-backed mixed parameter pack (`owner_source_farptr`, `pitemno_farptr`, `mode_flags`, `slot_index`, `value_add_offset`, `intra_chunk_offset`, `ucparam_farptr`, `ucparamsize`) and an explicit far return restored in `AX:DX`, even though the current live endpoint still textualizes that repaired signature conservatively as plain `dword __cdecl`.
## Bottom Line
The current prep work is now large enough that it should be treated as one coordinated lane rather than scattered notes.
Use this file as the resume point for future class-lift and C++ reconstruction work.