12 KiB
GhidraMCP Class-Lifting Endpoint Spec
Purpose
This note drafts the endpoint surface needed to support the Remorse class-lifting workflow described in docs/remorse-cpp-decompilation-plan.md and grounded by docs/remorse-class-candidate-inventory.md.
This is not an implementation batch. It is a local design spec so that when MCP work resumes later, the endpoint set can be built in a way that matches the actual reverse-engineering workflow instead of a generic symbol-edit API.
Design Goals
The new endpoints should make these workflows cheap and repeatable:
- create class and namespace containers in Ghidra without touching the GUI
- move already-renamed flat functions under explicit class ownership
- build typed instance structs and typed vtables from verified evidence
- attach
this-pointer semantics and method signatures to recovered methods - preserve ambiguity when evidence is partial instead of forcing speculative class conversions
- support dry-run review before any bulk symbol or datatype mutation
Non-Goals
- automatic recovery of class hierarchies from raw heuristics alone
- one-shot
convert whole binary to C++ classes - speculative inheritance inference without vtable or field evidence
- silent symbol moves that hide rename collisions or ownership conflicts
Existing MCP Behavior To Reuse
The local fork already has patterns worth reusing:
- explicit target selectors:
project_dir,project_name,folder_path,program_name - dry-run oriented edit-plan behavior
- machine-friendly outputs rather than prose-heavy summaries
- backward-compatible aliases when route names change
Every new class-lifting endpoint should follow the same conventions.
Core Object Model Assumptions
The class-lifting workflow needs to manipulate four kinds of things explicitly:
- namespace/class containers in the symbol tree
- function ownership and method naming
- datatypes for instance structs and vtables
- binding metadata between methods, vtable slots, and instance layouts
That means symbol-only endpoints are not enough. Datatype endpoints and method-binding endpoints are part of the minimum viable feature set.
Proposed Endpoints
1. create_namespace
Create a namespace or class container.
Parameters:
name: stringparent_path: string, optionalkind: enumnamespace|class, defaultnamespace- explicit target selectors, optional
Response:
statuscreated: boolkindpathsymbol_idor equivalent stable identifier if availablecollision: existing path info when create is skipped or merged
Why it matters:
- lets the workflow create
Entity,SpriteNode,EntityVmRuntime, or similar owners before moving methods
2. list_namespace_members
Return members of a namespace or class container in a machine-friendly form.
Parameters:
path: stringinclude_child_namespaces: bool, defaultfalseinclude_functions: bool, defaulttrueinclude_data: bool, defaulttrue- explicit target selectors, optional
Response:
statuspathmembers: array of{ kind, name, address?, datatype?, child_count? }
Why it matters:
- needed for inventory verification and idempotent batch moves
3. move_symbol_to_namespace
Move a function or data symbol under a namespace/class.
Parameters:
symbol_address: string, optionalsymbol_name: string, optional- one of the above required
namespace_path: stringnew_name: string, optionalconflict_policy: enumfail|keep_existing|rename_incoming, defaultfaildry_run: bool, defaultfalse- explicit target selectors, optional
Response:
statusmoved: boolold_pathnew_pathcollision: optional structured collision detail
Why it matters:
- this is the basic operation needed to turn flat functions into methods after evidence is verified
4. set_function_class
High-level helper to move a function into a class and apply method-oriented naming/signature metadata in one call.
Parameters:
function_address: stringclass_path: stringmethod_name: stringthis_param_name: string, optional, defaultthiscalling_convention: string, optionaldry_run: bool, defaultfalse- explicit target selectors, optional
Response:
statusfunction_addressold_pathnew_pathsignature_beforesignature_after
Why it matters:
- reduces the number of separate write operations for the common
move + rename + set this semanticsworkflow
5. create_or_update_struct
Create or update a structure datatype.
Parameters:
name: stringcategory_path: string, optionalsize: integer, optionalpacking: integer, optionalfields: array of field specs
Each field spec:
-
offset: integer -
name: string -
datatype: string -
comment: string, optional -
confidence: enumhigh|medium|low, optional -
dry_run: bool, defaultfalse -
explicit target selectors, optional
Response:
statusdatatype_pathcreated_or_updatedsizefield_countconflicts: array, optional
Why it matters:
- class lifting without struct authoring is not enough for readable or recompilable source
6. create_or_update_vtable
Create a vtable datatype as a structure of function pointers.
Parameters:
name: stringcategory_path: string, optionalslots: array of slot specsdry_run: bool, defaultfalse- explicit target selectors, optional
Each slot spec:
offset: integername: stringfunction_address: string, optionalprototype: string, optionalcomment: string, optional
Response:
statusdatatype_pathslot_countbound_functions: array of{ offset, function_address, name }
Why it matters:
- this is the missing datatype-side half of stable virtual dispatch recovery
7. set_function_this_type
Apply or update this-pointer typing on a function.
Parameters:
function_address: stringthis_type: stringthis_param_name: string, optional, defaultthisthis_storage: enumstack|register|farptr, optionalcalling_convention: string, optionaldry_run: bool, defaultfalse- explicit target selectors, optional
Response:
statusfunction_addresssignature_beforesignature_after
Why it matters:
- many decompiler improvements only show up after the instance type is attached to the first argument correctly
8. analyze_vtable
Read-side helper that inspects a suspected vtable region and emits slot candidates.
Parameters:
address: stringslot_count: integer, optionalstop_on_invalid_pointer: bool, defaulttrue- explicit target selectors, optional
Response:
statusaddressslots: array of{ offset, target_address, target_name, is_function, current_owner?, comment? }warnings: array, optional
Why it matters:
- this is the minimum analysis helper needed before class authorship is applied at scale
9. apply_class_layout
Bind a class namespace, instance struct, optional vtable struct, and a set of methods in one dry-runnable transaction.
Parameters:
class_path: stringinstance_struct: stringvtable_struct: string, optionalvtable_address: string, optionalmethods: array of method specsdry_run: bool, defaultfalse- explicit target selectors, optional
Each method spec:
function_address: stringmethod_name: stringslot_offset: integer, optionalis_virtual: bool, defaultfalsethis_type: string, optionalcomment: string, optional
Response:
statusclass_pathapplied_methodsapplied_structswarnings
Why it matters:
- supports one-shot promotion of a verified family from notes into Ghidra with explicit review first
10. export_class_candidate
Read-side export helper for documentation and source-generation prep.
Parameters:
class_path: stringinclude_struct_fields: bool, defaulttrueinclude_vtable: bool, defaulttrueinclude_method_signatures: bool, defaulttrue- explicit target selectors, optional
Response:
- machine-friendly JSON-like object containing class metadata, methods, field layouts, and slot maps
Why it matters:
- the local docs and future C++ skeleton emission need a clean export surface, not just screen scraping
Field Schemas
Struct field schema
Recommended stable shape:
{
"offset": 0,
"name": "vtable",
"datatype": "EntityVTable *",
"comment": "Primary vtable pointer",
"confidence": "high"
}
Method schema
{
"function_address": "0008:ba00",
"method_name": "Init",
"slot_offset": null,
"is_virtual": false,
"this_type": "EntityDispatchEntry *",
"comment": "Base constructor-style init"
}
Vtable slot schema
{
"offset": 20,
"name": "OnEventType2",
"function_address": "000b:3ab2",
"prototype": "void (__far *OnEventType2)(SpriteNode *, Event *)"
}
Transaction And Safety Rules
All write-capable class-lifting endpoints should support:
dry_run- explicit target selectors
- structured conflict reporting
- idempotent repeat calls where practical
- no silent overwrite of unrelated symbols or datatype fields
Recommended conflict output shape:
type:symbol_collision|datatype_collision|slot_conflict|owner_conflict|signature_conflictpathoraddressexistingrequestedresolution_options
Backward Compatibility And Aliases
Where practical, add aliases instead of replacing older names.
Recommended aliases:
create_class->create_namespace(kind=class)move_function_to_class->set_function_classset_this_type->set_function_this_typebuild_vtable->create_or_update_vtable
This follows the local fork’s existing pattern of keeping compatibility wrappers when route names evolve.
Suggested Implementation Order
If implementation resumes later, the smallest useful sequence is:
create_namespacemove_symbol_to_namespaceset_function_this_typecreate_or_update_structanalyze_vtablecreate_or_update_vtableapply_class_layoutexport_class_candidate
That order enables immediate manual class work after only the first three or four endpoints, while leaving the richer transactional workflows for later.
First Real Workflow To Target
The first workflow this API should make easy is the pilot family from the current inventory:
EntityDispatchEntryBase promotion workflow
- create class namespace
Remorse::EntityDispatchEntry - create instance struct
EntityDispatchEntry - move
0008:ba00,0008:bca8,0008:bd53,0008:bf8e,0008:c01d,0008:dbec, and constructor variants under that class as methods - attach
thistyping - analyze or define vtables
0x3b06,0x2d10,0x3afe,0x3ad2,0x3aa6 - export the class candidate for repo-side documentation and C++ skeleton generation
If the endpoint surface handles that family cleanly, it is probably sufficient for the rest of the early C++ lifting work.
Open Questions To Resolve Later
- whether Ghidra class namespaces or plain namespaces produce better decompiler output in this 16-bit NE environment
- how best to encode far-pointer aware
thisconventions in method signatures - whether vtable datatypes should be attached to concrete memory addresses automatically or only on explicit request
- whether confidence annotations should live in datatype comments, decompiler comments, or external export metadata
Summary
The endpoint surface needed here is not large, but it does need to span both symbol ownership and datatype authorship. If later MCP work only adds move function into class, it will still leave the hardest part of the C++ lift undone.
The minimum viable class-lifting feature set is therefore:
- namespace/class creation
- symbol-to-class moves
thistyping- struct authoring
- vtable analysis/authoring
- one transactional
apply_class_layoutpath