# Ghidra MCP Wishlist This file records concrete gaps in the current Ghidra MCP workflow. Update it whenever a task requires PyGhidra or another local-only fallback because MCP lacks the needed operation. For each new entry, keep the format short: - Missing capability - Current fallback - Why it matters in this repo - Proposed MCP endpoint or behavior ## Current Wishlist ### Live MCP Issues Hit During Spanish Cheat Pass (2026-03-26) - Missing capability: working `search_bytes(...)` requests against the currently opened program. - Current fallback: `read_region(...)`, `get_data_uses(...)`, `search_instructions(...)`, and manual/xref-driven narrowing inside `/es/CRUSADER.EXE`. - Why it matters: the Spanish-cheat question specifically needed a direct full-memory search for the English `jassica16` scan-code table and any plausible replacement sequence. - Proposed MCP behavior: `search_bytes(...)` should honor the active program context by default and return a machine-friendly empty-hit result when no matches exist, not `HTTP 404 No context found for request`. - Missing capability: reliable explicit target selection on read/query endpoints in the live server session. - Current fallback: repo notes plus manual project `.prp` metadata inspection after `read_region(...)` and `get_function_by_address(...)` ignored explicit root-vs-`/es` selectors and still resolved against the active Spanish program. - Why it matters: this repo routinely needs side-by-side comparisons between `/CRUSADER.EXE`, `/es/CRUSADER.EXE`, `/Writable/...`, and other project entries without changing the active Ghidra tab. - Proposed MCP behavior: all selector-aware read endpoints should actually bind to the requested `project_dir` / `project_name` / `folder_path` / `program_name`, or return a structured target-resolution failure instead of silently reading the active program. - Missing capability: consistent context handling for project/runtime metadata helpers in the live server session. - Current fallback: direct `get_project_access_info()` plus workspace-side `.prp` reads after `list_project_programs(...)`, `get_callers(...)`, `compare_functions(...)`, and `get_runtime_capabilities()` returned `404 No context found for request` during an otherwise healthy active-program session. - Why it matters: these are the exact helper endpoints needed to validate which program is active, enumerate comparison targets, and reason about whether a failure is a real analysis result or an MCP/session problem. - Proposed MCP behavior: metadata helpers should either work whenever an active program exists or return structured unsupported-state details, not raw 404 context failures. - Status update (2026-03-26, later Spanish pass): the refreshed live server still returned `404 No context found for request` for `get_runtime_capabilities(...)` and `get_callers(...)` during an active `/es/CRUSADER.EXE` session, so this is still a live deployment or routing problem, not just an earlier-session artifact. ### Open Gaps Found During Hidden Usecode Debugger Patch Batch (2026-03-24) - Missing capability: write-capable project/program selection for MCP edit operations. - Current fallback: local PyGhidra `run-script` plus `read-region` against `--project-dir K:\ghidra\Crusader_Decomp --project-name Crusader --folder-path /Writable --program-name CRUSADER-PATCHED.EXE`. - Why it matters: retail NE patch work in this repo must sometimes modify and save `/Writable/CRUSADER-PATCHED.EXE` with the GUI closed, while current MCP write flows depend on the active Ghidra session/program context. - Proposed MCP addition: add bridge-exposed target selectors (`project_dir`, `project_name`, `folder_path`, `program_name`) for write endpoints, backed by plugin support to open the requested project file, apply `patch_bytes_and_reanalyze` or edit-plan writes, and save deterministically. - Status update (2026-03-24): local fork now accepts optional `project_dir`, `project_name`, `folder_path`, and `program_name` selectors on `apply_program_edit_plan` and `patch_bytes_and_reanalyze`; explicit targets are opened through `GhidraProject`, written, saved deterministically, and then released. - Status update (2026-03-24, follow-up): explicit target resolution now reuses an already-open matching program when possible and otherwise opens a writable domain object directly; MCP no longer opens explicit targets in read-only mode for edit operations. ### Open Gaps Found During Current 0x4588 Pass (2026-03-21) - Missing capability: usable read-only scripting in the live MCP/Ghidra session. - Current fallback: terminal-side Python and manual MCP inspection windows after `run_readonly_script` returned `Ghidra was not started with PyGhidra. Python is not available`. - Why it matters: one-off structure probes and byte-pattern scans are still common during EUSECODE and overlap work, and they are much cleaner as constrained in-process reads than as external heuristics. - Proposed MCP addition: expose runtime capability state for `run_readonly_script` and either guarantee a working in-process script engine or return a machine-friendly unsupported-state response early. - Status update (2026-03-24): local fork now exposes `get_runtime_capabilities()` with readonly-script probe state and `run_readonly_script()` returns structured `status`/`reason`/`detail` output early when Python support is unavailable in the live session. - Status update (2026-03-24, follow-up): `open_current_program_readonly()` is now intentionally disabled and returns an unsupported-state response so MCP does not create accidental read-only program instances in normal workflow. - Status update (2026-03-21): the current live plugin process still returns HTTP 404 for direct symbol routes (`/get_symbol_at`, `/symbol_at`) in this chat session, but bridge `get_symbol_at(address)` now avoids raw 404s by falling back to compatible legacy endpoints and returning deterministic symbol-state output (for example `0x844` -> `symbol=`). - Remaining gap: reload/redeploy the updated plugin build so direct symbol routes are present in the live process; bridge fallback now covers older live builds in the meantime. - Implemented now: - `get_xrefs_to(address)` / `get_xrefs_from(address)` with typed ref kinds (`call`, `read`, `write`, `jump`, `other`) plus containing-function metadata. - tolerant `set_function_prototype` retries for legacy calling-convention tokens (for example `__cdecl16far`) and returns an accepted template example on parse/apply failure. - `rename_data(address, new_name)` now renames or creates the primary symbol at any valid address and returns the resolved symbol metadata instead of `Rename data attempted`. - `get_symbol_at(address)` returns the primary symbol state at an address so label changes can be verified directly without depending on decompiler refresh timing. - `get_symbol_at(address)` now resolves the active program on the Swing thread, falls back to the visible/open program when the current-program pointer is transiently unavailable, and the bridge retries the compatible `/symbol_at` alias if a stale server route returns `404 No context found for request`. - bridge `get_symbol_at(address)` now probes additional legacy aliases (`getSymbolAt`, `symbolAt`, `get_symbol`) and, if symbol routes are absent, derives symbol state from legacy endpoints (`get_function_by_address`, paged `data`) so callers receive machine-friendly output instead of a raw 404. - Local bridge audit (2026-03-21): `get_xrefs_to` / `get_xrefs_from` wrappers are already present in `K:\mcp\GhidraMCP\bridge_mcp_ghidra.py`; if a client still does not surface them, that is a client/tool-refresh issue rather than a missing local-fork endpoint. ## Implemented In Local GhidraMCP Fork (2026-03-21) Added endpoints in `K:\mcp\GhidraMCP\src\main\java\com\lauriewired\GhidraMCPPlugin.java` and tools in `K:\mcp\GhidraMCP\bridge_mcp_ghidra.py`: - Function boundary repair: - `create_function_by_address(entry, name, body_start, body_end, comment?)` - `delete_function_by_address(entry)` - `get_function_containing(address)` - Arbitrary code and memory inspection: - `read_region(start, end)` - `disassemble_region(start, end)` - `get_instruction_window(address, before_count, after_count)` - `search_instructions(query, mode=text|operand|address, limit?)` - `get_data_uses(address, include_operand_scans=true, limit?)` - Batch and transactional edits: - `set_comments(batch)` - `set_decompiler_comments(batch)` - `rename_functions_by_address(batch)` - `apply_program_edit_plan(plan, dry_run=false)` - Reanalysis and repair helpers: - `reanalyze_region(start, end)` - `patch_bytes_and_reanalyze(start, bytes, comment?)` - `analyze_function_boundaries(start, end)` - Read-only project access and scripting: - `get_project_access_info()` - `get_runtime_capabilities()` - `open_current_program_readonly(version=-1, make_current=true)` - `run_readonly_script(script_path|script_text)` with a constrained token denylist policy - Explicit write targeting: - optional `project_dir`, `project_name`, `folder_path`, `program_name` selectors on `apply_program_edit_plan(...)` - optional `project_dir`, `project_name`, `folder_path`, `program_name` selectors on `patch_bytes_and_reanalyze(...)` Batch encoding used by the current bridge: - `set_comments` and `set_decompiler_comments`: list of `(address, comment)` pairs. - `rename_functions_by_address`: list of `(address, new_name)` pairs. - `apply_program_edit_plan`: one action per line with `|` separators, for example: - `create_function_by_address|000c:1234|name|000c:1234|000c:1260|note` - `delete_function_by_address|000c:1234` - `rename_function_by_address|000c:1234|new_name` - `set_disassembly_comment|000c:1234|comment text` - `set_decompiler_comment|000c:1234|comment text` Notes on read-only coverage: - `open_current_program_readonly` opens a read-only program object for the currently loaded domain file. - Project-switch/open-by-path is still not implemented; MCP still operates on the active Ghidra GUI project context. ### Function boundary repair - Missing capability: create a function at an explicit entry with an explicit body start/end. - Current fallback: local PyGhidra `create-function` and JSON repair plans. - Why it matters: boundary repair is a recurring part of this project, especially for overlapped or truncated raw functions. - Proposed MCP addition: `create_function_by_address(entry, name, body_start, body_end, comment?)`. - Missing capability: delete an incorrect auto-created function. - Current fallback: local PyGhidra `delete-function`. - Why it matters: bad auto-analysis often blocks decompilation of adjacent real functions. - Proposed MCP addition: `delete_function_by_address(entry)`. - Missing capability: get the function containing an arbitrary address. - Current fallback: local PyGhidra `get-function-containing`. - Why it matters: no-function windows and overlap investigations depend on quickly mapping instruction hits back to owning functions. - Proposed MCP addition: `get_function_containing(address)`. ### Arbitrary code and memory inspection - Missing capability: read raw bytes from an arbitrary address range in program memory. - Current fallback: local PyGhidra `read-region`. - Why it matters: some important sites are real code bytes that are not yet part of any function object. - Proposed MCP addition: `read_region(start, end)` returning bytes and a compact hex view. - Missing capability: dump nearby instructions around an arbitrary address even when no function exists there. - Current fallback: custom read-only PyGhidra scripts such as `pyghidra_plans/dump_instruction_windows.py`. - Why it matters: the `0x4588` investigation depended on inspecting instruction windows in no-function regions. - Proposed MCP addition: `disassemble_region(start, end)` or `get_instruction_window(address, before_count, after_count)`. - Missing capability: scan all instructions for a literal operand or address token. - Current fallback: custom PyGhidra scripts such as `scan_4588_instruction_uses.py`. - Why it matters: normal xref APIs can miss useful operand-text hits in partially analyzed regions. - Proposed MCP addition: `search_instructions(query, mode=text|operand|address, limit?)`. - Missing capability: robust data-address xrefs that include operand-based uses even when the reference manager has none. - Current fallback: instruction-text scans and manual disassembly windows. - Why it matters: globals like `0x4588` can be heavily used before formal references exist in the database. - Proposed MCP addition: `get_data_uses(address, include_operand_scans=true)`. ### Batch and transactional edits - Missing capability: apply a small transactional edit plan containing function removals, function creations, renames, and comments. - Current fallback: local PyGhidra `apply-plan` with JSON. - Why it matters: boundary repair work is safer when a verified batch can be replayed atomically. - Proposed MCP addition: `apply_program_edit_plan(plan)` with dry-run support. - Missing capability: batch comment creation for a verified address set. - Current fallback: repeated single-address comment calls or PyGhidra plan files. - Why it matters: reverse-engineering batches often produce several related evidence comments at once. - Proposed MCP addition: `set_comments(batch)` and `set_decompiler_comments(batch)`. - Missing capability: batch rename-by-address for a small verified set. - Current fallback: repeated `rename_function_by_address` calls or local plan files. - Why it matters: verified raw-import ports often land in short, evidence-backed batches. - Proposed MCP addition: `rename_functions_by_address(batch)`. ### Reanalysis and repair helpers - Missing capability: re-disassemble or reanalyze a small address range after patching bytes or changing function boundaries. - Current fallback: local scripted repair passes. - Why it matters: the far-call fixup workflow and boundary recovery both depend on deterministic reanalysis of touched ranges. - Proposed MCP addition: `reanalyze_region(start, end, options?)`. - Missing capability: patch a small byte range and immediately re-disassemble affected instructions. - Current fallback: local PyGhidra repair scripts. - Why it matters: the NE far-call fixup pass was a major workflow improvement and is exactly the sort of task MCP should eventually support. - Proposed MCP addition: `patch_bytes_and_reanalyze(start, bytes, comment?)`. - Missing capability: detect likely bad function overlaps or candidate function starts in a small range. - Current fallback: manual repair plus custom PyGhidra probing. - Why it matters: overlap repair is one of the main reasons the workflow still has to leave MCP. - Proposed MCP addition: `analyze_function_boundaries(start, end)` returning overlap warnings and candidate entries. ### Read-only project access and scripting - Missing capability: open a locked project read-only or query a specified project clone directly from MCP. - Current fallback: local PyGhidra against an unlocked temporary project clone. - Why it matters: the GUI often owns the main project while read-only inspection still needs to continue. - Proposed MCP addition: read-only project selection/open options for all analysis endpoints. - Missing capability: run a small read-only script for one-off inspections that do not justify a permanent MCP endpoint yet. - Current fallback: local PyGhidra `run-script --read-only`. - Why it matters: several repo workflows start as one-off analysis helpers before they prove worth productizing. - Proposed MCP addition: a constrained `run_readonly_script(script_text|script_path)` endpoint with explicit safety limits. ### Migrated entries from `ghidra-mcp_wishlist.md` Short, concrete gaps hit during live Crusader work. Each entry records what MCP lacked, what fallback was needed, and what a useful MCP feature should look like. ## Open Gaps (migrated) ### Byte-pattern search across program memory - Status: implemented in local fork (2026-03-26) - Missing MCP capability: search raw bytes or byte patterns across the current program's mapped segments / address spaces. - Fallback used: manual `read_region` sweeps plus local Python over the MCP HTTP bridge to scan live Spanish `CRUSADER.EXE` memory for the `jassica16` scan-code table. - Useful MCP feature: - `search_bytes(pattern, start?, end?, segment_filter?, max_hits?)` - accepts hex byte patterns with optional wildcards - returns exact hit addresses plus nearby hex context - Why it matters: this would have closed the Spanish cheat-sequence question directly inside MCP instead of forcing ad hoc local scripting. - Status update (2026-03-26): local fork now exposes `search_bytes(pattern, start?, end?, segment_filter?, max_hits?)` in both the Java plugin and Python bridge; it accepts `??` wildcards, scans mapped memory blocks, and returns machine-friendly hit lines with block names and nearby hex context. ### Reliable caller/xref recovery for local call sites - Status: implemented in local fork (2026-03-26) - Missing MCP capability: reliable function-call xrefs for near/local calls inside the active program. - Fallback used: manual `search_instructions` and instruction-window inspection because `get_function_xrefs` did not surface some obvious local call sites in the Spanish keyboard/helper cluster. - Useful MCP feature: - improve `get_function_xrefs` so it includes near calls, far calls, tail-call-style jumps, and thunk references consistently - or add `get_callers(address_or_name, include_near=true, include_far=true, include_jumps=true)` - Why it matters: tracing helper chains around hidden key-sequence code is slower and less reliable when local callers have to be reconstructed by text search. - Status update (2026-03-26): local fork now exposes `get_callers(target, include_near=true, include_far=true, include_jumps=true, limit?)`, combining reference-manager hits with instruction-flow scans so local near-call sites show up even when plain xrefs are incomplete; `get_function_xrefs` now reuses the same caller recovery path. ### Cross-program reads inside the same Ghidra project - Status: implemented in local fork (2026-03-26) - Missing MCP capability: read/query another program or assembly in the same project without switching the active program first. - Fallback used: indirect comparison against repo notes, workspace-side files, and ad hoc local scripts instead of querying `/CRUSADER.EXE`, `/es/CRUSADER.EXE`, `/Writable/...`, or other domain files side by side through MCP. - Useful MCP feature: - allow explicit target selectors on all read/query endpoints, not only write endpoints - example: `read_region(start, end, project_dir?, project_name?, folder_path?, program_name?)` - same for strings, functions, xrefs, data uses, decompile, disassemble, symbol lookup, and segment listing - Why it matters: live localized-build comparisons and writable-copy verification should not require changing the active Ghidra tab just to inspect another program. - Status update (2026-03-26): read/query endpoints in the local fork now accept optional explicit target selectors (`project_dir`, `project_name`, `folder_path`, `program_name`) and reuse the same target-resolution layer as write flows; this now covers method/class listings, segments, imports/exports, namespaces, data items, function lookup/listing, decompile/disassembly, symbol lookup, regions, instruction scans, strings, xrefs, and data-use queries. ### Cross-project / cross-program compare tooling - Status: implemented in local fork (2026-03-26) - Missing MCP capability: first-class compare operations between two programs in the same project or across projects. - Fallback used: manual note-to-note comparison, address math, and repeated per-program queries. - Useful MCP feature: - `compare_regions(left_program, left_range, right_program, right_range, mode=bytes|words|disasm|strings)` - `compare_strings(left_program, right_program, filter?)` - `compare_functions(left_program, left_addr_or_name, right_program, right_addr_or_name, mode=signature|disasm|decompile|xrefs)` - machine-readable output with address pairs, similarity score, and differing bytes/instructions/strings - Why it matters: this would make English vs Spanish / Remorse vs Regret / raw vs live NE comparisons much faster and less error-prone. - Status update (2026-03-26): local fork now exposes `compare_regions(...)`, `compare_strings(...)`, and `compare_functions(...)` with left/right explicit target selectors; outputs are machine-friendly and include comparison mode, similarity score, and capped difference samples for byte/word, disassembly, string, signature, decompile, and xref views. ### Port renames/comments/symbol facts between programs - Status: implemented in local fork (2026-03-26) - Missing MCP capability: apply verified names/comments from one program to another program with explicit provenance instead of re-entering them one by one. - Fallback used: manual rename/comment batches plus external notes to carry mapping provenance. - Useful MCP feature: - `port_symbols(source_program, target_program, mappings, apply=names|comments|both, provenance_comment_template?)` - support direct address maps, segment-relative maps, and user-supplied CSV/JSON mapping tables - dry-run mode showing collisions and ambiguous targets - Why it matters: porting verified English or raw-import findings into Spanish or live NE targets is a recurring workflow. - Status update (2026-03-26): local fork now exposes `port_symbols(mappings, apply=names|comments|both, provenance_comment_template?, dry_run?)` with `source_*` and `target_*` selectors; the bridge accepts a verified list of source/target address pairs and the plugin ports names plus PRE/EOL comments with optional provenance text and explicit-target save support. ### Project inventory / browse endpoint - Status: implemented in local fork (2026-03-26) - Missing MCP capability: list project folders and available programs through MCP. - Fallback used: repo-side assumptions and local tooling; the current MCP read tools expose only the active program cleanly. - Useful MCP feature: - `list_project_programs(project_dir?, project_name?, folder_path?, recursive=true)` - returns folder path, program name, read-only/writable/versioned state, and whether it is currently open - Why it matters: comparing or porting across programs is awkward without a discoverable inventory of assemblies already in the Ghidra project. - Status update (2026-03-26): local fork now exposes `list_project_programs(project_dir?, project_name?, folder_path?, recursive=true)` plus a `project_programs` alias; it walks project folders and returns machine-friendly program inventory lines with folder path, program name, content type, read-only/versioned flags, and current-open state.