This commit is contained in:
MaddoScientisto 2026-04-12 14:45:08 +02:00
commit a9153546ae
56 changed files with 6731 additions and 258 deletions

View file

@ -1,214 +1,43 @@
---
name: pyghidra-ghidra-ops
description: Local PyGhidra fallback workflow for Crusader Ghidra edits and queries
description: MCP-only Python-backed Ghidra scripting workflow for Crusader edits and queries; use when live MCP Python/script capabilities are needed and never for the offline local CLI toolkit
---
# PyGhidra Ghidra Ops
Use this skill when Ghidra MCP is missing a needed operation and you need native CPython access to the Ghidra API for the local Crusader project.
Use this skill when the live Ghidra MCP session needs Python-backed inspection or scripted edits. Do not use the offline local PyGhidra CLI from this workspace.
## Use Cases
- Create or delete functions in `CRUSADER-RAW.EXE`.
- Apply small batched repairs driven by verified addresses.
- Add comments or rename functions by address from a repeatable JSON plan.
- Decompile or disassemble functions without switching back to the MCP server.
- Query function metadata, search by name, and inspect xrefs from the same local CLI.
- Inspect project root files to confirm the program name/path before running edits.
- Run live MCP readonly Python-backed inspection when decompiler or xref work needs scripted help.
- Run live MCP write-capable scripted edits for small verified rename, comment, function-boundary, or datatype batches.
- Keep scripted Ghidra work inside the active GUI-backed MCP session so project locks do not matter.
## Workspace Defaults
- Ghidra install dir: `I:\Apps\ghidra_12.0.4_PUBLIC`
- Ghidra project dir: repo root
- Ghidra project name: `Crusader`
- Default program: `CRUSADER-RAW.EXE`
- Local Python env: `.venv-pyghidra311`
- CLI entrypoint: `.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader`
- Bootstrap script: `.\tools\pyghidra_crusader\bootstrap_env.ps1`
- Active authority: the live Ghidra MCP session
- Default target unless stated otherwise: `CRUSADER.EXE`
- Python-backed operations must run through MCP endpoints exposed by the active Ghidra session
## Constraints
- Stay conservative. Use the same rename and batch-size rules as the main Ghidra workflow.
- Prefer one focused plan or 1-5 direct edits at a time.
- If a live MCP session was started with Python enabled, use live `run_readonly_script(...)` for quick inspection before falling back to the local CLI; reserve the local PyGhidra path for write-side work or still-missing MCP capabilities.
- Write operations require the project to be openable for modification. If `Crusader.lock` is present because the GUI owns the project, close Ghidra first or work on a copy.
- Never fall back to the offline/local CLI path from this workspace.
- If MCP cannot do the needed Python-backed operation, document the gap in `ghidra_mcp_wishlist.md` rather than using the local toolkit.
- Keep `crusader_decompilation_notes.md` updated after verified repair batches.
For 16-bit NE decompiler failures after prototype edits or function recreation, inspect direct callees before assuming the caller frame is corrupt. In this repo a broken caller (`1420:1499`) was only fixed after repairing a shared callee (`1000:42e2`) whose pointer-return prototype had decompiled with a hidden `__return_storage_ptr__` and poisoned the caller stack model.
Refresh the local PyGhidra environment when the bundled Ghidra version changes:
## MCP Usage Pattern
```powershell
powershell -ExecutionPolicy Bypass -File .\tools\pyghidra_crusader\bootstrap_env.ps1
```
## Commands
List root project files:
```powershell
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader project-files
```
Delete a bad function object:
```powershell
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader delete-function --entry 0007:5b6f
```
Create a repaired function with an explicit body:
```powershell
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader create-function \
--entry 0007:5a90 \
--name seg043_func_0090 \
--body-start 0007:5a90 \
--body-end 0007:5b79 \
--plate-comment "Recovered from standalone seg043 boundary scan"
```
Rename a function by entry address:
```powershell
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader rename-function --entry 0006:02cc --name entity_class_get_flag20
```
MCP-style read/query commands are also available from the same CLI:
```powershell
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader get-function-by-address --address 000a:48ff
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader get_function_by_address --address 000a:48ff
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader get-function-containing --address 000a:4901
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader decompile-function-by-address --address 000a:48ff
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader disassemble-function --address 000a:48ff
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader read-region --start 000a:48ff --end 000a:4912
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader search-functions-by-name --query rng_
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-methods --limit 20
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list_methods --limit 20
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-strings --limit 20
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-imports --limit 20
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-exports --limit 20
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-namespaces --limit 20
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-segments --limit 20
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-data-items --limit 20
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-classes --limit 20
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader get-xrefs-to --address 000a:48ff
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader get-function-xrefs --name rng_next_modulo
```
All commands also support structured output for scripting:
```powershell
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader --format json get-function-by-address --address 000a:48ff
```
JSON output now uses a stable envelope:
```json
{
"schema_version": "1.0",
"command": "get-function-by-address",
"ok": true,
"schema": { "type": "object", "properties": { "name": { "type": "string" } } },
"data": {
"name": "rng_next_modulo",
"signature": "undefined rng_next_modulo()",
"entry": "000a:48ff",
"body_start": "000a:48ff",
"body_end": "000a:4912"
}
}
```
The CLI also accepts exact MCP-style underscore command aliases, so local automation can often swap MCP names directly with little or no translation.
For ad hoc investigation, prefer `run-script` over multiline `python -c` or pasted PowerShell here-strings. It avoids leaving the shared shell stuck in an unfinished string/block state:
```powershell
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader run-script --script .\pyghidra_plans\inspect_rng.py --read-only
```
Script globals available inside `run-script`:
```python
config
project
program
helpers["get_function"]
helpers["get_function_containing"]
helpers["decompile_function"]
helpers["disassemble_function"]
helpers["get_xrefs_to"]
helpers["get_xrefs_from"]
helpers["read_region_bytes"]
helpers["rename_function"]
helpers["set_comment"]
```
Write-side MCP-style aliases are available too:
```powershell
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader rename-function-by-address --entry 000a:48ff --name rng_next_modulo
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader set-decompiler-comment --address 000a:48ff --text "Returns RNG output modulo the requested bound."
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader set-disassembly-comment --address 000a:48ff --text "Modulo wrapper around rng_advance_state"
```
Apply a small JSON plan:
```json
{
"transaction": "Repair seg043 boundaries",
"remove_functions": [
"0007:5b6f"
],
"create_functions": [
{
"entry": "0007:5a90",
"name": "seg043_func_0090",
"body_start": "0007:5a90",
"body_end": "0007:5b79",
"comment": "Recovered from standalone seg043 boundary scan"
},
{
"entry": "0007:5b7a",
"name": "seg043_func_017a",
"body_start": "0007:5b7a",
"body_end": "0007:5c1b"
},
{
"entry": "0007:5c1c",
"name": "seg043_func_021c",
"body_start": "0007:5c1c",
"body_end": "0007:5c80"
}
],
"comments": [
{
"address": "0007:5b6f",
"text": "Old auto-created split overlaps the earlier seg043:0090..0179 routine.",
"type": "plate"
}
]
}
```
```powershell
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader apply-plan --plan .\seg043_repair.json
```
Dry-run a plan before touching the project:
```powershell
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader apply-plan --plan .\seg043_repair.json --dry-run
```
- Prefer standard MCP endpoints first for decompilation, disassembly, xrefs, renames, comments, function creation/deletion, and datatype work.
- Use live MCP Python/script endpoints only when the ordinary endpoint surface cannot express the needed operation.
- Keep script batches small and evidence-driven, just like ordinary MCP edit plans.
- When a live MCP Python/script batch succeeds, treat that as the canonical workflow; do not duplicate it through the local CLI.
## Implementation Notes
- Address strings accept raw `SSSS:OOOO` form or plain integers such as `0x75a90`.
- The CLI tries a few root folder path variants when opening the program so it can tolerate minor project path differences.
- Plan files support `remove_functions`, `rename_functions`, `create_functions`, `comments`, and `assert_functions`.
- `set-decompiler-comment` maps to a pre-comment and `set-disassembly-comment` maps to an EOL comment.
- Read/query commands open the program read-only; create/rename/comment/plan commands still require the project to be writable.
- `run-script --read-only` is the safest way to do one-off inspection without getting the shared PowerShell session stuck in a multiline Python string.
- `read-region` now reads bytes one address at a time instead of relying on a bulk `getBytes` path that produced misleading all-zero results in this project under PyGhidra.
- PyGhidra startup now suppresses the noisy local GhidraMCP `Module.manifest` warnings during normal CLI operation.
- Address strings still accept raw `SSSS:OOOO` form or plain integers such as `0x75a90` when the underlying MCP endpoint supports them.
- Keep the active-program context in mind; if the wrong Ghidra tab is active, fix that through the live MCP workflow rather than opening a second offline project handle.
- If a missing live endpoint or script capability blocks work, update `ghidra_mcp_wishlist.md` so the gap stays visible instead of reintroducing the local CLI fallback.