Research
This commit is contained in:
parent
28cbbe3470
commit
a9153546ae
56 changed files with 6731 additions and 258 deletions
207
.github/skills/pyghidra-ghidra-ops/SKILL.md
vendored
207
.github/skills/pyghidra-ghidra-ops/SKILL.md
vendored
|
|
@ -1,214 +1,43 @@
|
|||
---
|
||||
name: pyghidra-ghidra-ops
|
||||
description: Local PyGhidra fallback workflow for Crusader Ghidra edits and queries
|
||||
description: MCP-only Python-backed Ghidra scripting workflow for Crusader edits and queries; use when live MCP Python/script capabilities are needed and never for the offline local CLI toolkit
|
||||
---
|
||||
|
||||
# PyGhidra Ghidra Ops
|
||||
|
||||
Use this skill when Ghidra MCP is missing a needed operation and you need native CPython access to the Ghidra API for the local Crusader project.
|
||||
Use this skill when the live Ghidra MCP session needs Python-backed inspection or scripted edits. Do not use the offline local PyGhidra CLI from this workspace.
|
||||
|
||||
## Use Cases
|
||||
|
||||
- Create or delete functions in `CRUSADER-RAW.EXE`.
|
||||
- Apply small batched repairs driven by verified addresses.
|
||||
- Add comments or rename functions by address from a repeatable JSON plan.
|
||||
- Decompile or disassemble functions without switching back to the MCP server.
|
||||
- Query function metadata, search by name, and inspect xrefs from the same local CLI.
|
||||
- Inspect project root files to confirm the program name/path before running edits.
|
||||
- Run live MCP readonly Python-backed inspection when decompiler or xref work needs scripted help.
|
||||
- Run live MCP write-capable scripted edits for small verified rename, comment, function-boundary, or datatype batches.
|
||||
- Keep scripted Ghidra work inside the active GUI-backed MCP session so project locks do not matter.
|
||||
|
||||
## Workspace Defaults
|
||||
|
||||
- Ghidra install dir: `I:\Apps\ghidra_12.0.4_PUBLIC`
|
||||
- Ghidra project dir: repo root
|
||||
- Ghidra project name: `Crusader`
|
||||
- Default program: `CRUSADER-RAW.EXE`
|
||||
- Local Python env: `.venv-pyghidra311`
|
||||
- CLI entrypoint: `.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader`
|
||||
- Bootstrap script: `.\tools\pyghidra_crusader\bootstrap_env.ps1`
|
||||
- Active authority: the live Ghidra MCP session
|
||||
- Default target unless stated otherwise: `CRUSADER.EXE`
|
||||
- Python-backed operations must run through MCP endpoints exposed by the active Ghidra session
|
||||
|
||||
## Constraints
|
||||
|
||||
- Stay conservative. Use the same rename and batch-size rules as the main Ghidra workflow.
|
||||
- Prefer one focused plan or 1-5 direct edits at a time.
|
||||
- If a live MCP session was started with Python enabled, use live `run_readonly_script(...)` for quick inspection before falling back to the local CLI; reserve the local PyGhidra path for write-side work or still-missing MCP capabilities.
|
||||
- Write operations require the project to be openable for modification. If `Crusader.lock` is present because the GUI owns the project, close Ghidra first or work on a copy.
|
||||
- Never fall back to the offline/local CLI path from this workspace.
|
||||
- If MCP cannot do the needed Python-backed operation, document the gap in `ghidra_mcp_wishlist.md` rather than using the local toolkit.
|
||||
- Keep `crusader_decompilation_notes.md` updated after verified repair batches.
|
||||
|
||||
For 16-bit NE decompiler failures after prototype edits or function recreation, inspect direct callees before assuming the caller frame is corrupt. In this repo a broken caller (`1420:1499`) was only fixed after repairing a shared callee (`1000:42e2`) whose pointer-return prototype had decompiled with a hidden `__return_storage_ptr__` and poisoned the caller stack model.
|
||||
|
||||
Refresh the local PyGhidra environment when the bundled Ghidra version changes:
|
||||
## MCP Usage Pattern
|
||||
|
||||
```powershell
|
||||
powershell -ExecutionPolicy Bypass -File .\tools\pyghidra_crusader\bootstrap_env.ps1
|
||||
```
|
||||
|
||||
## Commands
|
||||
|
||||
List root project files:
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader project-files
|
||||
```
|
||||
|
||||
Delete a bad function object:
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader delete-function --entry 0007:5b6f
|
||||
```
|
||||
|
||||
Create a repaired function with an explicit body:
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader create-function \
|
||||
--entry 0007:5a90 \
|
||||
--name seg043_func_0090 \
|
||||
--body-start 0007:5a90 \
|
||||
--body-end 0007:5b79 \
|
||||
--plate-comment "Recovered from standalone seg043 boundary scan"
|
||||
```
|
||||
|
||||
Rename a function by entry address:
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader rename-function --entry 0006:02cc --name entity_class_get_flag20
|
||||
```
|
||||
|
||||
MCP-style read/query commands are also available from the same CLI:
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader get-function-by-address --address 000a:48ff
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader get_function_by_address --address 000a:48ff
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader get-function-containing --address 000a:4901
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader decompile-function-by-address --address 000a:48ff
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader disassemble-function --address 000a:48ff
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader read-region --start 000a:48ff --end 000a:4912
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader search-functions-by-name --query rng_
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-methods --limit 20
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list_methods --limit 20
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-strings --limit 20
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-imports --limit 20
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-exports --limit 20
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-namespaces --limit 20
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-segments --limit 20
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-data-items --limit 20
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-classes --limit 20
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader get-xrefs-to --address 000a:48ff
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader get-function-xrefs --name rng_next_modulo
|
||||
```
|
||||
|
||||
All commands also support structured output for scripting:
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader --format json get-function-by-address --address 000a:48ff
|
||||
```
|
||||
|
||||
JSON output now uses a stable envelope:
|
||||
|
||||
```json
|
||||
{
|
||||
"schema_version": "1.0",
|
||||
"command": "get-function-by-address",
|
||||
"ok": true,
|
||||
"schema": { "type": "object", "properties": { "name": { "type": "string" } } },
|
||||
"data": {
|
||||
"name": "rng_next_modulo",
|
||||
"signature": "undefined rng_next_modulo()",
|
||||
"entry": "000a:48ff",
|
||||
"body_start": "000a:48ff",
|
||||
"body_end": "000a:4912"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The CLI also accepts exact MCP-style underscore command aliases, so local automation can often swap MCP names directly with little or no translation.
|
||||
|
||||
For ad hoc investigation, prefer `run-script` over multiline `python -c` or pasted PowerShell here-strings. It avoids leaving the shared shell stuck in an unfinished string/block state:
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader run-script --script .\pyghidra_plans\inspect_rng.py --read-only
|
||||
```
|
||||
|
||||
Script globals available inside `run-script`:
|
||||
|
||||
```python
|
||||
config
|
||||
project
|
||||
program
|
||||
helpers["get_function"]
|
||||
helpers["get_function_containing"]
|
||||
helpers["decompile_function"]
|
||||
helpers["disassemble_function"]
|
||||
helpers["get_xrefs_to"]
|
||||
helpers["get_xrefs_from"]
|
||||
helpers["read_region_bytes"]
|
||||
helpers["rename_function"]
|
||||
helpers["set_comment"]
|
||||
```
|
||||
|
||||
Write-side MCP-style aliases are available too:
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader rename-function-by-address --entry 000a:48ff --name rng_next_modulo
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader set-decompiler-comment --address 000a:48ff --text "Returns RNG output modulo the requested bound."
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader set-disassembly-comment --address 000a:48ff --text "Modulo wrapper around rng_advance_state"
|
||||
```
|
||||
|
||||
Apply a small JSON plan:
|
||||
|
||||
```json
|
||||
{
|
||||
"transaction": "Repair seg043 boundaries",
|
||||
"remove_functions": [
|
||||
"0007:5b6f"
|
||||
],
|
||||
"create_functions": [
|
||||
{
|
||||
"entry": "0007:5a90",
|
||||
"name": "seg043_func_0090",
|
||||
"body_start": "0007:5a90",
|
||||
"body_end": "0007:5b79",
|
||||
"comment": "Recovered from standalone seg043 boundary scan"
|
||||
},
|
||||
{
|
||||
"entry": "0007:5b7a",
|
||||
"name": "seg043_func_017a",
|
||||
"body_start": "0007:5b7a",
|
||||
"body_end": "0007:5c1b"
|
||||
},
|
||||
{
|
||||
"entry": "0007:5c1c",
|
||||
"name": "seg043_func_021c",
|
||||
"body_start": "0007:5c1c",
|
||||
"body_end": "0007:5c80"
|
||||
}
|
||||
],
|
||||
"comments": [
|
||||
{
|
||||
"address": "0007:5b6f",
|
||||
"text": "Old auto-created split overlaps the earlier seg043:0090..0179 routine.",
|
||||
"type": "plate"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader apply-plan --plan .\seg043_repair.json
|
||||
```
|
||||
|
||||
Dry-run a plan before touching the project:
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader apply-plan --plan .\seg043_repair.json --dry-run
|
||||
```
|
||||
- Prefer standard MCP endpoints first for decompilation, disassembly, xrefs, renames, comments, function creation/deletion, and datatype work.
|
||||
- Use live MCP Python/script endpoints only when the ordinary endpoint surface cannot express the needed operation.
|
||||
- Keep script batches small and evidence-driven, just like ordinary MCP edit plans.
|
||||
- When a live MCP Python/script batch succeeds, treat that as the canonical workflow; do not duplicate it through the local CLI.
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
- Address strings accept raw `SSSS:OOOO` form or plain integers such as `0x75a90`.
|
||||
- The CLI tries a few root folder path variants when opening the program so it can tolerate minor project path differences.
|
||||
- Plan files support `remove_functions`, `rename_functions`, `create_functions`, `comments`, and `assert_functions`.
|
||||
- `set-decompiler-comment` maps to a pre-comment and `set-disassembly-comment` maps to an EOL comment.
|
||||
- Read/query commands open the program read-only; create/rename/comment/plan commands still require the project to be writable.
|
||||
- `run-script --read-only` is the safest way to do one-off inspection without getting the shared PowerShell session stuck in a multiline Python string.
|
||||
- `read-region` now reads bytes one address at a time instead of relying on a bulk `getBytes` path that produced misleading all-zero results in this project under PyGhidra.
|
||||
- PyGhidra startup now suppresses the noisy local GhidraMCP `Module.manifest` warnings during normal CLI operation.
|
||||
- Address strings still accept raw `SSSS:OOOO` form or plain integers such as `0x75a90` when the underlying MCP endpoint supports them.
|
||||
- Keep the active-program context in mind; if the wrong Ghidra tab is active, fix that through the live MCP workflow rather than opening a second offline project handle.
|
||||
- If a missing live endpoint or script capability blocks work, update `ghidra_mcp_wishlist.md` so the gap stays visible instead of reintroducing the local CLI fallback.
|
||||
Loading…
Add table
Add a link
Reference in a new issue