Research
This commit is contained in:
parent
28cbbe3470
commit
a9153546ae
56 changed files with 6731 additions and 258 deletions
19
.github/instructions/ghidra.instructions.md
vendored
19
.github/instructions/ghidra.instructions.md
vendored
|
|
@ -38,7 +38,7 @@ applyTo: "**"
|
|||
- Keep `crusader_decompilation_notes.md` updated after each verified batch. That file is now a short index — append new analysis to the appropriate file in `docs/` and add a row to the index table if a new file is created.
|
||||
- Keep `crusader_segment_coverage_ledger.csv` updated after each verified batch whenever a segment can be promoted or reclassified.
|
||||
- Keep the progress section in `plan-mid.md` updated after each verified batch so the next pass can resume from the exact stopping point.
|
||||
- Keep `ghidra_mcp_wishlist.md` updated whenever the workflow hits a missing MCP capability and has to fall back to PyGhidra or another local-only path.
|
||||
- Keep `ghidra_mcp_wishlist.md` updated whenever the workflow hits a missing MCP capability and would otherwise tempt a fallback outside MCP.
|
||||
- Each wishlist entry should be short and concrete: what MCP lacked, what command/script/tool had to replace it, and what a useful MCP endpoint or behavior would look like.
|
||||
- Record raw-import addresses alongside original segment-relative offsets when porting names.
|
||||
- **Always use `rename_function_by_address`** — `rename_function` (by name) fails with "must have required property 'old_name'" and is broken. Use `"function_address": "000c:XXXX"` format.
|
||||
|
|
@ -56,18 +56,13 @@ applyTo: "**"
|
|||
- Before running write endpoints such as `patch_bytes_and_reanalyze` or any PyGhidra byte-write script, verify that the selected program is the intended writable copy, not the reference executable.
|
||||
- If the target program is not clearly a writable patch copy in `/Writable`, stop and ask the user before performing the byte write.
|
||||
|
||||
# PyGhidra Fallback
|
||||
# Python-Backed Ghidra Through MCP Only
|
||||
|
||||
- Use the local PyGhidra toolkit in `tools/pyghidra_crusader` when MCP is missing an operation such as function creation, deletion, or batched scripted edits.
|
||||
- If Ghidra was started with Python enabled, prefer live MCP `run_readonly_script(...)` for one-off inspection first; drop to the local PyGhidra CLI only when the work needs write access or MCP still lacks the required operation.
|
||||
- When PyGhidra is needed because MCP lacks a required operation, append a note to `ghidra_mcp_wishlist.md` in the same batch if the gap is not already documented.
|
||||
- The workspace-local Python environment for this toolkit is `.venv-pyghidra311`, created from `C:\Users\Maddo\.pyenv\pyenv-win\versions\3.11.6\python.exe` and installed from the bundled Ghidra 12.0.4 offline packages.
|
||||
- Default install dir for the toolkit is `I:\Apps\ghidra_12.0.4_PUBLIC`.
|
||||
- Invoke the toolkit with `\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader ...` from the repo root.
|
||||
- Rebuild or refresh that environment with `powershell -ExecutionPolicy Bypass -File .\tools\pyghidra_crusader\bootstrap_env.ps1` from the repo root when the local PyGhidra packages drift or a Ghidra upgrade lands.
|
||||
- Keep PyGhidra batches small too: prefer one focused repair plan or 1-5 direct edits at a time.
|
||||
- Write operations require the Ghidra project to open successfully. If `Crusader.lock` is present because the GUI owns the project, close Ghidra first or operate on a project copy.
|
||||
- If the workflow needs the user to change Ghidra state, use the ask-questions tool with a yes/no confirmation prompt instead of plain text. Ask the user to close Ghidra before PyGhidra write commands, and ask the user to open the Ghidra project before MCP server commands. The prompt should briefly describe exactly what to do and instruct the user to answer `Yes` only after the action is complete.
|
||||
- Never use the offline/local PyGhidra CLI toolkit from this workspace.
|
||||
- Do not invoke `tools.pyghidra_crusader`, the local `.venv-pyghidra311` entrypoint, or any project-open workflow that competes with the live GUI lock.
|
||||
- Treat Python-backed Ghidra capabilities as MCP-only: use live `run_readonly_script(...)`, live write-capable MCP script endpoints, and other MCP operations exposed by the running Ghidra session.
|
||||
- If MCP lacks a needed Python-backed operation, record that gap in `ghidra_mcp_wishlist.md` instead of falling back to the offline/local toolkit.
|
||||
- If the workflow needs the user to change Ghidra state for MCP access, use the ask-questions tool with a yes/no confirmation prompt instead of plain text. Ask the user to open the correct Ghidra program or make the correct tab active before MCP work when needed.
|
||||
|
||||
# Current Verified Raw-Import Ports
|
||||
|
||||
|
|
|
|||
207
.github/skills/pyghidra-ghidra-ops/SKILL.md
vendored
207
.github/skills/pyghidra-ghidra-ops/SKILL.md
vendored
|
|
@ -1,214 +1,43 @@
|
|||
---
|
||||
name: pyghidra-ghidra-ops
|
||||
description: Local PyGhidra fallback workflow for Crusader Ghidra edits and queries
|
||||
description: MCP-only Python-backed Ghidra scripting workflow for Crusader edits and queries; use when live MCP Python/script capabilities are needed and never for the offline local CLI toolkit
|
||||
---
|
||||
|
||||
# PyGhidra Ghidra Ops
|
||||
|
||||
Use this skill when Ghidra MCP is missing a needed operation and you need native CPython access to the Ghidra API for the local Crusader project.
|
||||
Use this skill when the live Ghidra MCP session needs Python-backed inspection or scripted edits. Do not use the offline local PyGhidra CLI from this workspace.
|
||||
|
||||
## Use Cases
|
||||
|
||||
- Create or delete functions in `CRUSADER-RAW.EXE`.
|
||||
- Apply small batched repairs driven by verified addresses.
|
||||
- Add comments or rename functions by address from a repeatable JSON plan.
|
||||
- Decompile or disassemble functions without switching back to the MCP server.
|
||||
- Query function metadata, search by name, and inspect xrefs from the same local CLI.
|
||||
- Inspect project root files to confirm the program name/path before running edits.
|
||||
- Run live MCP readonly Python-backed inspection when decompiler or xref work needs scripted help.
|
||||
- Run live MCP write-capable scripted edits for small verified rename, comment, function-boundary, or datatype batches.
|
||||
- Keep scripted Ghidra work inside the active GUI-backed MCP session so project locks do not matter.
|
||||
|
||||
## Workspace Defaults
|
||||
|
||||
- Ghidra install dir: `I:\Apps\ghidra_12.0.4_PUBLIC`
|
||||
- Ghidra project dir: repo root
|
||||
- Ghidra project name: `Crusader`
|
||||
- Default program: `CRUSADER-RAW.EXE`
|
||||
- Local Python env: `.venv-pyghidra311`
|
||||
- CLI entrypoint: `.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader`
|
||||
- Bootstrap script: `.\tools\pyghidra_crusader\bootstrap_env.ps1`
|
||||
- Active authority: the live Ghidra MCP session
|
||||
- Default target unless stated otherwise: `CRUSADER.EXE`
|
||||
- Python-backed operations must run through MCP endpoints exposed by the active Ghidra session
|
||||
|
||||
## Constraints
|
||||
|
||||
- Stay conservative. Use the same rename and batch-size rules as the main Ghidra workflow.
|
||||
- Prefer one focused plan or 1-5 direct edits at a time.
|
||||
- If a live MCP session was started with Python enabled, use live `run_readonly_script(...)` for quick inspection before falling back to the local CLI; reserve the local PyGhidra path for write-side work or still-missing MCP capabilities.
|
||||
- Write operations require the project to be openable for modification. If `Crusader.lock` is present because the GUI owns the project, close Ghidra first or work on a copy.
|
||||
- Never fall back to the offline/local CLI path from this workspace.
|
||||
- If MCP cannot do the needed Python-backed operation, document the gap in `ghidra_mcp_wishlist.md` rather than using the local toolkit.
|
||||
- Keep `crusader_decompilation_notes.md` updated after verified repair batches.
|
||||
|
||||
For 16-bit NE decompiler failures after prototype edits or function recreation, inspect direct callees before assuming the caller frame is corrupt. In this repo a broken caller (`1420:1499`) was only fixed after repairing a shared callee (`1000:42e2`) whose pointer-return prototype had decompiled with a hidden `__return_storage_ptr__` and poisoned the caller stack model.
|
||||
|
||||
Refresh the local PyGhidra environment when the bundled Ghidra version changes:
|
||||
## MCP Usage Pattern
|
||||
|
||||
```powershell
|
||||
powershell -ExecutionPolicy Bypass -File .\tools\pyghidra_crusader\bootstrap_env.ps1
|
||||
```
|
||||
|
||||
## Commands
|
||||
|
||||
List root project files:
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader project-files
|
||||
```
|
||||
|
||||
Delete a bad function object:
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader delete-function --entry 0007:5b6f
|
||||
```
|
||||
|
||||
Create a repaired function with an explicit body:
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader create-function \
|
||||
--entry 0007:5a90 \
|
||||
--name seg043_func_0090 \
|
||||
--body-start 0007:5a90 \
|
||||
--body-end 0007:5b79 \
|
||||
--plate-comment "Recovered from standalone seg043 boundary scan"
|
||||
```
|
||||
|
||||
Rename a function by entry address:
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader rename-function --entry 0006:02cc --name entity_class_get_flag20
|
||||
```
|
||||
|
||||
MCP-style read/query commands are also available from the same CLI:
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader get-function-by-address --address 000a:48ff
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader get_function_by_address --address 000a:48ff
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader get-function-containing --address 000a:4901
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader decompile-function-by-address --address 000a:48ff
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader disassemble-function --address 000a:48ff
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader read-region --start 000a:48ff --end 000a:4912
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader search-functions-by-name --query rng_
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-methods --limit 20
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list_methods --limit 20
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-strings --limit 20
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-imports --limit 20
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-exports --limit 20
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-namespaces --limit 20
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-segments --limit 20
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-data-items --limit 20
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader list-classes --limit 20
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader get-xrefs-to --address 000a:48ff
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader get-function-xrefs --name rng_next_modulo
|
||||
```
|
||||
|
||||
All commands also support structured output for scripting:
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader --format json get-function-by-address --address 000a:48ff
|
||||
```
|
||||
|
||||
JSON output now uses a stable envelope:
|
||||
|
||||
```json
|
||||
{
|
||||
"schema_version": "1.0",
|
||||
"command": "get-function-by-address",
|
||||
"ok": true,
|
||||
"schema": { "type": "object", "properties": { "name": { "type": "string" } } },
|
||||
"data": {
|
||||
"name": "rng_next_modulo",
|
||||
"signature": "undefined rng_next_modulo()",
|
||||
"entry": "000a:48ff",
|
||||
"body_start": "000a:48ff",
|
||||
"body_end": "000a:4912"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The CLI also accepts exact MCP-style underscore command aliases, so local automation can often swap MCP names directly with little or no translation.
|
||||
|
||||
For ad hoc investigation, prefer `run-script` over multiline `python -c` or pasted PowerShell here-strings. It avoids leaving the shared shell stuck in an unfinished string/block state:
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader run-script --script .\pyghidra_plans\inspect_rng.py --read-only
|
||||
```
|
||||
|
||||
Script globals available inside `run-script`:
|
||||
|
||||
```python
|
||||
config
|
||||
project
|
||||
program
|
||||
helpers["get_function"]
|
||||
helpers["get_function_containing"]
|
||||
helpers["decompile_function"]
|
||||
helpers["disassemble_function"]
|
||||
helpers["get_xrefs_to"]
|
||||
helpers["get_xrefs_from"]
|
||||
helpers["read_region_bytes"]
|
||||
helpers["rename_function"]
|
||||
helpers["set_comment"]
|
||||
```
|
||||
|
||||
Write-side MCP-style aliases are available too:
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader rename-function-by-address --entry 000a:48ff --name rng_next_modulo
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader set-decompiler-comment --address 000a:48ff --text "Returns RNG output modulo the requested bound."
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader set-disassembly-comment --address 000a:48ff --text "Modulo wrapper around rng_advance_state"
|
||||
```
|
||||
|
||||
Apply a small JSON plan:
|
||||
|
||||
```json
|
||||
{
|
||||
"transaction": "Repair seg043 boundaries",
|
||||
"remove_functions": [
|
||||
"0007:5b6f"
|
||||
],
|
||||
"create_functions": [
|
||||
{
|
||||
"entry": "0007:5a90",
|
||||
"name": "seg043_func_0090",
|
||||
"body_start": "0007:5a90",
|
||||
"body_end": "0007:5b79",
|
||||
"comment": "Recovered from standalone seg043 boundary scan"
|
||||
},
|
||||
{
|
||||
"entry": "0007:5b7a",
|
||||
"name": "seg043_func_017a",
|
||||
"body_start": "0007:5b7a",
|
||||
"body_end": "0007:5c1b"
|
||||
},
|
||||
{
|
||||
"entry": "0007:5c1c",
|
||||
"name": "seg043_func_021c",
|
||||
"body_start": "0007:5c1c",
|
||||
"body_end": "0007:5c80"
|
||||
}
|
||||
],
|
||||
"comments": [
|
||||
{
|
||||
"address": "0007:5b6f",
|
||||
"text": "Old auto-created split overlaps the earlier seg043:0090..0179 routine.",
|
||||
"type": "plate"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader apply-plan --plan .\seg043_repair.json
|
||||
```
|
||||
|
||||
Dry-run a plan before touching the project:
|
||||
|
||||
```powershell
|
||||
.\.venv-pyghidra311\Scripts\python.exe -m tools.pyghidra_crusader apply-plan --plan .\seg043_repair.json --dry-run
|
||||
```
|
||||
- Prefer standard MCP endpoints first for decompilation, disassembly, xrefs, renames, comments, function creation/deletion, and datatype work.
|
||||
- Use live MCP Python/script endpoints only when the ordinary endpoint surface cannot express the needed operation.
|
||||
- Keep script batches small and evidence-driven, just like ordinary MCP edit plans.
|
||||
- When a live MCP Python/script batch succeeds, treat that as the canonical workflow; do not duplicate it through the local CLI.
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
- Address strings accept raw `SSSS:OOOO` form or plain integers such as `0x75a90`.
|
||||
- The CLI tries a few root folder path variants when opening the program so it can tolerate minor project path differences.
|
||||
- Plan files support `remove_functions`, `rename_functions`, `create_functions`, `comments`, and `assert_functions`.
|
||||
- `set-decompiler-comment` maps to a pre-comment and `set-disassembly-comment` maps to an EOL comment.
|
||||
- Read/query commands open the program read-only; create/rename/comment/plan commands still require the project to be writable.
|
||||
- `run-script --read-only` is the safest way to do one-off inspection without getting the shared PowerShell session stuck in a multiline Python string.
|
||||
- `read-region` now reads bytes one address at a time instead of relying on a bulk `getBytes` path that produced misleading all-zero results in this project under PyGhidra.
|
||||
- PyGhidra startup now suppresses the noisy local GhidraMCP `Module.manifest` warnings during normal CLI operation.
|
||||
- Address strings still accept raw `SSSS:OOOO` form or plain integers such as `0x75a90` when the underlying MCP endpoint supports them.
|
||||
- Keep the active-program context in mind; if the wrong Ghidra tab is active, fix that through the live MCP workflow rather than opening a second offline project handle.
|
||||
- If a missing live endpoint or script capability blocks work, update `ghidra_mcp_wishlist.md` so the gap stays visible instead of reintroducing the local CLI fallback.
|
||||
Loading…
Add table
Add a link
Reference in a new issue