Crusader_Decomp/psx-map-exporter/docs/implementation-analysis.md
2026-04-13 15:59:50 +02:00

248 lines
7.2 KiB
Markdown

# PSX Map Exporter Implementation Analysis
## Summary
The exporter should be treated as a controlled probe, not as a final renderer.
The key design choice is to keep the whole path raw-file-based and auditable:
- raw WDL in
- explicit carve + record extraction + bundle extraction
- cached sprite/frame artifacts out
- final composed map PNG out
That keeps the work independent from the existing viewer and makes every wrong assumption inspectable.
## Why This Architecture
The existing PSX work already proved two important negative results:
- direct raw bundle-order art binding is too weak to count as solved
- viewer-side polish is low value until extraction is isolated and testable
So the new exporter should optimize for:
- small number of assumptions
- easy intermediate inspection
- direct correspondence to documented executable behavior where possible
## Chosen `v0` Path
### 1. Parse only the parts of the WDL we can justify now
Implemented directly from docs:
- `0x34` header
- audio-size dword
- absolute region boundaries recovered from high offset words in the header
Not implemented in `v0`:
- full loader section choreography
- detached runtime stream install
- inflated runtime-state interpretation
Those are preserved as future extension points but not required for the first PNG.
### 2. Prefer loader-sized `post_audio_section_00` as a layered authored probe
Why:
- the old region00-first path is now known to overfit the small root-dispatch family
- loader-sized section parsing recovers the dense constructor-placement records from the same first real section, currently modeled as paired 12-byte records inside 24-byte row chunks
- the same section also exposes the smaller root-dispatch lane, which is independently renderable offline and now belongs in the default layered probe
Tradeoff:
- the art binding is still diagnostic-only for many types
- constructor placements are better understood as one runtime object seed layer, not the final visible map or the static world substrate
- root-dispatch rows now render as a second authored layer, but they still do not close the runtime-only control, state, and dynamic effect gaps
This is acceptable for `v0` because the project goal is a fresh, inspectable layered baseline rather than a falsely confident full renderer.
### 3. Decode art from raw bundles, but keep binding diagnostic
What is strong already:
- bundle scan can be constrained by executable-backed header fields
- frame decode and row-RLE semantics are pinned
What is still weak:
- exact late-`DAT_800758d8` parse and type-to-resource selection path
- exact palette path
So the current standalone probe does the right split:
- strong part: raw bundle/frame decode
- diagnostic part: `typeWord -> bundle slot`
It also exports candidate late active-header override blobs to cache so the Ghidra-backed `DAT_800758d8` header-only lane can be inspected per run without pretending that binding is already solved.
The newer conclusion from `LSET1/L0` label failures is narrower than the earlier wording: if one type repeatedly paints a coherent room footprint with obviously wrong art, the exporter is probably visualizing valid world-object seed placement while still missing the separate static-world layer and the downstream executable bind/state path that chooses the final drawable resource.
Viewer-derived sidecars and donor mappings are no longer acceptable here because they blur exactly the binding problem the exporter is meant to isolate.
## Module Plan
### `src/wdl.js`
Responsibilities:
- read header words
- compute post-audio start
- derive regions from absolute boundary values
- expose region buffers and summary metadata
Reason to isolate it:
- the carve is likely to change as more loader details land
- record extraction should not depend on header internals
### `src/bundles.js`
Responsibilities:
- scan the graphics bank for plausible kind-4/kind-5 bundles
- parse bundle headers and frame entries
- decode frame bytes
- emit grayscale PNG-ready RGBA buffers
When the standalone scan yields zero bundles for a map, `src/export-map.js` may hydrate bundle offsets and frame geometry from `out/psx_wdl_disc/.../summary.json` and continue decoding the actual frame bytes from the raw WDL.
Reason to isolate it:
- this code is reusable even if the map schema changes
- it is the strongest raw-file-backed part of the exporter
### `src/export-map.js`
Responsibilities:
- choose the record source
- choose diagnostic art binding
- normalize screen bounds
- write cache metadata and composed outputs
This file holds the intentionally weak parts of `v0` so they remain easy to replace.
### `src/render.js`
Responsibilities:
- sprite compositing
- sort order approximation
- PNG encoding
- neutral opaque background for evaluation-friendly probe output
## Data Contracts
### Record
```json
{
"index": 0,
"source": "region00",
"typeWord": 74,
"xWord": 5635,
"yWord": 3815,
"zWord": 0,
"selectorWord": 1,
"laneWord": 32,
"screenX": -1820,
"screenY": -4725
}
```
### Bundle
```json
{
"offsetInRegion": 58808,
"absoluteOffset": 534068,
"kind": 5,
"mode": 2,
"paletteIndex": 12,
"frameCount": 3,
"dataOffset": 112,
"frameTableOffset": 52
}
```
### Scene Item
```json
{
"recordIndex": 0,
"bundleSlot": 74,
"bundleAbsoluteOffset": 954728,
"frameIndex": 1,
"screenX": -1820,
"screenY": -4725,
"drawX": -1879,
"drawY": -4815,
"width": 96,
"height": 91,
"originX": 59,
"originY": 90
}
```
## Validation Strategy
`v0` validation should answer four questions only:
1. Did the raw WDL parse into the documented regions?
2. Did the graphics-bank scanner recover plausible bundles with decoded frames?
3. Did the constructor-placement extractor recover plausible section-0 rows from the loader-sized section view?
4. Did the compositor produce a non-empty PNG with recognizable art silhouettes on a neutral background?
This is enough for the first pass.
## Risks
### Binding risk
The diagnostic bundle binding is the weakest part of the pipeline.
Expected failure modes:
- correct placement with wrong art family
- repeated art across several type families
- frame clamping where selector words exceed available bundle frames
Mitigation:
- keep the chosen bundle slot, frame clamp count, and bundle-repeat metrics in output metadata
### Schema risk
The `region00` record extractor uses a plausibility scan instead of a final loader schema.
Expected failure modes:
- false positives in some maps
- missing records when the preamble differs
Mitigation:
- preserve `recordStartOffset`
- make `region01` fallback selectable from CLI
### Palette risk
Grayscale is intentionally not faithful to the executable color path.
Mitigation:
- keep the grayscale rule explicit
- do not mix partial CLUT heuristics into `v0`
## Immediate Follow-Up Options
After `v0` works, the next pass should choose one of these:
1. Replace provisional art binding with a loader-backed type/resource lookup.
2. Parse the late `DAT_800758d8` bank directly from the large late graphics area instead of relying on slot order.
3. Add executable-backed CLUT reconstruction once the palette path is pinned tightly enough.
4. Recover stage-1 graph ordering when sprite placement is stable enough to make sort differences meaningful.