Crusader_Decomp/psx-map-exporter/docs/implementation-analysis.md
2026-04-13 15:59:50 +02:00

7.2 KiB

PSX Map Exporter Implementation Analysis

Summary

The exporter should be treated as a controlled probe, not as a final renderer.

The key design choice is to keep the whole path raw-file-based and auditable:

  • raw WDL in
  • explicit carve + record extraction + bundle extraction
  • cached sprite/frame artifacts out
  • final composed map PNG out

That keeps the work independent from the existing viewer and makes every wrong assumption inspectable.

Why This Architecture

The existing PSX work already proved two important negative results:

  • direct raw bundle-order art binding is too weak to count as solved
  • viewer-side polish is low value until extraction is isolated and testable

So the new exporter should optimize for:

  • small number of assumptions
  • easy intermediate inspection
  • direct correspondence to documented executable behavior where possible

Chosen v0 Path

1. Parse only the parts of the WDL we can justify now

Implemented directly from docs:

  • 0x34 header
  • audio-size dword
  • absolute region boundaries recovered from high offset words in the header

Not implemented in v0:

  • full loader section choreography
  • detached runtime stream install
  • inflated runtime-state interpretation

Those are preserved as future extension points but not required for the first PNG.

2. Prefer loader-sized post_audio_section_00 as a layered authored probe

Why:

  • the old region00-first path is now known to overfit the small root-dispatch family
  • loader-sized section parsing recovers the dense constructor-placement records from the same first real section, currently modeled as paired 12-byte records inside 24-byte row chunks
  • the same section also exposes the smaller root-dispatch lane, which is independently renderable offline and now belongs in the default layered probe

Tradeoff:

  • the art binding is still diagnostic-only for many types
  • constructor placements are better understood as one runtime object seed layer, not the final visible map or the static world substrate
  • root-dispatch rows now render as a second authored layer, but they still do not close the runtime-only control, state, and dynamic effect gaps

This is acceptable for v0 because the project goal is a fresh, inspectable layered baseline rather than a falsely confident full renderer.

3. Decode art from raw bundles, but keep binding diagnostic

What is strong already:

  • bundle scan can be constrained by executable-backed header fields
  • frame decode and row-RLE semantics are pinned

What is still weak:

  • exact late-DAT_800758d8 parse and type-to-resource selection path
  • exact palette path

So the current standalone probe does the right split:

  • strong part: raw bundle/frame decode
  • diagnostic part: typeWord -> bundle slot

It also exports candidate late active-header override blobs to cache so the Ghidra-backed DAT_800758d8 header-only lane can be inspected per run without pretending that binding is already solved.

The newer conclusion from LSET1/L0 label failures is narrower than the earlier wording: if one type repeatedly paints a coherent room footprint with obviously wrong art, the exporter is probably visualizing valid world-object seed placement while still missing the separate static-world layer and the downstream executable bind/state path that chooses the final drawable resource.

Viewer-derived sidecars and donor mappings are no longer acceptable here because they blur exactly the binding problem the exporter is meant to isolate.

Module Plan

src/wdl.js

Responsibilities:

  • read header words
  • compute post-audio start
  • derive regions from absolute boundary values
  • expose region buffers and summary metadata

Reason to isolate it:

  • the carve is likely to change as more loader details land
  • record extraction should not depend on header internals

src/bundles.js

Responsibilities:

  • scan the graphics bank for plausible kind-4/kind-5 bundles
  • parse bundle headers and frame entries
  • decode frame bytes
  • emit grayscale PNG-ready RGBA buffers

When the standalone scan yields zero bundles for a map, src/export-map.js may hydrate bundle offsets and frame geometry from out/psx_wdl_disc/.../summary.json and continue decoding the actual frame bytes from the raw WDL.

Reason to isolate it:

  • this code is reusable even if the map schema changes
  • it is the strongest raw-file-backed part of the exporter

src/export-map.js

Responsibilities:

  • choose the record source
  • choose diagnostic art binding
  • normalize screen bounds
  • write cache metadata and composed outputs

This file holds the intentionally weak parts of v0 so they remain easy to replace.

src/render.js

Responsibilities:

  • sprite compositing
  • sort order approximation
  • PNG encoding
  • neutral opaque background for evaluation-friendly probe output

Data Contracts

Record

{
  "index": 0,
  "source": "region00",
  "typeWord": 74,
  "xWord": 5635,
  "yWord": 3815,
  "zWord": 0,
  "selectorWord": 1,
  "laneWord": 32,
  "screenX": -1820,
  "screenY": -4725
}

Bundle

{
  "offsetInRegion": 58808,
  "absoluteOffset": 534068,
  "kind": 5,
  "mode": 2,
  "paletteIndex": 12,
  "frameCount": 3,
  "dataOffset": 112,
  "frameTableOffset": 52
}

Scene Item

{
  "recordIndex": 0,
  "bundleSlot": 74,
  "bundleAbsoluteOffset": 954728,
  "frameIndex": 1,
  "screenX": -1820,
  "screenY": -4725,
  "drawX": -1879,
  "drawY": -4815,
  "width": 96,
  "height": 91,
  "originX": 59,
  "originY": 90
}

Validation Strategy

v0 validation should answer four questions only:

  1. Did the raw WDL parse into the documented regions?
  2. Did the graphics-bank scanner recover plausible bundles with decoded frames?
  3. Did the constructor-placement extractor recover plausible section-0 rows from the loader-sized section view?
  4. Did the compositor produce a non-empty PNG with recognizable art silhouettes on a neutral background?

This is enough for the first pass.

Risks

Binding risk

The diagnostic bundle binding is the weakest part of the pipeline.

Expected failure modes:

  • correct placement with wrong art family
  • repeated art across several type families
  • frame clamping where selector words exceed available bundle frames

Mitigation:

  • keep the chosen bundle slot, frame clamp count, and bundle-repeat metrics in output metadata

Schema risk

The region00 record extractor uses a plausibility scan instead of a final loader schema.

Expected failure modes:

  • false positives in some maps
  • missing records when the preamble differs

Mitigation:

  • preserve recordStartOffset
  • make region01 fallback selectable from CLI

Palette risk

Grayscale is intentionally not faithful to the executable color path.

Mitigation:

  • keep the grayscale rule explicit
  • do not mix partial CLUT heuristics into v0

Immediate Follow-Up Options

After v0 works, the next pass should choose one of these:

  1. Replace provisional art binding with a loader-backed type/resource lookup.
  2. Parse the late DAT_800758d8 bank directly from the large late graphics area instead of relying on slot order.
  3. Add executable-backed CLUT reconstruction once the palette path is pinned tightly enough.
  4. Recover stage-1 graph ordering when sprite placement is stable enough to make sort differences meaningful.