# FaceAI Scaffold

This folder scaffolds the new FaceAI app described in the integration plan.

It includes:

- a Vue frontend for the FaceAI upload and polling flow
- a separate Vue monitor frontend for querying the FaceAI audit database
- a Node/Express backend for session exchange, queueing, and return handoff
- a dedicated processor runner that consumes matcher jobs from Redis and executes `face_matcher`
- a local Dockerized Tomcat/JSP stack so the launch and return flow can be tested against the real legacy race pages under `www/`

## Structure

```text
faceai/
  apps/
    backend/
    frontend/
    monitor-frontend/
    processor/
  docker/
    Dockerfile
```

## Runtime Topology

The production-oriented application topology is still three services:

- `faceai`: public HTTP service on port `3001`, serving the built Vue app and the authenticated API
- `faceai-monitor`: read-only audit monitor on port `3002`, serving a lightweight Vue dashboard backed by the audit API
- `processor`: background matcher runner consuming BullMQ jobs from Redis and executing the Linux `face_matcher` binary
- `redis`: short-lived queue and search-state store

The checked-in development override now expands that into the full local integration stack:

- `faceai-monitor`: local audit dashboard that proxies read-only audit-monitor API calls to the backend
- `tomcat-www`: local Tomcat runtime serving the real legacy JSP race pages from `www/`
- `mysql`: local legacy database used by the Tomcat stack
- `maildump`: local SMTP sink and viewer for the Tomcat stack

That means the local FaceAI flow now runs against the same legacy page type the real site uses instead of a separate PHP bridge container.

## What The End-To-End Local Test Covers

The local Tomcat stack exercises the exact flow the current integration is aiming for:

1. a legacy-like race page loads the original `www/_js/rus-ecom-240621.js` script and shows a `Face ID` button instead of `tipoPuntoFoto`
2. clicking it uses the handoff URL emitted by the real JSP race page and launches FaceAI through `http://localhost:3001/dev/legacy/launch`
3. the backend signs a short-lived handoff token and redirects to the Vue app
4. the Vue app exchanges the token for its own FaceAI session cookie
5. the user uploads a selfie and starts a Redis-backed race-scoped search
6. the frontend polls until the job completes
7. FaceAI requests a signed return URL
8. the browser is redirected back to the real Tomcat race page with `faceaiResultId` and `faceaiToken`
9. the legacy race-page JavaScript fetches the signed result from FaceAI, stores the hydrated match payload in browser storage, and reloads the cleaned race URL
10. the same legacy race page renders the FaceAI-filtered gallery view from that stored payload

## Local Testing With The Merged Legacy JSP Stack

This is the recommended local test path because it exercises the public site, the processor, Redis, and the real legacy JSP page flow together.

### Prerequisites

- Docker Desktop or another Docker Engine with Compose support
- local npm dependencies installed in this `faceai/` workspace

### Start The Stack

From this folder:

```bash
npm install
npm run build
docker compose --env-file .env.development up --build
```

The checked-in Compose setup now uses:

- `docker-compose.yml` as the production-ready base stack
- `docker-compose.override.yml` as the local development overlay
- `.env.production` for production-oriented values
- `.env.development` for the local simulator workflow

The local development stack started by the command above combines the base file and the override and starts:

- FaceAI public site on `http://localhost:3001`
- FaceAI monitor on `http://localhost:3002`
- processor runner on the internal Compose network
- Redis on the internal Compose network
- Tomcat serving the real local legacy site on `http://localhost:8080`
- MySQL for the local legacy stack
- Maildump for the local legacy stack on `http://localhost:8025`

The local stack also mounts:

- `../test_pkl` into both the public FaceAI container and the processor container as the shared read-only PKL dataset root
- `./logs` into both the public FaceAI container and the processor container as the persistent diagnostics directory
- `../www` into the Tomcat build/runtime inputs so the real legacy JSP pages and assets are used

The separate monitor container does not mount the SQLite database directly. Instead it serves a static Vue app and proxies read-only requests under `/api/audit-monitor/*` to the existing backend service, which remains the only process that opens the audit database.

The `processor` service is built from `docker/processor.Dockerfile` using the repository root as Docker build context. That image copies only the checked-in Unix `face_matcher` into the image, so the matcher is baked into the processor runtime without bringing along the other Unix or Windows binaries.

### Persistent Logs

The checked-in local Compose stack now mirrors the relevant Node service logs to both Docker stdout/stderr and `faceai/logs` on the host.

After `docker compose --env-file .env.development up --build`, inspect:

- `faceai/logs/backend.log` for backend startup and API-side failures
- `faceai/logs/processor.log` for worker startup, queue processing, and uncaught processor errors
- `faceai/logs/faceai-audit.sqlite` for the structured 24-month audit trail of FaceAI usage
- `faceai/logs/searches/<searchId>/worker.log` for the per-search processor trace
- `faceai/logs/searches/<searchId>/matcher.log` for the native `face_matcher` output

This keeps the useful processor diagnostics outside the Docker-managed runtime volume so they survive container rebuilds and can be inspected directly from the workspace.

The audit database is a lightweight SQLite file shared by the public FaceAI service and the processor on the existing log volume. Each search row stores the requesting user, race metadata, request timestamp, request IP and user agent, the uploaded selfie SHA-256 fingerprint, and the final match snapshot. That makes the log queryable without introducing another service and lets you recover the same result set a user originally saw by looking up `selfie_sha256`.

The default retention is 730 days, matching the requested 24-month window. Old audit rows are pruned automatically by FaceAI during normal runtime.

Example query:

```bash
sqlite3 faceai/logs/faceai-audit.sqlite "SELECT search_id, user_id, race_id, requested_at, match_count FROM faceai_audit_searches WHERE selfie_sha256 = '...';"
```

Because the service entrypoints now mirror output instead of redirecting it away, the same startup and runtime messages are also visible through `docker logs regalami-faceai`, `docker logs regalami-faceai-processor`, and Portainer's container log viewer.

The current bundled Linux `face_matcher` binary is a PyInstaller build that requires `GLIBC_2.38` or newer and the `libxcb.so.1` runtime library. The checked-in local processor image satisfies that requirement.

### Run The Browser Test

Open:

```text
http://localhost:8080/Foto2.abl?id_gara=1018557&pageRow=96&pageNumber=1
```

For audit visibility during local runs, also open:

```text
http://localhost:3002
```

The monitor shows:

- lifetime and recent-window search totals from the SQLite audit log
- the latest searches with filter-by-status and free-text lookup for race, user, result, error code, selfie name, or search id
- per-search event timelines and stored match snapshots
- recent events and a recent top-race breakdown

That is the real local Tomcat race page. It loads the original race-page JavaScript from `www/_js/rus-ecom-240621.js`, lets the script replace the visible `tipoPuntoFoto` selector with the new `Face ID` button, and launches the backend handoff endpoint configured through the JSP page.

### Expected Local Flow

Use the page above and verify this sequence:

1. the real local race page renders on port `8080`
2. the visible checkpoint selector is replaced with the `Face ID` launch button
3. clicking `Face ID` redirects through `http://localhost:3001/dev/legacy/launch` into `http://localhost:3001/auth/callback?token=...`
4. the FaceAI app establishes its session and loads the upload flow
5. uploading a selfie creates a queued search that the processor picks up
6. when polling completes, FaceAI redirects back to the same race page with `faceaiResultId` and `faceaiToken`
7. the race-page JavaScript hydrates the result from FaceAI and reloads the cleaned legacy URL
8. the filtered photo list renders from the hydrated FaceAI result payload

### Rebuild Notes

If you change frontend code and want Docker to serve the updated UI, rebuild first with:

```bash
npm run build
```

If you want to stop and remove the local containers afterward, run:

```bash
docker compose --env-file .env.development down
```

### Automated End-To-End Test

The workspace now includes a Playwright suite that drives the real local Tomcat race page, the FaceAI app, and the processor end to end.

From this folder, run:

```bash
npm install
npm run test:e2e:install
npm run test:e2e
```

The suite will:

- build the frontend bundle
- start `docker compose --env-file .env.development up --build -d`
- open `http://localhost:8080/Foto2.abl?id_gara=1018557&pageRow=96&pageNumber=1`
- click the `Face ID` launch button injected by `www/_js/rus-ecom-240621.js`
- upload `test_pkl/test_images/DSC_1960.JPG`
- wait for the processor to complete and for FaceAI to redirect back to the legacy race page
- assert the filtered legacy result contains the expected `6` matches and includes `DSC_1960.JPG`
- validate `faceai/logs/backend.log`, `faceai/logs/processor.log`, and the per-search `worker.log` and `matcher.log` for the run
- stop the Compose stack automatically when the suite finishes

The default deterministic fixture can be overridden with environment variables if the dataset changes:

```bash
FACEAI_E2E_SELFIE=DSC_1960.JPG
FACEAI_E2E_EXPECTED_MATCH_COUNT=6
```

If you want to keep the local containers running after the test for manual inspection, set:

```bash
FACEAI_E2E_KEEP_STACK=1
```

## Live Site Playwright Checks

The `faceai/` workspace now also includes a separate Playwright project for the live site. It is isolated from the Docker-backed simulator suite and is intended to verify that production login still works and that a real race page loads correctly after authentication.

Set these environment variables before running it:

```bash
LIVE_SITE_BASE_URL=https://www.regalamiunsorriso.it
LIVE_SITE_LOGIN_URL=https://www.regalamiunsorriso.it/login_clienti-it.html
LIVE_SITE_RACE_URL=https://www.regalamiunsorriso.it/42%20HALF%20MARATHON%20FIRENZE_gara-1018545---96-1.html
LIVE_SITE_USERNAME=your-login
LIVE_SITE_PASSWORD=your-password
```

Then run:

```bash
npm run test:live:install
npm run test:live
```

What it does:

- opens the live login page
- signs in with the supplied credentials
- persists authenticated Playwright storage state under `tests/live-site/.auth/user.json`
- opens the configured live race URL
- verifies the account UI is present and the race search form renders correctly

Optional live FaceAI checks can also be enabled with:

```bash
LIVE_FACEAI_BASE_URL=https://ai.regalamiunsorriso.it
LIVE_SITE_PORTRAIT_PATH=../test_pkl/live/test_portrait_1.png
LIVE_SITE_RUN_UPLOAD_FLOW=1
```

When enabled, the live suite also:

- validates that the legacy Face ID handoff URL includes the race storage metadata expected by FaceAI
- opens the real FaceAI app and asserts that the legacy header stylesheets load from the live legacy site without injecting cross-origin Font Awesome assets
- confirms the app does not emit the `MISSING_RACE_STORAGE` invalid-race error on launch
- uploads the supplied portrait image, waits for the search to complete, and requires a redirect back to the legacy result page with rendered results

### Processor Troubleshooting

If the processor logs show an error like `spawn /opt/face-recognition/face_matcher ENOENT`, the problem is not the upload flow itself. It means the running processor cannot see the matcher binary at the configured `FACEAI_MATCHER_BINARY` path.

With the current checked-in Dockerfiles, only the Unix `face_matcher` is copied into the processor image from the repository source tree during `docker build`. The runtime container no longer needs a host bind mount for `/opt/face-recognition`.

Published images now get that binary because the Forgejo container workflow builds a dedicated processor image from the repository root, which lets `faceai/docker/processor.Dockerfile` copy:

```text
bin/Face_Recognition_Unix/face_matcher
```

If a running processor still reports `ENOENT`, the deployed image was built before this change or the build did not include the checked-in matcher directory.

## Optional Backend And Frontend Dev Loop

If you only want to iterate on the app without the local Tomcat stack, you can still run the public site and the processor separately. The queue-backed flow now requires Redis and the processor, so `npm run dev` alone is no longer the full stack.

One workable loop is:

```bash
npm install
docker compose --env-file .env.development up redis -d
npm run dev
```

Then start the processor in a second shell, either with its own local environment or by keeping the Compose-managed processor service running.

## Docker Compose Deployment For The Public Site And Matcher Runner

The checked-in `docker-compose.yml` is now the production-ready base stack for hosted deployment. The checked-in `docker-compose.override.yml` is the development overlay that restores the local Tomcat/JSP stack, workspace bind mounts, and development-oriented commands.

Because Docker Compose auto-loads `docker-compose.override.yml` when it is present in the same directory, production-style runs from this workspace must explicitly select only the base file.

The public FaceAI site and the matcher runner can both use the same application image. The difference is only the process command:

- `npm run start` for the public site
- `npm run start:processor` for the matcher runner

If that shared image also embeds or mounts the current Linux `face_matcher` build, make sure the base OS provides `GLIBC_2.38` or newer and includes `libxcb1`. A Debian Trixie-based image with that package installed satisfies the requirement; a Bookworm-based image does not.

### Production Compose Commands

This setup assumes:

- FaceAI runtime files, logs, and matcher binaries live under `/var/docker/faceai` on the host
- the NAS export is already mounted on the host at `/mnt/nas12` via `/etc/fstab`, for example `192.168.10.247:/public /mnt/nas12 nfs rw,noatime 0 0`
- the race dataset root is available on the host at `/mnt/nas12/nas2/RUS`

Set the real production values in `.env.production`, then run:

```bash
docker compose -f docker-compose.yml --env-file .env.production up -d
```

To pull newer images before a rollout:

```bash
docker compose -f docker-compose.yml --env-file .env.production pull
docker compose -f docker-compose.yml --env-file .env.production up -d
```

This pattern assumes a reverse proxy on the host publishes `https://ai.regalamiunsorriso.it` and forwards to `127.0.0.1:3001`. The processor is internal-only and does not expose any public port.

The NAS-backed dataset bind mount stays read-only in both containers. That keeps the application aligned with the local Compose contract, where both services can inspect the same PKL tree but neither service can modify the underlying race data.

### Required Runtime Configuration

Shared application settings:

| Variable | Required | Example | Purpose |
| --- | --- | --- | --- |
| `NODE_ENV` | yes | `production` | disables development defaults |
| `FACEAI_REDIS_URL` | yes | `redis://redis:6379` | queue and search-state backend |
| `FACEAI_QUEUE_NAME` | optional | `faceai-searches` | BullMQ queue name |
| `FACEAI_RUNTIME_ROOT` | yes | `/data/runtime` | shared writable runtime root between site and processor |
| `FACEAI_LOG_ROOT` | recommended | `/data/logs` | persistent host-mounted diagnostics root for backend, processor, and per-search logs |
| `FACEAI_AUDIT_DB_PATH` | recommended | `/data/logs/faceai-audit.sqlite` | SQLite audit database shared by backend and processor |
| `FACEAI_AUDIT_RETENTION_DAYS` | recommended | `730` | how long structured audit rows are kept before automatic pruning |
| `FACEAI_SHARED_SECRET` | yes | long random secret | trust boundary between FaceAI and the legacy bridge |

Public site settings:

| Variable | Required | Example | Purpose |
| --- | --- | --- | --- |
| `PORT` | optional | `3001` | internal listen port |
| `FACEAI_FRONTEND_URL` | yes | `https://ai.regalamiunsorriso.it` | URL used when the legacy bridge redirects into the app |
| `FACEAI_PUBLIC_BASE_URL` | yes | `https://ai.regalamiunsorriso.it` | public base URL used for local links and return flow generation |
| `FACEAI_LEGACY_RETURN_URL` | yes | `https://www.regalamiunsorriso.it/faceai_return.php` | legacy endpoint that receives the signed FaceAI result handoff |
| `FACEAI_LEGACY_HOME_URL` | recommended | `https://www.regalamiunsorriso.it/` | fallback destination used when FaceAI has no valid session and needs to return the browser to the legacy site |
| `FACEAI_SESSION_COOKIE` | optional | `rus_faceai_session` | cookie name for the FaceAI session |
| `FACEAI_UPLOAD_ROOT` | optional | `/data/runtime/uploads` | upload directory inside the shared runtime volume |
| `FACEAI_ENABLE_LOCAL_LEGACY_STATIC` | recommended | `0` | disables development-only static serving of local legacy assets |

Processor settings:

| Variable | Required | Example | Purpose |
| --- | --- | --- | --- |
| `FACEAI_PKL_ROOT` | yes | `/data/pkl` | mounted race-to-PKL dataset root |
| `FACEAI_MATCHER_BINARY` | yes | `/app/bin/face_matcher` | matcher executable baked into the processor image |
| `FACEAI_MATCHER_TOLERANCE` | optional | `0.5` | forwarded to `face_matcher --tolerance`; must stay between `0.35` and `0.75` |
| `FACEAI_WORKER_CONCURRENCY` | optional | `2` | BullMQ worker concurrency |
| `FACEAI_WORKER_TIMEOUT_MS` | optional | `300000` | matcher timeout in milliseconds |

The mounted PKL root is expected to use this structure:

```text
/data/pkl/
  2026/
    04.APRILE/
      PISA/
        any-file-name.pkl
```

The public FaceAI site mounts the same path read-only so it can check availability during session bootstrap and refuse uploads immediately when the race has no `.pkl` data.

Do not enable `FACEAI_ENABLE_LOCAL_LEGACY_STATIC` in production. That mode exists only for local simulator flows.

### Legacy-Side Configuration That Must Match

The deployment will not work correctly unless the legacy bridge is configured consistently.

The legacy site must:

- redirect users into `FACEAI_FRONTEND_URL` with a valid signed handoff token
- use the same `FACEAI_SHARED_SECRET` as the FaceAI deployment
- expose the configured `FACEAI_LEGACY_RETURN_URL`
- validate the signed return token and fetch the result payload from FaceAI

The shared secret is the trust boundary between the legacy site and FaceAI. Treat it like any other production secret and inject it through the platform secret store, not through source control.

### Reverse Proxy Expectations

The app should sit behind HTTPS. In practice that means:

- publish only the public FaceAI host name externally
- forward the original host and scheme headers from the proxy
- keep the container bound to localhost or a private network if possible
- allow normal browser redirects between the legacy site and the FaceAI host

### Post-Deploy Validation

After the Compose stack is up, validate at least the following:

1. `GET /health` returns `{"ok":true}` through the public FaceAI host.
2. The legacy handoff endpoint redirects to `https://faceai.../auth/callback?token=...`.
3. FaceAI can exchange the token and establish a session.
4. A search is enqueued in Redis and picked up by the processor.
5. Completing a search produces a redirect URL that points to the intended legacy return target.
6. The legacy page can resolve the signed result and render the filtered race page.

### Current Production Limitations

This scaffold can now be deployed with the public site, processor, and Redis, but it still has important limitations:

- search state is short-lived in Redis and is not backed by a durable database
- runtime uploads and matcher output still need an agreed production retention and cleanup policy
- the PKL mount contract is now defined, but final NAS operations and cleanup policy still need to be hardened
- the backend currently sets the FaceAI session cookie with `secure: false`, which should be hardened before final public rollout
- the local simulator endpoints under `/dev/*` are still present in the app and should be treated as non-production scaffolding
- the processor CSV parser is still based on the current scaffolded matcher output assumptions

So the Compose deployment is appropriate for hosted integration and controlled production-like rollout, but not yet for the final hardened architecture described in the integration plan.

## Environment Files

The repository now keeps separate env files for the two compose workflows:

- `.env.production`: production-oriented values used with the base compose file only
- `.env.development`: local simulator values used with the base file plus the override

To start the local development stack:

```bash
docker compose --env-file .env.development up --build
```

To start the production-style stack from this workspace without loading the development override:

```bash
docker compose -f docker-compose.yml --env-file .env.production up -d
```

If you need a template that lists all supported variables, use `.env.example`.

The most important development defaults are:

```text
NODE_ENV=development
FACEAI_PORT=3001
FACEAI_FRONTEND_URL=http://localhost:3001
FACEAI_PUBLIC_BASE_URL=http://localhost:3001
FACEAI_LEGACY_RETURN_MODE=direct
FACEAI_LEGACY_HOME_URL=http://localhost:8080/index.jsp
FACEAI_FEATURE_ENABLED=1
FACEAI_HANDOFF_URL=http://localhost:3001/dev/legacy/launch
FACEAI_SHARED_SECRET=change-me
FACEAI_SESSION_COOKIE=rus_faceai_session
FACEAI_REDIS_URL=redis://redis:6379
FACEAI_QUEUE_NAME=faceai-searches
FACEAI_RUNTIME_ROOT=/data/runtime
FACEAI_UPLOAD_ROOT=/data/runtime/uploads
FACEAI_LOG_ROOT=/data/logs
FACEAI_AUDIT_DB_PATH=/data/logs/faceai-audit.sqlite
FACEAI_AUDIT_RETENTION_DAYS=730
FACEAI_PKL_ROOT=/data/pkl
FACEAI_MATCHER_BINARY=/app/bin/face_matcher
FACEAI_MATCHER_TOLERANCE=0.5
```

If you want FaceAI to force the older bridge-style return path instead, set `FACEAI_LEGACY_RETURN_MODE=bridge` and point `FACEAI_LEGACY_RETURN_URL` at the appropriate legacy bridge endpoint.

In the current development override, the local legacy handoff wiring is already done with:

```text
FACEAI_HANDOFF_URL=http://localhost:3001/dev/legacy/launch
FACEAI_LEGACY_RETURN_MODE=direct
FACEAI_LEGACY_HOME_URL=http://localhost:8080/index.jsp
```

The development override also keeps the log wiring with `./logs:/data/logs`, so both the backend and the processor write persistent diagnostics into the workspace while also remaining visible through Docker and Portainer container logs.

The Compose contract now also includes an HTTP healthcheck on the public FaceAI service and a Redis readiness check. That makes `docker compose ps` meaningful during rollout: `faceai` only becomes healthy after `GET /health` returns `{"ok":true}`, and both the public site and the processor wait for Redis readiness before their own startup sequence begins.

The local Tomcat stack also needs the legacy bridge feature flag enabled on the JSP side:

```text
FACEAI_FEATURE_ENABLED=1
```

The checked-in `docker-compose.override.yml` sets that on the local `tomcat-www` service so the race page can launch the FaceAI handoff flow locally.

### Audit Monitor API

The monitor frontend uses these read-only backend routes:

- `GET /api/audit-monitor/summary`
- `GET /api/audit-monitor/searches?status=...&query=...&limit=...`
- `GET /api/audit-monitor/searches/<searchId>`

They read the same SQLite audit database already configured through `FACEAI_AUDIT_DB_PATH`. The monitor container proxies those requests to `faceai`, so no browser-side direct SQLite access is involved.

## Notes

- Search orchestration now uses Redis and a dedicated processor worker.
- The checked-in base Compose file is production-oriented, while the checked-in override is development-only.
- The local development flow now uses the actual Tomcat-served JSP race page instead of a separate PHP simulator bridge.
- The final legacy integration still needs a real signed identity source and a real return-filter implementation on the old site.