Regalamiunsorriso/faceai/README.md

# FaceAI Scaffold

This folder scaffolds the new FaceAI app described in the integration plan.

It includes:

- a Vue frontend for the FaceAI upload and polling flow
- a Node/Express backend for session exchange, queueing, and return handoff
- a dedicated processor runner that consumes matcher jobs from Redis and executes `face_matcher`
- a local legacy simulator so the launch and return flow can be tested without the old Java site
- a Dockerized PHP Apache stack for exercising the real `www/faceai_handoff.php` and `www/faceai_return.php` bridge files

## Structure

```text
faceai/
  apps/
    backend/
    frontend/
    processor/
  docker/
    Dockerfile
```

## Runtime Topology

The scaffold currently expects four runtime roles:

- `faceai`: public HTTP service on port `3001`, serving the built Vue app and the authenticated API
- `processor`: background matcher runner consuming BullMQ jobs from Redis and executing the Linux `face_matcher` binary
- `redis`: short-lived queue and search-state store
- `legacy-php`: local-only PHP Apache simulator for exercising the real bridge files under `www/`

For hosted deployment, the long-lived application topology is `faceai` + `processor` + `redis`. The PHP simulator stays local-only and the real legacy site remains on its existing stack.

## What The End-To-End Local Test Covers

The local simulator exercises the exact flow the plan is aiming for:

1. a legacy-like race page loads the original `www/_js/rus-ecom-240621.js` script and shows a `Face ID` button instead of `tipoPuntoFoto`
2. clicking it hits the real PHP handoff bridge at `www/faceai_handoff.php`
3. the backend signs a short-lived handoff token and redirects to the Vue app
4. the Vue app exchanges the token for its own FaceAI session cookie
5. the user uploads a selfie and starts a Redis-backed race-scoped search
6. the frontend polls until the job completes
7. FaceAI requests a signed return URL
8. the browser is redirected back to the real PHP return bridge at `www/faceai_return.php`
9. the PHP bridge fetches the signed result from FaceAI and renders a filtered legacy-like race page

## Local Testing With The Legacy PHP Simulator

This is the recommended local test path because it exercises the public site, the processor, Redis, and the real PHP bridge files together.

### Prerequisites

- Docker Desktop or another Docker Engine with Compose support
- local npm dependencies installed in this `faceai/` workspace

### Start The Stack

From this folder:

```bash
npm install
npm run build
docker compose up --build
```

The checked-in `docker-compose.yml` starts:

- FaceAI public site on `http://localhost:3001`
- processor runner on the internal Compose network
- Redis on the internal Compose network
- PHP Apache serving `../www` on `http://localhost:8080`

The local stack also mounts:

- `../test_pkl` into both the public FaceAI container and the processor container as the shared read-only PKL dataset root
- `./logs` into both the public FaceAI container and the processor container as the persistent diagnostics directory
- `../www` into the PHP container so the real bridge files are used

The `processor` service is built from `docker/processor.Dockerfile` using the repository root as Docker build context. That image copies only the checked-in Unix `face_matcher` into the image, so the matcher is baked into the processor runtime without bringing along the other Unix or Windows binaries.

### Persistent Logs

The checked-in local Compose stack now mirrors the relevant Node service logs to both Docker stdout/stderr and `faceai/logs` on the host.

After `docker compose up --build`, inspect:

- `faceai/logs/backend.log` for backend startup and API-side failures
- `faceai/logs/processor.log` for worker startup, queue processing, and uncaught processor errors
- `faceai/logs/searches/<searchId>/worker.log` for the per-search processor trace
- `faceai/logs/searches/<searchId>/matcher.log` for the native `face_matcher` output

This keeps the useful processor diagnostics outside the Docker-managed runtime volume so they survive container rebuilds and can be inspected directly from the workspace.

Because the service entrypoints now mirror output instead of redirecting it away, the same startup and runtime messages are also visible through `docker logs regalami-faceai`, `docker logs regalami-faceai-processor`, and Portainer's container log viewer.

The current bundled Linux `face_matcher` binary is a PyInstaller build that requires `GLIBC_2.38` or newer and the `libxcb.so.1` runtime library. The checked-in local processor image satisfies that requirement.

### Run The Browser Test

Open:

```text
http://localhost:8080/faceai_simulator.php?raceId=202&lang=it
```

That page simulates the legacy race page, loads the original race-page JavaScript from `www/_js/rus-ecom-240621.js`, lets the script replace the visible `tipoPuntoFoto` selector with the new `Face ID` button, and launches the real PHP handoff bridge at `www/faceai_handoff.php`.

### Expected Local Flow

Use the page above and verify this sequence:

1. the simulator page renders on port `8080`
2. the visible checkpoint selector is replaced with the `Face ID` launch button
3. clicking `Face ID` redirects through `faceai_handoff.php` into `http://localhost:3001/auth/callback?token=...`
4. the FaceAI app establishes its session and loads the upload flow
5. uploading a selfie creates a queued search that the processor picks up
6. when polling completes, FaceAI redirects back to `http://localhost:8080/faceai_return.php?...`
7. the PHP return page renders the filtered photo list from the FaceAI result payload

### Rebuild Notes

If you change frontend code and want Docker to serve the updated UI, rebuild first with:

```bash
npm run build
```

If you want to stop and remove the local containers afterward, run:

```bash
docker compose down
```

### Automated End-To-End Test

The workspace now includes a Playwright suite that drives the PHP simulator, the FaceAI app, and the processor end to end.

From this folder, run:

```bash
npm install
npm run test:e2e:install
npm run test:e2e
```

The suite will:

- build the frontend bundle
- start `docker compose up --build -d`
- open `http://localhost:8080/faceai_simulator.php?raceId=202&lang=it`
- click the `Face ID` launch button injected by `www/_js/rus-ecom-240621.js`
- upload `test_pkl/test_images/DSC_1960.JPG`
- wait for the processor to complete and for FaceAI to redirect to `faceai_return.php`
- assert the filtered legacy result contains the expected `6` matches and includes `DSC_1960.JPG`
- validate `faceai/logs/backend.log`, `faceai/logs/processor.log`, and the per-search `worker.log` and `matcher.log` for the run
- stop the Compose stack automatically when the suite finishes

The default deterministic fixture can be overridden with environment variables if the dataset changes:

```bash
FACEAI_E2E_SELFIE=DSC_1960.JPG
FACEAI_E2E_EXPECTED_MATCH_COUNT=6
```

If you want to keep the local containers running after the test for manual inspection, set:

```bash
FACEAI_E2E_KEEP_STACK=1
```

## Live Site Playwright Checks

The `faceai/` workspace now also includes a separate Playwright project for the live site. It is isolated from the Docker-backed simulator suite and is intended to verify that production login still works and that a real race page loads correctly after authentication.

Set these environment variables before running it:

```bash
LIVE_SITE_BASE_URL=https://www.regalamiunsorriso.it
LIVE_SITE_LOGIN_URL=https://www.regalamiunsorriso.it/login_clienti-it.html
LIVE_SITE_RACE_URL=https://www.regalamiunsorriso.it/42%20HALF%20MARATHON%20FIRENZE_gara-1018545---96-1.html
LIVE_SITE_USERNAME=your-login
LIVE_SITE_PASSWORD=your-password
```

Then run:

```bash
npm run test:live:install
npm run test:live
```

What it does:

- opens the live login page
- signs in with the supplied credentials
- persists authenticated Playwright storage state under `tests/live-site/.auth/user.json`
- opens the configured live race URL
- verifies the account UI is present and the race search form renders correctly

Optional live FaceAI checks can also be enabled with:

```bash
LIVE_FACEAI_BASE_URL=https://ai.regalamiunsorriso.it
LIVE_SITE_PORTRAIT_PATH=../test_pkl/live/test_portrait_1.png
LIVE_SITE_RUN_UPLOAD_FLOW=1
```

When enabled, the live suite also:

- validates that the legacy Face ID handoff URL includes the race storage metadata expected by FaceAI
- opens the real FaceAI app and asserts that the legacy header stylesheets load from the live legacy site without injecting cross-origin Font Awesome assets
- confirms the app does not emit the `MISSING_RACE_STORAGE` invalid-race error on launch
- uploads the supplied portrait image, waits for the search to complete, and requires a redirect back to the legacy result page with rendered results

### Processor Troubleshooting

If the processor logs show an error like `spawn /opt/face-recognition/face_matcher ENOENT`, the problem is not the upload flow itself. It means the running processor cannot see the matcher binary at the configured `FACEAI_MATCHER_BINARY` path.

With the current checked-in Dockerfiles, only the Unix `face_matcher` is copied into the processor image from the repository source tree during `docker build`. The runtime container no longer needs a host bind mount for `/opt/face-recognition`.

Published images now get that binary because the Forgejo container workflow builds a dedicated processor image from the repository root, which lets `faceai/docker/processor.Dockerfile` copy:

```text
bin/Face_Recognition_Unix/face_matcher
```

If a running processor still reports `ENOENT`, the deployed image was built before this change or the build did not include the checked-in matcher directory.

## Optional Backend And Frontend Dev Loop

If you only want to iterate on the app without the PHP simulator, you can still run the public site and the processor separately. The queue-backed flow now requires Redis and the processor, so `npm run dev` alone is no longer the full stack.

One workable loop is:

```bash
npm install
docker compose up redis -d
npm run dev
```

Then start the processor in a second shell, either with its own local environment or by keeping the Compose-managed processor service running.

## Docker Compose Deployment For The Public Site And Matcher Runner

The checked-in `docker-compose.yml` is for local integration testing because it also includes the PHP simulator and local bind mounts. For hosted deployment, keep the same three-service application topology but remove `legacy-php` and replace the local mounts with the real production paths on the host.

The public FaceAI site and the matcher runner can both use the same application image. The difference is only the process command:

- `npm run start` for the public site
- `npm run start:processor` for the matcher runner

If that shared image also embeds or mounts the current Linux `face_matcher` build, make sure the base OS provides `GLIBC_2.38` or newer and includes `libxcb1`. A Debian Trixie-based image with that package installed satisfies the requirement; a Bookworm-based image does not.

### Production Compose Example

This example assumes:

- FaceAI runtime files, logs, and matcher binaries live under `/var/docker/faceai` on the host
- the NAS export is already mounted on the host at `/mnt/nas12` via `/etc/fstab`, for example `192.168.10.247:/public /mnt/nas12 nfs rw,noatime 0 0`
- the race dataset root is available on the host at `/mnt/nas12/nas2/RUS`

Replace the registry path and secrets with the real deployment values.

```yaml
services:
  faceai:
    image: forgejo.maddoscientisto.net/maddo/faceai-client:latest
    container_name: regalami-faceai
    restart: unless-stopped
    command:
      - node
      - docker/run-with-log-file.mjs
      - /data/logs/backend.log
      - npm
      - run
      - start
    environment:
      NODE_ENV: production
      PORT: 3001
      FACEAI_FRONTEND_URL: https://ai.regalamiunsorriso.it
      FACEAI_PUBLIC_BASE_URL: https://ai.regalamiunsorriso.it
      FACEAI_LEGACY_RETURN_URL: https://www.regalamiunsorriso.it/faceai_return.php
      FACEAI_SHARED_SECRET: disagio-spaghetti-science-lol-boh
      FACEAI_SESSION_COOKIE: rus_faceai_session
      FACEAI_REDIS_URL: redis://redis:6379
      FACEAI_QUEUE_NAME: faceai-searches
      FACEAI_RUNTIME_ROOT: /data/runtime
      FACEAI_UPLOAD_ROOT: /data/runtime/uploads
      FACEAI_LOG_ROOT: /data/logs
      FACEAI_PKL_ROOT: /data/pkl
      FACEAI_ENABLE_LOCAL_LEGACY_STATIC: 0
    volumes:
      - /mnt/storage/data/faceai/runtime:/data/runtime
      - /mnt/storage/data/faceai/logs:/data/logs
      - /mnt/nas12/nas2/RUS:/data/pkl:ro
    ports:
      - "3001:3001"
    healthcheck:
      test: ["CMD-SHELL", "wget -qO- http://127.0.0.1:3001/health | grep -q '\"ok\":true'"]
      interval: 10s
      timeout: 5s
      retries: 6
      start_period: 20s
    depends_on:
      redis:
        condition: service_healthy

  processor:
    image: forgejo.maddoscientisto.net/maddo/faceai-processor:latest
    container_name: regalami-faceai-processor
    restart: unless-stopped
    command:
      - node
      - docker/run-with-log-file.mjs
      - /data/logs/processor.log
      - npm
      - run
      - start:processor
    environment:
      NODE_ENV: production
      FACEAI_REDIS_URL: redis://redis:6379
      FACEAI_QUEUE_NAME: faceai-searches
      FACEAI_RUNTIME_ROOT: /data/runtime
      FACEAI_LOG_ROOT: /data/logs
      FACEAI_PKL_ROOT: /data/pkl
      FACEAI_MATCHER_BINARY: /opt/face-recognition/face_matcher
      FACEAI_WORKER_CONCURRENCY: 2
      FACEAI_WORKER_TIMEOUT_MS: 300000
    volumes:
      - /mnt/storage/data/faceai/runtime:/data/runtime
      - /mnt/storage/data/faceai/logs:/data/logs
      - /mnt/nas12/nas2/RUS:/data/pkl:ro
    depends_on:
      redis:
        condition: service_healthy

  redis:
    image: redis:7-alpine
    container_name: regalami-faceai-redis
    restart: unless-stopped
    command: redis-server --appendonly no
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 5s
      timeout: 3s
      retries: 12
```

This pattern assumes a reverse proxy on the host publishes `https://ai.regalamiunsorriso.it` and forwards to `127.0.0.1:3001`. The processor is internal-only and does not expose any public port.

The NAS-backed dataset bind mount stays read-only in both containers. That keeps the application aligned with the local Compose contract, where both services can inspect the same PKL tree but neither service can modify the underlying race data.

### Required Runtime Configuration

Shared application settings:

| Variable | Required | Example | Purpose |
| --- | --- | --- | --- |
| `NODE_ENV` | yes | `production` | disables development defaults |
| `FACEAI_REDIS_URL` | yes | `redis://redis:6379` | queue and search-state backend |
| `FACEAI_QUEUE_NAME` | optional | `faceai-searches` | BullMQ queue name |
| `FACEAI_RUNTIME_ROOT` | yes | `/data/runtime` | shared writable runtime root between site and processor |
| `FACEAI_LOG_ROOT` | recommended | `/data/logs` | persistent host-mounted diagnostics root for backend, processor, and per-search logs |
| `FACEAI_SHARED_SECRET` | yes | long random secret | trust boundary between FaceAI and the legacy bridge |

Public site settings:

| Variable | Required | Example | Purpose |
| --- | --- | --- | --- |
| `PORT` | optional | `3001` | internal listen port |
| `FACEAI_FRONTEND_URL` | yes | `https://ai.regalamiunsorriso.it` | URL used when the legacy bridge redirects into the app |
| `FACEAI_PUBLIC_BASE_URL` | yes | `https://ai.regalamiunsorriso.it` | public base URL used for local links and return flow generation |
| `FACEAI_LEGACY_RETURN_URL` | yes | `https://www.regalamiunsorriso.it/faceai_return.php` | legacy endpoint that receives the signed FaceAI result handoff |
| `FACEAI_LEGACY_HOME_URL` | recommended | `https://www.regalamiunsorriso.it/` | fallback destination used when FaceAI has no valid session and needs to return the browser to the legacy site |
| `FACEAI_SESSION_COOKIE` | optional | `rus_faceai_session` | cookie name for the FaceAI session |
| `FACEAI_UPLOAD_ROOT` | optional | `/data/runtime/uploads` | upload directory inside the shared runtime volume |
| `FACEAI_ENABLE_LOCAL_LEGACY_STATIC` | recommended | `0` | disables development-only static serving of local legacy assets |

Processor settings:

| Variable | Required | Example | Purpose |
| --- | --- | --- | --- |
| `FACEAI_PKL_ROOT` | yes | `/data/pkl` | mounted race-to-PKL dataset root |
| `FACEAI_MATCHER_BINARY` | yes | `/opt/face-recognition/face_matcher` | matcher executable baked into the processor image |
| `FACEAI_WORKER_CONCURRENCY` | optional | `2` | BullMQ worker concurrency |
| `FACEAI_WORKER_TIMEOUT_MS` | optional | `300000` | matcher timeout in milliseconds |

The mounted PKL root is expected to use this structure:

```text
/data/pkl/
  2026/
    04.APRILE/
      PISA/
        any-file-name.pkl
```

The public FaceAI site mounts the same path read-only so it can check availability during session bootstrap and refuse uploads immediately when the race has no `.pkl` data.

Do not enable `FACEAI_ENABLE_LOCAL_LEGACY_STATIC` in production. That mode exists only for local simulator flows.

### Legacy-Side Configuration That Must Match

The deployment will not work correctly unless the legacy bridge is configured consistently.

The legacy site must:

- redirect users into `FACEAI_FRONTEND_URL` with a valid signed handoff token
- use the same `FACEAI_SHARED_SECRET` as the FaceAI deployment
- expose the configured `FACEAI_LEGACY_RETURN_URL`
- validate the signed return token and fetch the result payload from FaceAI

The shared secret is the trust boundary between the legacy site and FaceAI. Treat it like any other production secret and inject it through the platform secret store, not through source control.

### Reverse Proxy Expectations

The app should sit behind HTTPS. In practice that means:

- publish only the public FaceAI host name externally
- forward the original host and scheme headers from the proxy
- keep the container bound to localhost or a private network if possible
- allow normal browser redirects between the legacy site and the FaceAI host

### Post-Deploy Validation

After the Compose stack is up, validate at least the following:

1. `GET /health` returns `{"ok":true}` through the public FaceAI host.
2. The legacy handoff endpoint redirects to `https://faceai.../auth/callback?token=...`.
3. FaceAI can exchange the token and establish a session.
4. A search is enqueued in Redis and picked up by the processor.
5. Completing a search produces a redirect URL that points to `FACEAI_LEGACY_RETURN_URL`.
6. The legacy return endpoint can resolve the signed result and render the filtered race page.

### Current Production Limitations

This scaffold can now be deployed with the public site, processor, and Redis, but it still has important limitations:

- search state is short-lived in Redis and is not backed by a durable database
- runtime uploads and matcher output still need an agreed production retention and cleanup policy
- the PKL mount contract is now defined, but final NAS operations and cleanup policy still need to be hardened
- the backend currently sets the FaceAI session cookie with `secure: false`, which should be hardened before final public rollout
- the local simulator endpoints under `/dev/*` are still present in the app and should be treated as non-production scaffolding
- the processor CSV parser is still based on the current scaffolded matcher output assumptions

So the Compose deployment is appropriate for hosted integration and controlled production-like rollout, but not yet for the final hardened architecture described in the integration plan.

## Environment

Defaults are already set for local development, but these can be overridden:

```text
PORT=3001
FACEAI_FRONTEND_URL=http://localhost:5173
FACEAI_PUBLIC_BASE_URL=http://localhost:3001
FACEAI_LEGACY_RETURN_URL=http://localhost:3001/dev/legacy/return
FACEAI_LEGACY_HOME_URL=http://localhost:8080/index.jsp
FACEAI_SHARED_SECRET=change-me
FACEAI_SESSION_COOKIE=rus_faceai_session
FACEAI_REDIS_URL=redis://redis:6379
FACEAI_QUEUE_NAME=faceai-searches
FACEAI_RUNTIME_ROOT=/data/runtime
FACEAI_UPLOAD_ROOT=/data/runtime/uploads
FACEAI_LOG_ROOT=/data/logs
FACEAI_PKL_ROOT=/data/pkl
FACEAI_MATCHER_BINARY=/opt/face-recognition/face_matcher
```

If you want FaceAI to return through the new PHP bridge prepared under `www`, point `FACEAI_LEGACY_RETURN_URL` to that endpoint instead, for example `http://localhost/faceai_return.php` or the equivalent URL in your local PHP setup.

In the provided Docker Compose stack, that wiring is already done with:

```text
FACEAI_LEGACY_RETURN_URL=http://localhost:8080/faceai_return.php
FACEAI_LEGACY_HOME_URL=http://localhost:8080/index.jsp
```

The log wiring is also already done in the checked-in Compose file with a host bind mount for `./logs:/data/logs`, so both the backend and the processor write persistent diagnostics into the workspace while also remaining visible through Docker and Portainer container logs.

The Compose contract now also includes an HTTP healthcheck on the public FaceAI service and a Redis readiness check. That makes `docker compose ps` meaningful during rollout: `faceai` only becomes healthy after `GET /health` returns `{"ok":true}`, and both the public site and the processor wait for Redis readiness before their own startup sequence begins.

The local PHP simulator also needs the legacy bridge feature flag enabled:

```text
FACEAI_FEATURE_ENABLED=1
```

The checked-in `docker-compose.yml` now sets that on the `legacy-php` service so the simulator can launch the FaceAI handoff flow locally.

## Notes

- Search orchestration now uses Redis and a dedicated processor worker.
- The checked-in Compose file is meant for local integration testing, not as-is production use.
- The local legacy simulator is intentionally backend-driven so the handoff can be tested without compiling the existing Java application.
- `www/faceai_simulator.php` exists only for local testing. It does not replace the actual JSP race page.
- The final legacy integration still needs a real signed identity source and a real return-filter implementation on the old site.