Regalamiunsorriso/faceai/README.md

411 lines
18 KiB
Markdown

# FaceAI Scaffold
This folder scaffolds the new FaceAI app described in the integration plan.
It includes:
- a Vue frontend for the FaceAI upload and polling flow
- a Node/Express backend for session exchange, queueing, and return handoff
- a dedicated processor runner that consumes matcher jobs from Redis and executes `face_matcher`
- a local legacy simulator so the launch and return flow can be tested without the old Java site
- a Dockerized PHP Apache stack for exercising the real `www/faceai_handoff.php` and `www/faceai_return.php` bridge files
## Structure
```text
faceai/
apps/
backend/
frontend/
processor/
docker/
Dockerfile
```
## Runtime Topology
The scaffold currently expects four runtime roles:
- `faceai`: public HTTP service on port `3001`, serving the built Vue app and the authenticated API
- `processor`: background matcher runner consuming BullMQ jobs from Redis and executing the Linux `face_matcher` binary
- `redis`: short-lived queue and search-state store
- `legacy-php`: local-only PHP Apache simulator for exercising the real bridge files under `www/`
For hosted deployment, the long-lived application topology is `faceai` + `processor` + `redis`. The PHP simulator stays local-only and the real legacy site remains on its existing stack.
## What The End-To-End Local Test Covers
The local simulator exercises the exact flow the plan is aiming for:
1. a legacy-like race page loads the original `www/_js/rus-ecom-240621.js` script and shows a `Face ID` button instead of `tipoPuntoFoto`
2. clicking it hits the real PHP handoff bridge at `www/faceai_handoff.php`
3. the backend signs a short-lived handoff token and redirects to the Vue app
4. the Vue app exchanges the token for its own FaceAI session cookie
5. the user uploads a selfie and starts a Redis-backed race-scoped search
6. the frontend polls until the job completes
7. FaceAI requests a signed return URL
8. the browser is redirected back to the real PHP return bridge at `www/faceai_return.php`
9. the PHP bridge fetches the signed result from FaceAI and renders a filtered legacy-like race page
## Local Testing With The Legacy PHP Simulator
This is the recommended local test path because it exercises the public site, the processor, Redis, and the real PHP bridge files together.
### Prerequisites
- Docker Desktop or another Docker Engine with Compose support
- local npm dependencies installed in this `faceai/` workspace
### Start The Stack
From this folder:
```bash
npm install
npm run build
docker compose up --build
```
The checked-in `docker-compose.yml` starts:
- FaceAI public site on `http://localhost:3001`
- processor runner on the internal Compose network
- Redis on the internal Compose network
- PHP Apache serving `../www` on `http://localhost:8080`
The local stack also mounts:
- `../bin/Face_Recognition_Unix` into the processor container as the matcher binary source
- `../test_pkl` into both the public FaceAI container and the processor container as the shared read-only PKL dataset root
- `./logs` into both the public FaceAI container and the processor container as the persistent diagnostics directory
- `../www` into the PHP container so the real bridge files are used
The `processor` service is built from `docker/processor.Dockerfile`, which uses a Debian Trixie-based Node 22 image, applies the current package upgrades available during build, and installs `libxcb1` so the bundled Linux `face_matcher` binary can run locally.
### Persistent Logs
The checked-in local Compose stack now redirects the relevant Node service logs into `faceai/logs` on the host.
After `docker compose up --build`, inspect:
- `faceai/logs/backend.log` for backend startup and API-side failures
- `faceai/logs/processor.log` for worker startup, queue processing, and uncaught processor errors
- `faceai/logs/searches/<searchId>/worker.log` for the per-search processor trace
- `faceai/logs/searches/<searchId>/matcher.log` for the native `face_matcher` output
This keeps the useful processor diagnostics outside the Docker-managed runtime volume so they survive container rebuilds and can be inspected directly from the workspace.
The current bundled Linux `face_matcher` binary is a PyInstaller build that requires `GLIBC_2.38` or newer and the `libxcb.so.1` runtime library. The checked-in local processor image satisfies that requirement.
### Run The Browser Test
Open:
```text
http://localhost:8080/faceai_simulator.php?raceId=202&lang=it
```
That page simulates the legacy race page, loads the original race-page JavaScript from `www/_js/rus-ecom-240621.js`, lets the script replace the visible `tipoPuntoFoto` selector with the new `Face ID` button, and launches the real PHP handoff bridge at `www/faceai_handoff.php`.
### Expected Local Flow
Use the page above and verify this sequence:
1. the simulator page renders on port `8080`
2. the visible checkpoint selector is replaced with the `Face ID` launch button
3. clicking `Face ID` redirects through `faceai_handoff.php` into `http://localhost:3001/auth/callback?token=...`
4. the FaceAI app establishes its session and loads the upload flow
5. uploading a selfie creates a queued search that the processor picks up
6. when polling completes, FaceAI redirects back to `http://localhost:8080/faceai_return.php?...`
7. the PHP return page renders the filtered photo list from the FaceAI result payload
### Rebuild Notes
If you change frontend code and want Docker to serve the updated UI, rebuild first with:
```bash
npm run build
```
If you want to stop and remove the local containers afterward, run:
```bash
docker compose down
```
### Automated End-To-End Test
The workspace now includes a Playwright suite that drives the PHP simulator, the FaceAI app, and the processor end to end.
From this folder, run:
```bash
npm install
npm run test:e2e:install
npm run test:e2e
```
The suite will:
- build the frontend bundle
- start `docker compose up --build -d`
- open `http://localhost:8080/faceai_simulator.php?raceId=202&lang=it`
- click the `Face ID` launch button injected by `www/_js/rus-ecom-240621.js`
- upload `test_pkl/test_images/DSC_1960.JPG`
- wait for the processor to complete and for FaceAI to redirect to `faceai_return.php`
- assert the filtered legacy result contains the expected `6` matches and includes `DSC_1960.JPG`
- validate `faceai/logs/backend.log`, `faceai/logs/processor.log`, and the per-search `worker.log` and `matcher.log` for the run
- stop the Compose stack automatically when the suite finishes
The default deterministic fixture can be overridden with environment variables if the dataset changes:
```bash
FACEAI_E2E_SELFIE=DSC_1960.JPG
FACEAI_E2E_EXPECTED_MATCH_COUNT=6
```
If you want to keep the local containers running after the test for manual inspection, set:
```bash
FACEAI_E2E_KEEP_STACK=1
```
## Optional Backend And Frontend Dev Loop
If you only want to iterate on the app without the PHP simulator, you can still run the public site and the processor separately. The queue-backed flow now requires Redis and the processor, so `npm run dev` alone is no longer the full stack.
One workable loop is:
```bash
npm install
docker compose up redis -d
npm run dev
```
Then start the processor in a second shell, either with its own local environment or by keeping the Compose-managed processor service running.
## Docker Compose Deployment For The Public Site And Matcher Runner
The checked-in `docker-compose.yml` is for local integration testing because it also includes the PHP simulator and local bind mounts. For hosted deployment, keep the same three-service application topology but remove `legacy-php` and replace the local mounts with the real production paths on the host.
The public FaceAI site and the matcher runner can both use the same application image. The difference is only the process command:
- `npm run start` for the public site
- `npm run start:processor` for the matcher runner
If that shared image also embeds or mounts the current Linux `face_matcher` build, make sure the base OS provides `GLIBC_2.38` or newer and includes `libxcb1`. A Debian Trixie-based image with that package installed satisfies the requirement; a Bookworm-based image does not.
### Production Compose Example
This example assumes:
- FaceAI runtime files, logs, and matcher binaries live under `/var/docker/faceai` on the host
- the NAS export is already mounted on the host at `/mnt/nas12` via `/etc/fstab`, for example `192.168.10.247:/public /mnt/nas12 nfs rw,noatime 0 0`
- the race dataset root is available on the host at `/mnt/nas12/nas2/RUS`
Replace the registry path and secrets with the real deployment values.
```yaml
services:
faceai:
image: forgejo.maddoscientisto.net/maddo/faceai-client:latest
container_name: regalami-faceai
restart: unless-stopped
command: sh -c "mkdir -p /data/logs && npm run start >> /data/logs/backend.log 2>&1"
environment:
NODE_ENV: production
PORT: 3001
FACEAI_FRONTEND_URL: https://ai.regalamiunsorriso.it
FACEAI_PUBLIC_BASE_URL: https://ai.regalamiunsorriso.it
FACEAI_LEGACY_RETURN_URL: https://www.regalamiunsorriso.it/faceai_return.php
FACEAI_SHARED_SECRET: change-this-to-a-long-random-secret
FACEAI_SESSION_COOKIE: rus_faceai_session
FACEAI_REDIS_URL: redis://redis:6379
FACEAI_QUEUE_NAME: faceai-searches
FACEAI_RUNTIME_ROOT: /data/runtime
FACEAI_UPLOAD_ROOT: /data/runtime/uploads
FACEAI_LOG_ROOT: /data/logs
FACEAI_PKL_ROOT: /data/pkl
FACEAI_ENABLE_LOCAL_LEGACY_STATIC: 0
volumes:
- /var/docker/faceai/runtime:/data/runtime
- /var/docker/faceai/logs:/data/logs
- /mnt/nas12/nas2/RUS:/data/pkl:ro
ports:
- "127.0.0.1:3001:3001"
depends_on:
- redis
processor:
image: forgejo.maddoscientisto.net/maddo/faceai-client:latest
container_name: regalami-faceai-processor
restart: unless-stopped
command: sh -c "mkdir -p /data/logs && npm run start:processor >> /data/logs/processor.log 2>&1"
environment:
NODE_ENV: production
FACEAI_REDIS_URL: redis://redis:6379
FACEAI_QUEUE_NAME: faceai-searches
FACEAI_RUNTIME_ROOT: /data/runtime
FACEAI_LOG_ROOT: /data/logs
FACEAI_PKL_ROOT: /data/pkl
FACEAI_MATCHER_BINARY: /opt/face-recognition/face_matcher
FACEAI_WORKER_CONCURRENCY: 2
FACEAI_WORKER_TIMEOUT_MS: 300000
volumes:
- /var/docker/faceai/runtime:/data/runtime
- /var/docker/faceai/logs:/data/logs
- /mnt/nas12/nas2/RUS:/data/pkl:ro
- /var/docker/faceai/bin/Face_Recognition_Unix:/opt/face-recognition:ro
depends_on:
- redis
redis:
image: redis:7-alpine
container_name: regalami-faceai-redis
restart: unless-stopped
command: redis-server --appendonly no
```
This pattern assumes a reverse proxy on the host publishes `https://ai.regalamiunsorriso.it` and forwards to `127.0.0.1:3001`. The processor is internal-only and does not expose any public port.
The NAS-backed dataset bind mount stays read-only in both containers. That keeps the application aligned with the local Compose contract, where both services can inspect the same PKL tree but neither service can modify the underlying race data.
### Required Runtime Configuration
Shared application settings:
| Variable | Required | Example | Purpose |
| --- | --- | --- | --- |
| `NODE_ENV` | yes | `production` | disables development defaults |
| `FACEAI_REDIS_URL` | yes | `redis://redis:6379` | queue and search-state backend |
| `FACEAI_QUEUE_NAME` | optional | `faceai-searches` | BullMQ queue name |
| `FACEAI_RUNTIME_ROOT` | yes | `/data/runtime` | shared writable runtime root between site and processor |
| `FACEAI_LOG_ROOT` | recommended | `/data/logs` | persistent host-mounted diagnostics root for backend, processor, and per-search logs |
| `FACEAI_SHARED_SECRET` | yes | long random secret | trust boundary between FaceAI and the legacy bridge |
Public site settings:
| Variable | Required | Example | Purpose |
| --- | --- | --- | --- |
| `PORT` | optional | `3001` | internal listen port |
| `FACEAI_FRONTEND_URL` | yes | `https://ai.regalamiunsorriso.it` | URL used when the legacy bridge redirects into the app |
| `FACEAI_PUBLIC_BASE_URL` | yes | `https://ai.regalamiunsorriso.it` | public base URL used for local links and return flow generation |
| `FACEAI_LEGACY_RETURN_URL` | yes | `https://www.regalamiunsorriso.it/faceai_return.php` | legacy endpoint that receives the signed FaceAI result handoff |
| `FACEAI_SESSION_COOKIE` | optional | `rus_faceai_session` | cookie name for the FaceAI session |
| `FACEAI_UPLOAD_ROOT` | optional | `/data/runtime/uploads` | upload directory inside the shared runtime volume |
| `FACEAI_ENABLE_LOCAL_LEGACY_STATIC` | recommended | `0` | disables development-only static serving of local legacy assets |
Processor settings:
| Variable | Required | Example | Purpose |
| --- | --- | --- | --- |
| `FACEAI_PKL_ROOT` | yes | `/data/pkl` | mounted race-to-PKL dataset root |
| `FACEAI_MATCHER_BINARY` | yes | `/opt/face-recognition/face_matcher` | matcher executable inside the processor container |
| `FACEAI_WORKER_CONCURRENCY` | optional | `2` | BullMQ worker concurrency |
| `FACEAI_WORKER_TIMEOUT_MS` | optional | `300000` | matcher timeout in milliseconds |
The mounted PKL root is expected to use this structure:
```text
/data/pkl/
2026/
04.APRILE/
PISA/
any-file-name.pkl
```
The public FaceAI site mounts the same path read-only so it can check availability during session bootstrap and refuse uploads immediately when the race has no `.pkl` data.
Do not enable `FACEAI_ENABLE_LOCAL_LEGACY_STATIC` in production. That mode exists only for local simulator flows.
### Legacy-Side Configuration That Must Match
The deployment will not work correctly unless the legacy bridge is configured consistently.
The legacy site must:
- redirect users into `FACEAI_FRONTEND_URL` with a valid signed handoff token
- use the same `FACEAI_SHARED_SECRET` as the FaceAI deployment
- expose the configured `FACEAI_LEGACY_RETURN_URL`
- validate the signed return token and fetch the result payload from FaceAI
The shared secret is the trust boundary between the legacy site and FaceAI. Treat it like any other production secret and inject it through the platform secret store, not through source control.
### Reverse Proxy Expectations
The app should sit behind HTTPS. In practice that means:
- publish only the public FaceAI host name externally
- forward the original host and scheme headers from the proxy
- keep the container bound to localhost or a private network if possible
- allow normal browser redirects between the legacy site and the FaceAI host
### Post-Deploy Validation
After the Compose stack is up, validate at least the following:
1. `GET /health` returns `{"ok":true}` through the public FaceAI host.
2. The legacy handoff endpoint redirects to `https://faceai.../auth/callback?token=...`.
3. FaceAI can exchange the token and establish a session.
4. A search is enqueued in Redis and picked up by the processor.
5. Completing a search produces a redirect URL that points to `FACEAI_LEGACY_RETURN_URL`.
6. The legacy return endpoint can resolve the signed result and render the filtered race page.
### Current Production Limitations
This scaffold can now be deployed with the public site, processor, and Redis, but it still has important limitations:
- search state is short-lived in Redis and is not backed by a durable database
- runtime uploads and matcher output still need an agreed production retention and cleanup policy
- the PKL mount contract is now defined, but final NAS operations and cleanup policy still need to be hardened
- the backend currently sets the FaceAI session cookie with `secure: false`, which should be hardened before final public rollout
- the local simulator endpoints under `/dev/*` are still present in the app and should be treated as non-production scaffolding
- the processor CSV parser is still based on the current scaffolded matcher output assumptions
So the Compose deployment is appropriate for hosted integration and controlled production-like rollout, but not yet for the final hardened architecture described in the integration plan.
## Environment
Defaults are already set for local development, but these can be overridden:
```text
PORT=3001
FACEAI_FRONTEND_URL=http://localhost:5173
FACEAI_PUBLIC_BASE_URL=http://localhost:3001
FACEAI_LEGACY_RETURN_URL=http://localhost:3001/dev/legacy/return
FACEAI_SHARED_SECRET=change-me
FACEAI_SESSION_COOKIE=rus_faceai_session
FACEAI_REDIS_URL=redis://redis:6379
FACEAI_QUEUE_NAME=faceai-searches
FACEAI_RUNTIME_ROOT=/data/runtime
FACEAI_UPLOAD_ROOT=/data/runtime/uploads
FACEAI_LOG_ROOT=/data/logs
FACEAI_PKL_ROOT=/data/pkl
FACEAI_MATCHER_BINARY=/opt/face-recognition/face_matcher
```
If you want FaceAI to return through the new PHP bridge prepared under `www`, point `FACEAI_LEGACY_RETURN_URL` to that endpoint instead, for example `http://localhost/faceai_return.php` or the equivalent URL in your local PHP setup.
In the provided Docker Compose stack, that wiring is already done with:
```text
FACEAI_LEGACY_RETURN_URL=http://localhost:8080/faceai_return.php
```
The log wiring is also already done in the checked-in Compose file with a host bind mount for `./logs:/data/logs`, so both the backend and the processor write persistent diagnostics into the workspace.
The local PHP simulator also needs the legacy bridge feature flag enabled:
```text
FACEAI_FEATURE_ENABLED=1
```
The checked-in `docker-compose.yml` now sets that on the `legacy-php` service so the simulator can launch the FaceAI handoff flow locally.
## Notes
- Search orchestration now uses Redis and a dedicated processor worker.
- The checked-in Compose file is meant for local integration testing, not as-is production use.
- The local legacy simulator is intentionally backend-driven so the handoff can be tested without compiling the existing Java application.
- `www/faceai_simulator.php` exists only for local testing. It does not replace the actual JSP race page.
- The final legacy integration still needs a real signed identity source and a real return-filter implementation on the old site.