feat: add processor service with Redis-backed job queue
- Introduced a new `processor` service in the Docker Compose setup to handle face matching jobs. - Configured Redis as a job queue and state management system for processing searches. - Updated the backend to enqueue jobs and manage user locks using Redis. - Added environment variables for Redis configuration and runtime paths. - Created technical design documentation for the processor service outlining architecture, queue model, and search lifecycle. - Updated package.json and package-lock.json to include dependencies for BullMQ and ioredis in the processor workspace. - Added sample PKL files for local testing in the `test_pkl` directory.
This commit is contained in:
parent
d5cdcd3332
commit
81a1ac85af
20 changed files with 1313 additions and 108 deletions
166
faceai/docs/processor-technical-design.md
Normal file
166
faceai/docs/processor-technical-design.md
Normal file
|
|
@ -0,0 +1,166 @@
|
|||
# FaceAI Processor Technical Design
|
||||
|
||||
## Goal
|
||||
|
||||
Add an internal processor service that executes `face_matcher` jobs for the public FaceAI site, while preventing duplicate searches per user and keeping all state short-lived and restart-safe.
|
||||
|
||||
## Scope Of This Slice
|
||||
|
||||
- add Redis-backed queue and job state
|
||||
- add a dedicated `processor` workspace and container scaffold
|
||||
- replace in-memory search orchestration in the public backend
|
||||
- preserve the existing frontend polling and legacy return flow
|
||||
- support local PKL testing from `test_pkl/`
|
||||
|
||||
This slice does not yet implement production NAS mounting, persistent databases, or a final parser tailored to the real matcher CSV format.
|
||||
|
||||
## Runtime Architecture
|
||||
|
||||
### Public backend
|
||||
|
||||
- owns the authenticated API used by the Vue frontend
|
||||
- stores uploaded selfies in a shared runtime volume
|
||||
- enqueues jobs into BullMQ
|
||||
- keeps per-search state, results, rate limits, and active-user locks in Redis
|
||||
- never executes `face_matcher` directly
|
||||
|
||||
### Processor
|
||||
|
||||
- consumes queue jobs from Redis using BullMQ worker concurrency
|
||||
- resolves the race-scoped PKL path for each job
|
||||
- executes the Linux `face_matcher` binary
|
||||
- parses the CSV result into legacy-compatible `photoId` matches
|
||||
- writes final state and result payload back to Redis
|
||||
|
||||
### Redis
|
||||
|
||||
- queue broker for BullMQ
|
||||
- source of truth for active-user locks
|
||||
- source of truth for search status and short-lived results
|
||||
- source of truth for rate-limit counters
|
||||
|
||||
## Queue And Locking Model
|
||||
|
||||
- queue name: `faceai-searches`
|
||||
- active lock key: `faceai:active-search:user:{legacyUserId}`
|
||||
- search record key: `faceai:search:{searchId}`
|
||||
- result record key: `faceai:result:{resultId}`
|
||||
- rate limit key prefix: `faceai:rate-limit:{legacyUserId}`
|
||||
|
||||
`POST /api/searches` must acquire the active-user lock before enqueueing. If the lock already exists, the backend returns `409` with error code `ACTIVE_SEARCH_EXISTS`.
|
||||
|
||||
The lock is released only when the processor marks the search as terminal: `completed`, `failed`, or `timed_out`.
|
||||
|
||||
## Race And PKL Resolution
|
||||
|
||||
The canonical race key is the legacy `id_gara`, already exposed as `raceId` in the existing handoff flow.
|
||||
|
||||
The processor resolves the PKL path using a race-based directory layout:
|
||||
|
||||
```text
|
||||
/data/pkl/
|
||||
101/
|
||||
face_encodings.pkl
|
||||
202/
|
||||
face_encodings.pkl
|
||||
```
|
||||
|
||||
The lookup rule is:
|
||||
|
||||
1. try `/data/pkl/{raceId}/face_encodings.pkl`
|
||||
2. optionally fall back to `/data/pkl/{raceId}.pkl`
|
||||
3. fail the job if neither exists
|
||||
|
||||
For local development, `test_pkl/` is mounted into `/data/pkl/test` and the backend can fall back to the first `.pkl` file in that folder when no race-specific file exists yet.
|
||||
|
||||
## Shared Runtime Storage
|
||||
|
||||
Both the public backend and the processor mount the same writable runtime directory:
|
||||
|
||||
```text
|
||||
/data/runtime/
|
||||
uploads/
|
||||
searches/
|
||||
```
|
||||
|
||||
- uploaded selfies are written under `uploads/{searchId}/`
|
||||
- worker output and logs are written under `searches/{searchId}/`
|
||||
- cleanup can safely remove old per-search directories after retention expires
|
||||
|
||||
## Search Lifecycle
|
||||
|
||||
1. frontend uploads a selfie and calls `POST /api/searches`
|
||||
2. backend validates session, rate limit, and active-user lock
|
||||
3. backend stores the upload and creates a Redis search record with status `queued`
|
||||
4. backend enqueues a BullMQ job
|
||||
5. processor picks up the job and sets status `processing`
|
||||
6. processor runs `face_matcher`
|
||||
7. processor parses CSV output into matches
|
||||
8. processor stores a result record and marks the search `completed`
|
||||
9. frontend polling reads Redis-backed state through `GET /api/searches/:id`
|
||||
10. existing redirect flow sends the user back to the legacy filtered page
|
||||
|
||||
## Search Record Shape
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "search_...",
|
||||
"status": "queued",
|
||||
"raceId": "101",
|
||||
"userId": "legacy-user-1",
|
||||
"returnUrl": "https://...",
|
||||
"lang": "it",
|
||||
"selfieName": "selfie.jpg",
|
||||
"selfiePath": "/data/runtime/uploads/search_.../selfie.jpg",
|
||||
"resultId": null,
|
||||
"matchCount": 0,
|
||||
"errorCode": null,
|
||||
"errorMessage": null,
|
||||
"createdAt": 0,
|
||||
"startedAt": null,
|
||||
"completedAt": null
|
||||
}
|
||||
```
|
||||
|
||||
## Result Shape
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "result_...",
|
||||
"raceId": "101",
|
||||
"raceName": "Mezza di Firenze",
|
||||
"userId": "legacy-user-1",
|
||||
"returnUrl": "https://...",
|
||||
"lang": "it",
|
||||
"matches": [
|
||||
{
|
||||
"photoId": "legacy-photo-id",
|
||||
"score": 0.98,
|
||||
"label": "legacy-photo-id"
|
||||
}
|
||||
],
|
||||
"createdAt": 0
|
||||
}
|
||||
```
|
||||
|
||||
## Compose Topology
|
||||
|
||||
- `faceai`: public backend plus built frontend
|
||||
- `processor`: queue consumer and matcher executor
|
||||
- `redis`: queue and short-lived state
|
||||
- `legacy-php`: local bridge simulator for end-to-end testing
|
||||
|
||||
## Operational Defaults
|
||||
|
||||
- worker concurrency: `2`
|
||||
- active search retention: `24h`
|
||||
- result retention: `24h`
|
||||
- rate limit window: `5 requests / 10 minutes / user`
|
||||
- worker timeout: `5 minutes`
|
||||
|
||||
## Known Follow-Up Work
|
||||
|
||||
- confirm the real CSV columns emitted by `face_matcher`
|
||||
- verify the Linux binary shared library requirements inside the processor image
|
||||
- replace the PKL fallback with a strict NAS-backed race mapping once the final folder layout is agreed
|
||||
- add cleanup jobs for expired runtime files
|
||||
Loading…
Add table
Add a link
Reference in a new issue