5.5 KiB
FaceAI Processor Technical Design
Goal
Add an internal processor service that executes face_matcher jobs for the public FaceAI site, while preventing duplicate searches per user and keeping all state short-lived and restart-safe.
Scope Of This Slice
- add Redis-backed queue and job state
- add a dedicated
processorworkspace and container scaffold - replace in-memory search orchestration in the public backend
- preserve the existing frontend polling and legacy return flow
- support local PKL testing from
test_pkl/mounted with the same directory shape used in hosted deployment
This slice does not yet implement production NAS mounting, persistent databases, or a final parser tailored to the real matcher CSV format.
Runtime Architecture
Public backend
- owns the authenticated API used by the Vue frontend
- stores uploaded selfies in a shared runtime volume
- enqueues jobs into BullMQ
- keeps per-search state, results, rate limits, and active-user locks in Redis
- never executes
face_matcherdirectly
Processor
- consumes queue jobs from Redis using BullMQ worker concurrency
- resolves the race-scoped PKL path for each job
- executes the Linux
face_matcherbinary - parses the CSV result into legacy-compatible
photoIdmatches - writes final state and result payload back to Redis
Redis
- queue broker for BullMQ
- source of truth for active-user locks
- source of truth for search status and short-lived results
- source of truth for rate-limit counters
Queue And Locking Model
- queue name:
faceai-searches - active lock key:
faceai:active-search:user:{legacyUserId} - search record key:
faceai:search:{searchId} - result record key:
faceai:result:{resultId} - rate limit key prefix:
faceai:rate-limit:{legacyUserId}
POST /api/searches must acquire the active-user lock before enqueueing. If the lock already exists, the backend returns 409 with error code ACTIVE_SEARCH_EXISTS.
The lock is released only when the processor marks the search as terminal: completed, failed, or timed_out.
Race And PKL Resolution
The canonical race key is still the legacy id_gara, but the worker no longer guesses the PKL path from raceId alone.
The legacy handoff must provide a raceStorage object with:
yearmonthFolderlike04.APRILEraceFolderlikePISA
The processor resolves the PKL path using this mounted directory layout:
/data/pkl/
2026/
04.APRILE/
PISA/
face_encodings_20260330_170210.pkl
LUCCA/
face_encodings_20260330_170155.pkl
The lookup rule is:
- resolve
/data/pkl/{year}/{monthFolder}/{raceFolder} - list files at that race root
- take the first
.pklfile found there, regardless of filename - fail the job if the directory does not exist or contains no
.pklfile
For local development, test_pkl/ is mounted directly into /data/pkl in both the public FaceAI container and the processor container, so the same rule is used in every environment.
Shared Runtime Storage
Both the public backend and the processor mount the same writable runtime directory:
/data/runtime/
uploads/
searches/
- uploaded selfies are written under
uploads/{searchId}/ - worker output and logs are written under
searches/{searchId}/ - cleanup can safely remove old per-search directories after retention expires
Search Lifecycle
- frontend uploads a selfie and calls
POST /api/searches - backend validates session, rate limit, and active-user lock
- backend verifies that the mounted race directory exists and already contains a
.pkl; if not, it rejects the request before queueing - backend stores the upload and creates a Redis search record with status
queued - backend enqueues a BullMQ job
- processor picks up the job and sets status
processing - processor runs
face_matcher - processor parses CSV output into matches
- processor stores a result record and marks the search
completed - frontend polling reads Redis-backed state through
GET /api/searches/:id - existing redirect flow sends the user back to the legacy filtered page
Search Record Shape
{
"id": "search_...",
"status": "queued",
"raceId": "101",
"raceStorage": {
"year": "2026",
"monthFolder": "04.APRILE",
"raceFolder": "PISA"
},
"userId": "legacy-user-1",
"returnUrl": "https://...",
"lang": "it",
"selfieName": "selfie.jpg",
"selfiePath": "/data/runtime/uploads/search_.../selfie.jpg",
"resultId": null,
"matchCount": 0,
"errorCode": null,
"errorMessage": null,
"createdAt": 0,
"startedAt": null,
"completedAt": null
}
Result Shape
{
"id": "result_...",
"raceId": "101",
"raceName": "Mezza di Firenze",
"userId": "legacy-user-1",
"returnUrl": "https://...",
"lang": "it",
"matches": [
{
"photoId": "legacy-photo-id",
"score": 0.98,
"label": "legacy-photo-id"
}
],
"createdAt": 0
}
Compose Topology
faceai: public backend plus built frontendprocessor: queue consumer and matcher executorredis: queue and short-lived statelegacy-php: local bridge simulator for end-to-end testing
Operational Defaults
- worker concurrency:
2 - active search retention:
24h - result retention:
24h - rate limit window:
5 requests / 10 minutes / user - worker timeout:
5 minutes
Known Follow-Up Work
- confirm the real CSV columns emitted by
face_matcher - verify the Linux binary shared library requirements inside the processor image
- add cleanup jobs for expired runtime files