▲ 1 r/AWS_cloud
I’m designing a bulk CV upload feature for an ATS (Applicant Tracking System) and would love some feedback on the architecture.
Tech Stack
- Frontend: Next.js
- Backend: FastAPI (Python) running on an EC2
- Cloud/Infra: AWS (S3, SQS, Lambda)
- Database: AWS DocumentDB
The Requirement
Users need to upload batches of CVs( max 50 files, max size of one file - 10mb) (PDFs, DOCX). The system needs to parse the text from these files, extract candidate metadata (name, email, phone), and insert the records into our DocumentDB.
The Problem We Are Solving
Currently, CV parsing is a heavy, CPU-intensive task running synchronously inside our FastAPI application.
The Proposed Architecture
We are moving to an event-driven architecture to completely decouple the parsing from the web server.
- Direct-to-S3: Next.js requests presigned URLs from FastAPI. The client uploads the files directly to S3.
- State Tracking: The API reserves the database records (Status:
Uploading), then drops an event with the file references into an Input SQS Queue. - Serverless Parsing: AWS Lambda is triggered by the SQS queue in batches. It fetches the files from S3 and performs the CPU-heavy text extraction.
- Direct Database Write: The Lambda function writes the parsed candidate data directly to DocumentDB and updates the record status to
Completed.
u/Busy-Solution-2842 — 11 days ago