u/Master_Ad2559

▲ 43 r/softwarearchitecture+1 crossposts

This project implements an end-to-end event-driven data pipeline using AWS services and Apache Spark.

Architecture
S3 Bronze Layer for raw data
Lambda for orchestration
AWS Glue (Spark) for transformations
S3 Silver Layer in Parquet format
Future extension to Postgres and Power BI

Event Flow
CSV file uploaded to S3 Bronze bucket
S3 triggers Lambda
Lambda triggers AWS Glue job
Glue reads CSV, applies schema and transformations
Output written to S3 Silver as Parquet

Tech Stack
AWS S3
AWS Lambda
AWS Glue (Apache Spark)
IAM
Python
PySpark

Key Design Decisions
Parquet used for efficient analytics
Glue used instead of Lambda for large-scale transformations
Event-driven architecture (no cron jobs)

u/Master_Ad2559 — 9 days ago

When you talk about movies there are always some cult movies like godfather, fight club, goodfellas and so on that for sure get talked about. I was wondering if that’s the same case with books also that there are a few must read cult books that every reader should read at least once. If there are please provide me a list of those top 10 must read books

reddit.com
u/Master_Ad2559 — 11 days ago