Requirements 1
Workshop: Java Event Pipeline Using Local AWS-like Services
Audience: Team of 3 developers, ~6 years Java experience
Goal: Build an end-to-end event pipeline locally, mimicking S3 and SNS, with bad data handling and reporting.
1. Architecture Overview
- Local AWS emulation: LocalStack (SNS, S3)
- Event generator: Java app simulating laptop telemetry/events
- Consumer: Java service long-polling SNS
- Persistence: MariaDB
- Reporting: SQL rollups + static HTML reports
- Frontend: Simple HTML + JS (tables, filters)
Generator -> SNS(topic)
|
Consumer
|
Clean + Validate
|
MariaDB
|
Rollup Queries
|
HTML Reports
2. Tooling Choices (Freeware)
- LocalStack – SNS + S3 simulation
- Docker Desktop OR WSL2
- MariaDB (Docker or native)
- Java 17
- Maven
- Flyway (optional schema mgmt)
- Chart.js (optional for charts)
3. Setup Instructions
Option A – Docker (Recommended)
docker run -d --name localstack \ -p 4566:4566 \ -e SERVICES=sns,s3 \ localstack/localstack
docker run -d --name mariadb \ -p 3306:3306 \ -e MARIADB_ROOT_PASSWORD=root \ -e MARIADB_DATABASE=laptops \ mariadb:11
Option B – Windows (No Docker)
- Install Java 17 (Adoptium)
- Install MariaDB Server
- Run LocalStack via Python virtualenv (advanced users)
Option C – WSL2
- Ubuntu 22/24
- Docker inside WSL or native MariaDB
- Same commands as Option A
4. Project Split (Team of 3)
Person A – Event Generator
- Java CLI app
- Configurable:
- Number of laptops
- Event interval (minutes/hours)
- Invalid data % (t1)
- Events published to SNS topic
Sample Event JSON:
{
"laptopId": "LAP-1021",
"eventType": "BATTERY_LOW",
"timestamp": "2025-01-01T10:30:00Z",
"payload": "charge=12%\u0007"
}
Person B – Consumer + Cleaning
- Java service polling SNS
- Detect:
- Invalid JSON
- Control characters
- Missing mandatory fields
- Clean + normalize
- Insert into MariaDB
Example cleaning rule:
payload = payload.replaceAll("\\p{Cntrl}", "");
Person C – Data Model + Reports
- Schema design
- Metadata tables:
- laptop
- seller
- customer
- Rollup queries
- HTML report generation
Example rollup:
SELECT city, model, count(*) AS events FROM laptop_events JOIN laptop USING(laptop_id) JOIN customer USING(customer_id) GROUP BY city, model;
5. Database Tables (Minimal)
- laptop_events
- laptop
- seller
- customer
Key idea: events are append-only, metadata is slowly changing.
6. Reporting
- Generate static HTML using:
- SQL + Java templating (StringTemplate / Mustache)
- Or export CSV + client-side JS
- Features:
- Sort
- Filter by city / model
- Simple charts
7. Stretch Goals
- Dead-letter table for rejected events
- S3 bucket for raw events backup
- Reprocessing job
- Config reload without restart
8. Learning Outcomes
- Event-driven design
- Bad data handling (realistic)
- Local cloud emulation
- End-to-end ownership
- Clean separation of concerns
- Log in to post comments