System Design Interview Questions and Answers

Sanjay Kumar PhD

7 min readJan 9, 2025

1. Design a Parking Lot Management System

Features:

Vehicle Management: Entry/exit logs, license plate recognition.
Parking Slot Allocation: Dynamic slot assignment, floor preferences.
Payment System: Hourly/daily rates, prepaid/postpaid options.
Admin Panel: Slot overview, reports, maintenance scheduling.

Architecture:

Frontend:

Mobile/web app for customers (book slots, view history).
Admin dashboard (manage slots, generate reports).

Backend:

REST API for slot booking, vehicle logs, and payments.
Database for storing parking slots, vehicles, transactions (SQL or NoSQL based on scale).

IoT Integration:

Sensors for slot occupancy.
Cameras for license plate recognition.

Challenges & Solutions:

Concurrency: Use locks or distributed transactions to prevent overbooking.
Scalability: Implement horizontal scaling for high-traffic parking lots.
Real-Time Updates: Use WebSockets or Server-Sent Events for live updates.

2. Design an API Rate Limiter

Features:

Rate Limiting: Throttling requests based on user or IP.
Burst Handling: Temporary allowance for bursts.
Quota Management: Monthly/annual limits per user tier.

Implementation:

Algorithms:

Token Bucket: Each request consumes a token; tokens refill at a fixed rate.
Leaky Bucket: Requests are queued and processed at a fixed rate.

Storage:

Use Redis to maintain counters for user requests due to its speed and atomic operations.

Middleware:

Integrate with API Gateway or use libraries like ngx_http_limit_req_module (Nginx).

Challenges & Solutions:

Distributed System: Use a consistent hashing mechanism for distributed rate-limiting across servers.
User Differentiation: Assign different limits for free vs. premium users.

3. Handle Security in Distributed Systems

Key Aspects:

Authentication:

OAuth 2.0 or OpenID Connect.
Use short-lived JWTs for secure communication.

Encryption:

TLS for communication.
Encrypt sensitive data at rest (e.g., AES-256).

Access Control:

Implement Role-Based Access Control (RBAC) or Attribute-Based Access Control (ABAC).

Auditing:

Log all critical actions (e.g., admin changes, data access).

Example:

For microservices: Use a service mesh (e.g., Istio) for encrypted communication between services.

4. Handle Millions of Events Per Second

Features:

Ingest, process, and store large volumes of events in real time.
Provide fault-tolerant and low-latency operations.

Architecture:

Data Ingestion:

Use Kafka or RabbitMQ for high-throughput message queues.

Processing:

Use Apache Flink or Apache Storm for real-time event processing.

Storage:

Use columnar databases like Cassandra for scalable writes.

Visualization:

Integrate with tools like Grafana for dashboards.

Challenges & Solutions:

Backpressure: Use Flow Control mechanisms in Kafka.
Latency: Ensure processing clusters are geographically close to data sources.

5. Design an Online Booking System like Airbnb

Key Features:

Search & Listings:

ElasticSearch for fast query results.
Ranking based on user preferences and ratings.

Availability & Booking:

Lock slots in the database during the transaction.
Use distributed locks for consistency.

Payment Integration:

Secure payment gateways like Stripe or PayPal.

Notifications:

Email/SMS alerts for booking confirmations.

Architecture:

Frontend: Progressive Web App (PWA) for users and hosts.
Backend:
REST/GraphQL APIs.
Database: Shard by geographic region for scalability.

6. High Availability in Critical Applications

Strategies:

Redundancy:

Use Active-Active deployments for databases.
Implement multi-region replication.

Load Balancing:

Use tools like AWS Elastic Load Balancer.

Health Monitoring:

Implement tools like Prometheus for alerting.

Data Backups:

Frequent snapshots of databases for disaster recovery.

7. Photo-Sharing Service like Instagram

Features:

Image upload, processing, and sharing.
User feeds and notifications.

Architecture:

Frontend:

Web/mobile app.
Image optimization on the client side.

Backend:

Image Storage: Use a CDN (e.g., AWS S3 + CloudFront).
Metadata: Store in relational DB (e.g., MySQL).
Feed Generation: Precompute feeds for efficiency.

Search:

Use ElasticSearch for tag-based image search.

8. CAP Theorem

Explanation:

In a distributed system, you can only achieve two out of three:
Consistency: All nodes see the same data at the same time.
Availability: Every request receives a response (success/failure).
Partition Tolerance: The system continues to operate despite network partitions.

Examples:

CP Systems (e.g., HBase): Prioritize consistency over availability.
AP Systems (e.g., DynamoDB): Prioritize availability over consistency.

9. Big Data in Real-Time

Architecture:

Data Ingestion:

Kafka for event streaming.

Processing:

Flink for event aggregation.

Storage:

HDFS or S3 for archival.
Cassandra for real-time analytics.

Challenges:

Data Skew: Use key partitioning strategies to balance load.
Latency: Use low-latency databases like Redis for temporary storage.

10. Distributed File Storage System

Features:

High durability and availability.
Efficient data retrieval.

Architecture:

Metadata Server: Stores file system metadata (e.g., file locations).
Chunk Servers: Store actual file data.
Replication: Replicate data across multiple nodes.

Example:

Google’s GFS: Divides files into chunks and stores them across nodes.

11. Design an Ad-Serving Platform

Key Features:

Real-Time Bidding (RTB):

Advertisers bid on ad space in real-time.
Use a demand-side platform (DSP) to manage bids.

Targeting:

Behavioral targeting based on user history, preferences, and location.
Contextual targeting based on content.

Ad Delivery:

Optimize delivery to reduce latency using a Content Delivery Network (CDN).

Architecture:

Frontend:

User-facing application to display ads.

Backend:

Ad Server: Stores ad creatives and serves them based on rules.
User Profile Store: Stores data for targeting (e.g., demographics, history).
Tracking System: Tracks clicks, impressions, and conversions.

Storage:

NoSQL database (e.g., MongoDB or DynamoDB) for storing ad metadata.

Real-Time Processing:

Use Kafka for clickstream data ingestion.
Spark for fraud detection and analytics.

Challenges:

Fraud Detection: Use anomaly detection models to identify invalid clicks/impressions.
Latency: Aim for response times <100ms for seamless user experience.

12. Strategies for Fraud Detection in Online Transactions

Techniques:

Rule-Based Systems:

Define rules (e.g., transactions > $10,000 from a new IP).
Quick to implement but lacks adaptability.

Machine Learning Models:

Supervised learning models (e.g., Random Forest, XGBoost) for classification.
Anomaly detection models for unsupervised fraud detection.

Behavioral Analysis:

Monitor typical user behaviors (e.g., locations, transaction times).

Architecture:

Ingestion:

Use Kafka to stream transaction events.

Processing:

Real-time scoring using ML models deployed via Flask/FastAPI.

Storage:

Store flagged transactions in a NoSQL database for further investigation.

Challenges:

False Positives: Optimize models to reduce legitimate transaction rejections.
Real-Time Analysis: Ensure decisions are made within milliseconds.

13. Real-Time Analytics System

Key Features:

Event Ingestion: Handle high-volume data from multiple sources.
Processing: Aggregate, transform, and analyze data in real time.
Visualization: Present insights via dashboards.

Architecture:

Data Ingestion:

Kafka for event streaming.

Real-Time Processing:

Apache Flink or Apache Spark Streaming for aggregations.

Storage:

Real-time data: Redis or Memcached.
Historical data: Data lake (e.g., AWS S3) or Druid.

Visualization:

Tools like Tableau, Grafana, or Power BI.

Challenges:

High Throughput: Partition data streams to distribute the load.
Fault Tolerance: Use checkpoints in Flink to recover from failures.

14. Design a Trending Topics Feature for a Platform Like Twitter

Key Features:

Topic Detection:

Use NLP techniques like Named Entity Recognition (NER).
Hashtag frequency analysis.

Trend Ranking:

Rank by tweet velocity, user engagement, and geographic location.

Localization:

Tailor trends to specific regions or languages.

Architecture:

Data Ingestion:

Stream tweets using Kafka.

Processing:

Use Spark Streaming for calculating tweet frequencies.
ML models for sentiment analysis and topic classification.

Storage:

Store trending topics in a cache (e.g., Redis) for low-latency reads.

Challenges:

Real-Time Updates: Use sliding window aggregations to compute trends.
Spam Detection: Filter out bots and fake trends using behavioral analytics.

15. Design an Email Sending Service

Key Features:

Email Queueing:

Queue emails for asynchronous sending.
Retry mechanism for failures.

Spam Prevention:

Validate email addresses and enforce SPF/DKIM/DMARC.

Tracking:

Open rates, click-through rates (CTR), and delivery statuses.

Architecture:

Frontend:

Interface for users to compose and schedule emails.

Backend:

Email Sending Service: Use SMTP libraries like SendGrid or AWS SES.
Queueing: Use RabbitMQ or Kafka for email queueing.

Storage:

Relational DB for email logs and tracking data.

Challenges:

Rate Limiting: Use a rate limiter to prevent spamming.
Deliverability: Use domain reputation management tools.

16. Ensure Data Consistency in Microservices Architecture

Strategies:

Eventual Consistency:

Accept that different services may temporarily hold different states.
Use event-driven communication (e.g., Kafka).

Distributed Transactions:

Implement the Saga pattern to manage transactions.
Compensating Transactions: Roll back changes if a failure occurs.

Data Validation:

Use data reconciliation jobs to identify and fix inconsistencies.

Example:

For an e-commerce platform, ensure order, inventory, and payment services are consistent by publishing events to a central event bus.

17. Design a Calendar System

Key Features:

Event Creation:

Single and recurring events.
Notifications and reminders.

Time Zone Management:

Handle users in different time zones.

Sharing:

Allow sharing and collaboration on events.

Architecture:

Frontend:

Calendar UI with drag-and-drop functionality.

Backend:

Store events in a relational database (e.g., PostgreSQL).
Use a job scheduler (e.g., Quartz) for reminders.

Challenges:

Manage conflicts for shared events.
Efficiently query events for a specific time range.

18. Zero-Downtime Deployments

Strategies:

Blue-Green Deployment:

Deploy the new version to a staging environment.
Switch traffic to the new version after validation.

Canary Deployment:

Gradually release updates to a small subset of users.
Rollback quickly if issues are detected.

Rolling Updates:

Deploy updates incrementally across servers.

Example:

Use Kubernetes rolling updates with health checks to ensure no downtime.

19. Track User Actions on a Website

Features:

Action Logging:

Log page views, clicks, and interactions.

Storage:

Store raw events for analytics and debugging.

Architecture:

Ingestion:

Use JavaScript trackers to send events to Kafka.

Processing:

Aggregate events using Spark/Flink.

Storage:

Use ClickHouse for scalable event storage.

Visualization:

Build dashboards in Grafana or Tableau.

Challenges:

Data Volume: Partition data by user ID for efficient storage and querying.
GDPR Compliance: Anonymize user data for privacy.

20. Optimize for Read-Heavy vs. Write-Heavy Systems

Read-Heavy:

Caching:

Use Redis/Memcached for frequently accessed data.

Read Replicas:

Add database replicas to distribute read queries.

Denormalization:

Pre-compute and store aggregated data for faster reads.

Write-Heavy:

Efficient Schema Design:

Use append-only logs for writes.

Event Sourcing:

Write events instead of updating the state directly.

Partitioning:

Partition data by write-intensive keys to balance load.

System Design Interview Questions and Answers

1. Design a Parking Lot Management System

2. Design an API Rate Limiter

3. Handle Security in Distributed Systems

4. Handle Millions of Events Per Second

5. Design an Online Booking System like Airbnb

6. High Availability in Critical Applications

7. Photo-Sharing Service like Instagram

8. CAP Theorem

9. Big Data in Real-Time

10. Distributed File Storage System

11. Design an Ad-Serving Platform

12. Strategies for Fraud Detection in Online Transactions

13. Real-Time Analytics System

14. Design a Trending Topics Feature for a Platform Like Twitter

15. Design an Email Sending Service

16. Ensure Data Consistency in Microservices Architecture

17. Design a Calendar System

18. Zero-Downtime Deployments

19. Track User Actions on a Website

20. Optimize for Read-Heavy vs. Write-Heavy Systems

Written by Sanjay Kumar PhD

No responses yet