Amazon DynamoDB Interview Questions and Answers
Q: What are the key features of Amazon DynamoDB?
Answer: Key features of Amazon DynamoDB include:
- Fully managed: AWS handles administrative tasks such as hardware provisioning, setup, configuration, monitoring, and scaling.
- Scalable: DynamoDB automatically scales workloads by partitioning data across multiple servers to handle growing demands.
- Performance: Provides consistent single-digit millisecond latency for read and write operations.
- Flexible data model: Supports both document and key-value data models using JSON-like syntax.
- Built-in security: Offers encryption at rest and in transit, fine-grained access control, and integration with AWS Identity and Access Management (IAM).
Q: What are the different types of primary keys supported by DynamoDB?
Answer: DynamoDB supports two types of primary keys:
- Partition key (hash key): A single attribute used to determine the partition where the item is stored.
- Composite primary key (hash-and-range key): Comprises a partition key and a sort key. This key determines both the partition and the sort order of the items.
Q: How does DynamoDB ensure scalability and high availability?
Answer: DynamoDB ensures scalability and high availability through:
- Partitioning: Automatically partitions data across multiple servers using the partition key, enabling DynamoDB to handle high traffic volumes efficiently.
- Replication: Data is synchronously replicated across multiple availability zones within a region, ensuring fault tolerance and high availability.
Q: What is the difference between provisioned throughput and on-demand capacity modes in DynamoDB?
Provisioned throughput mode:
- Allows you to specify the read and write capacity units (RCUs and WCUs) required for the table upfront.
- You are billed based on the provisioned capacity, irrespective of actual usage.
On-demand capacity mode:
- Automatically scales read and write capacity based on actual usage.
- You are charged per request based on the capacity consumed.
Q: How does DynamoDB handle consistency?
Answer: DynamoDB offers two consistency models:
- Eventual consistency: Prioritizes availability over consistency. Data across copies is eventually consistent within seconds.
- Strong consistency: Guarantees immediate consistency for read operations, ensuring the most up-to-date data is returned.
Q: What are DynamoDB streams?
Answer: DynamoDB streams capture a time-ordered sequence of item-level modifications (inserts, updates, deletes) in a DynamoDB table.
- Use cases include triggering AWS Lambda functions, replicating data across tables or regions, and implementing cross-region replication.
Q: How can you monitor and manage DynamoDB?
Answer: DynamoDB provides several tools for monitoring and management:
- Amazon CloudWatch: Monitors metrics like read/write capacity utilization, error rates, and throttling events.
- DynamoDB Console: Offers a graphical interface for managing tables, monitoring performance, and configuring settings.
- AWS CLI and SDKs: Enable programmatic access to DynamoDB for automation, scripting, and integration with other AWS services.
Q: What are some best practices for designing DynamoDB tables?
Answer: Best practices include:
- Selecting an appropriate partition key to ensure even data distribution and avoid hot partitions.
- Using sparse indexes to minimize storage costs and improve query performance.
- Leveraging secondary indexes and materialized views to support various access patterns.
- Choosing optimal data types to reduce storage space and improve performance.
- Implementing DynamoDB streams for real-time data processing and change capture.
Q: How does DynamoDB encryption work?
Answer: DynamoDB provides encryption at rest using AWS Key Management Service (KMS).
- All data, including backups and replicas, is encrypted using either an AWS-managed key or a customer-managed key.
- This ensures data protection and compliance with security requirements.
Q: What are the use cases for DynamoDB?
Answer: DynamoDB is ideal for:
- Web and mobile applications: Provides low-latency performance for user authentication, session storage, and activity tracking.
- Gaming applications: Tracks player progress, leaderboards, and real-time interactions.
- IoT applications: Handles time-series data, device tracking, and event logging.
- E-commerce: Manages product catalogs, shopping carts, and order processing.
- Serverless applications: Integrates seamlessly with AWS Lambda for event-driven architectures.
Q: What are global secondary indexes (GSIs) and local secondary indexes (LSIs) in DynamoDB?
Global Secondary Index (GSI):
- Allows queries on non-primary key attributes.
- Can have a different partition key and sort key than the base table.
- Data is stored in a separate table partition from the base table.
Local Secondary Index (LSI):
- Enables querying on an alternate sort key for the same partition key as the base table.
- Shares the same partition key as the base table and is limited to 10 GB of data per partition.
Q: What is a DynamoDB Accelerator (DAX)?
- DAX is a fully managed in-memory cache for DynamoDB that improves response times for read-heavy workloads.
- Key features:
- Reduces latency for eventually consistent reads to microseconds.
- Eliminates the need to manage a separate caching layer.
- Fully integrated with DynamoDB, maintaining compatibility with existing applications.
Q: What is the role of AWS Lambda in DynamoDB workflows?
AWS Lambda can be integrated with DynamoDB to:
- Process DynamoDB streams for real-time event-driven architectures.
- Perform data transformations or aggregations based on changes in a table.
- Implement custom business logic when changes occur in the database.
- Enable workflows like notifications, analytics, and cross-region replication.
Q: How does DynamoDB handle large items or datasets?
- DynamoDB imposes a maximum item size of 400 KB, including all attributes.
- For larger datasets, consider:
- Splitting data across multiple items.
- Using S3 for storing large objects and storing references (keys or URLs) in DynamoDB.
- Using batch writes and queries to process large datasets efficiently.
Q: What are DynamoDB transactions?
DynamoDB supports ACID-compliant transactions for applications requiring:
- Multiple-item operations: Perform transactional updates, inserts, or deletes across multiple items or tables.
- Atomicity and consistency: Ensures either all operations succeed or none are applied.
- Example use cases: Financial systems, inventory management, and complex workflows that need data integrity.
Q: What are the limitations of DynamoDB?
While DynamoDB is powerful, it has some limitations:
- Item size: Maximum size is 400 KB.
- LSI constraints: Maximum of five LSIs per table and 10 GB storage per partition key.
- Complex queries: Limited compared to traditional relational databases; requires careful schema design.
- Cost: Can be expensive for write-heavy or unpredictable workloads.
- Strict throughput limits: Misconfigured RCUs/WCUs can lead to throttling.
Q: How does DynamoDB support backup and restore?
DynamoDB provides on-demand backups and point-in-time recovery (PITR):
On-demand backups: Create full backups of tables at any time without impacting performance.
PITR: Restores data to any point in the past 35 days, enabling disaster recovery.
- Backups are encrypted and can be retained indefinitely.
Q: How can you optimize cost in DynamoDB?
- Use on-demand capacity mode for unpredictable workloads.
- Choose provisioned throughput mode with auto-scaling for predictable workloads.
- Use DynamoDB tables with infrequent access (DynamoDB Standard-IA) for cost savings on less-frequently accessed data.
- Implement efficient indexes: Use only necessary GSIs/LSIs to reduce storage and query costs.
- Leverage batch operations to minimize read and write costs.
Q: How does DynamoDB integrate with other AWS services?
Answer: DynamoDB integrates seamlessly with various AWS services, such as:
- AWS Lambda: For event-driven workflows and processing DynamoDB streams.
- Amazon S3: For storing large objects referenced in DynamoDB.
- AWS Glue: For data transformations and ETL processes.
- Amazon CloudWatch: For monitoring and setting alarms for key performance metrics.
- Amazon Kinesis: For real-time data streaming and analytics.
Q: What is the difference between DynamoDB and traditional relational databases?
Q: How does DynamoDB implement fine-grained access control?
- DynamoDB integrates with AWS Identity and Access Management (IAM) to provide fine-grained access control.
- Features include:
- Role-based access: Restrict access based on user roles.
- Attribute-based access: Limit access to specific attributes within an item.
- Condition-based policies: Apply conditions to grant access, such as IP address, time of day, or operation type.
Q: What is the maximum throughput capacity of a single DynamoDB partition?
- A single DynamoDB partition can support:
- 3,000 RCUs (read capacity units) for strongly consistent reads.
- 6,000 RCUs for eventually consistent reads.
- 1,000 WCUs (write capacity units).
- If workload exceeds these limits, DynamoDB automatically creates additional partitions to handle the demand.
Q: What are time-to-live (TTL) attributes in DynamoDB?
- TTL is a mechanism that automatically deletes expired items based on a specified timestamp.
Benefits:
- Reduces storage costs by cleaning up old or irrelevant data.
- Improves application performance by keeping the dataset smaller.
- Common use cases include session management, event logging, and temporary data storage.
Q: How does DynamoDB support global tables?
- Global tables enable multi-region, multi-active replication.
Features include:
- Automatic replication of table updates across multiple regions.
- Low-latency access for globally distributed applications.
- Built-in conflict resolution for write operations across regions.
- Common use cases include disaster recovery, geographically distributed workloads, and compliance with data residency requirements.
Q: How can you avoid hot partitions in DynamoDB?
- Hot partitions occur when too much traffic is directed to a single partition. To avoid them:
- Choose a partition key that evenly distributes traffic.
- Use randomized partition keys (e.g., hashing or adding random suffixes).
- Leverage composite keys to spread data across partitions.
- Avoid sequential keys such as timestamps for high-traffic workloads.
Q: What are some common DynamoDB error codes and their meanings?
- ProvisionedThroughputExceededException: The request exceeds the provisioned throughput. Solution: Increase RCUs/WCUs or optimize queries.
- ConditionalCheckFailedException: A conditional operation failed. Solution: Check and revise conditions.
- ThrottlingException: Too many requests are sent in a short period. Solution: Use exponential backoff or optimize capacity.
- ResourceNotFoundException: The specified table or index does not exist. Solution: Verify the resource name and ensure it exists.
Q: What is Adaptive Capacity in DynamoDB?
- Adaptive Capacity automatically adjusts partition throughput to accommodate uneven workloads.
Benefits:
- Prevents throttling on heavily accessed partitions.
- Ensures even distribution of capacity across partitions.
- Requires no manual intervention or reconfiguration.
Q: How does DynamoDB handle batch operations?
- DynamoDB supports two types of batch operations:
- BatchGetItem: Retrieves up to 100 items from multiple tables in a single request.
- BatchWriteItem: Writes, deletes, or updates up to 25 items in a single request.
Benefits:
- Reduces network overhead.
- Improves performance for bulk data processing.
- Note: Batch operations are subject to size limits and can return unprocessed items if capacity is exceeded.
Q: What are DynamoDB reserved capacity and savings plans?
- Reserved capacity allows you to commit to a specific capacity for a 1- or 3-year term at a discounted price.
- Savings plans provide flexible pricing based on usage patterns while reducing costs.
- Suitable for applications with predictable and consistent workloads.
Q: How does DynamoDB integrate with machine learning?
- DynamoDB integrates with Amazon SageMaker and AWS Glue to facilitate ML workflows.
Use cases:
- Store training data in DynamoDB and process it for model training.
- Use DynamoDB streams to trigger real-time ML predictions or insights.
- Implement feature stores with DynamoDB for serving real-time features to ML models.
Q: How does DynamoDB differ from Amazon RDS?
Q: What are transactional APIs in DynamoDB?
- Transactional APIs support ACID (Atomicity, Consistency, Isolation, Durability) operations.
APIs include:
- TransactWriteItems: Perform up to 25 write operations in a single transaction.
- TransactGetItems: Retrieve up to 25 items from one or more tables.
Use cases:
- Financial systems, inventory management, and multi-step workflows requiring strong data consistency.
Q: How does DynamoDB manage item versioning with optimistic locking?
- DynamoDB supports optimistic locking using a version number attribute.
Workflow:
- A version number is incremented every time an item is updated.
- Updates fail if the version number does not match the latest version in the database.
- Ensures data consistency and prevents accidental overwrites in concurrent environments.