my notes: AWS Database

Cheryl
4 min readJan 16, 2020

--

Relational DB:

  • SQL Server, Oracle, MySQL Server, PostgreSQL, Aurora, MariaDB
  • RDS manages backups, software patching, failure detection and recovery
  • Automated backups or snapshots
  • Provides encryption in transit (SSL) and at rest (AWS KMS). KMS keys are specific to regions they are created in
  • Multi AZ (Disaster Recovery): when primary DB fails, DNS will automatically connect to the secondary DB
  • Read Replicas (Performance, up to 5 copies): Every write to the primary DB will be replicated to the read replica. When primary DB fails, there is no automatic failover (need to manually connect it to the read replica). Read replica helps to reduce the load on the primary DB by keeping a identical copy of the data and sharing the load of read traffic with the primary DB
  • Monitoring using CloudWatch and tracking using CloudTrail

Aurora:

  • Only serverless RDS
  • 2 copies of data in each AZ, min of 3 AZ. Always have 6 copies. Only in regions with > 3 AZ
  • Write availability will not be affected with loss of < 2 copies of data, read availability will not be affected with loss of < 3 copies of data
  • Starts with 10GB — 64TB (10GB increments with storage autoscaling)
  • Self-healing. Continuously scans for errors and repairs automatically
  • Backups and snapshots are performed with no impact on DB performance
  • Backups retention period of 1 (default) — 35 days
  • Snapshots can be shared with other AWS accounts
  • Read Replicas for Aurora: Aurora replicas & MySQL read replicas
  • Migrating MySQL DB to Aurora DB:

— Create an Aurora read replica from MySQL DB which will create an writer and reader node (in separate AZ and different DNS endpoint). Promote either node to primary.

— Take snapshot of one of the Aurora read replica node and restore a new Aurora DB from the snapshot.

Non-Relational DB:

  • Adding a new field for 1 record does not create a new column across all row that can lead to many empty cells. Can have data with different format/columns

DynamoDB:

  • Severless
  • Used for mobile, web, gaming, ad-tech, IoT and many other apps
  • Schemaless with initial limit of 256 tables per region
  • Supports document and key-value data models
  • Data is stored in partition backed by SSD
  • Spread across 3 geographically distinct data centres
  • Encryption at rest (AWS KMS), only when creating new DynamoDB table. Once enabled, it cannot be disabled
  • Backup retention period < 35 days
  • Eventual consistent read (default): Consistency is reached < 1s
  • Strongly consistent read: Consistency reached in << 1s (may not be available when there is network delay/ outages)
  • Can be monitored using Cloudwatch and tracked using CloudTrail

Data Warehousing:

  • Business intelligence
  • Cognos, Jaspersoft, SQL Server reporting services, Oracle Hyperion, SAP NetWeaver
  • Allow query on large and complex data
  • OLTP (Online Transaction Processing): Direct query to extract rows, no computation
  • OLAP (Online Analytics Processing): Computation like finding sum, average etc
  • Specify DB to be used for OLTP or OLAP

Redshift (for OLAP):

  • Single AZ deployment
  • Single node (160Gb)
  • Multi-Node: Leader node (manage client connections, receive queries), Compute node (store data, perform queries and computation), maximum of 128 compute nodes
  • Advanced compressions: Columnar storage can be compressed more than row-based data stores since data are more similar across columns than rows. Multiple compression techniques. Do not need indexes or materialised views, requiring less space. Automatically samples data and select most appropriate compression scheme.
  • Massively Parallel Processing (MPP): distributes data and query across nodes. Easy to scale out (add more nodes) to maintain fast performance
  • Backup: 1 day (default) — 35 days. Minimum 3 copies (original and replica on compute node, backup on S3), replicate snapshots to S3 in other region for DR
  • Pricing: Compute node hours, backups, data transfer (only within VPC)
  • Security: Encrypted in transit (SSL), at rest (AES-256), takes care of key management (client manage own keys using hardware security module (HSM), or AWS key management service (KMS))
  • Audit logs are stored in S3 and monitoring through CloudWatch

ElastiCache:

  • Improves DB and web application performance using managed in-memory caches instead of relying on slower disk-based databases
  • Cache frequent identical queries (list of bestsellers)
  • Open-source in-memory caching engine: Memcached (for simplicity), Redis (more advanced caching)

--

--

Cheryl
Cheryl

Written by Cheryl

trouvez vous un cato. etre un cato.

No responses yet