Skip to main content

NoSQL Databases: DynamoDB (AWS) & Firestore (GCP)

Why NoSQL for Serverless?

Traditional SQL databases (PostgreSQL, MySQL) require persistent connections—not ideal for ephemeral functions.

NoSQL databases designed for serverless provide:

  • Automatic scaling
  • Pay-per-request pricing
  • Low latency
  • No connection limits

AWS: DynamoDB
GCP: Firestore (or Datastore)


Simple Explanation

What it is

This lesson explains how serverless apps store data using managed NoSQL databases that scale automatically.

Why we need it

Serverless functions spin up and down quickly, so we need databases that do not require long-lived connections.

Benefits

  • Automatic scaling without capacity planning.
  • Pay-per-request pricing that matches serverless usage.
  • Low operational overhead compared to running your own database servers.

Tradeoffs

  • Different data modeling than SQL tables and joins.
  • Query limits that require careful key design.

Real-world examples (architecture only)

  • Product catalog → Function → DynamoDB table.
  • User profiles → Function → Firestore collection.

Part 1: AWS DynamoDB

DynamoDB Basics

Key-Value Store

Partition Key: userId
Sort Key: timestamp

Item:
{
userId: "user123",
timestamp: "2026-02-08T10:00:00Z",
message: "Hello!",
likes: 5
}

Concepts

  • Table: Collection of items
  • Item: Single record (like a row in SQL)
  • Attribute: Field in an item (like a column)
  • Partition Key: How DynamoDB distributes data
  • Sort Key: Optional; sorts items within a partition

Create a Table

Console Method

  1. Go to DynamoDB Console
  2. Click Create table
  3. Table name: Items
  4. Partition key: itemId (String)
  5. Click Create

What You Get

DynamoDB table structure

Access DynamoDB from Lambda

Install SDK

pip install boto3

Create Item

import json
import boto3
from datetime import datetime

ddb = boto3.resource("dynamodb")
table = ddb.Table("Items")

def handler(event, context):
item = {
"itemId": "item-123",
"name": "My Item",
"createdAt": datetime.utcnow().isoformat(),
"description": "A great item",
}

table.put_item(Item=item)
return {"statusCode": 201, "body": json.dumps(item)}

Get Item

import json

def handler(event, context):
item_id = event.get("pathParameters", {}).get("id")
response = table.get_item(Key={"itemId": item_id})
item = response.get("Item")
return {"statusCode": 200, "body": json.dumps(item)}

Query Items

import json
from boto3.dynamodb.conditions import Key

def handler(event, context):
user_id = event.get("pathParameters", {}).get("userId")
response = table.query(KeyConditionExpression=Key("userId").eq(user_id))
items = response.get("Items", [])
return {"statusCode": 200, "body": json.dumps(items)}

Update Item

import json

def handler(event, context):
item_id = event.get("pathParameters", {}).get("id")
body = json.loads(event.get("body") or "{}")
name = body.get("name")

table.update_item(
Key={"itemId": item_id},
UpdateExpression="SET #n = :name",
ExpressionAttributeNames={"#n": "name"},
ExpressionAttributeValues={":name": name},
)

return {"statusCode": 200, "body": json.dumps({"itemId": item_id, "name": name})}

Delete Item

def handler(event, context):
item_id = event.get("pathParameters", {}).get("id")
table.delete_item(Key={"itemId": item_id})
return {"statusCode": 204}

DynamoDB Pricing

  • Pay per request
  • Auto-scales
  • See AWS DynamoDB pricing for current rates

Provisioned (For Predictable Workloads)

  • Reserve capacity upfront
  • Cheaper if usage is predictable
  • Manual scaling required

For Maarifa: Use On-Demand pricing.

Lambda Permissions to DynamoDB

Your Lambda needs IAM permissions:

{
"Effect": "Allow",
"Action": [
"dynamodb:PutItem",
"dynamodb:GetItem",
"dynamodb:Query",
"dynamodb:UpdateItem",
"dynamodb:DeleteItem"
],
"Resource": "arn:aws:dynamodb:us-east-1:123456789:table/Items"
}

Best Practices

  1. Use partition keys wisely — Distribute evenly across partitions
  2. Minimize data size — Smaller items load faster
  3. Use on-demand for variability — Matches serverless nature
  4. Cache frequently-read data — Use ElastiCache or DAX
  5. Enable automation — Use Lambda for TTL-based cleanup

Part 2: Google Cloud Firestore

What Is Firestore?

Firestore isGoogle's managed NoSQL database for serverless applications.

Similarities to DynamoDB:

  • Automatic scaling
  • Pay-per-request pricing
  • Millisecond latency
  • Handles concurrent connections

Differences:

  • Document-based (like MongoDB) vs. key-value
  • Better for hierarchical data
  • Stronger consistency
  • Real-time listeners included

Firestore Basics

Data is organized in collections and documents:

Firestore structure

Create a Firestore Database

  1. Go to Google Cloud Console
  2. Click Create database
  3. Choose Native Mode (not Datastore)
  4. Select location
  5. Click Create

Access Firestore from Cloud Function

pip install google-cloud-firestore
# main.py - Google Cloud Function
from datetime import datetime
from google.cloud import firestore

db = firestore.Client()

def create_item(request):
item = {
"itemId": "item-123",
"name": "My Item",
"createdAt": datetime.utcnow(),
"description": "A great item",
}

db.collection("items").document(item["itemId"]).set(item)
return (item, 201)

CRUD Operations (Firestore)

Create:

def create_item(request):
data = request.get_json(silent=True) or {}
doc = db.collection("items").add({
"name": data.get("name"),
"description": data.get("description"),
"createdAt": datetime.utcnow(),
})
return ({"id": doc[1].id, **data}, 201)

Read:

def get_item(request, item_id):
doc = db.collection("items").document(item_id).get()
if not doc.exists:
return ({"error": "Item not found"}, 404)
return ({"id": doc.id, **doc.to_dict()}, 200)

Update:

def update_item(request, item_id):
data = request.get_json(silent=True) or {}
db.collection("items").document(item_id).update({"name": data.get("name")})
return ({"id": item_id, "name": data.get("name")}, 200)

Delete:

def delete_item(request, item_id):
db.collection("items").document(item_id).delete()
return ("", 204)

Query:

def list_items(request):
query = (
db.collection("items")
.where("status", "==", "active")
.order_by("createdAt", direction=firestore.Query.DESCENDING)
)
items = [{"id": doc.id, **doc.to_dict()} for doc in query.stream()]
return (items, 200)

Firestore Best Practices

  1. Design for your queries — Firestore requires indexes for complex queries
  2. Avoid hot partitions — Distribute keys evenly
  3. Use batch writes — Multiple documents at once
  4. Index strategically — Create manually for known queries
  5. Enable offline support — Built-in, good for mobile

AWS DynamoDB vs. Google Cloud Firestore

FeatureDynamoDBFirestore
Data ModelKey-valueDocument-based
Query PowerIndex-drivenRich querying
ConsistencyConfigurableStrong by default (see docs)
Real-Time ListenersStreamsBuilt-in listeners
PricingSee AWS pricingSee GCP pricing
Document SizeSee limitsSee limits
TransactionsSee limitsSee limits
Offline SupportNoYes (mobile SDKs)
Complex QueriesRequires indexesRequires indexes for some queries

Which Should You Use?

Use DynamoDB if:

  • You're in AWS ecosystem
  • Simple queries (key lookups)
  • Write-heavy workloads
  • You want built-in DAX caching

Use Firestore if:

  • You need rich querying
  • Document hierarchies
  • Real-time listeners
  • Mobile app integration
  • You prefer strong consistency

Best Practice

Learn both. Enterprise systems often use multi-cloud for flexibility and redundancy.


Hands-On: Item Tracker (Both Clouds)

Part A (AWS):

  1. Create DynamoDB table
  2. Write Lambda to create/read/update/delete items

Part B (GCP):

  1. Create Firestore collection
  2. Write Cloud Function with same CRUD operations

Compare the two implementations. Which felt simpler?

Use itemId as Partition Key and createdAt as Sort Key.

Key Takeaway

DynamoDB is the natural database for serverless applications. It scales with your functions and charges only for what you use.


Project (Cloud-Agnostic)

Build a CRUD API for items using a NoSQL database.

Deliverables:

  1. Describe the data model and access patterns.
  2. Map the database to AWS or GCP services.
  3. Explain how your schema supports your queries.

If you want feedback, email your write-up to [email protected].


References