Ingesting millions of photos from IoT devices on Azure: from camera to cloud

For Fotopast.cloud we process photos from thousands of camera traps. The pipeline that scales: Event Grid triggers, blob storage tiering, deduplication, and cost optimization for bulk uploads.

IoTAzureBlob StoragePipeline

An IoT device sends a photo once an hour. Sounds low volume. Multiplied by 5 000 devices in service that's 120 000 photos per day. A chunk of that comes as batch uploads once daily (traps don't photograph sleeping animals; at dusk activity they send 40-60 photos at once). Ingesting that volume without message drops and at reasonable cost takes more than “an API endpoint that writes to S3.”

This is the architecture we built for Fotopast.cloud, running in production with no intervention since launch other than scaling.

Phase 1: Ingestion — an HTTP endpoint that doesn't drop

The device POSTs multipart with a photo plus JSON metadata (GPS, time, device ID, battery). The endpoint must:

Respond within 5 seconds, or the device retries — doubling the data
Accept slightly malformed JSON (older firmware) and recover
Verify auth tokens — without blocking ingestion on slow auth checks (pre-validation at the edge)

Our solution: Azure Functions on the Consumption plan with an HTTP trigger. The function only writes a blob to the landing container and returns 202 Accepted. No business validation, no DB write. Average latency 120 ms, P99 800 ms.

Business logic runs later against the landing bucket. When ingestion tolerates bad data, drop rate = 0.

Phase 2: Deduplication — the same photo shouldn't land twice

Firmware edge cases resend the same photo multiple times. Our solution: landing blob named {deviceId}/{photoHash}.jpg. SHA-256 of the binary content. If a blob with that name already exists, overwrite is a no-op.

Hash is computed in the processing phase (not at ingestion — speed is king there), but each photo is indexed by hash in the DB. Two records of the same photo, post-hashing, land on the same DB row and the duplicate vanishes.

Phase 3: Event Grid — processing without polling

Event Grid fires an event on every blob upload. A second Azure Function is triggered by the event, not by polling — “every 10 s check for new files” would be slow and expensive.

The processing function:

Computes the hash, creates a DB record (idempotent)
Generates 3 thumbnail sizes (400 px, 1200 px, full) and stores them in the hot container
Extracts EXIF metadata (GPS, time)
Pushes a notification to the user (if enabled)
Moves the original blob to the cold container (archive, cheaper tier)

Why thumbnails upfront instead of on-the-fly? Because one photo gets read ~50-200× (list view, detail, sharing). Generate thumbnails once, cache forever: 10-50× cheaper than server-side resize on every read.

Phase 4: Blob storage tiering — cost drops 3-5×

Azure Blob has three tiers:

Hot: ~$0.018/GB/mo, instant reads. We use this for thumbnails.
Cool: ~$0.010/GB/mo, reads after ~1 min wakeup. Full photos < 30 days old.
Archive: ~$0.002/GB/mo, 1-15 hour rehydration. Photos > 1 year old.

Automatic lifecycle policy: each photo goes to Cool after 30 days, to Archive after 12 months. Business logic: 90% of users only look at the last 30 days, 99% don't look at things older than a year — but we MUST keep them for compliance / evidence.

For Fotopast this cut storage cost by 60% after the first year of operation — from ~$1 200/mo to ~$480/mo.

Phase 5: Backpressure and rate limiting

A single device with buggy firmware starts sending 100 photos per minute. Without protection that would fill the queue and slow processing for everyone else. Device-level rate limit:

Redis counter per deviceId at the ingestion endpoint
Limit 200 photos / 5 min. On breach: return 429 Too Many Requests (device retries later)
Alert when 10+ devices breach the limit (possible fleet-wide firmware bug)

The resulting numbers

After over a year of Fotopast operation:

~45M photos in the system
~120 TB total (most in Archive tier)
Ingestion throughput peak at 500 photos/minute
P99 API end-to-end latency: 950 ms
Azure cost ~$600/mo (including compute and storage)
0 lost photos in 18 months

What we'd do differently

Three lessons for the next IoT project:

Dead-letter queue from day one. We added it a month in, after seeing the first edge-case data.
Lifecycle policy set on day 1, not day 60. Moving a backlog of existing photos to Cool tier is more expensive than setting it right upfront.
Monitor per-device metrics, not just system-level. Aggregate “500 photos/min” is a healthy average. But 10 devices sending 50 photos/min each = buggy firmware, which you only find per-device.

Otherwise — 2 months to deliver and over a year of hands-off operation — this is an architecture that hasn't changed, just scaled.

LinkedIn X

Working on something similar?

Book a 30-minute technical call. No sales process — direct architectural feedback.

Our service:

Build systems that scale — without bottlenecks →

Pick a time