Writing.
Things we've learned, mostly the hard way. Updated when we have something to say, not on a content calendar.
Cheap RAG: how far you can go on a single Postgres
pgvector, halfvec, BM25 with paradedb — when you don't need a vector DB and how to know.
AI 1 minAWS SQS: Dead Letter Queues, Visibility Timeout, and Retry Patterns
SQS is simple to start with and easy to misconfigure at scale. Dead letter queues, visibility timeout math, and retry patterns are where most teams get burned. Here's what to get right.
Cloud 7 minAWS Route 53 Health Checks and Failover Routing: How to Actually Use Them
Route 53 health checks and failover routing can make your application survive regional outages automatically. Here's how to configure them correctly and what the common mistakes are.
Cloud 7 minAWS ElastiCache: Redis vs Memcached and When to Use Each
ElastiCache Redis and Memcached both cache data, but they have very different capabilities, durability models, and cost profiles. Here's how to choose and what the gotchas are.
Cloud 7 minAWS EBS vs EFS vs S3: Choosing the Right Storage Service
EBS, EFS, and S3 all store data but serve fundamentally different purposes. Here's how to choose based on access pattern, latency requirements, and cost.
Cloud 7 minAWS Cost Explorer: How to Actually Use It to Find Savings
Cost Explorer has the data to find most AWS cost problems. Most teams use it wrong. Here's how to navigate it to find actionable savings, not just charts.
Cloud 6 minAWS CDK vs Terraform: Which IaC Tool to Use on AWS
CDK and Terraform both provision AWS infrastructure, but they have fundamentally different models, learning curves, and appropriate use cases. Here's how to choose.
Cloud 6 minAWS Bedrock Knowledge Bases: Building RAG Pipelines on AWS
Bedrock Knowledge Bases is AWS's managed RAG infrastructure. Here's how it works, what it costs, and when to use it versus building your own retrieval pipeline.
Cloud 7 minAurora Serverless v2: When It Makes Sense and When It Doesn't
Aurora Serverless v2 promises automatic scaling and cost savings. In practice, it's the right choice for some workloads and significantly more expensive for others. Here's the honest breakdown.
Cloud 6 minTerraform State Management on AWS: S3, DynamoDB, and Getting It Right
Remote state in S3 with DynamoDB locking is the standard, but the details — encryption, versioning, state file organization, and access control — determine whether it works reliably at team scale.
Cloud 8 minAWS WAF: What to Configure, What to Skip, and Why the Defaults Aren't Enough
AWS WAF protects web applications from common exploits, but the default managed rule groups leave gaps, and misconfigured rate limiting creates more problems than it solves. Here's a practical setup.
Cloud 7 minAWS Transit Gateway vs VPC Peering: Which to Use and When
Transit Gateway and VPC Peering both connect VPCs, but they have different cost models, routing behaviors, and operational tradeoffs. Here's how to decide.
Cloud 7 minAWS GuardDuty: Setup, What to Enable, and What the Alerts Actually Mean
GuardDuty is one of the highest-value AWS security services and one of the most misunderstood. Here's what it actually detects, how to configure it properly, and how to respond to findings.
Cloud 7 minAWS DynamoDB Cost Optimization: Capacity Modes, Design Patterns, and the Charges That Surprise You
DynamoDB costs are non-obvious. The right capacity mode, table design, and read/write patterns can reduce your DynamoDB bill by 60-80%. Here's what to look at.
Cloud 7 minAWS CloudTrail: What to Enable, What to Audit, and What Teams Miss
CloudTrail is the audit backbone of AWS security and compliance. Most accounts have it partially configured. Here's what a complete setup actually looks like.
Cloud 8 minAWS API Gateway vs ALB: Which One Should Front Your API?
API Gateway and ALB both route HTTP traffic, but they have different cost models, feature sets, and operational characteristics. Here's when to use each and when to use both.
Cloud 7 minWe replaced our staging environment with a script
Ephemeral preview envs on Fly per PR, 90 seconds cold, $0.04 per branch.
Cloud 1 minAWS Step Functions vs SQS vs EventBridge: Choosing the Right Orchestration Tool
Three services that all move work between systems, with very different cost models, failure handling, and operational characteristics. Here's when to use each.
Cloud 8 minAWS Organizations and Multi-Account Strategy: The Right Structure for Growing Teams
A single AWS account works fine until it doesn't. Here's how to design a multi-account structure with Organizations that scales without creating operational overhead.
Cloud 8 minAWS Lambda Cold Starts: What They Are and How to Fix Them
Cold starts add 200ms to 10 seconds of latency to Lambda invocations. Here's what causes them, which ones actually matter, and the practical fixes for each runtime.
Cloud 8 minAWS CloudFront: Cost, Performance, and the Configuration Mistakes That Cost Both
CloudFront reduces latency and origin load, but misconfigured caching, wrong price classes, and uncompressed responses can make it expensive without the performance benefits.
Cloud 8 minAWS Secrets Manager vs Parameter Store: Which One to Use and When
Both store secrets. Both integrate with Lambda, ECS, and EC2. The differences in cost, rotation, cross-account access, and compliance posture determine which one belongs in your architecture.
Cloud 7 minAWS Multi-Region Architecture: What It Actually Takes
Active-active, active-passive, and disaster recovery — the three patterns, what they cost, and what gets complicated when you cross region boundaries on AWS.
Cloud 8 minAWS Fargate Cost Optimization: Why Your Containers Cost More Than They Should
Fargate charges for exactly what you provision, not what you use. That distinction — and a few common configuration mistakes — is why most Fargate workloads run 40-60% over budget.
Cloud 7 minHIPAA-Compliant AWS Architecture: What You Actually Need to Configure
AWS will sign a BAA for most services, but a signed BAA doesn't make your architecture HIPAA-compliant. Here's what the actual configuration requirements look like across encryption, access control, audit logging, and incident response.
Cloud 8 minEKS Cost Optimization: Where Kubernetes Bills Go Wrong on AWS
EKS clusters accumulate cost through over-provisioned node groups, idle capacity, missing Spot configuration, and data transfer patterns that aren't obvious from the console. Here's where to look and what to fix.
Cloud 8 minAWS S3 Cost Optimization: Where the Money Actually Goes
S3 looks cheap until it isn't. Storage classes, request costs, data transfer fees, and Intelligent-Tiering gotchas — here's where S3 costs hide and how to eliminate them systematically.
Cloud 7 minAWS CloudWatch: What to Actually Monitor
Most AWS accounts either monitor nothing or drown in alerts that nobody acts on. Here's the practical set of metrics, alarms, and dashboards that actually tell you when something is wrong.
Cloud 6 minAmazon SES: Why Your Emails Aren't Being Delivered
SES configuration fails in a small number of predictable ways. Here's how to diagnose sandbox issues, DNS record problems, suppression list blocks, and bounce rate traps before they quietly kill your email sending.
Cloud 6 minAWS VPC Design Best Practices
Your VPC design affects security, cost, and operational complexity for everything that runs in it. Here's the practical subnet strategy, NAT gateway placement, and security model that holds up at scale.
Cloud 6 minAWS RDS vs Aurora: When to Migrate
Aurora offers better performance, automatic failover, and storage auto-scaling — but it's not always the right choice. Here's how to decide, and what a migration actually involves.
Cloud 5 minAWS Lambda Best Practices for Production
Running Lambda in production is different from running it in a demo. Here's what actually matters: cold starts, memory tuning, error handling, observability, and the cost tradeoffs most teams get wrong.
Cloud 5 minAWS IAM Best Practices: A Practical Guide
IAM misconfigurations are the most common finding in AWS security audits. Here's the practical checklist — root account, least privilege, roles vs users, and how to audit what you have.
Cloud 5 minWhy AWS Bedrock PoCs Fail in Production (And How to Build One That Doesn't)
Most AWS Bedrock proof-of-concepts never make it to production. Here's what separates a PoC that ships from one that gets shelved — and what the production architecture actually looks like.
Cloud 6 minWhat I Find in Every AWS Cost Audit
After auditing dozens of AWS accounts, the same three problems keep showing up. Here's what they are, why they happen, and how to fix them.
Cloud 5 minThe AWS Security Audit Checklist I Use on Every Engagement
A practical, prioritized checklist for auditing AWS account security. Covers IAM, S3, CloudTrail, root account hygiene, and Security Hub — with remediation steps for each finding.
Cloud 5 minTerraform vs CloudFormation: Which to Use on AWS
Terraform and CloudFormation both provision AWS infrastructure as code. Here's how to choose between them — and what the choice actually means for your team's long-term operational posture.
Cloud 5 minHow to Right-Size EC2 Instances (Without Breaking Anything)
Over-provisioned EC2 instances are one of the most consistent findings in AWS cost audits. Here's how to find them, quantify the savings, and downsize safely.
Cloud 5 minHow to Reduce AWS Data Transfer Costs (The Ones You Can't Explain)
AWS data transfer charges are the most misunderstood line item on most AWS bills. Here's how to find where the charges are coming from and what to do about them.
Cloud 5 minAWS Savings Plans vs Reserved Instances: Which Should You Buy?
AWS Savings Plans and Reserved Instances both reduce your compute bill by up to 72%. Here's how to choose between them — and how to avoid the commitment traps.
Cloud 6 minAWS Multi-Account Strategy: When You Need It and How to Set It Up
A single AWS account works fine until it doesn't. Here's how to know when you need a multi-account structure, what the AWS Organizations setup looks like, and the common mistakes to avoid.
Cloud 5 minAWS ECS vs EKS: Which Container Orchestration Platform Should You Use?
ECS and EKS both run containers on AWS, but they're built for different teams and contexts. Here's how to choose — and why the Fargate dimension often matters more than the orchestrator.
Cloud 5 minWhy Bozeman, why now: the rural-tech thesis
What 18 months out of SF taught us about hiring, focus, and the cost of dependencies.
Essay 1 min