Engineering Blog

Writing & Thinking

Practical articles on building production software — from type-safe AI pipelines to legacy migrations and edge deployments.

March 19, 20269 min read

The Post-Training Revolution: Why Fine-Tuning Beats Prompting for 80% of Tasks

Fine-tuned small models outperform zero-shot GPT-4 on 80% of classification tasks at 10–100x lower cost. LoRA, DPO, and fine-tuning-as-a-service changed the calculus. Here's the decision framework.

AIFine-TuningSLMMachine Learning

Read article

March 19, 20269 min read

AI Agents and SQL: How to Give LLMs Safe Database Access

Text-to-SQL is one of the most useful AI features you can build — and the most dangerous if done carelessly. Here's the five-layer security architecture for production.

AIPostgreSQLSecurityTypeScript

Read article

March 19, 20269 min read

Red-Teaming Your AI: A Developer's Guide to Breaking Things on Purpose

Pre-deployment testing increasingly fails to predict real-world AI behavior. Here's a four-category framework for systematically finding your AI's failure modes before your users do.

AITestingSecurityAI Safety

Read article

March 19, 20269 min read

Always-On AI Agents: From Demo to 24/7 Production

The gap between "works on my laptop" and "runs reliably 24/7" is wider than most developers realize. State persistence, error recovery, resource budgets, and the five problems nobody warns you about.

AIAI AgentsProduction DeploymentArchitecture

Read article

March 18, 20269 min read

The Judgment Gap: Why AI Makes Senior Engineers More Valuable, Not Less

73% of teams use AI daily but productivity gains are only 10%. The gap between 3x gains and zero gains is engineering judgment — and AI is making it more valuable, not less.

AICareerDeveloper ToolsProductivity

Read article

March 18, 20269 min read

AI Agent Security: What the $47,000 Prompt Injection Taught Everyone

A production AI agent was tricked into issuing $47,000 in unauthorized refunds. Traditional authentication fails for agents. Here's the secure architecture pattern.

AISecurityAI AgentsProduction Deployment

Read article

March 18, 20268 min read

Multimodal AI in Practice: Adding Vision to Your TypeScript App

The multimodal AI market hit $3.85B in 2026. Google's Gemini Embedding 2 maps text, images, and audio into one vector space. Here's how to add vision to your app with practical code.

AITypeScriptMultimodal AIComputer Vision

Read article

March 18, 20269 min read

The EU AI Act Hits in August: What Developers Actually Need to Do

Even calling an LLM API triggers legal obligations if you serve EU users. Fines up to €35M. The major deadline is August 2, 2026. Here's the practical checklist.

AIRegulationEU AI ActCompliance

Read article

March 17, 20269 min read

Small Language Models: Why 3B Parameters Is All You Need

In 2022, 60% on MMLU required 540B parameters. By 2024, 3.8B parameters hit the same score — a 142x reduction. SLMs are the future of production AI, and the data proves it.

AISLMMachine LearningCost Optimization

Read article

March 17, 20269 min read

Open-Source LLMs in Production: When to Skip the API

I was paying $400/month for ticket classification. A fine-tuned Mistral 7B matched the accuracy at $31/month. Here's the decision framework for when open-source beats proprietary.

AIOpen SourceLLMSelf-Hosting

Read article

March 17, 20269 min read

The Solo Developer's Stack in 2026: What I Use and Why

Eight years of production software, distilled into one stack. React 19, TanStack Start, Cloudflare Workers, PostgreSQL, and the AI tools that save me 3 hours a day.

Developer ToolsArchitectureReactCloudflare Workers

Read article

March 17, 20268 min read

Designing AI-Friendly APIs: What LLMs Need From Your Endpoints

Your API was designed for developers who read documentation. AI agents are a different consumer — brilliant at structure, terrible at reading between the lines. Here's how to design for both.

AIAPI DesignTypeScriptArchitecture

Read article

March 17, 20269 min read

How I Use PostgreSQL as a Complete AI Backend

Vector storage, conversation history, prompt versioning, usage analytics, and job scheduling — all in one database. Why I stopped reaching for six services and used PostgreSQL for everything.

PostgreSQLAIArchitecturepgvector

Read article

March 17, 20269 min read

On-Device AI: Running Models in the Browser with WebGPU

A healthcare client refused to send data to external APIs. WebGPU and Transformers.js let me run a text classifier entirely in the browser — 30ms inference, zero API calls, full privacy.

AIWebGPUBrowserEdge Computing

Read article

March 17, 20269 min read

The Embedding Problem Nobody Talks About: Why AI Search Degrades Over Time

Six months after launch, users said search was getting worse. The code hadn't changed. The culprit: embedding drift — model deprecation, silent updates, and corpus evolution.

AIRAGEmbeddingsMaintenance

Read article

March 17, 20268 min read

AI Code Review: How LLMs Catch Bugs Your Linter Misses

My last production bug passed TypeScript strict mode and every lint rule. An LLM found it in 5 seconds. Here's how to add AI code review to your CI pipeline for $2/month.

AICode ReviewCI/CDDeveloper Tools

Read article

March 17, 20269 min read

Why Every AI Feature Needs a Fallback Plan

At 2 AM on a Tuesday, the Anthropic API went down for 47 minutes. Our AI feature became a blank page. That was a $12,000 lesson in building resilient AI products.

AIArchitectureReliabilityProduction Deployment

Read article

March 17, 20268 min read

Semantic Caching for LLM Applications: How I Cut Our API Bill by 60%

40% of our LLM calls were near-duplicates — same question, different words, full price. Semantic caching with pgvector dropped the monthly bill from $215 to $85.

AICachingCost Optimizationpgvector

Read article

March 17, 20269 min read

LLM Guardrails in Production: How to Keep AI on Script

A user asked our compliance AI to write a poem about marijuana regulations. It did, beautifully. That was the wrong answer. Here are the five layers of guardrails I built afterward.

AISecurityLLMProduction Deployment

Read article

March 17, 20268 min read

The Latency Tax: Why AI Features Feel Slow and How to Fix It

Users abandoned our AI tool — not because the output was bad, but because it took 5 seconds. Streaming, skeleton states, predictive loading, and the 500ms rule that changed everything.

AIPerformanceUXReact

Read article

March 16, 20269 min read

Context Engineering: The Skill That Replaced Prompt Engineering

Prompt engineering was never the real skill. Context engineering — designing the informational environment around the model — is what actually determines output quality. Here's what changed and why it matters.

AIContext EngineeringLLMAI Agents

Read article

March 16, 20269 min read

Vibe Coding: The Most Divisive Trend in Software Development

73% of dev teams use AI coding tools daily, but org-level productivity gains are only ~10%. The reality behind the hype, the controversy, and what actually works.

AIDeveloper ToolsVibe CodingProductivity

Read article

March 16, 20269 min read

The Terminal-First Revolution: How Coding Agents Changed the Way Developers Work

Claude Code went from launch to the #1 AI coding tool in under a year. This isn't just better autocomplete — it's a fundamental shift from code completion to task completion.

AIDeveloper ToolsClaude CodeCoding Agents

Read article

March 16, 20269 min read

MCP: The Protocol That Became AI's USB-C

From internal Anthropic experiment to Linux Foundation standard with 100M monthly downloads in one year. How Model Context Protocol became the universal connector for AI agents.

AIMCPAI InfrastructureProtocols

Read article

March 16, 20269 min read

Testing AI Agents: How to QA Systems That Never Give the Same Answer Twice

Traditional testing assumes determinism. AI agents break that assumption. Here's how evaluation-driven development, behavioral testing, and agent observability are filling the gap.

AIAI AgentsTestingEvaluation

Read article

March 16, 20268 min read

Why 89% of AI Agent Pilots Never Reach Production

AI agent pilots doubled in 2025 but full deployment is stuck at 11%. Here's what's blocking production and a practical framework to break through.

AIAI AgentsEnterprise AIProduction Deployment

Read article

March 16, 20269 min read

Multi-Agent Systems Are the New Microservices

Multi-agent AI architectures are following the same arc as microservices. Here are the design patterns, orchestration models, and protocols you need to know in 2026.

AIAI AgentsArchitectureDesign Patterns

Read article

March 16, 20268 min read

From $30 to $0.10: How Falling Token Costs Unleashed AI Agents

Token pricing dropped 99.7% in three years. Here's how that cost collapse is reshaping AI agent architectures and creating a new discipline of agent economics.

AIAI AgentsAI EconomicsCost Optimization

Read article

March 16, 202610 min read

Agents That Build Themselves: The Rise of Self-Improving AI

AI agents are learning to optimize their own code, discover new tools, and accumulate skills over time. Here's how recursive self-improvement works, why it matters, and what keeps researchers up at night.

AIAI AgentsSelf-Improving AIMachine Learning

Read article

March 16, 20268 min read

Streaming LLM Responses in React with Server-Sent Events

A practical guide to building real-time AI chat interfaces in React using Server-Sent Events — from server-side stream creation to smooth client-side token rendering.

ReactAIStreamingServer-Sent Events

Read article

March 15, 20269 min read

How to Build AI Agents with Tool Calling in TypeScript

A deep dive into building LLM-powered agents that can execute real actions — from defining type-safe tools with Zod to implementing the agent loop and handling failures in production.

AITypeScriptLLMTool Calling

Read article

March 14, 20268 min read

React Server Components and AI: The Full-Stack Pattern You're Missing

Why AI logic belongs on the server, how React Server Components create a natural boundary for LLM calls, and patterns for streaming AI responses to interactive client UIs.

ReactServer ComponentsAIFull-Stack

Read article

March 13, 20269 min read

Building a RAG Pipeline with pgvector and TypeScript

How I built a production retrieval-augmented generation pipeline using PostgreSQL and pgvector instead of a dedicated vector database — from embedding documents to semantic search.

RAGPostgreSQLpgvectorTypeScriptAI

Read article

March 12, 20267 min read

Managing AI Chat State in React with TanStack Store

Why useState falls apart for AI chat interfaces, and how TanStack Store provides fine-grained reactive state management that keeps streaming UIs performant.

ReactTanStack StoreAIState Management

Read article

March 11, 20267 min read

How to Switch AI Providers Without Rewriting Your App

The adapter pattern for LLM providers — how to build a provider-agnostic AI layer in TypeScript so you can switch between OpenAI, Anthropic, and Google without touching your application code.

AITypeScriptArchitectureLLM

Read article

March 10, 20268 min read

Building Type-Safe AI Pipelines with TypeScript

How to use TypeScript's type system and Zod validation to build reliable, production-grade LLM data pipelines that catch errors before they reach your users.

TypeScriptAILLMData Pipelines

Read article

March 7, 20268 min read

Running LLMs at the Edge: AI on Cloudflare Workers

How to deploy AI-powered features to Cloudflare Workers — from proxying LLM API calls at the edge to using Workers AI for on-device inference, with real latency benchmarks.

Cloudflare WorkersAIEdge ComputingLLM

Read article

March 3, 20268 min read

Getting Structured Output from LLMs: Stop Parsing JSON by Hand

How to use Zod schemas and provider-specific structured output features to get validated, type-safe data from LLMs — no more regex parsing or hoping the model follows instructions.

AIZodTypeScriptLLM

Read article

February 18, 202610 min read

How I Migrated a Legacy .NET Dashboard to React

A practical guide to migrating legacy Blazor/.NET enterprise applications to React and TypeScript — from reverse-engineering undocumented APIs to shipping a modern frontend.

ReactTypeScriptMigration.NET

Read article

February 14, 20268 min read

Building Real-Time AI Dashboards with React and TanStack Query

How I built a dashboard that displays AI pipeline status, processing metrics, and LLM-generated insights in real time using TanStack Query polling and optimistic updates.

ReactTanStack QueryAIDashboard

Read article

February 8, 20269 min read

Prompt Engineering Patterns That Actually Work in Production

Battle-tested prompt engineering patterns from production AI systems — system prompt architecture, few-shot examples, chain-of-thought reasoning, and how to version and test prompts like code.

AIPrompt EngineeringLLMTypeScript

Read article

January 25, 20267 min read

Deploying Full-Stack Apps to Cloudflare Workers with TanStack Start

Everything you need to know about deploying full-stack React applications to Cloudflare Workers using TanStack Start — from Vite configuration to edge-native patterns.

Cloudflare WorkersTanStack StartReactEdge Computing

Read article

January 8, 20269 min read

The Developer's Guide to Building AI-Powered SaaS Products

Lessons from building an AI-powered regulatory compliance platform from zero to MVP as a sole developer — covering RAG pipelines, prompt engineering, and payment processing.

AISaaSProduct DevelopmentRAG

Read article