Engineering Blog

Writing & Thinking

Practical articles on building production software — from type-safe AI pipelines to legacy migrations and edge deployments.

9 min read

The Post-Training Revolution: Why Fine-Tuning Beats Prompting for 80% of Tasks

Fine-tuned small models outperform zero-shot GPT-4 on 80% of classification tasks at 10–100x lower cost. LoRA, DPO, and fine-tuning-as-a-service changed the calculus. Here's the decision framework.

AIFine-TuningSLMMachine Learning
Read article
9 min read

AI Agents and SQL: How to Give LLMs Safe Database Access

Text-to-SQL is one of the most useful AI features you can build — and the most dangerous if done carelessly. Here's the five-layer security architecture for production.

AIPostgreSQLSecurityTypeScript
Read article
9 min read

Red-Teaming Your AI: A Developer's Guide to Breaking Things on Purpose

Pre-deployment testing increasingly fails to predict real-world AI behavior. Here's a four-category framework for systematically finding your AI's failure modes before your users do.

AITestingSecurityAI Safety
Read article
9 min read

Always-On AI Agents: From Demo to 24/7 Production

The gap between "works on my laptop" and "runs reliably 24/7" is wider than most developers realize. State persistence, error recovery, resource budgets, and the five problems nobody warns you about.

AIAI AgentsProduction DeploymentArchitecture
Read article
9 min read

The Judgment Gap: Why AI Makes Senior Engineers More Valuable, Not Less

73% of teams use AI daily but productivity gains are only 10%. The gap between 3x gains and zero gains is engineering judgment — and AI is making it more valuable, not less.

AICareerDeveloper ToolsProductivity
Read article
9 min read

AI Agent Security: What the $47,000 Prompt Injection Taught Everyone

A production AI agent was tricked into issuing $47,000 in unauthorized refunds. Traditional authentication fails for agents. Here's the secure architecture pattern.

AISecurityAI AgentsProduction Deployment
Read article
8 min read

Multimodal AI in Practice: Adding Vision to Your TypeScript App

The multimodal AI market hit $3.85B in 2026. Google's Gemini Embedding 2 maps text, images, and audio into one vector space. Here's how to add vision to your app with practical code.

AITypeScriptMultimodal AIComputer Vision
Read article
9 min read

The EU AI Act Hits in August: What Developers Actually Need to Do

Even calling an LLM API triggers legal obligations if you serve EU users. Fines up to €35M. The major deadline is August 2, 2026. Here's the practical checklist.

AIRegulationEU AI ActCompliance
Read article
9 min read

Small Language Models: Why 3B Parameters Is All You Need

In 2022, 60% on MMLU required 540B parameters. By 2024, 3.8B parameters hit the same score — a 142x reduction. SLMs are the future of production AI, and the data proves it.

AISLMMachine LearningCost Optimization
Read article
9 min read

Open-Source LLMs in Production: When to Skip the API

I was paying $400/month for ticket classification. A fine-tuned Mistral 7B matched the accuracy at $31/month. Here's the decision framework for when open-source beats proprietary.

AIOpen SourceLLMSelf-Hosting
Read article
9 min read

The Solo Developer's Stack in 2026: What I Use and Why

Eight years of production software, distilled into one stack. React 19, TanStack Start, Cloudflare Workers, PostgreSQL, and the AI tools that save me 3 hours a day.

Developer ToolsArchitectureReactCloudflare Workers
Read article
8 min read

Designing AI-Friendly APIs: What LLMs Need From Your Endpoints

Your API was designed for developers who read documentation. AI agents are a different consumer — brilliant at structure, terrible at reading between the lines. Here's how to design for both.

AIAPI DesignTypeScriptArchitecture
Read article
9 min read

How I Use PostgreSQL as a Complete AI Backend

Vector storage, conversation history, prompt versioning, usage analytics, and job scheduling — all in one database. Why I stopped reaching for six services and used PostgreSQL for everything.

PostgreSQLAIArchitecturepgvector
Read article
9 min read

On-Device AI: Running Models in the Browser with WebGPU

A healthcare client refused to send data to external APIs. WebGPU and Transformers.js let me run a text classifier entirely in the browser — 30ms inference, zero API calls, full privacy.

AIWebGPUBrowserEdge Computing
Read article
9 min read

The Embedding Problem Nobody Talks About: Why AI Search Degrades Over Time

Six months after launch, users said search was getting worse. The code hadn't changed. The culprit: embedding drift — model deprecation, silent updates, and corpus evolution.

AIRAGEmbeddingsMaintenance
Read article
8 min read

AI Code Review: How LLMs Catch Bugs Your Linter Misses

My last production bug passed TypeScript strict mode and every lint rule. An LLM found it in 5 seconds. Here's how to add AI code review to your CI pipeline for $2/month.

AICode ReviewCI/CDDeveloper Tools
Read article
9 min read

Why Every AI Feature Needs a Fallback Plan

At 2 AM on a Tuesday, the Anthropic API went down for 47 minutes. Our AI feature became a blank page. That was a $12,000 lesson in building resilient AI products.

AIArchitectureReliabilityProduction Deployment
Read article
8 min read

Semantic Caching for LLM Applications: How I Cut Our API Bill by 60%

40% of our LLM calls were near-duplicates — same question, different words, full price. Semantic caching with pgvector dropped the monthly bill from $215 to $85.

AICachingCost Optimizationpgvector
Read article
9 min read

LLM Guardrails in Production: How to Keep AI on Script

A user asked our compliance AI to write a poem about marijuana regulations. It did, beautifully. That was the wrong answer. Here are the five layers of guardrails I built afterward.

AISecurityLLMProduction Deployment
Read article
8 min read

The Latency Tax: Why AI Features Feel Slow and How to Fix It

Users abandoned our AI tool — not because the output was bad, but because it took 5 seconds. Streaming, skeleton states, predictive loading, and the 500ms rule that changed everything.

AIPerformanceUXReact
Read article
9 min read

Context Engineering: The Skill That Replaced Prompt Engineering

Prompt engineering was never the real skill. Context engineering — designing the informational environment around the model — is what actually determines output quality. Here's what changed and why it matters.

AIContext EngineeringLLMAI Agents
Read article
9 min read

Vibe Coding: The Most Divisive Trend in Software Development

73% of dev teams use AI coding tools daily, but org-level productivity gains are only ~10%. The reality behind the hype, the controversy, and what actually works.

AIDeveloper ToolsVibe CodingProductivity
Read article
9 min read

The Terminal-First Revolution: How Coding Agents Changed the Way Developers Work

Claude Code went from launch to the #1 AI coding tool in under a year. This isn't just better autocomplete — it's a fundamental shift from code completion to task completion.

AIDeveloper ToolsClaude CodeCoding Agents
Read article
9 min read

MCP: The Protocol That Became AI's USB-C

From internal Anthropic experiment to Linux Foundation standard with 100M monthly downloads in one year. How Model Context Protocol became the universal connector for AI agents.

AIMCPAI InfrastructureProtocols
Read article
9 min read

Testing AI Agents: How to QA Systems That Never Give the Same Answer Twice

Traditional testing assumes determinism. AI agents break that assumption. Here's how evaluation-driven development, behavioral testing, and agent observability are filling the gap.

AIAI AgentsTestingEvaluation
Read article
8 min read

Why 89% of AI Agent Pilots Never Reach Production

AI agent pilots doubled in 2025 but full deployment is stuck at 11%. Here's what's blocking production and a practical framework to break through.

AIAI AgentsEnterprise AIProduction Deployment
Read article
9 min read

Multi-Agent Systems Are the New Microservices

Multi-agent AI architectures are following the same arc as microservices. Here are the design patterns, orchestration models, and protocols you need to know in 2026.

AIAI AgentsArchitectureDesign Patterns
Read article
8 min read

From $30 to $0.10: How Falling Token Costs Unleashed AI Agents

Token pricing dropped 99.7% in three years. Here's how that cost collapse is reshaping AI agent architectures and creating a new discipline of agent economics.

AIAI AgentsAI EconomicsCost Optimization
Read article
10 min read

Agents That Build Themselves: The Rise of Self-Improving AI

AI agents are learning to optimize their own code, discover new tools, and accumulate skills over time. Here's how recursive self-improvement works, why it matters, and what keeps researchers up at night.

AIAI AgentsSelf-Improving AIMachine Learning
Read article
8 min read

Streaming LLM Responses in React with Server-Sent Events

A practical guide to building real-time AI chat interfaces in React using Server-Sent Events — from server-side stream creation to smooth client-side token rendering.

ReactAIStreamingServer-Sent Events
Read article
9 min read

How to Build AI Agents with Tool Calling in TypeScript

A deep dive into building LLM-powered agents that can execute real actions — from defining type-safe tools with Zod to implementing the agent loop and handling failures in production.

AITypeScriptLLMTool Calling
Read article
8 min read

React Server Components and AI: The Full-Stack Pattern You're Missing

Why AI logic belongs on the server, how React Server Components create a natural boundary for LLM calls, and patterns for streaming AI responses to interactive client UIs.

ReactServer ComponentsAIFull-Stack
Read article
9 min read

Building a RAG Pipeline with pgvector and TypeScript

How I built a production retrieval-augmented generation pipeline using PostgreSQL and pgvector instead of a dedicated vector database — from embedding documents to semantic search.

RAGPostgreSQLpgvectorTypeScriptAI
Read article
7 min read

Managing AI Chat State in React with TanStack Store

Why useState falls apart for AI chat interfaces, and how TanStack Store provides fine-grained reactive state management that keeps streaming UIs performant.

ReactTanStack StoreAIState Management
Read article
7 min read

How to Switch AI Providers Without Rewriting Your App

The adapter pattern for LLM providers — how to build a provider-agnostic AI layer in TypeScript so you can switch between OpenAI, Anthropic, and Google without touching your application code.

AITypeScriptArchitectureLLM
Read article
8 min read

Building Type-Safe AI Pipelines with TypeScript

How to use TypeScript's type system and Zod validation to build reliable, production-grade LLM data pipelines that catch errors before they reach your users.

TypeScriptAILLMData Pipelines
Read article
8 min read

Running LLMs at the Edge: AI on Cloudflare Workers

How to deploy AI-powered features to Cloudflare Workers — from proxying LLM API calls at the edge to using Workers AI for on-device inference, with real latency benchmarks.

Cloudflare WorkersAIEdge ComputingLLM
Read article
8 min read

Getting Structured Output from LLMs: Stop Parsing JSON by Hand

How to use Zod schemas and provider-specific structured output features to get validated, type-safe data from LLMs — no more regex parsing or hoping the model follows instructions.

AIZodTypeScriptLLM
Read article
10 min read

How I Migrated a Legacy .NET Dashboard to React

A practical guide to migrating legacy Blazor/.NET enterprise applications to React and TypeScript — from reverse-engineering undocumented APIs to shipping a modern frontend.

ReactTypeScriptMigration.NET
Read article
8 min read

Building Real-Time AI Dashboards with React and TanStack Query

How I built a dashboard that displays AI pipeline status, processing metrics, and LLM-generated insights in real time using TanStack Query polling and optimistic updates.

ReactTanStack QueryAIDashboard
Read article
9 min read

Prompt Engineering Patterns That Actually Work in Production

Battle-tested prompt engineering patterns from production AI systems — system prompt architecture, few-shot examples, chain-of-thought reasoning, and how to version and test prompts like code.

AIPrompt EngineeringLLMTypeScript
Read article
7 min read

Deploying Full-Stack Apps to Cloudflare Workers with TanStack Start

Everything you need to know about deploying full-stack React applications to Cloudflare Workers using TanStack Start — from Vite configuration to edge-native patterns.

Cloudflare WorkersTanStack StartReactEdge Computing
Read article
9 min read

The Developer's Guide to Building AI-Powered SaaS Products

Lessons from building an AI-powered regulatory compliance platform from zero to MVP as a sole developer — covering RAG pipelines, prompt engineering, and payment processing.

AISaaSProduct DevelopmentRAG
Read article
Ask about Kyle
AI-powered resume assistant

Ask me about Kyle's skills, experience, or projects