Building Type-Safe AI Pipelines with TypeScript

When I started building LLM-powered data pipelines at Lit Alerts, I quickly realized the biggest challenge wasn't the AI models themselves — it was the unpredictable nature of the data flowing between them. You send a carefully structured prompt and get back... whatever the model feels like returning. Missing fields, extra conversational filler, hallucinated JSON keys. If your downstream services expect a specific schema, your application crashes.

In a traditional application, you control your API contracts. With LLMs, you're dealing with probabilistic outputs. I've found that the only way to build reliable, production-grade AI pipelines is to enforce strict type safety at every boundary — treating the LLM as untrusted user input.

Why Type Safety Matters More with LLMs

The core issue is non-determinism. You might ask for a JSON object with four fields, but the model might return five, or three, or wrap its response in conversational text. If your pipeline blindly passes this data to a database insert or a downstream API, you'll get cryptic runtime errors that are nearly impossible to reproduce.

TypeScript alone catches structural issues at compile time. But LLM responses arrive at runtime. You need runtime validation that mirrors your TypeScript types — and that's where Zod comes in.

Schema Validation with Zod

At Lit Alerts, every LLM response passes through a Zod schema before it touches any business logic. Zod gives you two things simultaneously: a TypeScript type and a runtime validator.

import { z } from 'zod'

const PipelineResultSchema = z.object({
  summary: z.string().min(1),
  sentiment: z.enum(['positive', 'negative', 'neutral']),
  confidenceScore: z.number().min(0).max(1),
  entities: z.array(z.string()),
})

type PipelineResult = z.infer<typeof PipelineResultSchema>

async function processLLMResponse(rawOutput: string): Promise<PipelineResult> {
  const parsed = JSON.parse(rawOutput)
  return PipelineResultSchema.parse(parsed)
}

If the LLM omits confidenceScore or returns sentiment: "maybe", this throws a ZodError with a clear message pointing to the exact field that failed. No more guessing why your pipeline silently corrupted a database row three hours ago.

Typed Pipeline Steps

For complex pipelines with multiple LLM calls, I structure each step as a typed function with explicit input and output contracts. This makes the pipeline composable and easy to debug.

type PipelineStep<Input, Output> = {
  name: string
  execute: (input: Input) => Promise<Output>
}

function createPipeline<A, B, C>(
  step1: PipelineStep<A, B>,
  step2: PipelineStep<B, C>
) {
  return async (input: A): Promise<C> => {
    const intermediate = await step1.execute(input)
    return step2.execute(intermediate)
  }
}

const summarize: PipelineStep<string, { summary: string; length: number }> = {
  name: 'summarize',
  execute: async (text) => {
    const result = await callLLM(`Summarize: ${text}`)
    return SummarySchema.parse(JSON.parse(result))
  },
}

const classify: PipelineStep<{ summary: string; length: number }, string> = {
  name: 'classify',
  execute: async ({ summary }) => {
    const result = await callLLM(`Classify sentiment: ${summary}`)
    return SentimentSchema.parse(result)
  },
}

If I change the output shape of summarize, TypeScript immediately flags every downstream consumer that needs updating. The compiler becomes your pipeline documentation.

Error Handling and Retry Logic

At Lit Alerts, we process thousands of documents daily. We've seen everything — malformed JSON, empty responses, rate limits, timeouts. Our error handling strategy:

Validation errors (wrong schema): Log the raw LLM output, increment a metric, skip the document for human review
Transient errors (timeouts, 500s): Retry with exponential backoff, up to 3 attempts
Persistent failures: Alert the team, quarantine the document

The key insight: treat validation failures differently from infrastructure failures. A timeout is retryable. A model that consistently returns the wrong schema needs a prompt fix, not more retries.

Practical Takeaways

After building multiple production AI pipelines, here's what I'd tell anyone starting out:

Validate at every boundary. Between the LLM and your code. Between pipeline steps. Before any database write.
Use Zod's .safeParse() in production instead of .parse() — it returns a result object instead of throwing, which is easier to handle in async pipelines.
Version your schemas. When the LLM prompt changes, your expected output shape may change too. Schema versioning prevents silent data corruption.
Log raw LLM responses before validation. When a schema fails, you need the original output for debugging.

The goal is to move complexity from runtime to compile time. When you do that, you spend less time debugging and more time building features that actually deliver value. TypeScript and Zod together make LLM output as predictable as any other data source in your application.