Designing AI-Friendly APIs: What LLMs Need From Your Endpoints

I spent a week integrating an LLM with a third-party API, and the experience was miserable. Not because the API was bad — it was well-documented, well-structured, and thoroughly tested. It was miserable because the API was designed for humans reading documentation, not for AI agents parsing responses.

Error messages said "Invalid request." Not which field was invalid or what the valid options were. Enum values were numeric codes that required a lookup table. Nested objects went seven levels deep with no way to request a flatter view. The API worked fine when I was the one reading the responses. It was terrible when an LLM was.

This experience made me rethink how I design APIs — because in 2026, AI agents are a primary consumer of your endpoints.

The New Consumer

Your API used to have one type of consumer: application code written by a developer who read your documentation. That developer understood your domain, knew your conventions, and could handle ambiguity.

AI agents are a different kind of consumer. They have massive context windows but limited patience for ambiguity. They can process structured data instantly but struggle with implicit conventions. They are excellent at following explicit instructions but terrible at inferring unwritten rules.

Designing APIs for AI agents does not mean abandoning good API design. It means being more explicit, more descriptive, and more structured than you might have been when your only consumer was a developer with access to Slack and a search engine.

Self-Describing Error Responses

The single most impactful thing you can do is make your error responses informative enough that an AI agent can fix its own mistakes.

// Bad: What is the agent supposed to do with this?
{
  "error": "Invalid request",
  "code": 400
}

// Better: The agent knows exactly what to fix
{
  "error": {
    "type": "validation_error",
    "message": "Request validation failed",
    "details": [
      {
        "field": "date_range.start",
        "issue": "Must be in ISO 8601 format (YYYY-MM-DD)",
        "received": "March 1, 2026",
        "example": "2026-03-01"
      },
      {
        "field": "category",
        "issue": "Must be one of the allowed values",
        "received": "reports",
        "allowed_values": ["financial", "compliance", "operational", "audit"]
      }
    ]
  }
}

When an AI agent receives the first error, it has to guess what went wrong. It might retry with the same bad request, or it might try to parse the word "Invalid" for clues. When it receives the second error, it knows exactly which fields to fix, what format to use, and what values are acceptable. It can construct a valid retry on the first attempt.

In my own APIs, I now follow a simple rule: every error response must contain enough information to construct a valid request.

Descriptive Enum Values

Numeric codes and abbreviations are a holdover from an era when bandwidth was expensive. AI agents cannot look up what status code 7 means in your documentation.

// Bad: What does status 3 mean?
{ "status": 3, "type": "A" }

// Good: Self-documenting
{ "status": "pending_review", "type": "annual_compliance_report" }

This applies to every field that has a fixed set of values. Use descriptive strings, not codes. Use full words, not abbreviations. The marginal increase in payload size is negligible compared to the reduction in AI confusion.

Flat Response Structures

Deep nesting is hard for AI agents to navigate. When an agent needs to extract a specific value from a seven-level-deep JSON structure, it often gets confused about which level it is at, especially when similar field names appear at different levels.

// Hard for agents to navigate
{
  "report": {
    "metadata": {
      "audit": {
        "reviewer": {
          "contact": {
            "email": "reviewer@example.com"
          }
        }
      }
    }
  }
}

// Easier: Flat with dot-notation keys
{
  "report_id": "rpt_123",
  "audit_reviewer_email": "reviewer@example.com",
  "audit_reviewer_name": "Jane Smith",
  "audit_status": "pending_review",
  "audit_due_date": "2026-04-01"
}

I am not saying never use nesting — logical grouping still matters for readability. But when the nesting exists only because of object-relational mapping conventions and does not add semantic meaning, flatten it. Two levels of nesting is a good maximum for AI-consumed endpoints.

Explicit Pagination Metadata

AI agents need to know how to page through results without guessing. Include explicit pagination metadata with every paginated response.

{
  "data": [...],
  "pagination": {
    "total_items": 247,
    "total_pages": 25,
    "current_page": 1,
    "per_page": 10,
    "has_next": true,
    "has_previous": false,
    "next_url": "/api/reports?page=2&per_page=10",
    "previous_url": null
  }
}

The next_url field is particularly important. An AI agent that can follow a URL does not need to understand your pagination parameter convention. It just follows the link. This is the same principle behind HATEOAS in REST design, and it turns out AI agents are the consumers that benefit from it most.

Structured Tool Descriptions

If you are building APIs that AI agents will consume through tool calling (MCP servers, function calling, etc.), the quality of your tool descriptions directly determines how well the agent uses your API.

// Bad: The agent has to guess what this does
const tool = {
  name: 'getReports',
  description: 'Get reports',
  parameters: z.object({
    type: z.string(),
    from: z.string(),
    to: z.string(),
  }),
};

// Good: The agent knows exactly how to call this
const tool = {
  name: 'getComplianceReports',
  description: 'Retrieve compliance reports for a specific date range. Returns a paginated list of reports sorted by due date. Use this when the user asks about compliance status, upcoming deadlines, or audit history.',
  parameters: z.object({
    report_type: z.enum(['annual', 'quarterly', 'incident', 'audit'])
      .describe('The type of compliance report to retrieve'),
    date_from: z.string()
      .describe('Start date in ISO 8601 format (YYYY-MM-DD). Example: 2026-01-01'),
    date_to: z.string()
      .describe('End date in ISO 8601 format (YYYY-MM-DD). Example: 2026-03-31'),
    status: z.enum(['draft', 'pending_review', 'approved', 'rejected'])
      .optional()
      .describe('Filter by report status. Omit to include all statuses.'),
  }),
};

Every parameter should have a .describe() call. Every enum should use descriptive values. The tool description should explain not just what it does, but when to use it.

Consistent Date and Number Formatting

AI agents struggle with ambiguous formats. "03/04/2026" — is that March 4 or April 3? "1,234.56" — is that one thousand or one point two three four?

Standardize everything:

Dates: ISO 8601 always. 2026-03-16T14:30:00Z. No exceptions.
Numbers: No formatting. 1234.56, not 1,234.56 or $1,234.56.
Currency: Separate value and currency code. { "amount": 1234.56, "currency": "USD" }.
Booleans: Use true/false, not 1/0, "yes"/"no", or "Y"/"N".

// Ambiguous
{ "date": "03/16/26", "amount": "$1,234.56", "active": "Y" }

// Unambiguous
{ "date": "2026-03-16", "amount": 1234.56, "currency": "USD", "active": true }

Summary Endpoints

For AI agents, sometimes the best API is one that pre-aggregates data. Instead of forcing the agent to fetch 50 records and calculate an average, provide a summary endpoint.

// Instead of making the agent calculate this from raw data
app.get('/api/compliance/summary', async (req, res) => {
  const summary = await db.query(`
    SELECT
      COUNT(*) FILTER (WHERE status = 'approved') as approved,
      COUNT(*) FILTER (WHERE status = 'pending_review') as pending,
      COUNT(*) FILTER (WHERE status = 'overdue') as overdue,
      COUNT(*) as total,
      MIN(due_date) FILTER (WHERE status = 'pending_review') as next_deadline
    FROM compliance_reports
    WHERE organization_id = $1
  `, [req.user.orgId]);

  res.json({
    compliance_status: summary.rows[0],
    generated_at: new Date().toISOString(),
  });
});

This reduces the number of tool calls the agent needs (saving tokens and time), reduces the chance of calculation errors, and gives the agent a clean, pre-processed view of the data.

Testing with AI Consumers

The final step is testing your API with actual AI agents. I run a suite of tests where an LLM attempts to accomplish tasks using only the API's tool definitions and error responses — no documentation, no examples, just the API itself.

async function testAgentUsability(task: string): Promise<TestResult> {
  const result = await agent.run({
    task,
    tools: apiTools,
    maxIterations: 10,
  });

  return {
    completed: result.success,
    iterations: result.iterationCount,
    errors: result.errorCount,
    totalTokens: result.tokenUsage,
  };
}

// If the agent needs more than 3 iterations for a simple task,
// your API descriptions need improvement

If the agent cannot complete a straightforward task in 3 iterations, your tool descriptions are not descriptive enough. If it completes the task but uses excessive tokens, your responses are too verbose or too deeply nested. If it fails on the first error and cannot recover, your error responses are not informative enough.

Design your APIs for the dumbest smart consumer you will ever have. That consumer is an LLM — brilliant at processing structure, terrible at reading between the lines. Make everything explicit, and your APIs will be better for human consumers too.