Integration & APIs - GraphQL Schema Design

Create a free account to save your progress

Earn XP, track streaks, and sync your dashboard across devices.

Lesson

AI pitfall

AI-generated GraphQL schemas almost never include depth limits or complexity limits. Without these, a malicious client can craft a single query that joins tables recursively and crashes your database. Every production GraphQL API needs query depth and complexity guards.

GraphQLWhat is graphql?A query language for APIs where clients specify the exact shape of data they need in a single request, avoiding over-fetching and under-fetching. gives you power and flexibility. It also gives you the ability to build an APIWhat is api?A set of rules that lets one program talk to another, usually over the internet, by sending requests and getting responses. that brings your database to its knees with a single query. SchemaWhat is schema?A formal definition of the structure your data must follow - which fields exist, what types they have, and which are required. design is where you prevent that. This lesson covers the patterns that make GraphQL APIs performant, maintainable, and safe.

SchemaWhat is schema?A formal definition of the structure your data must follow - which fields exist, what types they have, and which are required.-first development

Good to know

Schema-first means your frontend and backend teams can work in parallel from day one. The frontend mocks the API from the schema, the backend implements resolvers against the same contract. This is one of GraphQL's biggest practical advantages over REST, where the contract is often informal.

In GraphQLWhat is graphql?A query language for APIs where clients specify the exact shape of data they need in a single request, avoiding over-fetching and under-fetching., you can approach development two ways: code-first (write resolvers, generate schema) or schema-first (write schema, then implement resolvers). Schema-first wins for most teams.

# schema.graphql - write this first
type User {
  id: ID!
  name: String!
  email: String!
  posts(first: Int = 10, after: String): PostConnection!
  createdAt: DateTime!
}

type Post {
  id: ID!
  title: String!
  body: String!
  author: User!
  comments(first: Int = 10): CommentConnection!
  publishedAt: DateTime
  status: PostStatus!
}

enum PostStatus {
  DRAFT
  PUBLISHED
  ARCHIVED
}

type Query {
  user(id: ID!): User
  post(id: ID!): Post
  posts(first: Int = 20, after: String, status: PostStatus): PostConnection!
}

type Mutation {
  createPost(input: CreatePostInput!): CreatePostPayload!
  updatePost(id: ID!, input: UpdatePostInput!): UpdatePostPayload!
  deletePost(id: ID!): DeletePostPayload!
}

The schema becomes the contract. Frontend developers can build their queries immediately using mockWhat is mock?A fake replacement for a real dependency in tests that records how it was called so you can verify interactions. data. Backend developers implement resolvers knowing exactly what shape the data must have. Changes to the schema are reviewed by both teams.

Input types and payloadWhat is payload?The data sent in the body of an HTTP request, such as the JSON object you include when creating a resource through a POST request. types

Notice the mutation pattern above: input: CreatePostInput! for arguments and CreatePostPayload! for the return type. This is a GraphQL convention that keeps mutations clean and extensible.

input CreatePostInput {
  title: String!
  body: String!
  status: PostStatus = DRAFT
}

type CreatePostPayload {
  post: Post
  errors: [UserError!]!
}

type UserError {
  field: String!
  message: String!
}

The payload includes both the result and potential errors. This is important because GraphQL always returns HTTPWhat is http?The protocol browsers and servers use to exchange web pages, API data, and other resources, defining how requests and responses are formatted. 200, you cannot rely on status codes to communicate domain errors.

The N+1 problemWhat is n+1 query?A performance bug where fetching a list triggers one extra database query per item instead of loading all related data in a single query.

Edge case

The N+1 problem is not just a performance issue, it can cause database connection pool exhaustion under load. If 50 concurrent GraphQL queries each trigger 20 resolver calls, that is 1,000 database queries happening simultaneously. DataLoader reduces this to 100, which is the difference between a working system and an outage.

This is the single biggest performance issue in GraphQLWhat is graphql?A query language for APIs where clients specify the exact shape of data they need in a single request, avoiding over-fetching and under-fetching.. If you do not address it, your APIWhat is api?A set of rules that lets one program talk to another, usually over the internet, by sending requests and getting responses. will be slow and your database will suffer.

How it happens

Consider this query:

query {
  posts(first: 20) {
    nodes {
      title
      author {
        name
      }
    }
  }
}

A naive resolver implementation:

const resolvers = {
  Query: {
    posts: () => db.query('SELECT * FROM posts LIMIT 20')
  },
  Post: {
    author: (post) => db.query('SELECT * FROM users WHERE id = ?', [post.authorId])
  }
};

What happens at runtimeWhat is runtime?The environment that runs your code after it's written. Some languages need a runtime installed on the machine; others (like Go) bake it into the binary.:

One query fetches 20 posts
For each post, one query fetches the author
Total: 1 + 20 = 21 database queries

If the authors happen to be the same people (posts by the same 3 authors), you are querying the same user rows multiple times. With deeper nesting (author's posts, those posts' comments), the query count explodes.

The DataLoader solution

Facebook created DataLoader specifically for this problem. It batches individual lookups into a single query per type per tick of the event loopWhat is event loop?The mechanism that lets Node.js handle many operations on a single thread by delegating slow tasks and processing their results when ready..

const DataLoader = require('dataloader');

// Create a loader that batches user lookups
const userLoader = new DataLoader(async (userIds) => {
  // One query for ALL user IDs
  const users = await db.query(
    'SELECT * FROM users WHERE id IN (?)',
    [userIds]
  );

  // Return users in the same order as the input IDs
  const userMap = new Map(users.map(u => [u.id, u]));
  return userIds.map(id => userMap.get(id) || null);
});

const resolvers = {
  Query: {
    posts: () => db.query('SELECT * FROM posts LIMIT 20')
  },
  Post: {
    author: (post) => userLoader.load(post.authorId)
    // DataLoader collects all .load() calls in one tick
    // then fires ONE query: SELECT * FROM users WHERE id IN (1, 5, 12)
  }
};

Result: 1 query for posts + 1 query for all authors = 2 queries total. The key rules for DataLoader:

Create a new DataLoader instance per request (to avoid caching across users)
The batch function must return results in the same order as the input keys
DataLoader also deduplicates: if post 3 and post 7 have the same author, it is only fetched once

SchemaWhat is schema?A formal definition of the structure your data must follow - which fields exist, what types they have, and which are required. evolution

RESTWhat is rest?An architectural style for web APIs where URLs represent resources (nouns) and HTTP methods (GET, POST, PUT, DELETE) represent actions on those resources. APIs version with URL prefixes (/v1/, /v2/). GraphQLWhat is graphql?A query language for APIs where clients specify the exact shape of data they need in a single request, avoiding over-fetching and under-fetching. takes a different approach: schema evolution. You never remove fields. You add new ones and deprecate old ones.

type User {
  id: ID!
  name: String!                  # Original field
  displayName: String!           # New field (better name)
  fullName: String! @deprecated(reason: "Use displayName instead")
  email: String!
  avatarUrl: String              # Added in month 3
  avatar: String @deprecated(reason: "Use avatarUrl instead. Will be removed 2025-06-01")
}

The @deprecated directive tells clients that a field is going away. GraphQL tooling (GraphiQL, Apollo Client) shows deprecated fields with a warning. Clients can migrate at their own pace.

Evolution workflow

Add new field alongside the old one
Deprecate old field with a reason and removal date
Monitor usage: check if any clients still query the deprecated field
Remove only when usage drops to zero (or after the announced deadline)

This is significantly better than REST versioning because you do not maintain two separate APIWhat is api?A set of rules that lets one program talk to another, usually over the internet, by sending requests and getting responses. versions. The schema is always one thing, with some fields marked as deprecated.

Error handling in GraphQLWhat is graphql?A query language for APIs where clients specify the exact shape of data they need in a single request, avoiding over-fetching and under-fetching.

GraphQL error handling is different from RESTWhat is rest?An architectural style for web APIs where URLs represent resources (nouns) and HTTP methods (GET, POST, PUT, DELETE) represent actions on those resources. and frequently misunderstood.

The errors array

Every GraphQL response has this shape:

json

{
  "data": { ... },
  "errors": [ ... ]
}

Both can be present simultaneously. This is called partial data: some fields resolved successfully, others failed.

json

{
  "data": {
    "user": {
      "name": "Alice",
      "email": null,
      "recentOrders": null
    }
  },
  "errors": [
    {
      "message": "Not authorized to view email",
      "path": ["user", "email"],
      "extensions": { "code": "FORBIDDEN" }
    },
    {
      "message": "Order service unavailable",
      "path": ["user", "recentOrders"],
      "extensions": { "code": "SERVICE_UNAVAILABLE" }
    }
  ]
}

The client gets the user's name even though email and orders failed. In REST, the entire request would have returned a 403 or 500.

Domain errors vs system errors

Best practice is to separate them:

# System errors → errors array (resolver throws)
# Domain errors → part of the response type

type CreatePostPayload {
  post: Post
  errors: [UserError!]!    # Domain errors: validation, business rules
}

// Domain error: return it in the payload
createPost: async (_, { input }) => {
  if (input.title.length < 3) {
    return {
      post: null,
      errors: [{ field: 'title', message: 'Title must be at least 3 characters' }]
    };
  }
  const post = await db.posts.create(input);
  return { post, errors: [] };
}

AuthorizationWhat is authorization?Checking what an authenticated user is allowed to do, like whether they can delete records or access admin pages. in resolvers

Never put authorization logic in the schemaWhat is schema?A formal definition of the structure your data must follow - which fields exist, what types they have, and which are required. definition. The schema defines the shape of data, not who can access it.

// ✅ Good: Authorization in the resolver layer
const resolvers = {
  User: {
    email: (user, _, context) => {
      // Only the user themselves or admins can see email
      if (context.currentUser.id !== user.id && !context.currentUser.isAdmin) {
        throw new ForbiddenError('Not authorized to view email');
      }
      return user.email;
    },
    posts: (user) => userLoader.loadPostsByUser(user.id)
  }
};

// ❌ Bad: Trying to enforce auth in schema
// There is no way to do this in SDL - the schema has no concept of "who"

For complex authorization, use a dedicated layer:

// auth.js - reusable authorization checks
function canViewEmail(viewer, targetUser) {
  return viewer.id === targetUser.id || viewer.role === 'admin';
}

function canEditPost(viewer, post) {
  return post.authorId === viewer.id || viewer.role === 'admin';
}

// resolvers use the auth layer
email: (user, _, { currentUser }) => {
  if (!canViewEmail(currentUser, user)) {
    throw new ForbiddenError('Not authorized');
  }
  return user.email;
}

Best practices vs anti-patterns

Best practice	Anti-pattern
Schema-first: design schema before resolvers	Code-first without reviewing the generated schema
Use DataLoader for all nested resolvers	Direct DB queries in every resolver (N+1)
Input types for mutations (`CreatePostInput`)	Flat argument lists (`title: String!, body: String!`)
Payload types with errors (`CreatePostPayload`)	Throwing errors for validation failures
Cursor-based pagination (`Connection` pattern)	Returning unbounded arrays
Deprecate fields with `@deprecated`	Removing fields without warning
Query complexity/depth limits	Allowing arbitrarily deep nested queries
Authorization in resolvers	Authorization in schema or nowhere
New DataLoader per request	Shared DataLoader across requests (data leaks)
Nullable fields for partial failure	Non-null everything (one failure kills the query)

Query complexity and depth limits

Without limits, a malicious client can craft a query that crashes your server:

# Depth attack: nested relationships to infinite depth
query {
  user(id: 1) {
    posts {
      nodes {
        author {
          posts {
            nodes {
              author {
                posts { ... }
              }
            }
          }
        }
      }
    }
  }
}

Protect yourself with limits:

const { createComplexityLimitRule } = require('graphql-validation-complexity');

const server = new ApolloServer({
  schema,
  validationRules: [
    createComplexityLimitRule(1000),  // Max query complexity score
    depthLimit(7)                     // Max nesting depth
  ]
});

Every production GraphQLWhat is graphql?A query language for APIs where clients specify the exact shape of data they need in a single request, avoiding over-fetching and under-fetching. APIWhat is api?A set of rules that lets one program talk to another, usually over the internet, by sending requests and getting responses. needs these guards. They are not optional.

Done

Complete & Next

Create a free account to save your progress

Essential to know

SchemaWhat is schema?A formal definition of the structure your data must follow - which fields exist, what types they have, and which are required. Ask AI for more-first development

Input types and payloadWhat is payload?The data sent in the body of an HTTP request, such as the JSON object you include when creating a resource through a POST request. Ask AI for more types

The N+1 problemWhat is n+1 query?A performance bug where fetching a list triggers one extra database query per item instead of loading all related data in a single query. Ask AI for more