Course:Node.js & Express/
Lesson

Every test needs data. You can't test a user creation endpointWhat is endpoint?A specific URL path on a server that handles a particular type of request, like GET /api/users. without a user object, can't test a search feature without records to search through, can't test paginationWhat is pagination?Splitting a large set of results into smaller pages so the server and client only handle a manageable chunk at a time. without enough rows in the database. The question isn't whether you need mockWhat is mock?A fake replacement for a real dependency in tests that records how it was called so you can verify interactions. data — it's how you generate it without creating a maintenance nightmare.

If you've ever seen a test file with 200 lines of hand-crafted JSONWhat is json?A text format for exchanging data between systems. It uses key-value pairs and arrays, and every programming language can read and write it. objects at the top, you already know the problem. Let's see how to do it better.

The problem with hard-coded test data

Here's what test data looks like when you start:

const testUser = {
  name: 'John Doe',
  email: 'john@test.com',
  age: 30,
  role: 'admin',
  createdAt: '2025-01-01T00:00:00Z'
};

This works fine for one test. But then you need a second user — with a different email because your database has a unique constraint. Then a third. Then you need one that's a non-admin. Then one with a very long name to test truncation. Before you know it, you're managing 15 nearly-identical objects that differ in one field each.

AI pitfall
When you ask AI to write tests, it generates unique hard-coded objects for every test case. This works but creates fragile, duplicated data that's painful to update when your schema changes. Ask for factories instead.
02

Factory functions — the right abstraction

A factory is a function that returns a new object every time, with sensible defaults you can override:

function createTestUser(overrides = {}) {
  return {
    id: crypto.randomUUID(),
    name: 'Test User',
    email: `user-${Date.now()}@test.com`,
    age: 25,
    role: 'user',
    createdAt: new Date().toISOString(),
    ...overrides,
  };
}

Now your tests read like this:

// Default user — all you need for most tests
const user = createTestUser();

// Admin user — override just what matters
const admin = createTestUser({ role: 'admin' });

// User with specific email — for duplicate-check tests
const alice = createTestUser({ email: 'alice@company.com' });

The ...overrides spread is the key pattern. Every call gets unique defaults, but you can override any field. When you add a new field to your user schemaWhat is schema?A formal definition of the structure your data must follow - which fields exist, what types they have, and which are required., you update one function — not 50 test objects.

Multiple factories for related data

Real apps have related entities. Build factories that compose:

function createTestPost(overrides = {}) {
  return {
    id: crypto.randomUUID(),
    title: 'Test Post',
    body: 'This is a test post body with enough content to be realistic.',
    authorId: crypto.randomUUID(),
    tags: ['test'],
    published: true,
    createdAt: new Date().toISOString(),
    ...overrides,
  };
}

// Create a user with their posts
const author = createTestUser();
const posts = [
  createTestPost({ authorId: author.id, title: 'First post' }),
  createTestPost({ authorId: author.id, title: 'Second post' }),
];
03

Faker.js — realistic data at scale

Factory functions solve the structure problem. But "Test User" and "user-1234@test.com" don't look like real data — which means your tests might miss bugs that only surface with realistic inputs (unicode names, long emails, special characters).

Faker.js generates realistic-looking data:

npm install -D @faker-js/faker
import { faker } from '@faker-js/faker';

function createTestUser(overrides = {}) {
  return {
    id: crypto.randomUUID(),
    name: faker.person.fullName(),
    email: faker.internet.email(),
    age: faker.number.int({ min: 18, max: 80 }),
    role: 'user',
    avatar: faker.image.avatar(),
    bio: faker.lorem.sentence(),
    createdAt: faker.date.past().toISOString(),
    ...overrides,
  };
}

const user = createTestUser();
// { name: 'María García', email: 'maria.garcia42@hotmail.com', age: 34, ... }

Every call produces different data. Names come from real name databases, emails follow realistic patterns, dates fall in sensible ranges.

Useful Faker methods

MethodGeneratesExample output
faker.person.fullName()Full name"Elena Kowalski"
faker.internet.email()Email"elena.k@gmail.com"
faker.internet.url()URL"https://fair-bicycle.info"
faker.lorem.sentence()Sentence"Voluptas eum deserunt..."
faker.lorem.paragraphs(2)ParagraphsTwo realistic paragraphs
faker.number.int({ min, max })Integer42
faker.date.past()Past date2024-08-15T...
faker.date.future()Future date2026-02-20T...
faker.image.avatar()Avatar URL"https://avatars..."
faker.string.uuid()UUID"a1b2c3d4-..."
faker.helpers.arrayElement([...])Random pickPicks one from array
Good to know
Faker supports locales. Use import { faker } from '@faker-js/faker/locale/fr' to get French names and addresses. This matters when you test locale-specific features like postal code validation.
04

Seeding for reproducible tests

Random data makes tests flaky if the randomness triggers different code paths. Fix this with a seed:

import { faker } from '@faker-js/faker';

// Same seed = same data every time
faker.seed(12345);

const user = createTestUser();
// Always returns the exact same "random" user

Use seeding when:

  • Tests depend on specific data values (sorting, filtering)
  • You need snapshot testing with predictable output
  • Debugging a flaky test — seed it, reproduce it, fix it

Skip seeding when:

  • Tests should work with any valid data (most tests)
  • You want to find edge cases through randomness

05

Generating data in bulk

For testing paginationWhat is pagination?Splitting a large set of results into smaller pages so the server and client only handle a manageable chunk at a time., search, or performance, you need many records:

function createTestUsers(count, overrides = {}) {
  return Array.from({ length: count }, (_, i) =>
    createTestUser({
      name: faker.person.fullName(),
      email: faker.internet.email(),
      ...overrides,
    })
  );
}

// 100 users for pagination tests
const users = createTestUsers(100);

// 50 admin users
const admins = createTestUsers(50, { role: 'admin' });
06

Edge case data

This is where most AI-generated test data falls short. Real users submit surprising input. Your mockWhat is mock?A fake replacement for a real dependency in tests that records how it was called so you can verify interactions. data should too:

const edgeCases = [
  createTestUser({ name: '' }),                          // empty name
  createTestUser({ name: 'A' }),                         // single char
  createTestUser({ name: 'José María García-López' }),   // accented + hyphen
  createTestUser({ name: '日本太郎' }),                   // CJK characters
  createTestUser({ name: 'A'.repeat(500) }),             // very long
  createTestUser({ email: 'user+tag@example.com' }),     // plus addressing
  createTestUser({ age: 0 }),                            // zero
  createTestUser({ age: -1 }),                           // negative
  createTestUser({ age: 999 }),                          // unrealistic but valid int
  createTestUser({ bio: null }),                         // null optional field
];
AI pitfall
AI generates tests with "happy path" data — normal names, normal emails, normal ages. But bugs hide in edge cases. Always test with empty strings, null values, unicode, and boundary values.
07

Quick reference

ApproachBest forTrade-off
Hard-coded objectsOne-off, very specific testsDoesn't scale, high duplication
Factory functionsMost test suitesRequires a small upfront investment
Factory + FakerRealistic data at scaleExtra dependency, random by default
Seeded FakerReproducible randomized dataMust manage seed values
Bulk generatorsPagination, performance, searchCan slow down test setup
Factory patternExample
Default objectcreateTestUser()
Override one fieldcreateTestUser({ role: 'admin' })
Related entitiescreateTestPost({ authorId: user.id })
Bulk generationcreateTestUsers(100)
Edge case setArray of factory calls with extreme values