Testland
Browse all skills & agents

faker-data

Authors test-data factories using Faker: the Python `faker` library, the `@faker-js/faker` JS port, and the `faker-ruby` gem, generating names, emails, addresses, phone numbers, dates, and locale-aware variants. Configures seed-based determinism for reproducible runs and selects providers (person / internet / location / date / finance / lorem) per language. Prefer this skill when the codebase already uses the Faker family or when cross-language consistency across Python, JS, and Ruby matters; use mimesis-data only when deeper Python locale coverage is the primary requirement. Use when authoring fixtures or factories that need realistic-looking field values.

faker-data

Overview

Faker is a family of libraries (Python / JS / Ruby / Java / .NET / PHP) that generate realistic synthetic field values - names, emails, addresses, dates, etc. - for test fixtures. The three most common ports in this skill's scope:

LanguageLibraryReference
Pythonfakerfaker-py
JS / TS@faker-js/fakerfaker-js
Rubyfaker-ruby/fakerfaker-rb

For .NET, see bogus-data. For Python-specifically with stronger locale coverage, also see mimesis-data.

When to use

  • A test fixture needs realistic but synthetic field values (the default 'foo' / 'bar' pattern produces tests that miss real bugs around long names, Unicode, edge-case formats).
  • The team wants reproducible randomness - same seed produces same data, useful for regression repro.
  • Locale coverage matters (i18n testing across de_DE, ja_JP, ar_SA, etc.).
  • A factory library (FactoryBot, factory_boy, Bogus) needs the underlying generator - Faker is typically the random-data engine plugged into those.

Install

Python

pip install Faker

(Per faker-py.)

JavaScript / TypeScript

npm install --save-dev @faker-js/faker

(Per faker-js.)

Ruby

# Gemfile
gem 'faker', group: :test

(Per faker-rb.)

Authoring

Python

from faker import Faker

fake = Faker()
fake.name()              # 'Margaret Boehm'
fake.email()             # 'walker.travis@example.com'
fake.address()           # '123 Main St, Apt 4B\nSpringfield, IL 62701'
fake.phone_number()      # '+1-555-867-5309'
fake.date_of_birth(minimum_age=18, maximum_age=65)
fake.text(max_nb_chars=200)

(Per faker-py.)

Common provider modules: person (name, prefix), address, internet (email, url, ipv4), phone_number, date_time, lorem (paragraphs, sentences, words), company, credit_card, job (faker-py).

JavaScript

import { faker } from '@faker-js/faker';

faker.person.fullName();           // 'Margaret Boehm'
faker.internet.email();             // 'walker.travis@example.com'
faker.location.streetAddress();     // '123 Main St'
faker.phone.number();               // '+1-555-867-5309'
faker.date.past({ years: 30 });
faker.lorem.paragraphs(2);

(Per faker-js.)

Module organization mirrors the Python ports but uses the module- namespace form: faker.person.*, faker.internet.*, faker.location.*, faker.date.*, faker.finance.*, faker.commerce.* (faker-js).

Ruby

require 'faker'

Faker::Name.name           # 'Margaret Boehm'
Faker::Internet.email      # 'walker.travis@example.com'
Faker::Address.full_address
Faker::PhoneNumber.cell_phone
Faker::Date.birthday(min_age: 18, max_age: 65)
Faker::Lorem.paragraphs(number: 2)

(Per faker-rb.)

Seeding for deterministic output

The most common test-stability mistake is letting Faker generate non-deterministic values across runs. Always seed in tests so a failure can be reproduced.

Python

from faker import Faker

# Class-level — sets the default RNG for all subsequent Faker() calls
Faker.seed(4321)
fake = Faker()

# Instance-level — useful when multiple Faker instances need different seeds
fake.seed_instance(4321)

(Per faker-py.)

JavaScript

import { faker } from '@faker-js/faker';

faker.seed(123);
// All faker.* calls until the next seed() are deterministic.

(Per faker-js.)

Ruby

require 'faker'

Faker::Config.random = Random.new(42)

For test frameworks: place the seed in beforeEach / setup so each test starts with the same baseline; for paired runs, persist the seed used per failing test (similar to the flake-pattern-reference Pattern 8 randomness guidance).

Locale support

Python

fake = Faker('it_IT')                # Italian
fake = Faker(['en_US', 'fr_FR', 'ja_JP'])   # Multi-locale (random per call)
fake.name()                          # generates per the configured locale(s)

(Per faker-py.)

JavaScript

import { fakerDE } from '@faker-js/faker';
import { fakerJA } from '@faker-js/faker';

fakerDE.person.fullName();   // German name
fakerJA.address.city();       // Japanese city

(Per faker-js; 70+ locales available.)

Ruby

Faker::Config.locale = :ja
Faker::Name.name   # Japanese name

(Per faker-rb.)

Composing factories with referential integrity

Faker generates field values; for referential integrity (a factory that creates a User with a related Order), use a factory library that wraps Faker:

LanguageFactory librarySkill
Pythonfactory_boy(consider mimesis-data for locale-rich pure mimesis pattern)
JS / TSfishery / factory.tshand-rolled with Faker as engine
RubyFactoryBotfactory-bot-data
.NETBogusbogus-data

Faker alone won't enforce that order.user_id == user.id; the factory library handles that.

Anti-patterns

Anti-patternWhy it failsFix
Calling Faker without a seed in testsA failure on CI doesn't reproduce locally; flake-investigation guesswork.Seed once per test or per suite (Faker.seed(...)).
Using fake.email() with a real domain (example.com is shared)Spam concerns; some validators reject example.com.Faker's defaults use safe RFC-2606 domains; never override to a real domain in tests.
Hardcoding generated values into snapshotsSnapshot bound to a Faker version's PRNG sequence; library bump breaks the snapshot.Snapshot the shape of the data; assert types and patterns rather than literal values.
Generating names with the wrong localeA test asserting "name has at least one space" fails on :ja (Japanese) where names use .Match the locale to the assertion; or relax the assertion to be locale-aware.
Using Faker for security testing payloadsFaker generates "realistic" data, not malicious. SQL injection / XSS won't happen by chance.Use malicious-payload-bank for adversarial input.

Limitations

  • PRNG sequence varies across major versions. A seed produces different values in Faker v18 vs v19. Pin the version in CI for deterministic tests.
  • Locale coverage is uneven. en_US is the most complete; less- common locales fall back to defaults silently. Test the locales you care about; don't assume completeness.
  • Realistic ≠ valid. Faker may generate an email with a technically-valid but unusual format (e.g. +-tagged); your validation may reject it. Match Faker's domain provider to your validator's regex.

References