Pydantic Faker: How I Supercharged Test Data Generation and Mock API Creation with Python

From Idea to Open Source: The Journey of Pydantic Faker

Hello everyone! Today, I want to share the story behind my open-source project: pydantic-faker. Like many Python developers, I frequently use Pydantic for defining data models. It’s an incredibly convenient tool that brings type safety and validation to our projects. However, when it comes to testing, writing documentation, or prototyping a frontend, a common challenge arises: where do we get realistic data that conforms to these very Pydantic schemas?

Repeatedly writing scripts to generate JSON blobs or populate databases with fake data is a routine chore that consumes time and energy. And what if you need to quickly spin up a mock endpoint that serves this data? That means even more code. These recurring tasks were the primary inspiration for pydantic-faker – a CLI utility designed to simplify these processes as much as possible.

So, what is pydantic-faker?

It’s a command-line tool that empowers you to:

  1. Generate structured fake data based on your Pydantic models.
  2. Serve this generated data via a local mock API server, complete with auto-generated OpenAPI (Swagger) documentation.

Key Problems It Solves:

  • Accelerate Development & Testing: No more wasting time manually creating test data.
  • Data Consistency: Generated data always conforms to your Pydantic schemas.
  • Realistic Data: Thanks to integration with the Faker library, the data (names, emails, addresses, text, etc.) looks and feels real.
  • Reproducibility: The –seed option allows you to get identical datasets every run, crucial for testing.
  • Constraint Adherence: pydantic-faker understands and applies standard constraints defined via pydantic.Field (e.g., min_length, max_length for strings; gt, le for numbers; min_items, max_items for lists).
  • Type Flexibility: Supports not only basic types but also Optional, nested models, lists, dictionaries, and advanced Python types like Union, Literal, and enum.Enum.
  • Rapid API Mocking: The serve command instantly spins up a FastAPI server with CRUD operations (for in-memory data) and filtering capabilities.

1. Generating Data with generate

The core command pydantic-faker generate my_module:MyModel lets you specify your Pydantic model and get JSON fake data.

  • –count N: Generate N instances.
  • –output-file path.json: Save to a file.
  • –faker-locale ru_RU: Use Russian (or any other Faker-supported locale) for names, addresses, etc.
  • –seed 123: Ensure reproducibility.

If you define a model like this:

from pydantic import BaseModel, Field
import enum
from typing import Literal, Union # Or use | for Python 3.10+

class UserRole(enum.Enum):
    ADMIN = "administrator"
    EDITOR = "editor"
    VIEWER = "viewer"

class User(BaseModel):
    username: str = Field(examples=["testuser", "admin_user"])
    role: UserRole = Field(examples=[UserRole.ADMIN, UserRole.EDITOR])
    status: Literal["active", "pending"] = "active"
    age_group: Union[Literal["child", "teen"], Literal["adult"]] # Or str | str

pydantic-faker has a chance to pick values directly from your examples list for the username, role and status fields.

2. Mock API Server with serve

The pydantic-faker serve my_module:MyModel command launches a Uvicorn server with a FastAPI application.

  • –port 8001–host 0.0.0.0: Configure the server.
  • CRUD Operations: The server provides GET (all & by ID/index), POST (create), PUT (update), DELETE endpoints for in-memory data.
  • Filtering: For GET requests on collections, you can use query parameters for basic filtering (e.g., GET /users?is_active=true).
  • OpenAPI Documentation: Automatically available at /docs and /redoc.

In developing pydantic-faker, I leaned heavily into the ecosystem created by Sebastián Ramírez. For the CLI, I chose Typer. It was a revelation! Instead of wrestling with the complex APIs of argparse or Click, Typer allows defining commands, arguments, and options using standard Python type hints. This not only simplifies development but also provides excellent IDE support, autocompletion, and automatic help generation. If FastAPI is the standard for building web APIs in Python, Typer is its perfect counterpart for CLIs.

Under the hood, the serve command uses FastAPI, ensuring high performance and a familiar environment for many developers.

I have many ideas for pydantic-faker’s evolution:

  • Deeper integration with Pydantic Field (e.g., full support for pattern for strings).
  • Ability to define custom generators for specific types or fields via a configuration file.
  • (Experimental) LLM integration for even more creative text data generation.
  • Enhanced filtering capabilities for the serve command (ranges, partial matches, etc.).

pydantic-faker is still a young project, but I hope it can already be a valuable asset for Python developers who value their time and strive for efficiency. If you frequently work with Pydantic and need test data or quick mock services, please give it a try!

I would be thrilled to hear your feedback, suggestions for improvement, and, of course, contributions on GitHub.

Thanks for reading! I hope pydantic-faker makes your development workflow a little easier and more enjoyable.

Leave a Reply