Tutorial

How to Generate Fake Relational Tables: A Step-by-Step Tutorial

Bill Crawford — Step-by-Step Tutorial — February 2026 — 18 min read · Last updated October 29, 2025

This tutorial walks through building three related fake database tables from scratch using the Relational Data Generator — with screenshots of every single step. By the end you'll have a complete, importable dataset with a customers table, an address_book table, and an invoices table, all wired together with real foreign keys.

Connect on LinkedIn →

We'll also cover the theory behind relational tables so you understand why the setup works the way it does — not just how to click through it.

What's in this tutorial

What are relational tables and why do they matter?
The schema we're building: customers, address_book, invoices
Step 1 — Open the tool and choose Start from Scratch
Step 2 — Build the customers table
Step 3 — Build the address_book table with a Foreign Key
Step 4 — Build the invoices table with a Foreign Key
Step 5 — Set record counts
Step 6 — Choose SQL format and dialect
Step 7 — Generate and review the output
Understanding the generated SQL
Exporting as CSV and JSON instead

What Are Relational Tables and Why Do They Matter?

A relational database stores data across multiple tables, and those tables are connected to each other. Rather than cramming every piece of information about a customer into one giant row, a relational database splits the data into logical groups and links them together.

The connection mechanism is the foreign key. A foreign key is a column in one table that stores the ID value of a row in another table. It creates a reference — a pointer — that says "this row belongs to that row over there."

Here's a simple mental model. Imagine three filing cabinets:

The Customers cabinet has one card per customer, each with a unique ID number
The Address Book cabinet has address cards — but instead of writing the customer's name on each card, you write their ID number from the Customers cabinet
The Invoices cabinet has invoice slips — and again, each slip has the customer's ID number written on it, not their full details

When you want to know everything about customer #4821, you look up ID 4821 in the Customers cabinet, then search the other cabinets for anything with that same ID. Every cabinet stays lean — no duplicated names, addresses repeated across dozens of invoices — and the whole system stays consistent by design.

What is referential integrity?

Referential integrity is the rule that says: every foreign key value must exist in the table it references. You cannot have an invoice for customer #9999 if there is no row with ID 9999 in the customers table. This sounds obvious but it's the most common thing that breaks when you generate fake data naively — you end up with invoices referencing customer IDs that don't exist anywhere.

The Relational Data Generator handles this automatically. It generates the customers table first, captures all the real customer IDs it created, and then uses only those real IDs when building the address_book and invoices tables. The result is a dataset that will import cleanly into any database with FK constraints enabled.

The three-table schema we're building. Both address_book and invoices reference customers via a foreign key on customer_id.

The Schema We're Building

Our three tables and what each one stores:

Table	What it stores	Primary Key	Foreign Key
customers	One row per customer: name, email, phone, join date	customer_id (UUID)	None — this is the root table
address_book	One or more addresses per customer	address_id (UUID)	customer_id → customers.customer_id
invoices	One or more invoices per customer	invoice_id (UUID)	customer_id → customers.customer_id

Notice that both address_book and invoices reference customers — customers is the parent table and the other two are child tables. A customer can have multiple addresses and multiple invoices, but every address and every invoice must belong to a real customer.

The record ratio we'll use: 100 customers → 200 addresses → 400 invoices. This means roughly 2 addresses per customer and 4 invoices per customer on average — a realistic ratio for a small business billing system.

Follow along as you read. Open the tool in a new tab and build as you go.

Open Relational Data Generator ↗

Open the Tool and Click "Start from Scratch"

Navigate to the Relational Data Generator. At the top of the page you'll see the quick-start template buttons. Since we're building a custom schema that doesn't match any of the pre-built templates, click ✦ Scratch (the rightmost button).

dataconversioncenter.com/generators/relational-data-generator/

The template row at the top of the page. Click "✦ Scratch" to start with a blank canvas.

After clicking Scratch, the right panel will show an empty first table called "table_1" with a prompt to add columns from the field picker on the left. The tool is now ready for us to start building.

Build the customers Table

The customers table is the root of our schema — it has no foreign keys, only its own primary key. Everything else will reference it. Let's set it up.

2a — Name the table

The active table panel shows a text input with the default name "table_1". Click it and type customers.

Active table panel

Rename the default table name to "customers" by clicking the name field and typing.

2b — Add the UUID primary key

In the field picker panel on the left, scroll to the Identifiers section and click UUID / GUID. A column appears in the table panel. Click the column name to rename it from "UUID / GUID" to customer_id.

💡 Why UUID instead of a number? UUID primary keys never conflict if you run the seed script multiple times or merge datasets from different runs. Auto-increment integers would collide. For test data, UUID is almost always the better choice.

Field picker (left panel) — Identifiers section

Click "UUID / GUID" in the Identifiers section. It adds a UUID column to the active table.

2c — Add the remaining customer fields

Now add these fields from the field picker. For each one, click the column name after it appears and rename it:

Click in field picker	Rename to	Why
UUID / GUID	customer_id	Primary key — every customer gets a unique ID
First Name	first_name	Customer's first name
Last Name	last_name	Customer's last name
Email	email	Contact email
Phone	phone	Contact phone number
Timestamp	created_at	When the customer record was created

customers table — after adding all columns

The completed customers table: 6 columns, all renamed to match the schema. The customer_id column shows a PK badge.

⚙ Column renaming tip: Click directly on the column name text (the white text in the row) to edit it inline. Press Enter or click away to confirm. Rename it to exactly what your database schema expects — customer_id not "UUID / GUID".

Build the address_book Table with a Foreign Key

Now we add the second table. This is where the relational magic happens — we'll add a foreign key that ties every address to a real customer.

3a — Add the second table

Click the + Add Table button at the top right of the builder panel. A new tab appears alongside the customers tab. Click the new table's name field and rename it address_book.

Table tabs after adding the second table

Two table tabs are now visible. The active tab (address_book, highlighted in blue) is the one currently being edited.

3b — Add the UUID primary key

Just like before, click UUID / GUID in the Identifiers section of the field picker. Rename the column to address_id.

3c — Add the Foreign Key column

This is the most important step. At the very top of the field picker panel, above all the categories, you'll see a green chip labelled 🔗 Foreign Key. Click it.

Field picker — Foreign Key chip at the top

The green "Foreign Key" chip sits above all other field categories. Click it to open the FK configuration modal.

The Foreign Key modal opens. Fill it in as follows:

Foreign Key modal — configuring address_book.customer_id

The FK modal configured for address_book. Column name: customer_id. References: customers → customer_id. Distribution: Random.

Column name in this table: Type customer_id
References table: Select customers from the dropdown
References column: Select customer_id from the dropdown
Distribution: Leave as Random — this means each address is randomly assigned to any customer, creating realistic variation where some customers have multiple addresses and some have one

Click Add Column. The foreign key column appears in the table, highlighted in green to distinguish it from regular columns.

3d — Add the address fields

With the FK column in place, add the actual address data columns. Click each one in the field picker and rename:

Click in field picker	Rename to
UUID / GUID	address_id
Foreign Key (configured above)	customer_id
Address Line 1	address_line_1
Address Line 2	address_line_2
City	city
State	state
Zip Code	zip_code
Boolean	is_primary

address_book table — completed with FK column highlighted in green

The address_book table with 8 columns. The customer_id row is highlighted green, indicating it's a FK column linked to the customers table.

Build the invoices Table with a Foreign Key

The invoices table follows the same pattern as address_book. It references customers — so every invoice is tied to a real customer — but it stores invoice-specific data: invoice number, date, amount, and status.

4a — Add the third table

Click + Add Table again. Rename the new table to invoices.

4b — Add the UUID primary key and Foreign Key

Add UUID / GUID and rename it invoice_id. Then click the 🔗 Foreign Key chip again and configure it:

Column name in this table: customer_id
References table: customers
References column: customer_id
Distribution: Random — some customers will have more invoices than others, which reflects real billing behaviour

Three table tabs — all three tables are now defined

All three tables are now defined: customers, address_book, and invoices. The invoices tab is currently active.

4c — Add the invoice data fields

Click in field picker	Rename to	What it stores
UUID / GUID	invoice_id	Unique identifier for each invoice
Foreign Key	customer_id	Which customer this invoice belongs to
Invoice Number	invoice_number	Human-readable invoice reference
Date	invoice_date	Date the invoice was issued
Date	due_date	Payment due date (add Date a second time)
Dollar Amount	amount_due	Total amount owed
Order Status	status	Invoice status: paid, pending, overdue, cancelled
Lorem (sentence)	notes	Optional notes or description

💡 Adding the same field type twice: The Date field is used for both invoice_date and due_date. Just click Date in the field picker twice — two independent Date columns are added, each generating different random dates. Rename each one separately.

invoices table — completed

The completed invoices table with 8 columns including the green FK column linking to customers.

Set Record Counts

Click each table tab in turn and set the record count in the header of each table panel:

Table	Record count	Ratio
customers	100	Root — 100 unique customers
address_book	200	~2 addresses per customer on average
invoices	400	~4 invoices per customer on average

These ratios make sense for a small business billing system. You can adjust them freely — set customers to 500 and invoices to 5,000 for a larger dataset, or reduce to 20/50/100 for lightweight test fixtures.

⚠️ Always make child tables larger than parent tables when using Random FK distribution. If you have 200 parent records and only 50 child records, many parents will have zero children — which may not reflect your application's assumptions.

Choose SQL Format and Dialect

Scroll down to the Settings panel below the table builder. This is where you choose your output format and SQL dialect.

Settings panel — output format and SQL options

Settings panel configured: SQL format selected (blue), PostgreSQL dialect, 500 rows per batch, CREATE TABLE and transactions enabled.

For this tutorial, use these settings:

Output format: SQL
SQL dialect: PostgreSQL (or choose your database)
Rows per INSERT batch: 500
CREATE TABLE: checked ✓
DROP TABLE IF EXISTS: unchecked (leave unchecked unless you want to wipe existing data)
FOREIGN KEY constraints: optionally check this to add REFERENCES clauses to the DDL
Wrap in transactions: checked ✓

Generate the Dataset and Review the Output

Click the ⚡ Generate Dataset button. A progress bar fills briefly as the tool generates all three tables in dependency order — customers first, then address_book and invoices (using the real customer IDs just generated).

Output panel — after generation completes

The output panel after generation. Three table tabs show record counts (100/200/400). The SQL output and a preview of the first rows are shown. Click each tab to preview that table's data.

Switch between the table tabs in the output panel to preview each table. Notice that when you click the address_book tab, the customer_id column shows UUID values — and every one of those values exists in the customers table. That's referential integrity working as designed.

Click ⬇ Download to save the complete SQL script.

Understanding the Generated SQL

Let's look at what the downloaded script actually contains and why each part is structured the way it is.

The file header

The script opens with a comment block summarising everything:

-- Script generated by DataConversionCenter.com
-- Dialect: PostgreSQL  |  Tables: 3  |  Total records: 700
-- Generated: 2026-02-24
-- Insert order respects foreign key dependencies
-- ── Table order: customers → address_book → invoices

The "insert order" note is important. The script always inserts parent tables before child tables. If you try to run the address_book inserts before the customers inserts, the FK constraint check fails because the customer IDs don't exist yet. The generator handles this automatically.

The CREATE TABLE block for customers

-- ── 1 of 3: customers (100 records) ──────────────────────
CREATE TABLE IF NOT EXISTS "customers" (
  "customer_id"  UUID          NOT NULL,
  "first_name"   VARCHAR(100)  NOT NULL,
  "last_name"    VARCHAR(100)  NOT NULL,
  "email"        VARCHAR(255)  NOT NULL,
  "phone"        VARCHAR(25)   NOT NULL,
  "created_at"   TIMESTAMPTZ   NOT NULL,
  PRIMARY KEY ("customer_id")
);

Notice the PostgreSQL-specific types: UUID (a native type in PostgreSQL), TIMESTAMPTZ (timestamp with timezone). If you'd chosen MySQL, these would be VARCHAR(36) and DATETIME respectively. The generator maps each field type to the correct SQL type for your chosen dialect.

The CREATE TABLE block for address_book — with the FK constraint

-- ── 2 of 3: address_book (200 records) ───────────────────
CREATE TABLE IF NOT EXISTS "address_book" (
  "address_id"    UUID          NOT NULL,
  "customer_id"   UUID          NOT NULL,
  "address_line_1" VARCHAR(200) NOT NULL,
  "address_line_2" VARCHAR(100),
  "city"          VARCHAR(100)  NOT NULL,
  "state"         CHAR(2)       NOT NULL,
  "zip_code"      CHAR(5)       NOT NULL,
  "is_primary"    BOOLEAN       NOT NULL,
  PRIMARY KEY ("address_id"),
  FOREIGN KEY ("customer_id") REFERENCES "customers" ("customer_id")
);

The final line is the FK constraint. It tells PostgreSQL: the customer_id column in this table must always reference a value that exists in customers.customer_id. Any attempt to insert an address with a customer_id that doesn't exist in the customers table will be rejected by the database engine itself.

The bulk INSERT blocks — transaction-wrapped

BEGIN;

INSERT INTO "customers" ("customer_id", "first_name", "last_name", "email", "phone", "created_at") VALUES
  ('f47ac10b-58cc-4372-a567-0e02b2c3d479', 'Sarah', 'Mendoza', '[email protected]', '(847) 293-5571', '2023-03-14T09:42:17Z'),
  ('b3c4d5e6-89ab-cdef-0123-456789abcdef', 'James', 'Wilson', '[email protected]', '(312) 847-2910', '2022-11-07T14:22:44Z'),
  -- ... 98 more rows
;

COMMIT;

All 100 customers are inserted in a single batch, wrapped in BEGIN / COMMIT. If any row fails, the entire batch rolls back — you won't end up with a partially-inserted customers table.

The address_book inserts — with real customer IDs

BEGIN;

INSERT INTO "address_book" ("address_id", "customer_id", "address_line_1", ...) VALUES
  ('c8d9e0f1-...', 'f47ac10b-58cc-4372-a567-0e02b2c3d479', '4821 Ridgewood Ave', ...),
  ('d1e2f3a4-...', 'b3c4d5e6-89ab-cdef-0123-456789abcdef', '192 Maple St',       ...),
  -- ...
;

COMMIT;

Look at the customer_id values in the address_book insert: f47ac10b-58cc-4372-a567-0e02b2c3d479. That's the exact same UUID that was used as the primary key for Sarah Mendoza in the customers insert above. The generator captured all 100 customer UUIDs during generation and used them when building the address_book rows. That's referential integrity guaranteed.

The output data

Here's what the actual generated rows look like across all three tables. Notice that the customer_id values in address_book and invoices always match real IDs from customers:

Table 1customers — sample rows

customer_id	first_name	last_name	email	phone	created_at
f47ac10b-58cc-...	Sarah	Mendoza	[email protected]	(847) 293-5571	2023-03-14T09:42Z
b3c4d5e6-89ab-...	James	Wilson	[email protected]	(312) 847-2910	2022-11-07T14:22Z
a1b2c3d4-efab-...	Maria	Thompson	[email protected]	(773) 510-4829	2024-01-19T08:33Z

Table 2address_book — sample rows (customer_id values match real customers above)

address_id	customer_id	address_line_1	city	state	zip_code	is_primary
c8d9e0f1-...	f47ac10b-58cc-...	4821 Ridgewood Ave	Austin	TX	78741	true
d1e2f3a4-...	f47ac10b-58cc-...	113 Oak Blvd Apt 4B	Austin	TX	78702	false
e5f6a7b8-...	b3c4d5e6-89ab-...	192 Maple St	Chicago	IL	60614	true

Table 3invoices — sample rows (customer_id values match real customers above)

invoice_id	customer_id	invoice_number	invoice_date	amount_due	status
11a2b3c4-...	f47ac10b-58cc-...	INV-2024-0892	2024-07-15	4291.47	paid
22b3c4d5-...	a1b2c3d4-efab-...	INV-2024-1047	2024-09-03	892.10	pending
33c4d5e6-...	f47ac10b-58cc-...	INV-2025-0012	2025-01-22	341.00	overdue

Sarah Mendoza (f47ac10b-...) appears in both the address_book (she has two addresses) and in invoices (she has two invoices). Maria Thompson (a1b2c3d4-...) also appears in invoices. Every value in the customer_id column of both child tables is a real UUID from the customers table — that's what makes this dataset importable.

Exporting as CSV and JSON Instead

SQL isn't always the right format. If you need to import the data using a GUI tool, test it in a JavaScript application, or load it into a document database, switch the output format before clicking Generate.

CSV format

Select CSV in the format row. Each table downloads as a separate file: customers.csv, address_book.csv, invoices.csv. The FK relationships are preserved by value — the customer_id values in address_book.csv match real IDs in customers.csv.

CSV is the right choice for importing with a GUI tool like TablePlus, DBeaver, or pgAdmin's import wizard, or for loading data into Google Sheets or Excel for review.

JSON nested format

Select JSON (nested) to get a single file where addresses and invoices are embedded inside each customer object:

[
  {
    "customer_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
    "first_name": "Sarah",
    "last_name": "Mendoza",
    "email": "[email protected]",
    "addresses": [
      { "address_id": "c8d9e0f1-...", "address_line_1": "4821 Ridgewood Ave", "city": "Austin", ... },
      { "address_id": "d1e2f3a4-...", "address_line_1": "113 Oak Blvd Apt 4B", ... }
    ],
    "invoices": [
      { "invoice_id": "11a2b3c4-...", "invoice_number": "INV-2024-0892", "amount_due": "4291.47", ... },
      { "invoice_id": "33c4d5e6-...", "invoice_number": "INV-2025-0012", "amount_due": "341.00", ... }
    ]
  }
]

This nested format is ideal for MongoDB or Firestore where documents can contain nested arrays, for GraphQL mock servers, or for React and Vue application development where components receive nested customer data from an API.

Try it yourself. The complete three-table schema from this tutorial takes about 5 minutes to build from scratch.

Open Relational Data Generator →

Database Normalization: Why Relational Tables Are Structured This Way

Relational tables are structured around normalization — a set of rules that eliminate data redundancy and ensure data integrity. Understanding normalization helps you design better schemas and understand why the generated tables are split the way they are.

First Normal Form (1NF)

Each column contains a single atomic value — no comma-separated lists, no arrays, no repeating groups. A customer's multiple phone numbers go in a separate phone_numbers table, not as phone1, phone2 columns.

Second Normal Form (2NF)

Every non-key column depends on the entire primary key. This matters for composite keys: in an order_items table keyed on (order_id, product_id), the product name belongs in the products table — not in order_items — because it only depends on product_id, not the full composite key.

Third Normal Form (3NF)

No non-key column depends on another non-key column (no transitive dependencies). A customers table should not store both zip_code and city — the city is determined by the zip code, creating a transitive dependency. City/state belong in a zip_codes reference table.

Why This Matters for Test Data Generation

The Relational Data Generator creates properly normalised schemas — customers, orders, and order items as separate linked tables. This mirrors real production database design and means the generated SQL and data is directly usable for realistic testing, not just toy examples.

Related Tools & Articles

ToolRelational Data Generator ToolCustom Fake Data Generator ToolFake Address Generator HubAll Data Generators GuideRelational Data Generator: Full Guide GuideCustom Fake Data Generator Guide ToolJSON Formatter GuideJSON vs XML vs CSV

How to Generate Fake Relational Tables: A Step-by-Step Tutorial

What Are Relational Tables and Why Do They Matter?

What is referential integrity?

The Schema We're Building

2a — Name the table

2b — Add the UUID primary key

2c — Add the remaining customer fields

3a — Add the second table

3b — Add the UUID primary key

3c — Add the Foreign Key column

3d — Add the address fields

4a — Add the third table

4b — Add the UUID primary key and Foreign Key

4c — Add the invoice data fields

Understanding the Generated SQL

The file header

The CREATE TABLE block for customers

The CREATE TABLE block for address_book — with the FK constraint

The bulk INSERT blocks — transaction-wrapped

The address_book inserts — with real customer IDs

The output data

Exporting as CSV and JSON Instead

CSV format

JSON nested format

Database Normalization: Why Relational Tables Are Structured This Way

First Normal Form (1NF)

Second Normal Form (2NF)

Third Normal Form (3NF)

Why This Matters for Test Data Generation

Related Tools & Articles

Related Articles