Skip to content
← Blog
🔒 All processing in your browser 🚫 No uploads stored ✓ No login required
Tutorial

How to Generate Fake Relational Tables: A Step-by-Step Tutorial

Bill Crawford — Step-by-Step Tutorial — February 2026 — 18 min read  ·  Last updated October 29, 2025

This tutorial walks through building three related fake database tables from scratch using the Relational Data Generator — with screenshots of every single step. By the end you'll have a complete, importable dataset with a customers table, an address_book table, and an invoices table, all wired together with real foreign keys.

Connect on LinkedIn →

We'll also cover the theory behind relational tables so you understand why the setup works the way it does — not just how to click through it.

What's in this tutorial
  1. What are relational tables and why do they matter?
  2. The schema we're building: customers, address_book, invoices
  3. Step 1 — Open the tool and choose Start from Scratch
  4. Step 2 — Build the customers table
  5. Step 3 — Build the address_book table with a Foreign Key
  6. Step 4 — Build the invoices table with a Foreign Key
  7. Step 5 — Set record counts
  8. Step 6 — Choose SQL format and dialect
  9. Step 7 — Generate and review the output
  10. Understanding the generated SQL
  11. Exporting as CSV and JSON instead

What Are Relational Tables and Why Do They Matter?

A relational database stores data across multiple tables, and those tables are connected to each other. Rather than cramming every piece of information about a customer into one giant row, a relational database splits the data into logical groups and links them together.

The connection mechanism is the foreign key. A foreign key is a column in one table that stores the ID value of a row in another table. It creates a reference — a pointer — that says "this row belongs to that row over there."

Here's a simple mental model. Imagine three filing cabinets:

When you want to know everything about customer #4821, you look up ID 4821 in the Customers cabinet, then search the other cabinets for anything with that same ID. Every cabinet stays lean — no duplicated names, addresses repeated across dozens of invoices — and the whole system stays consistent by design.

What is referential integrity?

Referential integrity is the rule that says: every foreign key value must exist in the table it references. You cannot have an invoice for customer #9999 if there is no row with ID 9999 in the customers table. This sounds obvious but it's the most common thing that breaks when you generate fake data naively — you end up with invoices referencing customer IDs that don't exist anywhere.

The Relational Data Generator handles this automatically. It generates the customers table first, captures all the real customer IDs it created, and then uses only those real IDs when building the address_book and invoices tables. The result is a dataset that will import cleanly into any database with FK constraints enabled.

customers PK customer_id UUID first_name VARCHAR last_name VARCHAR email VARCHAR phone VARCHAR created_at TIMESTAMPTZ address_book PK address_id UUID FK customer_id → customers address_line_1 VARCHAR city VARCHAR state CHAR(2) zip_code CHAR(5) invoices PK invoice_id UUID FK customer_id → customers invoice_number VARCHAR invoice_date DATE amount_due DECIMAL status VARCHAR PK = Primary Key FK = Foreign Key references relationship

The three-table schema we're building. Both address_book and invoices reference customers via a foreign key on customer_id.

The Schema We're Building

Our three tables and what each one stores:

TableWhat it storesPrimary KeyForeign Key
customersOne row per customer: name, email, phone, join datecustomer_id (UUID)None — this is the root table
address_bookOne or more addresses per customeraddress_id (UUID)customer_id → customers.customer_id
invoicesOne or more invoices per customerinvoice_id (UUID)customer_id → customers.customer_id

Notice that both address_book and invoices reference customers — customers is the parent table and the other two are child tables. A customer can have multiple addresses and multiple invoices, but every address and every invoice must belong to a real customer.

The record ratio we'll use: 100 customers → 200 addresses → 400 invoices. This means roughly 2 addresses per customer and 4 invoices per customer on average — a realistic ratio for a small business billing system.

Follow along as you read. Open the tool in a new tab and build as you go.

Open Relational Data Generator ↗
1
Open the Tool and Click "Start from Scratch"

Navigate to the Relational Data Generator. At the top of the page you'll see the quick-start template buttons. Since we're building a custom schema that doesn't match any of the pre-built templates, click ✦ Scratch (the rightmost button).

dataconversioncenter.com/generators/relational-data-generator/
Relational Data Generator — Multi-Table Datasets with SQL INSERT Build 2–5 related tables with real foreign key relationships. Export as SQL INSERT scripts or CSV/JSON/XML. QUICK-START TEMPLATES — OR BUILD FROM SCRATCH BELOW 🛒 E-Commerce 💼 CRM / SaaS 📝 Blog / CMS 👥 HR System 🏥 Healthcare ✦ Scratch click here
The template row at the top of the page. Click "✦ Scratch" to start with a blank canvas.

After clicking Scratch, the right panel will show an empty first table called "table_1" with a prompt to add columns from the field picker on the left. The tool is now ready for us to start building.

2
Build the customers Table

The customers table is the root of our schema — it has no foreign keys, only its own primary key. Everything else will reference it. Let's set it up.

2a — Name the table

The active table panel shows a text input with the default name "table_1". Click it and type customers.

Active table panel
TABLE NAME customers ← type this RECORDS 100
Rename the default table name to "customers" by clicking the name field and typing.

2b — Add the UUID primary key

In the field picker panel on the left, scroll to the Identifiers section and click UUID / GUID. A column appears in the table panel. Click the column name to rename it from "UUID / GUID" to customer_id.

💡 Why UUID instead of a number? UUID primary keys never conflict if you run the seed script multiple times or merge datasets from different runs. Auto-increment integers would collide. For test data, UUID is almost always the better choice.

Field picker (left panel) — Identifiers section
IDENTIFIERS UUID / GUID Customer ID Order ID Invoice Number SKU click to add PERSONAL IDENTITY First Name Last Name
Click "UUID / GUID" in the Identifiers section. It adds a UUID column to the active table.

2c — Add the remaining customer fields

Now add these fields from the field picker. For each one, click the column name after it appears and rename it:

Click in field pickerRename toWhy
UUID / GUIDcustomer_idPrimary key — every customer gets a unique ID
First Namefirst_nameCustomer's first name
Last Namelast_nameCustomer's last name
EmailemailContact email
PhonephoneContact phone number
Timestampcreated_atWhen the customer record was created
customers table — after adding all columns
customers Records: 100 customer_id UUID / GUID PK × first_name First Name × last_name Last Name × email Email × phone created_at
The completed customers table: 6 columns, all renamed to match the schema. The customer_id column shows a PK badge.

Column renaming tip: Click directly on the column name text (the white text in the row) to edit it inline. Press Enter or click away to confirm. Rename it to exactly what your database schema expects — customer_id not "UUID / GUID".

3
Build the address_book Table with a Foreign Key

Now we add the second table. This is where the relational magic happens — we'll add a foreign key that ties every address to a real customer.

3a — Add the second table

Click the + Add Table button at the top right of the builder panel. A new tab appears alongside the customers tab. Click the new table's name field and rename it address_book.

Table tabs after adding the second table
customers address_book + Add Table
Two table tabs are now visible. The active tab (address_book, highlighted in blue) is the one currently being edited.

3b — Add the UUID primary key

Just like before, click UUID / GUID in the Identifiers section of the field picker. Rename the column to address_id.

3c — Add the Foreign Key column

This is the most important step. At the very top of the field picker panel, above all the categories, you'll see a green chip labelled 🔗 Foreign Key. Click it.

Field picker — Foreign Key chip at the top
Click to add to the active table tab. 🔗 Foreign Key click here PERSONAL IDENTITY First Name
The green "Foreign Key" chip sits above all other field categories. Click it to open the FK configuration modal.

The Foreign Key modal opens. Fill it in as follows:

Foreign Key modal — configuring address_book.customer_id
🔗 Add Foreign Key Column Column name in this table customer_id References table customers References column customer_id Distribution Random — any parent (realistic clustering)
The FK modal configured for address_book. Column name: customer_id. References: customers → customer_id. Distribution: Random.

Click Add Column. The foreign key column appears in the table, highlighted in green to distinguish it from regular columns.

3d — Add the address fields

With the FK column in place, add the actual address data columns. Click each one in the field picker and rename:

Click in field pickerRename to
UUID / GUIDaddress_id
Foreign Key (configured above)customer_id
Address Line 1address_line_1
Address Line 2address_line_2
Citycity
Statestate
Zip Codezip_code
Booleanis_primary
address_book table — completed with FK column highlighted in green
address_book Records: 200 address_id UUID / GUID PK customer_id FK → customers address_line_1 Address Line 1 address_line_2 Address Line 2 city state zip_code is_primary Boolean ● Green rows are Foreign Key columns — linked to another table
The address_book table with 8 columns. The customer_id row is highlighted green, indicating it's a FK column linked to the customers table.
4
Build the invoices Table with a Foreign Key

The invoices table follows the same pattern as address_book. It references customers — so every invoice is tied to a real customer — but it stores invoice-specific data: invoice number, date, amount, and status.

4a — Add the third table

Click + Add Table again. Rename the new table to invoices.

4b — Add the UUID primary key and Foreign Key

Add UUID / GUID and rename it invoice_id. Then click the 🔗 Foreign Key chip again and configure it:

Three table tabs — all three tables are now defined
customers address_book invoices + Add Table
All three tables are now defined: customers, address_book, and invoices. The invoices tab is currently active.

4c — Add the invoice data fields

Click in field pickerRename toWhat it stores
UUID / GUIDinvoice_idUnique identifier for each invoice
Foreign Keycustomer_idWhich customer this invoice belongs to
Invoice Numberinvoice_numberHuman-readable invoice reference
Dateinvoice_dateDate the invoice was issued
Datedue_datePayment due date (add Date a second time)
Dollar Amountamount_dueTotal amount owed
Order StatusstatusInvoice status: paid, pending, overdue, cancelled
Lorem (sentence)notesOptional notes or description

💡 Adding the same field type twice: The Date field is used for both invoice_date and due_date. Just click Date in the field picker twice — two independent Date columns are added, each generating different random dates. Rename each one separately.

invoices table — completed
invoices Records: 400 invoice_id PK customer_id FK → customers invoice_number Invoice Number invoice_date due_date amount_due status notes Lorem (sentence) ● customer_id is a FK — every invoice references a real customer
The completed invoices table with 8 columns including the green FK column linking to customers.
5
Set Record Counts

Click each table tab in turn and set the record count in the header of each table panel:

TableRecord countRatio
customers100Root — 100 unique customers
address_book200~2 addresses per customer on average
invoices400~4 invoices per customer on average

These ratios make sense for a small business billing system. You can adjust them freely — set customers to 500 and invoices to 5,000 for a larger dataset, or reduce to 20/50/100 for lightweight test fixtures.

⚠️ Always make child tables larger than parent tables when using Random FK distribution. If you have 200 parent records and only 50 child records, many parents will have zero children — which may not reflect your application's assumptions.

6
Choose SQL Format and Dialect

Scroll down to the Settings panel below the table builder. This is where you choose your output format and SQL dialect.

Settings panel — output format and SQL options
OUTPUT FORMAT SQL CSV JSON (flat) JSON (nested) XML SQL DIALECT PostgreSQL ROWS PER INSERT BATCH 500 rows CREATE TABLE DROP TABLE IF EXISTS FOREIGN KEY constraints Wrap in transactions ⚡ Generate Dataset
Settings panel configured: SQL format selected (blue), PostgreSQL dialect, 500 rows per batch, CREATE TABLE and transactions enabled.

For this tutorial, use these settings:

7
Generate the Dataset and Review the Output

Click the ⚡ Generate Dataset button. A progress bar fills briefly as the tool generates all three tables in dependency order — customers first, then address_book and invoices (using the real customer IDs just generated).

Output panel — after generation completes
customers (100) address_book (200) invoices (400) SQL • PostgreSQL • 700 total records Copy ⬇ Download -- Script generated by DataConversionCenter.com -- Dialect: PostgreSQL | Tables: 3 | Total records: 700 -- Insert order: customers → address_book → invoices -- ── 1 of 3: customers (100 records) ──────────────────── CREATE TABLE IF NOT EXISTS "customers" ( "customer_id" UUID NOT NULL, "first_name" VARCHAR(100) NOT NULL, ...); PREVIEW — FIRST 6 ROWS CUSTOMER_ID FIRST_NAME LAST_NAME EMAIL f47ac10b-58cc... Sarah Mendoza sarah.mendoza@... b3c4d5e6-89ab... James Wilson james.wilson44@... a1b2c3d4-efab... Maria Thompson maria.thompson@...
The output panel after generation. Three table tabs show record counts (100/200/400). The SQL output and a preview of the first rows are shown. Click each tab to preview that table's data.

Switch between the table tabs in the output panel to preview each table. Notice that when you click the address_book tab, the customer_id column shows UUID values — and every one of those values exists in the customers table. That's referential integrity working as designed.

Click ⬇ Download to save the complete SQL script.

Understanding the Generated SQL

Let's look at what the downloaded script actually contains and why each part is structured the way it is.

The file header

The script opens with a comment block summarising everything:

-- Script generated by DataConversionCenter.com
-- Dialect: PostgreSQL  |  Tables: 3  |  Total records: 700
-- Generated: 2026-02-24
-- Insert order respects foreign key dependencies
-- ── Table order: customers → address_book → invoices

The "insert order" note is important. The script always inserts parent tables before child tables. If you try to run the address_book inserts before the customers inserts, the FK constraint check fails because the customer IDs don't exist yet. The generator handles this automatically.

The CREATE TABLE block for customers

-- ── 1 of 3: customers (100 records) ──────────────────────
CREATE TABLE IF NOT EXISTS "customers" (
  "customer_id"  UUID          NOT NULL,
  "first_name"   VARCHAR(100)  NOT NULL,
  "last_name"    VARCHAR(100)  NOT NULL,
  "email"        VARCHAR(255)  NOT NULL,
  "phone"        VARCHAR(25)   NOT NULL,
  "created_at"   TIMESTAMPTZ   NOT NULL,
  PRIMARY KEY ("customer_id")
);

Notice the PostgreSQL-specific types: UUID (a native type in PostgreSQL), TIMESTAMPTZ (timestamp with timezone). If you'd chosen MySQL, these would be VARCHAR(36) and DATETIME respectively. The generator maps each field type to the correct SQL type for your chosen dialect.

The CREATE TABLE block for address_book — with the FK constraint

-- ── 2 of 3: address_book (200 records) ───────────────────
CREATE TABLE IF NOT EXISTS "address_book" (
  "address_id"    UUID          NOT NULL,
  "customer_id"   UUID          NOT NULL,
  "address_line_1" VARCHAR(200) NOT NULL,
  "address_line_2" VARCHAR(100),
  "city"          VARCHAR(100)  NOT NULL,
  "state"         CHAR(2)       NOT NULL,
  "zip_code"      CHAR(5)       NOT NULL,
  "is_primary"    BOOLEAN       NOT NULL,
  PRIMARY KEY ("address_id"),
  FOREIGN KEY ("customer_id") REFERENCES "customers" ("customer_id")
);

The final line is the FK constraint. It tells PostgreSQL: the customer_id column in this table must always reference a value that exists in customers.customer_id. Any attempt to insert an address with a customer_id that doesn't exist in the customers table will be rejected by the database engine itself.

The bulk INSERT blocks — transaction-wrapped

BEGIN;

INSERT INTO "customers" ("customer_id", "first_name", "last_name", "email", "phone", "created_at") VALUES
  ('f47ac10b-58cc-4372-a567-0e02b2c3d479', 'Sarah', 'Mendoza', '[email protected]', '(847) 293-5571', '2023-03-14T09:42:17Z'),
  ('b3c4d5e6-89ab-cdef-0123-456789abcdef', 'James', 'Wilson', '[email protected]', '(312) 847-2910', '2022-11-07T14:22:44Z'),
  -- ... 98 more rows
;

COMMIT;

All 100 customers are inserted in a single batch, wrapped in BEGIN / COMMIT. If any row fails, the entire batch rolls back — you won't end up with a partially-inserted customers table.

The address_book inserts — with real customer IDs

BEGIN;

INSERT INTO "address_book" ("address_id", "customer_id", "address_line_1", ...) VALUES
  ('c8d9e0f1-...', 'f47ac10b-58cc-4372-a567-0e02b2c3d479', '4821 Ridgewood Ave', ...),
  ('d1e2f3a4-...', 'b3c4d5e6-89ab-cdef-0123-456789abcdef', '192 Maple St',       ...),
  -- ...
;

COMMIT;

Look at the customer_id values in the address_book insert: f47ac10b-58cc-4372-a567-0e02b2c3d479. That's the exact same UUID that was used as the primary key for Sarah Mendoza in the customers insert above. The generator captured all 100 customer UUIDs during generation and used them when building the address_book rows. That's referential integrity guaranteed.

The output data

Here's what the actual generated rows look like across all three tables. Notice that the customer_id values in address_book and invoices always match real IDs from customers:

Table 1customers — sample rows
customer_idfirst_namelast_nameemailphonecreated_at
f47ac10b-58cc-...SarahMendoza[email protected](847) 293-55712023-03-14T09:42Z
b3c4d5e6-89ab-...JamesWilson[email protected](312) 847-29102022-11-07T14:22Z
a1b2c3d4-efab-...MariaThompson[email protected](773) 510-48292024-01-19T08:33Z
Table 2address_book — sample rows (customer_id values match real customers above)
address_idcustomer_idaddress_line_1citystatezip_codeis_primary
c8d9e0f1-...f47ac10b-58cc-...4821 Ridgewood AveAustinTX78741true
d1e2f3a4-...f47ac10b-58cc-...113 Oak Blvd Apt 4BAustinTX78702false
e5f6a7b8-...b3c4d5e6-89ab-...192 Maple StChicagoIL60614true
Table 3invoices — sample rows (customer_id values match real customers above)
invoice_idcustomer_idinvoice_numberinvoice_dateamount_duestatus
11a2b3c4-...f47ac10b-58cc-...INV-2024-08922024-07-154291.47paid
22b3c4d5-...a1b2c3d4-efab-...INV-2024-10472024-09-03892.10pending
33c4d5e6-...f47ac10b-58cc-...INV-2025-00122025-01-22341.00overdue

Sarah Mendoza (f47ac10b-...) appears in both the address_book (she has two addresses) and in invoices (she has two invoices). Maria Thompson (a1b2c3d4-...) also appears in invoices. Every value in the customer_id column of both child tables is a real UUID from the customers table — that's what makes this dataset importable.

Exporting as CSV and JSON Instead

SQL isn't always the right format. If you need to import the data using a GUI tool, test it in a JavaScript application, or load it into a document database, switch the output format before clicking Generate.

CSV format

Select CSV in the format row. Each table downloads as a separate file: customers.csv, address_book.csv, invoices.csv. The FK relationships are preserved by value — the customer_id values in address_book.csv match real IDs in customers.csv.

CSV is the right choice for importing with a GUI tool like TablePlus, DBeaver, or pgAdmin's import wizard, or for loading data into Google Sheets or Excel for review.

JSON nested format

Select JSON (nested) to get a single file where addresses and invoices are embedded inside each customer object:

[
  {
    "customer_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
    "first_name": "Sarah",
    "last_name": "Mendoza",
    "email": "[email protected]",
    "addresses": [
      { "address_id": "c8d9e0f1-...", "address_line_1": "4821 Ridgewood Ave", "city": "Austin", ... },
      { "address_id": "d1e2f3a4-...", "address_line_1": "113 Oak Blvd Apt 4B", ... }
    ],
    "invoices": [
      { "invoice_id": "11a2b3c4-...", "invoice_number": "INV-2024-0892", "amount_due": "4291.47", ... },
      { "invoice_id": "33c4d5e6-...", "invoice_number": "INV-2025-0012", "amount_due": "341.00", ... }
    ]
  }
]

This nested format is ideal for MongoDB or Firestore where documents can contain nested arrays, for GraphQL mock servers, or for React and Vue application development where components receive nested customer data from an API.

Try it yourself. The complete three-table schema from this tutorial takes about 5 minutes to build from scratch.

Open Relational Data Generator →

Database Normalization: Why Relational Tables Are Structured This Way

Relational tables are structured around normalization — a set of rules that eliminate data redundancy and ensure data integrity. Understanding normalization helps you design better schemas and understand why the generated tables are split the way they are.

First Normal Form (1NF)

Each column contains a single atomic value — no comma-separated lists, no arrays, no repeating groups. A customer's multiple phone numbers go in a separate phone_numbers table, not as phone1, phone2 columns.

Second Normal Form (2NF)

Every non-key column depends on the entire primary key. This matters for composite keys: in an order_items table keyed on (order_id, product_id), the product name belongs in the products table — not in order_items — because it only depends on product_id, not the full composite key.

Third Normal Form (3NF)

No non-key column depends on another non-key column (no transitive dependencies). A customers table should not store both zip_code and city — the city is determined by the zip code, creating a transitive dependency. City/state belong in a zip_codes reference table.

Why This Matters for Test Data Generation

The Relational Data Generator creates properly normalised schemas — customers, orders, and order items as separate linked tables. This mirrors real production database design and means the generated SQL and data is directly usable for realistic testing, not just toy examples.

Related Tools & Articles

BC
Bill Crawford
Founder, Data Conversion Center

Bill Crawford is a data systems developer and technical founder with over 30 years of professional experience in accounting, finance, and business operations.

He holds a Bachelor's degree in Accounting and has spent more than three decades working within financial and operational environments. Over the past 10 years, he has been heavily involved in the development, implementation, and refinement of financial and enterprise data systems for both Fortune 500 companies and smaller organizations.

His work bridges finance and technology — combining deep domain knowledge in structured reporting and accounting workflows with hands-on SQL development and database architecture experience.

Bill founded DataConversionCenter.com to build practical, browser-based tools that simplify complex data challenges, including:

Rather than focusing on theoretical examples, his tools and articles are informed by real-world challenges encountered in enterprise reporting systems, financial databases, and operational data environments.

Professional Background

Bill's mission is to reduce friction in data workflows — particularly for professionals working with structured financial, operational, and reporting data.