Guide

The Complete Guide to Csv To Dbf: Everything You Need to Know

Bill Crawford — Developer Guide — 2026 · Published April 16, 2026

CSV and DBF sit on opposite sides of a common fault line in developer work. CSV is a text, delimiter-separated, essentially typeless format that every modern tool ingests happily. DBF — the dBASE database file — is a binary, fixed-width, strongly typed container defined in 1980 and still anchoring a long tail of working software: FoxPro applications, Clipper-compiled ledgers, county assessor records, insurance back-office systems, and the attribute tables of ESRI shapefiles. Converting a CSV into a valid DBF is a routine developer task whenever a downstream consumer accepts only DBF and the source data arrived as a spreadsheet export or API dump.

The friction is practical. A developer receives a CSV from a client, an upstream API, or a data engineering pipeline, and needs to deliver it into a legacy Visual FoxPro application, a Clipper-compiled reporting tool, or a custom GIS workflow that expects a shapefile’s sibling .dbf. The CSV may have clean headers, messy headers, numeric columns stored as strings, dates in three different formats, and UTF-8 bytes that the target DBF consumer will not accept. Most command-line approaches involve installing a Python library like simpledbf or dbfpy, writing a schema-mapping script, and hoping the target application accepts the dialect you emit.

The Csv To Dbf converter on this site does the full pipeline entirely in your browser. Drop a .csv file onto the page and it parses the header row, infers a field type and length per column from the sampled data, sanitises column names to the 10-character ASCII-uppercase constraint DBF requires, assembles a dBase III+ binary with the correct 32-byte file header, field descriptors, space-padded records, and the 0x1A end-of-file marker, and hands you a .dbf file ready to open in FoxPro, Clipper, ArcGIS, or any dBase-era consumer. No upload, no server processing, no login. This guide explains the DBF format, how the converter interprets a CSV, what edge cases matter, and the best practices developers should follow when feeding CSV data into a DBF-consuming pipeline.

Connect on LinkedIn →

Convert CSV to DBF instantly: Drop a .csv file onto the converter. It parses the header row, infers field types, sanitises column names, and emits a valid dBase III+ .dbf binary — free, private, no uploads.

Open Csv To Dbf Converter →

Table of Contents

What Is the DBF Format?
CSV vs. DBF: Key Differences
How the Conversion Works
Field Types and How They Are Inferred
Column Name Sanitization
Character Encoding and Codepage Handling
Common Use Cases
Best Practices
Limitations and Edge Cases

What Is the DBF Format?

DBF is a binary, fixed-width, row-oriented database file format originally defined by the dBASE database management system in 1980. The format was extended through dBASE III, IV, 5, and 7, adopted by FoxPro and Visual FoxPro, and became the de facto attribute-table format for ESRI shapefiles — every shapefile you encounter in the wild ships with a companion .dbf file carrying the non-geometric attributes of each feature. The converter on this site emits dBase III+ specifically, which is the most broadly compatible dialect: it is accepted by classic dBASE, FoxPro, Visual FoxPro, Clipper, ArcGIS, QGIS, and virtually every DBF reader written in the last forty years.

The physical structure of a dBase III+ DBF file is as follows:

File header (32 bytes). The first byte is 0x03 identifying dBase III+ without memo. The next three bytes are a YYMMDD last-update date. Bytes 4–7 hold the record count as a little-endian 32-bit integer. Bytes 8–9 hold the header length, and bytes 10–11 hold the record length — both little-endian 16-bit.
Field descriptor array. Immediately after the file header, one 32-byte descriptor appears for each field. Each descriptor contains the field name (11 bytes, null-padded, ASCII, uppercase by convention, maximum 10 printable characters), the field type (1 byte: C for Character, N for Numeric, L for Logical, D for Date), the field length (1 byte), and the decimal count (1 byte, used only for numeric fields).
Header terminator (1 byte). The descriptor array ends with a 0x0D carriage return byte.
Records. Every record has the same fixed length, computed as the sum of all field lengths plus one byte for a deletion flag. The leading byte is 0x20 (space) for active records or 0x2A (asterisk) for soft-deleted records. Fields are stored as ASCII or codepage-specific bytes, left-padded for numeric fields and right-padded with spaces for character fields.
End-of-file marker (1 byte). The file ends with a 0x1A byte — the ASCII SUB character, retained from CP/M-era conventions.

CSV vs. DBF: Key Differences

Understanding the structural gap between the two formats explains why the conversion produces the output it does and why certain CSV constructs must be coerced during the write.

Readability. CSV is plain text, Unicode-capable when encoded as UTF-8, and human-readable in any editor. DBF is binary and historically tied to single-byte codepages — CP437 OEM US, Windows-1252, or a codepage specified in the header’s language-driver byte.
Schema. CSV has no enforced schema; each row is a tuple of strings. DBF requires every record to share the same fields with identical types and lengths, fixed at write time. The converter infers a schema from the CSV by sampling each column, then applies it uniformly to every record.
Field names. CSV column headers may be any length and contain any characters. DBF field names are limited to 10 ASCII printable characters (stored in an 11-byte null-padded slot), uppercase by convention. The converter sanitises column names to this constraint and surfaces the mapping so you can audit it.
Type system. CSV values are strings until the consumer parses them. DBF fields are strongly typed at write time: Character, Numeric, Logical, or Date. The converter infers each column’s type by sampling the data and selects the narrowest type that accommodates every value.
Row length. CSV rows may be of any length; each row is self-delimited by a newline. DBF records are fixed-width; the record length is declared in the file header and every record must consume exactly that many bytes. Character fields are right-padded with spaces and numeric fields are left-padded to the declared length.
Nulls. CSV has no dedicated null representation — an empty field is two consecutive delimiters. DBF has no dedicated null either; empty character fields are all-space and empty numeric fields are conventionally all-space or zero. The converter emits all-space for empty source values.

How the Conversion Works

The Csv To Dbf converter reads your .csv file using the Web File API, parses it with a streaming CSV reader, and assembles a dBase III+ binary in memory using a DataView over an ArrayBuffer. The pipeline has four stages.

Stage 1 — CSV parse and sample. The header row is read first, yielding an ordered list of column names. The converter then streams through the records and samples each column to decide its field type and its required length. Sampling is full by default — every row is inspected — because DBF cannot resize a field after the header is written; if any row in column three is longer than the sampled maximum, the DBF would truncate it silently.

Stage 2 — Schema construction. For each column the converter picks the narrowest DBF type that accommodates every sampled value: Numeric if every value parses as a finite number, Date if every value matches a recognised date pattern, Logical if every value is one of the accepted truthy/falsy tokens, otherwise Character. The column name is sanitised to the 10-character uppercase ASCII constraint, and conflicts (two columns sanitising to the same name) are disambiguated with a numeric suffix. The resulting schema — a list of {name, type, length, decimals} tuples — is rendered in the preview panel so you can audit it before exporting.

Stage 3 — Header and descriptor write. The 32-byte file header is assembled with the dBase III+ version byte 0x03, the current date in YYMMDD form, the record count from the CSV, and the computed header and record lengths. Each field descriptor is then written in its 32-byte slot: 11 bytes for the name, one byte for the type code, one byte for the length, one byte for the decimal count, and padding zeros. The 0x0D header terminator closes the descriptor array.

Stage 4 — Record write and EOF. Each record is written at its fixed offset. The leading byte is 0x20 (active). For each field, the CSV value is coerced to the column’s declared type, formatted to the declared width (space-padded for Character, left-padded for Numeric, YYYYMMDD for Date, single-byte T/F for Logical), and written. The file ends with the 0x1A EOF marker. A download button appears with the resulting .dbf named after the input file.

The entire pipeline runs in the browser’s JavaScript engine. The .csv file never leaves your machine — it is read directly from the drop target into memory. For a typical CSV of a few thousand rows and 10–20 columns, parsing and writing complete in well under a second.

Field Types and How They Are Inferred

DBF supports four field types in the dBase III+ dialect the converter emits. For each column, the converter inspects every sampled value and chooses the narrowest compatible type:

Numeric (N). If every non-empty value in the column parses as a finite JavaScript number, the column becomes Numeric. The field length is set to the widest observed string representation of the number, and the decimal count is set to the maximum number of digits observed after the decimal separator. Empty source values are emitted as all-space bytes in the record. Integers with leading zeros (for example, US ZIP codes) will round-trip as numbers and lose the zeros — declare the column as a string explicitly in your CSV if this matters.
Logical (L). If every non-empty value is one of a small set of recognised truthy/falsy tokens (true/false, t/f, yes/no, y/n, 1/0), the column becomes Logical. It is written as a single byte: T for true, F for false, ? for empty.
Date (D). If every non-empty value matches a recognised date pattern (ISO YYYY-MM-DD, slash-separated YYYY/MM/DD, or MM/DD/YYYY with a four-digit year), the column becomes Date. Values are written as 8 ASCII bytes in YYYYMMDD form. Ambiguous formats (two-digit years, DD/MM vs MM/DD ambiguity) fall back to Character — parse them yourself in the CSV if you need Date type.
Character (C). The catch-all. Any column that fails the Numeric, Logical, and Date tests becomes Character. The field length is set to the longest observed byte length after encoding; oversized cells trigger a warning in the preview and are truncated on write. Empty values become all-space.

The converter displays a schema summary panel after parsing — column name, inferred type, field length, decimal count, and a sample value — so you can confirm the interpretation before downloading. If a numeric column is being typed as Character because one row contains a stray letter, you will see it here.

Column Name Sanitization

DBF field names are limited to 10 printable ASCII characters, stored in an 11-byte null-padded slot, uppercase by convention. A real-world CSV frequently carries column names like Customer Name, Order #, or Total (USD) that cannot be written to DBF verbatim. The converter applies a deterministic sanitization pass before the header is written:

Non-ASCII characters are transliterated to the closest ASCII equivalent, or stripped if no equivalent exists.
Spaces and punctuation other than the underscore are replaced with an underscore or removed.
The result is uppercased and truncated to 10 characters.
If two columns sanitise to the same name (for example Order # and Order_ID both colliding on ORDER), a numeric suffix disambiguates them.
A column name that starts with a digit is prefixed with an underscore, since DBF consumers vary in how they handle leading digits.

The converter renders the full source-to-DBF name map in the preview panel so you can audit every rename. If the renames would break downstream code that refers to the DBF fields by name, adjust the CSV headers upstream — using DBF-friendly column names in the CSV is the surest way to ship predictable output.

Character Encoding and Codepage Handling

The DBF format predates Unicode. Byte 29 of the file header is a language-driver ID that specifies a codepage — a numeric hint like 0x01 (CP437 OEM US), 0x03 (Windows-1252 Latin), or 0x57 (CP1252 ANSI). The converter defaults to writing with Windows-1252 and setting the language-driver byte accordingly, since Windows-1252 is the most common codepage accepted by FoxPro and Clipper consumers in Western locales.

CSV files in 2026 are almost always UTF-8. The converter decodes the CSV as UTF-8 by default and then encodes each Character field’s bytes in the chosen output codepage. Characters not representable in Windows-1252 — emoji, CJK glyphs, Cyrillic outside the Latin supplement, and so on — are replaced with a question mark byte and the affected records are flagged in the parse-status panel so you can investigate. For CJK or other non-Latin data, select a matching codepage in the converter’s advanced settings before writing.

Shapefile attribute tables commonly ship with a sibling .cpg file containing a single text line naming the codepage. If you are writing a DBF for a shapefile consumer, match the codepage its reader expects — ArcGIS and QGIS respect a companion .cpg, while some legacy readers silently assume OEM US.

Common Use Cases

Feeding a legacy FoxPro or Clipper application. Vintage FoxPro and Clipper systems in insurance, accounting, and records management remain deployed. A modern pipeline generating CSV (from an API, ETL tool, or data warehouse) can hand off to these systems by converting the CSV to DBF in the browser. Drop the CSV, download the DBF, deliver it to the legacy application.

Writing ESRI shapefile attribute tables. An ESRI shapefile is a set of files (.shp, .shx, .dbf, and optional .prj/.cpg) where the .dbf carries the non-geometric attributes. If you are assembling a shapefile from a modern pipeline, converting a CSV of attributes to DBF is the quickest way to produce the attribute-table component without pulling a full GIS library into the build.

Migrating data into a dBASE archive. County assessors, municipal records offices, and academic archives occasionally require data deposits in dBASE format for long-term retention. Converting a modern CSV export to dBase III+ produces a deposit-ready artefact in the canonical format.

Round-tripping data for legacy reporting. A reporting tool written in Clipper or FoxPro may be the only sanctioned path for certain regulatory reports in a long-running organisation. If a newer system has the data in CSV, converting to DBF lets the legacy reporter consume it without a schema-mapping rewrite.

Building test fixtures for DBF-reading code. A developer writing or maintaining code that reads DBF (for example, a custom GIS pipeline or an accounting-system importer) can use the converter to produce synthetic DBF fixtures from hand-crafted CSV inputs — a faster loop than authoring DBF files in a hex editor.

Best Practices

Validate the CSV before converting. A CSV with unquoted embedded commas, mixed line endings, or inconsistent column counts produces a DBF with mis-aligned rows. Pass the file through the CSV Validator first to confirm the structure is clean.

Inspect the schema summary before exporting. The converter renders a schema table showing field name, inferred type, length, decimal count, and a sample value per column. Skim this before downloading — the common surprises are a numeric ID typed as Character (because one stray row contains a non-digit) or a column forced to Character instead of Date because one cell is in an unrecognised date format.

Use DBF-friendly column names in the source CSV. If you control the CSV headers, use uppercase ASCII names of 10 characters or fewer — this eliminates sanitization surprises. If you do not control the source, consult the rename map in the preview panel before downstream code references the field names.

Declare numeric columns explicitly when leading zeros matter. US ZIP codes, SKUs with leading zeros, and fixed-width identifiers are semantically strings even though they look numeric. Pre-quote these columns in the CSV or add a non-digit prefix to force Character typing. Once a column is stored as Numeric, the leading zeros are gone.

Confirm the target consumer’s codepage expectation. FoxPro and Clipper in Western locales typically accept Windows-1252; ArcGIS respects a companion .cpg; legacy DOS readers expect CP437. Match the codepage to the consumer before writing or the consumer will show mojibake.

Spot-check the output by opening it in the target application. Before handing off a DBF to production, open it in the actual consuming application (FoxPro, ArcGIS, Clipper compiler, or whatever) and confirm the field types, lengths, and values match expectations. A DBF that passes the converter’s internal checks can still fail in a consumer that enforces stricter conventions (for example, requiring all field names to begin with a letter).

Keep the original CSV. The DBF is derived; the CSV is the source of truth until you have validated the full round trip. Retain the CSV until your downstream consumer has accepted the DBF successfully.

Limitations and Edge Cases

Memo (.dbt) fields are not emitted. Classic DBF supports Memo fields — long-text columns whose content lives in a companion .dbt file. The converter emits dBase III+ without memo (version byte 0x03) and writes all string content as Character fields. For sources with text longer than about 254 bytes in a single cell, truncation will occur and is flagged in the parse-status panel.

Visual FoxPro types are not emitted. Visual FoxPro added Currency (Y), DateTime (T), Integer (I), and Double (B) types beyond classic DBF. The converter does not emit these because not every consumer accepts them. If your CSV has a high-precision decimal or a timestamp-including column, write it as Character with an ISO-formatted string rather than relying on a FoxPro-specific extension.

Field length cap. A DBF Character field is limited to 254 bytes. If your CSV has a column whose longest cell exceeds 254 bytes after encoding, the converter truncates on write and surfaces the affected records in the parse-status panel. Split oversized columns upstream in the CSV, or accept truncation for the rare case where that is the correct behaviour.

Record count cap. dBase III+ stores the record count as a 32-bit little-endian integer, yielding a theoretical cap of about 4 billion records. In practice, browser memory constraints will cap you well below this — the entire CSV is read into memory before the DBF is assembled. For bulk conversion work involving hundreds of millions of rows, a Python script using simpledbf or a streaming DBF writer is more appropriate.

Non-Latin content with the default codepage. The default output codepage is Windows-1252. Bytes not representable in Windows-1252 are replaced with question marks. If your CSV contains CJK, Cyrillic outside Latin supplement, or Arabic content, switch the output codepage in the advanced settings to a matching DBF codepage (CP936 for Simplified Chinese, CP1251 for Cyrillic, and so on) before writing.

Very large single files. The converter reads the entire .csv into browser memory, infers the schema by full-sample scan, and assembles the DBF in memory. Files of several hundred megabytes may exceed the available memory of low-RAM devices or mobile browsers. Desktop browsers on modern hardware handle files up to approximately 200 MB comfortably.

ToolCsv To Dbf TutorialHow to Use the Csv To Dbf: Step-by-Step Tutorial ToolCSV Validator ToolDBF Validator ToolExcel to DBF Converter HubDeveloper Tools