The Complete Guide to Dbf To Csv Converting: Everything You Need to Know
DBF and CSV are the two file formats most likely to land on a developer’s desk when data crosses between a legacy system and a modern pipeline. DBF — the dBASE database file — is a binary, fixed-width, strongly typed container defined in 1980 and still anchoring a long tail of deployed software: FoxPro applications, Clipper-compiled ledgers, county assessor records, insurance back-office systems, and the attribute tables of every ESRI shapefile. CSV is its structural opposite: plain text, delimiter-separated, essentially typeless, and read natively by every modern tool from Excel to pandas to jq. Converting a DBF into a clean CSV is the routine path out of a legacy system and into a modern analysis, reporting, or ETL workflow.
The friction is practical. A developer receives a .dbf file from a client, pulls one out of a shapefile bundle, or exports one from a FoxPro application, and needs the rows in a spreadsheet, a database, or a pandas dataframe. Most command-line approaches involve installing a Python library like simpledbf or dbfread, writing a type-coercion script, and guessing the source codepage when non-ASCII bytes show up. The friction is worse when the DBF came from an unknown consumer and the codepage is not documented in the file header.
The Dbf To Csv Converter on this site does the full pipeline entirely in your browser. Drop a .dbf file onto the page and it parses the 32-byte file header, reads the field descriptor array, decodes each fixed-width record according to its declared field types, maps the file’s language-driver byte to a codepage, and emits a clean UTF-8 CSV with quoted fields where needed. The file ships back as a .csv download ready for Excel, pandas, PostgreSQL COPY, or any CSV-consuming pipeline. No upload, no server processing, no login. This guide explains the DBF format, how the converter reads it, what edge cases matter, and the best practices developers should follow when pulling DBF data into a modern workflow.
Convert DBF to CSV instantly: Drop a .dbf file onto the converter. It parses the header, reads each fixed-width record, decodes the declared codepage, and emits a clean UTF-8 CSV — free, private, no uploads.
Table of Contents
What Is the DBF Format?
DBF is a binary, fixed-width, row-oriented database file format originally defined by the dBASE database management system in 1980. The format was extended through dBASE III, IV, 5, and 7, adopted by FoxPro and Visual FoxPro, and became the de facto attribute-table format for ESRI shapefiles — every shapefile you encounter in the wild ships with a companion .dbf file carrying the non-geometric attributes of each feature. The converter on this site reads all widely deployed dialects: dBase III+, dBase IV, dBase 5, FoxPro 2, and Visual FoxPro, plus the shapefile dBase dialect.
The physical structure of a DBF file is as follows:
- File header (32 bytes). The first byte is a version signature that identifies the dialect —
0x03for dBase III+ without memo,0x83for dBase III+ with memo,0x30for Visual FoxPro,0xF5for FoxPro 2 with memo, and so on. The next three bytes are a YYMMDD last-update date. Bytes 4–7 hold the record count as a little-endian 32-bit integer. Bytes 8–9 hold the header length, and bytes 10–11 hold the record length — both little-endian 16-bit. Byte 29 is the language-driver ID encoding the codepage. - Field descriptor array. Immediately after the file header, one 32-byte descriptor appears for each field. Each descriptor contains the field name (11 bytes, null-padded, ASCII), the field type (1 byte:
C,N,L,D,M, and in FoxPro dialectsY,T,I,B,F), the field length (1 byte), and the decimal count (1 byte). - Header terminator (1 byte). The descriptor array ends with a
0x0Dcarriage return byte. - Records. Every record has the same fixed length, computed as the sum of all field lengths plus one byte for a deletion flag. The leading byte is
0x20(space) for active records or0x2A(asterisk) for soft-deleted records. Fields are stored as ASCII or codepage-specific bytes, left-padded for numeric fields and right-padded with spaces for character fields. - End-of-file marker (1 byte). The file ends with a
0x1Abyte — the ASCII SUB character, retained from CP/M-era conventions. Some DBF writers omit this byte; readers should not rely on its presence.
DBF vs. CSV: Key Differences
Understanding the structural gap between the two formats explains why the conversion produces the output it does and why certain DBF constructs require interpretation during the read.
- Readability. DBF is binary and historically tied to single-byte codepages — CP437 OEM US, Windows-1252, or a codepage declared in the header’s language-driver byte. CSV is plain text, Unicode-capable when encoded as UTF-8, and human-readable in any editor.
- Schema. DBF requires every record to share the same fields with identical types and lengths, fixed at write time. CSV has no enforced schema; each row is a tuple of strings that the consumer interprets as it sees fit.
- Field names. DBF field names are limited to 10 ASCII printable characters (stored in an 11-byte null-padded slot), uppercase by convention. CSV headers have no length or character restrictions.
- Type system. DBF fields are strongly typed at write time: Character, Numeric, Logical, Date, and in FoxPro dialects Currency, DateTime, Integer, Double, and Float. CSV values are strings until the consumer parses them — the CSV produced by this converter preserves the source values as readable text so the consumer can type-coerce them.
- Row length. DBF records are fixed-width; the record length is declared in the file header and every record consumes exactly that many bytes. CSV rows may be of any length; each row is self-delimited by a newline.
- Deletion. DBF uses a soft-delete convention: the first byte of every record is either a space (active) or an asterisk (deleted). CSV has no equivalent. The converter skips soft-deleted records by default and surfaces the count in the preview panel so you can opt in to including them.
- Nulls. DBF has no dedicated null representation; empty fields are all-space in Character columns and all-space or zero in Numeric columns. CSV represents missing values as an empty field between two consecutive delimiters. The converter emits an empty field for all-space DBF values.
How the Conversion Works
The Dbf To Csv Converter reads your .dbf file using the Web File API, parses the binary structure with a DataView over the file’s ArrayBuffer, and assembles a UTF-8 CSV in memory. The pipeline has four stages.
Stage 1 — Header parse. The first 32 bytes are read as the file header. The version byte is mapped to a dialect label shown in the preview panel. The record count, header length, and record length are extracted from their little-endian 16- and 32-bit fields. The language-driver byte is mapped to a codepage — Windows-1252, CP437, CP850, CP1251, CP936, and so on — that will be used to decode Character fields. If the language-driver byte is zero (common in older files), the converter defaults to Windows-1252 and flags the assumption in the status panel.
Stage 2 — Field descriptor parse. The descriptor array is read in 32-byte slots until the 0x0D terminator appears. Each descriptor yields a field name (stripped of null padding and lowercased for CSV output only if you enable the option), a type code, a byte length, and a decimal count. The schema is rendered in the preview panel so you can audit the field list before exporting.
Stage 3 — Record decode. Each record is read at its fixed offset computed from the header length plus record_index × record_length. The leading byte is checked: 0x2A records are skipped (soft-deleted) unless the include-deleted option is enabled. For each field, the raw bytes are sliced from the record, decoded according to the type code — trimmed of trailing spaces for Character, parsed as a signed decimal for Numeric, mapped to T/F/empty for Logical, reformatted to ISO YYYY-MM-DD for Date — and passed through codepage decoding for Character fields.
Stage 4 — CSV write. The header row of column names is emitted first, comma-separated and quoted where necessary. Each record is then written with fields separated by commas; fields containing commas, double quotes, or line breaks are wrapped in double quotes and internal quotes are doubled. Line endings are CRLF by default (the RFC 4180 default), switchable to LF for UNIX pipelines. The assembled CSV is handed to the browser as a .csv download named after the input file.
The entire pipeline runs in the browser’s JavaScript engine. The .dbf file never leaves your machine — it is read directly from the drop target into memory. For a typical DBF of a few thousand rows and 10–20 fields, parsing and writing complete in well under a second.
Field Types and How They Are Decoded
The converter decodes every DBF field type defined across the dBase III+, dBase IV, FoxPro 2, and Visual FoxPro dialects. For each field in the file header, the converter applies the decoding rule corresponding to its type code:
- Character (
C). Raw bytes are decoded using the codepage derived from the language-driver byte (or Windows-1252 if unset). Trailing spaces — DBF right-pads Character fields to the declared length — are stripped by default. The result is emitted as a UTF-8 string in the CSV. - Numeric (
N). Bytes are interpreted as ASCII signed decimal. Leading spaces are stripped; a sign and digits with an optional decimal point are read. Empty or all-space fields become empty in the CSV. The decimal count from the descriptor is respected when formatting. - Float (
F). dBase IV float type. Decoded identically to Numeric — ASCII signed decimal — and emitted to the CSV as the raw numeric string. - Logical (
L). Single-byte field.T/t/Y/ybecometrue;F/f/N/nbecomefalse;?or space becomes empty. Output tokens are switchable betweentrue/false,T/F, and1/0in the converter’s advanced settings to suit your downstream consumer. - Date (
D). 8 ASCII bytes inYYYYMMDDform. The converter re-formats to ISOYYYY-MM-DDfor CSV output. All-space fields become empty. - DateTime (
T). Visual FoxPro timestamp. 8 bytes: a Julian date as a 32-bit integer followed by milliseconds-since-midnight as a 32-bit integer. The converter formats these as ISO-8601YYYY-MM-DDTHH:MM:SS. - Integer (
I). Visual FoxPro 32-bit little-endian signed integer. Decoded directly to a decimal string. - Double (
B) and Currency (Y). Visual FoxPro 64-bit little-endian IEEE-754 double, with Currency scaled by 10000 before emission. Decoded to a decimal string with appropriate precision. - Memo (
M). Memo fields hold a 10-byte ASCII offset pointing into a companion.dbtor.fptmemo file. The converter reads the memo file when you drop it alongside the DBF on the drop zone; otherwise the Memo field is emitted as empty in the CSV and flagged in the status panel.
The schema summary panel renders every field name, decoded type, length, decimal count, and a sample value so you can confirm the interpretation before downloading the CSV.
Deleted Records and the Deletion Flag
DBF uses a soft-delete convention that predates garbage collection in most database engines: a record is marked deleted by writing 0x2A (*) to its leading byte. The record remains in the file; consumers are expected to skip it. A PACK operation in dBASE or FoxPro physically removes the soft-deleted records and rebuilds the file.
The converter defaults to skipping soft-deleted records and reports the skip count in the preview panel. This matches the behaviour of every DBF-aware consumer — a CSV that includes deleted records would produce surprising downstream output, since the DBF’s original consumer would never see them. If your use case requires them — for audit, forensic, or migration-integrity reasons — tick the “Include soft-deleted records” checkbox before exporting. The resulting CSV gains an additional _DELETED column whose value is true for previously-deleted records and false otherwise.
Character Encoding and Codepage Handling
The DBF format predates Unicode. Byte 29 of the file header is a language-driver ID that specifies the codepage in which Character fields are stored — a numeric hint like 0x01 (CP437 OEM US), 0x03 (Windows-1252 Latin), 0x57 (CP1252 ANSI), 0x64 (CP852 Eastern European), or 0xC8 (CP1250 Eastern European Windows). The converter maintains the canonical mapping and decodes Character fields accordingly.
Many DBF writers leave the language-driver byte as zero. When the converter sees a zero byte, it falls back to Windows-1252 and surfaces the assumption in the status panel. If the displayed sample values contain mojibake — telltale sequences like é for é or £ for £ — the source codepage is not Windows-1252. The converter exposes a codepage override in the advanced settings: pick the correct codepage (common alternatives are CP850, CP437, CP1251 for Cyrillic, CP1250 for Central European, CP936 for Simplified Chinese, and CP932 for Shift-JIS) and the preview re-renders with the new decoding.
Shapefile attribute tables commonly ship with a sibling .cpg file containing a single text line naming the codepage. When you drop the .cpg alongside the .dbf on the drop zone, the converter reads it and applies the named codepage automatically, overriding whatever the language-driver byte declares.
CSV output is always UTF-8. The converter transcodes every Character field from its source codepage to UTF-8 during the write, so downstream tools — Excel, pandas, PostgreSQL COPY, command-line utilities — see clean Unicode text without a second decoding step.
Common Use Cases
Extracting data from a legacy FoxPro or Clipper application. Vintage FoxPro and Clipper systems in insurance, accounting, and records management remain deployed. Pulling their data into a modern analysis tool means reading the DBF. Drop the DBF, download the CSV, load into pandas or Excel or a database.
Reading ESRI shapefile attribute tables. An ESRI shapefile is a set of files (.shp, .shx, .dbf, and optional .prj/.cpg) where the .dbf carries the non-geometric attributes of each feature. A GIS analyst pulling attributes into a spreadsheet or into pandas can convert the .dbf directly — the converter reads the accompanying .cpg automatically so non-ASCII place names decode correctly.
Migrating data out of a dBASE archive. Municipal records offices, county assessors, and academic archives hold dBASE files as long-term deposits. Pulling these into a modern query tool requires a CSV (or a JSON, which this site also supports via the Dbf To Json converter).
Spot-checking DBF output from a writer. A developer writing DBF files from a modern pipeline — for example using the Csv To Dbf converter on this site — can confirm the round trip by reading the output back with this tool. If the records decode cleanly and the field types match, the writer is behaving correctly.
Loading DBF data into a SQL database. A CSV is the easiest intermediate format for bulk-loading into SQL Server (BULK INSERT), PostgreSQL (COPY), MySQL (LOAD DATA INFILE), or SQLite. Convert the DBF to CSV, then pair the CSV with one of the INSERT generators on this site (such as CSV to PostgreSQL INSERT) to script the load.
Best Practices
Inspect the schema summary before exporting. The converter renders a schema table showing field name, DBF type, length, decimal count, and a sample value per column. Skim this before downloading — the common surprises are a Character column that looks numeric but is actually a zero-padded identifier, or a Date column whose values are blank (common when a DBF writer used Character instead of Date for dates).
Check the record count against the file header. The header declares a record count; the converter verifies that reading the declared number of records reaches the expected file offset. If the declared count and the actual trailing byte offset disagree, the DBF may be truncated or padded; the converter surfaces the discrepancy.
Watch for codepage mojibake. If the preview shows é where you expected é, the source codepage is not Windows-1252. Override it in the advanced settings before downloading — a wrong codepage silently corrupts every non-ASCII byte and the CSV consumer will carry the corruption forward.
Include the .cpg file when converting shapefile DBFs. Dropping the .cpg alongside the .dbf lets the converter apply the correct codepage without guesswork. This matters for shapefiles with non-ASCII place names, which are common in Europe and East Asia.
Decide whether to include soft-deleted records. The default is to skip them. For audit, forensic, or migration-integrity work, enable the include-deleted option and retain the added _DELETED column in the downstream consumer so the provenance is not lost.
Validate the CSV before downstream use. The CSV produced by the converter is RFC-4180-compliant, but the downstream consumer may have stricter requirements (line ending, delimiter, header expectations). Pass the CSV through the CSV Validator before shipping it to a rigid consumer.
Retain the original DBF. The CSV is derived; the DBF is the source of truth until the full round trip is validated. Retain the DBF until the downstream consumer has accepted the CSV successfully.
Limitations and Edge Cases
Memo fields require the companion file. Memo (M) fields store a 10-byte offset pointing into a companion .dbt (dBase) or .fpt (FoxPro) file. If the DBF has Memo fields and the memo file is not dropped alongside it, those fields are emitted as empty in the CSV and flagged in the status panel. Drop the .dbt or .fpt with the .dbf to read memo content.
Encrypted or password-protected DBFs are not supported. Some Visual FoxPro deployments use DBC-level encryption or password protection. The converter does not decrypt — an encrypted DBF will surface as garbled Character fields and incorrect Numeric values. Decrypt upstream with the originating application before converting.
Codepage detection is heuristic for zero language-driver bytes. When the language-driver byte is zero, the converter defaults to Windows-1252. This is correct for the majority of Western-locale DBFs but wrong for files produced in Cyrillic, CJK, or other non-Latin locales. Always spot-check the preview for mojibake before downloading.
Very large files are constrained by browser memory. The converter reads the entire .dbf into browser memory, parses the schema, and assembles the CSV in memory. Files of several hundred megabytes may exceed the available memory of low-RAM devices or mobile browsers. Desktop browsers on modern hardware handle files up to approximately 200 MB comfortably. For bulk conversion work of gigabyte-scale DBF archives, a Python script using dbfread or a streaming converter is more appropriate.
Non-standard dialects may produce unexpected results. A handful of niche DBF writers produce files with non-standard field types or malformed headers. The converter recognises all widely deployed dialects (dBase III+, IV, 5, FoxPro 2, Visual FoxPro, shapefile dBase) but may fall back to Character decoding for unrecognised field type codes. The parse-status panel will surface any unrecognised code.
Field name case is preserved by default. DBF field names are conventionally uppercase but stored as raw ASCII. The converter emits them as-is in the CSV header. If your downstream consumer requires lowercase column names, enable the lowercase option in the converter’s advanced settings.
