How to Use the Ssv Validator: Step-by-Step Tutorial
The SSV Validator runs entirely in your browser β no file is uploaded to any server, no account is required, and no data leaves your machine. This tutorial walks through every step of using the tool: loading an SSV file, reading the validation results, understanding each type of issue reported, and fixing the most common problems found in semicolon-separated value files. Practical worked examples are included for each issue type.
SSV (semicolon-separated values) files are especially common in exports from European ERP and accounting systems, government data portals, and any tool where the locale uses a comma as the decimal separator. This tutorial is aimed at developers who receive, process, or produce SSV files in data pipelines, database imports, or ETL workflows.
Follow along with the tool open: Open the SSV Validator in a second tab, then work through each step below.
Open SSV Validator βTable of Contents
Step 1 β Open the Tool
Navigate to /developer-tools/ssv-validator/. The tool loads entirely in your browser with no server-side processing. You can verify this by opening your browser's Network tab in DevTools β after initial page load, clicking Validate SSV generates zero outbound network requests. Your file data is parsed in-memory by JavaScript running inside the browser tab and is never transmitted anywhere.
The tool is reachable from the Developer Tools hub page, via the site command palette (press Ctrl+K or βK and type "SSV Validator"), or directly at the URL above. No login, no sign-up, and no extension is required.
Step 2 β Load Your SSV File
There are two ways to load your file into the validator:
Option A β Drag and drop. Drag your .ssv file (or any .csv or .txt file containing semicolon-delimited data) from your file manager and drop it anywhere on the page. A full-page drop overlay activates as soon as a file enters the browser window β release to load. The file is read directly from disk using the browser's FileReader API. Nothing leaves your machine.
Option B β Browse. Click the browse link inside the drop zone to open your operating system's file picker. Select your file and it loads immediately. This is the better option when drag-and-drop is awkward in your workflow, or when the file has an extension like .txt or .csv rather than .ssv.
Once loaded, the file name appears below the drop zone with a clear (Γ) button beside it. The drop zone hides and the Validate SSV button becomes active. Loading and validating are separate steps β the file is not yet analysed at this point.
If you load a file with an extension the tool does not recognise as SSV-compatible, a red type-error notice appears identifying the dropped extension. The tool accepts .ssv, .csv, and .txt extensions, as well as plain text MIME types β SSV data frequently arrives in files with these extensions rather than a dedicated .ssv extension.
Step 3 β Run Validation
With a file loaded, click the Validate SSV button. Validation runs synchronously in the browser. For most files under a few megabytes, results appear in under a second. Larger files take proportionally longer but all processing remains local β there is no server round-trip adding latency.
When validation completes, the status bar directly below the button updates with one of three states:
- Green β "Valid SSV β no issues found." The file passed all checks with no errors or warnings.
- Yellow β "Valid SSV β N warning(s). See details below." The file is structurally valid but has conditions worth reviewing.
- Red β "Validation failed β N error(s) found." The file has structural problems that will likely cause parse or load failures.
To validate a different file, click Clear to fully reset the tool, then load and validate the new file. Clear removes the current file reference, hides all result panels, and returns the drop zone to its ready state.
Step 4 β Read the Results
Results are displayed in up to five panels below the status bar, depending on what the validator found:
Error panel (red border). Appears when the file contains one or more structural errors. Each error is listed as a separate line with the row number and a plain-English description of the problem. If this panel is visible, the file has issues that will cause a parser, database loader, or data processing tool to fail or produce misaligned output. Fix all errors before loading the file into any downstream system.
Warning panel (yellow border). Appears when the file has conditions that may or may not be problems depending on your parser. Common warnings include a UTF-8 BOM, commas appearing more frequently than semicolons (possible CSV mislabelled as SSV), duplicate column headers, empty header names, and empty rows. Each warning is explained in plain language.
Stats panel (green border). Appears when the file passes all error-level checks. Displays key metrics: total data rows, column count, delimiter (confirms semicolon was used), empty cells and their percentage, empty row count, file size, line ending style (LF or CRLF), and BOM status. Use these to confirm the file was parsed as expected before loading it downstream.
Column headers panel. Lists every column header by position and name. Use this to confirm that column names are correct, that no names are unexpectedly blank or duplicated, and that no BOM character or invisible whitespace has been prepended to the first header name.
Data preview panel. Renders the first five data rows in a formatted table with row numbers and headers as column labels. Scroll horizontally to inspect wide files. Use this visual check to confirm that column alignment is correct β that the values in each column match the header above them. For SSV files from ERP systems, also check that numeric fields with European decimal formatting (comma as decimal separator) are appearing in the correct columns.
Step 5 β Understand Each Issue Type
The SSV Validator reports the following issue types. Here is what each means and how to resolve it:
Column count mismatch
Example error: "Row 7 has 5 field(s) but expected 4 (header column count)."
A data row contains a different number of semicolon-delimited fields than the header row. In SSV files, the most common cause is an embedded semicolon inside a field value β particularly in address fields, product descriptions, or notes columns exported from European ERP systems where semicolons appear in free text. Unlike CSV, SSV has no universally agreed quoting standard: some producers quote fields with embedded semicolons using double quotes, while others do not quote at all, leaving the embedded semicolon indistinguishable from a column delimiter. Fix: open the file at the specified row, locate the embedded semicolon, and either quote the field correctly or replace the semicolon with a safe substitute (a comma, a dash, or an escaped sequence).
No semicolons detected
Example warning: "No semicolons found β the file appears to use commas as the delimiter."
The file contains no semicolon characters, or another delimiter scores significantly higher. This means the file may not be genuine SSV β it could be a comma-delimited CSV or a tab-delimited TSV that has been labelled or treated as SSV. This is a common mislabelling: some export tools produce CSV files but name them with an .ssv extension. Fix: confirm the file's actual delimiter by inspecting a few rows in a plain-text editor. If it is comma-delimited, validate it with the CSV Validator instead. If it is tab-delimited, use the TSV Validator.
Comma delimiter dominance warning
Example warning: "Commas appear more frequently than semicolons (47 vs 12). This file might be CSV rather than SSV."
Semicolons are present but commas appear significantly more often. The file almost certainly uses comma as its delimiter β the semicolons are appearing inside field values (for example, in address fields: "123 Main St; Suite 4"). Validation continues with semicolon as the delimiter but results will likely show widespread column count mismatches. Fix: determine the actual delimiter and use the appropriate validator. If the file uses commas as the delimiter with semicolons in field values, it is CSV and should be validated as such.
Unclosed double-quote
Example error: "Unclosed double-quote on line(s): 23, 41. This will cause parse failures in most tools."
One or more rows contain an opening double-quote character that is never closed before the end of the line. This causes the parser to continue consuming characters across subsequent rows as part of the quoted field, until it finds a matching closing quote or reaches end-of-file. Everything between the unclosed quote and its eventual match β including what should be separate rows β is consumed as a single field value. Fix: open the file at each reported line number, find the unmatched quote, and either close the field with a matching quote or remove the stray opening quote. If the double-quote is a legitimate character in the field value that should not start a quoted field, it needs to be escaped as "" inside a properly quoted field.
UTF-8 BOM detected
Example warning: "UTF-8 BOM (byte order mark) detected at the start of the file."
The file was saved with a UTF-8 BOM (bytes EF BB BF). This is common in SSV files from Windows-based ERP systems and from Excel when saving as text. Most parsers handle it transparently, but some prepend the BOM characters to the first header field name, causing column name lookups to fail. Fix: strip the BOM before distribution. In Python: open with encoding='utf-8-sig'. On Linux/macOS: sed -i '1s/^\xef\xbb\xbf//' file.ssv.
Null bytes detected
Example error: "File contains null bytes. This typically indicates binary content or a non-text encoding (e.g. UTF-16)."
The file contains null byte characters (0x00). This is the characteristic signature of UTF-16 encoding, which is used by some Windows ERP exports for full Unicode support. A UTF-16 file appears binary to a UTF-8 text parser and cannot be read without conversion. Fix: convert the file to UTF-8 before validating or loading: iconv -f utf-16 -t utf-8 file.ssv > file_utf8.ssv. Re-validate the converted file to confirm no remaining encoding issues.
Duplicate column headers
Example warning: "Duplicate column header(s) detected: \"amount\". This can break many data pipelines."
Two or more columns share the same name. This is sometimes seen in ERP exports where multiple currency fields (amount in local currency, amount in EUR) share the same column label with the currency distinguished only by the column position rather than the name. Data processing libraries that index columns by name will either raise an error or silently drop duplicate columns. Fix: rename duplicate headers to be unique β for example, amount_local and amount_eur.
Empty header names
Example warning: "1 empty column header name(s) in the first row."
One or more fields in the header row contain no text β typically caused by a trailing semicolon on the header line, creating a phantom empty final column. Fix: remove the trailing semicolon from the header row, or add a name for the column if it contains real data.
Empty rows
Example warning: "3 empty row(s) found at line(s): 15, 16, 200."
One or more lines in the file contain only a newline with no field content. In SSV files from ERP systems, empty rows sometimes appear as section separators in what are actually multi-section exports β where different logical tables are concatenated in a single file. Fix: identify whether the empty rows are separators between logical sections (in which case the file needs to be split before validating), or trailing artifacts (which can simply be deleted). Remove or handle them at the specified line numbers.
Step 6 β Fix and Re-Validate
After reviewing the results panels, fix identified problems in your source file and run validation again. The most effective workflow:
- Before closing the validator, note every reported row number and issue type from the error and warning panels.
- Open the SSV file in a plain-text editor that shows line numbers and invisible characters. VS Code (View β Render Whitespace), Notepad++, vim, and Sublime Text all support this. Avoid editing SSV files in Excel β Excel will re-interpret numeric fields formatted with European decimal separators (comma), strip leading zeros from numeric IDs, and potentially change the delimiter when re-saving.
- Fix errors starting from the lowest row number. Column count errors caused by embedded semicolons on earlier rows shift all subsequent row numbers, so working from top to bottom keeps the reported line numbers accurate throughout.
- Save the fixed file. Click Clear in the validator to reset all panels, drag the fixed file in, and click Validate SSV again.
- Repeat until the status bar shows the green valid message. Warnings that are acceptable for your specific parser (such as a BOM that your tool handles transparently) can be noted in your documentation as known and intentional rather than fixed.
Worked Examples
Example 1: Embedded semicolon in an address field
An ERP export includes a free-text address field. A user entered a semicolon as part of the address notation, producing a row with one extra field:
id;name;address;city
1001;Acme GmbH;Hauptstr. 12; Hinterhof;Berlin
1002;Beta AG;Musterweg 5;Hamburg
Row 2 has 5 fields (the semicolon in "Hauptstr. 12; Hinterhof" is interpreted as a delimiter) but the header has 4. The validator reports: "Row 2 has 5 field(s) but expected 4." Fix: quote the address field in the export configuration, or replace embedded semicolons with a safe substitute before writing the file:
1001;Acme GmbH;"Hauptstr. 12; Hinterhof";Berlin
Example 2: Wrong delimiter β CSV labelled as SSV
A European data portal publishes a file with a .ssv extension but the export actually uses commas:
id,name,amount
1001,Acme GmbH,1234.56
1002,Beta AG,789.00
The validator reports: "No semicolons found β the file appears to use commas as the delimiter." The file has 1 column per row when parsed as SSV. Fix: this is a CSV file. Use the CSV Validator to validate it, and configure your parser to use comma as the delimiter rather than semicolon.
Example 3: Unclosed quote from a field containing a double-quote character
A product description field in an ERP export contains a double-quote character that was not escaped:
id;name;description
2001;Widget A;The "Pro" model
2002;Widget B;Standard edition
The validator reports: "Unclosed double-quote on line(s): 2." The opening " before Pro starts a quoted field; the parser expects a closing " to end it, but the next " it finds is on a later row. Everything in between is consumed as one field. Fix: escape the double-quote by doubling it inside a properly quoted field:
2001;Widget A;"The ""Pro"" model"
Example 4: UTF-16 file from a Windows ERP system
A German ERP system exports its ledger data as UTF-16LE. The validator reports: "File contains null bytes. This typically indicates binary content or a non-text encoding (e.g. UTF-16)." Fix: convert to UTF-8 before validating or loading into your pipeline:
iconv -f utf-16le -t utf-8 ledger.ssv > ledger_utf8.ssv
Re-validate the converted file. Also verify after conversion that accented characters (Γ€, ΓΆ, ΓΌ, Γ) and the euro sign (β¬) are correctly represented in the UTF-8 output.
Example 5: Multi-section ERP export with empty row separators
Some ERP systems export multiple logical sections into a single file, separating them with blank lines:
invoice_id;date;amount
INV-001;2026-01-15;1234,50
line_id;invoice_id;product;qty;unit_price
L001;INV-001;Widget A;5;246,90
The validator reports empty rows and column count mismatches (the second section's header row is treated as a data row with a different field count). Fix: split the file into separate sections before validating. Each section is its own logical table with its own schema and should be validated independently. If you must handle this format in a pipeline, add a pre-processing step that detects and splits on blank-line boundaries before passing each section to the validator or loader.
Example 6: BOM causing KeyError in pandas
A file exported from a German accounting tool includes a UTF-8 BOM. Loading it with pandas:
df = pd.read_csv('export.ssv', sep=';')
df['buchungsdatum'] # KeyError: 'buchungsdatum'
The validator shows "UTF-8 BOM detected" and the column headers panel shows the first header as \ufeffbuchungsdatum (with the BOM prefix visible). Fix: use encoding='utf-8-sig' in pandas to strip the BOM automatically:
df = pd.read_csv('export.ssv', sep=';', encoding='utf-8-sig')
Alternatively, strip the BOM from the file itself before distributing it.
Tips and Edge Cases
- Files with
.csvextensions from European systems. Excel and other tools in European locales sometimes export semicolon-delimited data with a.csvextension rather than.ssv. The SSV Validator accepts.csvfiles β drop the file in and validate it as SSV. The stats panel will confirm whether semicolons were found and used as the delimiter. - Numeric fields with comma decimal separators. European SSV files often contain numeric values like
1.234,56(period as thousands separator, comma as decimal separator). These are not parsed as numbers by the validator β they appear as string field values, which is correct. What matters is that the field count per row is consistent. The specific formatting of numeric values is a concern for your downstream loader, not the structural validator. - Files with no header row. Some ERP exports omit the header row β the column meaning is defined externally by a schema document. The validator uses the first row as its column count reference regardless. All structural checks still apply, but header-specific checks (duplicate names, empty names) will fire against the values in the first data row.
- Large files. The tool processes files entirely in-browser. Files under 10 MB validate in well under a second. Files of 50 MB or more may take several seconds. The 50 MB limit shown in the drop zone is a soft guideline β the tool will process larger files if browser memory allows, but performance degrades proportionally with file size.
- Re-validating after partial fixes. Click Clear (not browser reload) to reset the tool between validation runs. Clear removes the file reference and resets all panels without reloading the page. Reloading the page also works but takes longer.
- Privacy for sensitive ERP data. SSV files from ERP and accounting systems frequently contain financial records, employee data, customer PII, or vendor information. All validation runs locally in your browser tab β zero bytes of your file are transmitted anywhere. You can safely validate production financial exports, HR records, and customer data without any exposure risk.
- Checking the data preview for column alignment. After a successful validation, scan the data preview panel column by column. Confirm that values in each column match the header above them. A visually misaligned preview β where numeric amounts appear in a name column, for example β indicates a column count problem that the error panel did not catch (this can happen when column counts are consistent across all rows but the header is misconfigured). The preview is the fastest way to catch schema drift.
For a deeper explanation of what each validation check does and why it matters for SSV files, see the complete guide to SSV validation. To validate a different tabular format, visit the Developer Tools hub for CSV, TSV, PSV, and other delimited file validators.
