How to Use the PSV Validator: Step-by-Step Tutorial
The PSV Validator runs entirely in your browser โ no file is uploaded to any server, no account is required, and no data leaves your machine. This tutorial walks through every step of using the tool: loading a PSV file, reading the results, understanding each type of issue reported, and fixing the most common problems found in pipe-separated value files.
Follow along with the tool open: Open the PSV Validator in a second tab, then work through each step below.
Open PSV Validator โTable of Contents
Step 1 โ Open the Tool
Navigate to /developer-tools/psv-validator/. The tool loads entirely in the browser with no server-side processing. You can confirm this by opening the browser's Network tab in DevTools โ after the page loads, validating a file generates zero outbound requests. Your file data is processed in-memory by JavaScript running inside your browser tab.
The tool is accessible from the Developer Tools hub, via the site command palette (press Ctrl+K or โK and type "PSV Validator"), or directly at the URL above.
Step 2 โ Load Your PSV File
There are two ways to load your file:
Option A โ Drag and drop. Drag your .psv or .txt file from your file manager and drop it anywhere on the page. A full-page drop overlay appears when a file is dragged over the browser window โ release to load. The file is read directly from disk using the browser's FileReader API and never leaves your machine.
Option B โ Browse. Click the browse link inside the drop zone to open your operating system's file picker. Select your file and it loads immediately. This is useful when the file has an uncommon extension or when drag-and-drop is impractical in your workflow.
Once the file is loaded, its name appears below the drop zone with a clear (ร) button beside it. The drop zone hides and the action buttons become active. The file is not yet validated โ loading and validating are separate steps.
If you drop a file with an extension the tool does not recognise as PSV, a red type-error notice appears below the drop zone identifying the extension you dropped and noting that only PSV and plain-text files are accepted. Clear the file and try again with the correct file.
Step 3 โ Run Validation
With a file loaded, click the Validate PSV button. Validation runs synchronously in your browser โ for most files under a few megabytes, results appear in under a second. Larger files take proportionally longer but all processing remains local.
While validation runs the button is unresponsive. When it completes, the status bar below the button updates immediately with either a green success message, a red error message, or a yellow warning message.
To validate a different file, click Clear to reset the tool, then load and validate the new file. Clicking Clear removes the current file and resets all result panels to their empty state.
Step 4 โ Read the Results
Results appear in up to three panels below the status bar:
Error panel (red). Appears when one or more structural errors were found. Each error is listed with the row number and a plain-English description. If this panel appears, the file has problems that will likely cause a parser or database loader to fail or produce incorrect output. Address all errors before attempting to load the file into a production system.
Stats panel (green). Appears when the file passes all error checks. Shows key metrics about the file: row count, column count, encoding, and file size. These values are derived from the file as parsed โ the row count reflects actual parsed rows, not simply line count, so quoted fields containing embedded newlines are counted correctly.
Warning panel (yellow). Appears alongside the stats panel when warnings are present. Warnings indicate conditions that may or may not cause problems depending on your target system โ a UTF-8 BOM, trailing whitespace in a header name, or a trailing pipe on every row are typical warnings. Review each warning and decide whether it requires action for your specific use case.
Preview panel. When validation succeeds, a scrollable table preview of the first rows of your file appears below the results. Use this to visually confirm that column alignment looks correct and that the header row and data rows match up as expected.
Step 5 โ Understand Each Issue Type
The PSV Validator reports the following issue types. Here is what each means and how to fix it:
Column count mismatch
Example: "Row 11 has 4 fields; expected 5."
A row contains a different number of pipe-delimited fields than the header row (or the modal column count if no header is present). This is the most common error in PSV files. The most frequent cause is an unescaped literal pipe character inside a field value โ for example, a product option field containing Red|Blue|Green where the pipe characters are treated as field separators. Other causes include a missing field (accidentally deleted during manual editing) or an extra trailing pipe. To fix: open the file at the specified row, find the misaligned field, and either escape or remove the embedded pipe, or add the missing field.
Trailing pipe detected
Example: "Row 3 ends with a trailing pipe character."
The row ends with a | character, which creates an empty final field. If every row has a trailing pipe and the header row does too, the file may still be structurally consistent โ but many parsers will create a phantom empty final column. If only some rows have trailing pipes, the column count is inconsistent and this will cause errors. Fix: remove the trailing pipe from all affected rows. If your data producer consistently appends trailing pipes, configure your parser to ignore them or strip them in a pre-processing step.
Unclosed quote
Example: "Row 6: unclosed quote starting at column 2."
A double-quote character opens a quoted field but no matching closing quote is found. The parser will consume all subsequent content โ including row delimiters โ as part of the same field until it finds a closing quote or reaches end of file. This silently collapses multiple rows into one, causing a cascade of column count mismatches on every row below the problem row. Fix: locate the field at the specified row and column, ensure the closing double-quote is present, and double any embedded double-quotes ("").
Blank header name
Example: "Column 3 has a blank header name."
A field in the header row is empty โ two consecutive pipes (||), or a leading pipe at the start of the row, or a trailing pipe at the end. An unnamed column cannot be referenced by name in downstream code. Fix: add a name for the blank column, or remove the field if it is a spurious artifact of a trailing delimiter.
Duplicate header name
Example: "Columns 1 and 4 both have the header 'id'."
Two or more header fields share the same name. Downstream code that references columns by name will either raise an error or silently use the first matching column. Fix: rename one of the duplicate headers to make every column name unique.
Leading or trailing whitespace in header
Example: "Column 2 header ' email' has leading whitespace."
A header field name has a space or other whitespace character before or after it. The header name " email" (with a leading space) is distinct from "email" and will cause column-not-found errors in code that references it by the expected name. Fix: trim whitespace from all header field names in the source file or in your pre-processing step.
Empty row
Example: "Row 18 is empty."
A row contains only a newline with no field content. Usually a stray blank line from manual editing, spreadsheet export, or file concatenation. Fix: delete the empty row. A single trailing newline at the very end of the file is acceptable and reported as a warning rather than an error.
Encoding issue
Example: "File contains invalid UTF-8 byte sequences."
The file contains bytes that are not valid in the detected encoding. This typically means the file was saved in a single-byte encoding (Latin-1, Windows-1252) but is being read as UTF-8, or vice versa. Fix: determine the correct encoding by opening the file in a text editor that reports encoding (VS Code shows encoding in the status bar), then re-save in UTF-8 or configure your reader to use the correct encoding.
UTF-8 BOM detected
Example: "File starts with a UTF-8 BOM (EF BB BF)."
The file was saved with a byte order mark. This is typically a warning rather than an error. If your parser handles BOMs transparently, no action is needed. If the BOM is causing the first header name to appear with a leading invisible character, strip it. On Linux or macOS: sed -i '1s/^\xef\xbb\xbf//' file.psv. In Python, open the file with encoding='utf-8-sig' which strips the BOM automatically.
Step 6 โ Fix and Re-Validate
After identifying issues in the results panels, fix them in the source file and re-validate. The most efficient workflow:
- Before closing the validator, note all reported row numbers and issue types.
- Open the PSV file in a plain-text editor that shows line numbers โ VS Code, Notepad++, vim, or any comparable editor. Avoid fixing PSV problems in Excel, which may re-introduce quoting or encoding issues on save.
- Fix issues starting from the lowest row number. Unclosed-quote errors in particular can shift the apparent row numbers of all subsequent issues, so fixing from the top down ensures row numbers remain accurate.
- Save the fixed file, click Clear in the validator to reset, then drag the fixed file in and click Validate PSV again.
- Repeat until the status bar shows the green valid message and no error or warning panels remain (or until any remaining warnings are confirmed acceptable for your use case).
Worked Examples
Example 1: Pipe character inside a field value
A product export file contains options as pipe-separated values within a field:
1001|Widget Pro|Red|Blue|Green|29.99|Active
The header row has 5 columns (id|name|color|price|status) but this row produces 7 fields because the color options contain literal pipes. The validator reports a column count mismatch on this row. The fix is to either remove the embedded pipes and use a different separator within the field, or quote the field:
1001|Widget Pro|"Red|Blue|Green"|29.99|Active
Example 2: Trailing pipe on every row
A mainframe export appends a pipe after every field including the last:
id|name|email|
101|Alice|[email protected]|
102|Bob|[email protected]|
The validator reports that every row ends with a trailing pipe, creating a 4th blank column. The header row has a blank 4th header name. If your downstream parser creates a phantom empty column from this, strip the trailing pipes in a pre-processing step:
sed 's/|$//' file.psv > file_clean.psv
Example 3: Whitespace-padded header names
A file exported from a legacy reporting system pads all header names to a fixed width:
id |name |amount |date
The validator flags each header with trailing whitespace. A query referencing df['amount'] in pandas will raise a KeyError because the actual column name is 'amount '. Fix by trimming all header names to their content, either manually or with:
awk -F'|' 'NR==1{for(i=1;i<=NF;i++) gsub(/^ +| +$/,"",$i)} 1' OFS='|' file.psv
Example 4: Mixed encoding in a financial export
A bank export contains currency symbols in Windows-1252 encoding. The pound sign (ยฃ) is encoded as byte 0xA3 โ valid Latin-1 but an invalid UTF-8 sequence. The validator reports invalid UTF-8 byte sequences. Fix: re-open the file specifying Latin-1 encoding and re-save as UTF-8:
iconv -f latin1 -t utf-8 file.psv > file_utf8.psv
Tips and Edge Cases
- Files with no header row. If your file starts with data rows rather than a header, the validator uses the first row as its column count reference. All structural checks still apply. The blank and duplicate header name checks will reference the values in the first data row rather than dedicated header values.
- Very large files. The tool processes files in-browser using JavaScript. Files under 10 MB validate in under a second on a modern machine. Files of 50 MB or more may take several seconds. For multi-gigabyte PSV files, a command-line tool is more practical โ
awk,miller(mlr), or a custom script using your language's file streaming API. - Files with a
.txtextension. Many PSV files are distributed with a.txtextension. The tool accepts both.psvand.txtextensions. The delimiter is identified from the file content, not from the extension. - Comparing PSV and CSV validation. If you are not sure whether your file is truly pipe-delimited or comma-delimited, try both validators. The correct one will report a consistent column count across all rows; the wrong one will report column count mismatches on every row that contains the opposite delimiter.
- Privacy. Every byte of your file is processed locally. The FileReader API reads the file from disk, JavaScript parses it in memory, and the results are displayed in the DOM โ no network request is made at any point after initial page load. You can validate sensitive files โ payroll exports, patient records, financial transactions โ without exposure risk.
- Re-validating after fixes. The Clear button fully resets the tool state. After clicking Clear, the drop zone reappears and all result panels are hidden. You can then drag in the fixed file and validate again without reloading the page.
