• Stars
    star
    156
  • Rank 238,560 (Top 5 %)
  • Language
    C#
  • License
    GNU General Publi...
  • Created almost 5 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

CSV Lint plug-in for Notepad++ for syntax highlighting, csv validation, automatic column and datatype detecting, fixed width datasets, change datetime format, decimal separator, sort data, count unique values, convert to xml, json, sql etc. A plugin for data cleaning and working with messy data files.

CSV Lint - Notepad++ plugin

Release version GitHub all releases GitHub latest release Donate

CSV Lint plug-in for Notepad++ adds syntax highlighting to comma-separated values (csv) and fixed width data files. It automatically detects column datatypes, can detect data formatting errors and adjust datetime and decimal formats.

preview screenshot

Use CSV Lint for metadata discovery, technical data validation and reformatting on tabular data files. It is not meant to be a replacement for spreadsheet programs like Excel or SPSS, but rather it's a quality control tool to examine, verify or polish up a dataset before further processing.

With CSV Lint you can take a dataset and:

  • Scan for metadata, i.e. detect columns and datatypes
  • Create schema.ini based on metadata
  • Validate data against schema.ini
  • Convert datetime/decimal values to different formats
  • Convert between comma, semicolon, tab separated, fixed width formats
  • Split valid/invalid values into two separate columns
  • Count unique values of one or more columns
  • Convert csv data to SQL insert script

CSV Lint is stable and usable for most general use-cases, but it is a work-in-progress, so if you encounter any bugs or unexpected output I encourage you to report issues here. CSVLint is based on a prototype project Dataset MultiTool

If you find the CSV Lint plug-in useful you can buy me a coffee! ☕
paypal

How to install

The CSV Lint plugin is available in Notepad++ in the Plugins Admin menu.

  • Install Notepad++
  • In Notepad++ go to menu item Plugins > Plugins Admin...
  • On tab Available search for csv lint
  • Check the checkbox and press Install button
  • Click Yes to quit Notepad++ and "continue the operations"
  • Click Yes on the Windows notification "Allow app to make changes"

Note: the plug-in is available in the Plugins Admin menu starting with Notepad++ v8.1.9.1. If you have a Notepad++ version older than v8.1.9.1 or want to install the plug-in manually:

  • Go to the releases page
  • Find the latest release
  • 32bit version; unzip CSVLint.dll (32bit) to folder .\Program Files (x86)\Notepad++\plugins\CSVLint\CSVLint.dll.
  • 64bit version; unzip CSVLint.dll (64bit) to folder .\Program Files\Notepad++\plugins\CSVLint\CSVLint.dll.
  • restart Notepad++

How to use it

  1. Open your dataset in Notepad++
  2. Open the "CSV Lint window" from the plug-in menu or toolbar
  3. Press "Detect columns" to automatically detect format
  4. Optionally, manually enter or adjust metadata
  5. Press "Validate data" to detect any data errors

If there are no errors in the data, you can click "Reformat data" for data reformatting options, or select "Convert to SQL" menu item to generate an SQL insert script.

Also see this quick tour video, which shows how the plug-in works.

Watch video, CSV Lint plug-in features oveview Watch video, CSV File: View, Add Column Colors and Sort Data using Notepad++

Schema.ini

The metadata uses the standard schema.ini format, see documentation here

When you open a csv file the plug-in try to determine the column meta data. It will first look for a schema.ini file in the same folder as the data file, and check to see if it contains a section with the filename. If the file or section doesn't exist, it will scan the data and try to infer the columns and datatypes. You can manually change the meta data and press the blue disk icon to save it to a schema.ini file in the same folder as the data file for later use.

See schema.ini example below:

[mydata.csv]
Format=TabDelimited
DateTimeFormat=dd-mm-yyyy
DecimalSymbol=.
NumberDigits=2
Col1=OrderId Integer Width 8
Col2=Price Float Width 7
Col3=PartName Text Width 50
Col4=OrderDate DateTime Width 10

Format can be TabDelimited for tabs, CSVDelimited for commas, for any other delimiter use for example Format=Delimited(;). Use FixedLength for fixed width text files and set the Width for each column.

DateTimeFormat is not case sensitive and uses dd/mm/yyyy or yyyy-mm-dd hh:nn:ss etc.

DecimalSymbol can be either . or , and CSV Lint will assume the thousand separators symbol is the opposite of the DecimalSymbol. Define the maximum decimals digits for example NumberDigits=2 for values like "1.23" or "-45.67" etc.

Validating

The plug-in can be used as a CSV validator, press "Validate data" to check the input data for technical errors, based on the metadata in the textbox on the left. The line numbers of any errors will be logged in the textbox on the right. It will check the input data for the following errors:

  • Values that are too long, example value "abcde" when column is "Width 4"
  • Non-numeric values in numeric columns, example value "n/a" when column datatype is Integer
  • Incorrect decimal separator, example value "12.34" when DecimalSymbol is set to comma
  • Too many decimals, example value "12.345" when NumberDigits=2.
  • Incorrect date format, example value "12/31/2018" when DateTimeFormat=dd/mm/yyyy

Roadmap/goals

The CSV Lint plugin is work-in-progress, here is list of features I want to add (strikethrough is done)

  • Convert datetime values to different formats
  • Convert decimal symbol to point/comma
  • Toggle between comma, semicolon, tab separated formats
  • Improve file reading, to process/edit large files (>1MB)
  • Count unique values based on column(s)
  • Allow format masks per individual column
  • Support quoted strings
  • Support two-digit year date values
  • Support for currency/thousand separator "12.345,00" or "1,250,000.00" etc.
  • Load/save schema.ini
  • Improve automatic datatype detection
  • Add feature GUI click to jump to error line
  • Support enumeration coded + error check
  • Replace-code-with-label (using schema.json?)
  • Support FrictionlessData schema.json format
  • Improve GUI instead of plain text
  • Add feature generate scripts (SQL, Python, R)
  • Syntax highlighting, display columns as colors
  • Search for value in column, search next empty/non-empty in column
  • Search/replace in single column
  • Split column into new column ("123/456" -> "123", "456")

Trouble shooting / Known issues

  • When you press the "Validate Data" button after editing the data file, the text and metadata are not always synchronised immediately. if you get unexpected validation results, try saving the datafile or refreshing the meta data before clicking "Detect columns".

  • When you press "Detect columns" the datetime format of the data isn't always detected correctly. Especially when the data contains values like 05/06/2022 the order of day and month can be incorrect. You can adjust it manually to match your data before pressing the "Validate data" button.

  • When you select Language > CSVLint to enable the syntax highlighing colors, or change the metadata manually, the column colors aren't always updated immediately. Click inside the textfile or switch tabs to a different file and then back and it should display correctly.

Acknowledgements

With thanks to:

The CSV Lint plug-in couldn't have been created without their source examples, suggestions and valuable feedback. For a similar plug-in for Visual Studio Code check out Rainbow CSV.

Disclaimer

This software is free-to-use and it is provided as-is without warranty of any kind, always back-up your data files to prevent data loss.
The test data, examples and screenshots provided in this github repository do not contain real data, it is all randomly generated test data.

BdR©2023 Free to use - send questions or comments: Bas de Reuver - [email protected]

More Repositories

1

lcdgame.js

LCD game library in html5/javascript, to recreate simulations of 80s Game & Watch by Nintendo and other electronic handheld games like Tandy, Sunwing, Mini Arcade etc. Currently supports playable simulators of Donkey Kong II, Mario Bros, Highway, Sea Ranger, Tom's Adventure, Jungle Kong, Eagle n Chicken and Mario's Cement Factory, incl. online highscores lists.
JavaScript
98
star
2

Phaser3-example-game

A Phaser3 game example, a bare minimum game template. Shows the structure of a typical Phaser3 game using scenes, a preloader, a loading bar, buttons etc.
JavaScript
42
star
3

MultiLanguage

JavaScript multi-language localisation using a json file to add UI translation to websites or games
JavaScript
37
star
4

GimpSpriteAtlas

GIMP plug-in to create a spriteatlas (aka spritetexture or spritesheet). Takes all layers as images and compiles them into one sprite atlas texture, using a 2D packing algorithm so that the final texture image takes as little space as possible. Also outputs a TexturePacker JSON, LibGDX, CSS or XML coordinates file.
Python
27
star
5

barcode128.bas

A barcode 128 encoding macro for LibreOffice Calc or Excel, to print EAN CODE 128 barcodes and scan with handscanner
VBA
15
star
6

RandomValuesNPP

Generate random values plug-in for Notepad++. Use this plugin to generate passwords, guids or random datasets in CSV, JSON, XML and SQL formats. Use the fake test data for performance and QA testing to improve software quality in application development, reports, database modeling, webdev etc.
C#
13
star
7

phaserlevelselect

Phaser level select screen, example of a typical game select screen which saves the progress for each level to localstorage
JavaScript
13
star
8

GLES.JS-Android-Studio-project

Package GLES.JS in Android-Studio
JavaScript
10
star
9

Manage-translations

Manage translations in a spreadsheet and use a macro to generate the localization files for your website, app or game. Includes example and macro for both Excel and LibreOffice Calc to generate Javascript JSON, XCode strings (iOS), Eclipse xml (Android) or Visual Studio resx
VBA
7
star
10

selection-to-neon

GIMP Python plug-in, create neon light glow effect based on selection. Turn selected region into glowing lightsaber, laser beam effect, 80s cartoon underlighting effects (aka "Bipack Glow") etc.
Python
4
star
11

phaseranimals

Phaser Animals, general game structure test
JavaScript
3
star
12

phaserparallax

Parallax scrolling stars using Phaser JS
JavaScript
3
star
13

Phaser-memory-test

Phaser JS memory test create/destroy groups and sprites
JavaScript
2
star
14

NppPluginLexerExample

Notepad++ Plug-in Lexer and Folder example using the C# template
C#
2
star
15

phaserresize

Phaser JS dynamic screensize test
JavaScript
2
star
16

datasetmultitool

CSV lint tool to validate csv files. It is a helper utility to process csv textfiles and check for data errors. It can check text width, validate and reformat date and datetime values, change point or comma decimal separator, remove thousand separator and change column order.
JavaScript
2
star
17

phasercollision

Phaser Rotating Sprite Collision Test
JavaScript
1
star
18

phaserbackground

Background effect and different states using Phaser JS
JavaScript
1
star
19

Phaser-Group-and-cacheAsBitmap

Phaser js Group and cacheAsBitmap question
JavaScript
1
star
20

phaserparticles

Phaser, emit burst of particles in a circle
JavaScript
1
star
21

phaserbitmapfont

Phaser BitmapText font text using .json instead of .xml
JavaScript
1
star
22

process_icon_buttons

GIMP Python plug-in, automates the processing and resizing and exporting of button icon images, for game assets or websites.
Python
1
star