csvkit 1.0.6¶
About¶
csvkit is a suite of command-line tools for converting to and working with CSV, the king of tabular file formats.
It is inspired by pdftk, GDAL and the original csvcut tool by Joe Germuska and Aaron Bycoffe.
Important links:
Documentation: https://csvkit.rtfd.org/
Repository: https://github.com/wireservice/csvkit
Schemas: https://github.com/wireservice/ffs
First time? See Tutorial.
Note
To change the field separator, line terminator, etc. of the output, you must use csvformat.
Note
csvkit, by default, sniffs CSV formats (it deduces whether commas, tabs or spaces delimit fields, for example), and performs type inference (it converts text to numbers, dates, booleans, etc.). These features are useful and work well in most cases, but occasional errors occur. If you don’t need these features, set --snifflimit 0
(-y 0
) and --no-inference
(-I
).
Why csvkit?¶
Because it makes your life easier.
Convert Excel to CSV:
in2csv data.xls > data.csv
Convert JSON to CSV:
in2csv data.json > data.csv
Print column names:
csvcut -n data.csv
Select a subset of columns:
csvcut -c column_a,column_c data.csv > new.csv
Reorder columns:
csvcut -c column_c,column_a data.csv > new.csv
Find rows with matching cells:
csvgrep -c phone_number -r "555-555-\d{4}" data.csv > new.csv
Convert to JSON:
csvjson data.csv > data.json
Generate summary statistics:
csvstat data.csv
Query with SQL:
csvsql --query "select name from data where age > 30" data.csv > new.csv
Import into PostgreSQL:
csvsql --db postgresql:///database --insert data.csv
Extract data from PostgreSQL:
sql2csv --db postgresql:///database --query "select * from data" > new.csv
And much more…
Table of contents¶
- Tutorial
- Reference
- Tips and Troubleshooting
- Contributing to csvkit
- Release process
- License
- Changelog
- 1.0.6 - July 13, 2021
- 1.0.5 - March 2, 2020
- 1.0.4 - March 16, 2019
- 1.0.3 - March 11, 2018
- 1.0.2 - April 28, 2017
- 1.0.1 - December 29, 2016
- 1.0.0 - December 27, 2016
- 0.9.1 - March 31, 2015
- 0.9.0 - September 8, 2014
- 0.8.0 - July 27, 2014
- 0.7.3 - April 27, 2014
- 0.7.2 - March 24, 2014
- 0.7.1 - March 24, 2014
- 0.7.0 - March 24, 2014
- 0.6.1 - August 20, 2013
- 0.6.0 - August 20, 2013
- 0.5.0 - August 21, 2012
- 0.4.4 - May 1, 2012
- 0.4.3 - February 20, 2012
Citation¶
When citing csvkit in publications, you may use this BibTeX entry:
@Manual{,
title = {csvkit},
author = {Christopher Groskopf and contributors},
year = 2016,
url = {https://csvkit.readthedocs.org/}
}
Authors¶
The following individuals have contributed code to csvkit:
Christopher Groskopf
Joe Germuska
Aaron Bycoffe
Travis Mehlinger
Alejandro Companioni
Benjamin Wilson
Bryan Silverthorn
Evan Wheeler
Matt Bone
Ryan Pitts
Hari Dara
Jeff Larson
Jim Thaxton
Miguel Gonzalez
Anton Ian Sipos
Gregory Temchenko
Kevin Schaul
Marc Abramowitz
Noah Hoffman
Jan Schulz
Derek Wilson
Chris Rosenthal
Davide Setti
Gabi Davar
Sriram Karra
James McKinney
Aaron McMillin
Matt Dudys
Joakim Lundborg
Federico Scrinzi
Shane StClair
raistlin7447
Alex Dergachev
Jeff Paine
Jeroen Janssens
Sébastien Fievet
Travis Swicegood
Ryan Murphy
Diego Rabatone Oliveira
Matt Pettis
Tasneem Raja
Richard Low
Kristina Durivage
Espartaco Palma
pnaimoli
Michael Mior
Jennifer Smith
Antonio Lima
Dave Stanton
Pedrow
Neal McBurnett
Anthony DeBarros
Baptiste Mispelon
James Seppi
Karrie Kehoe
Geert Barentsen
Cathy Deng
Eric Bréchemier
Neil Freeman
Fede Isas
Patricia Lipp
Kev++
edwardros
Martin Burch
Pedro Silva
hydrosIII
Tim Wisniewski
Santiago Castro
Dan Davison
Éric Araujo
Sam Stuck
Edward Betts
Jake Zimmerman
Bryan Rankin
Przemek Wesołek
Karl Fogel
sterlingpetersen
kjedamzik
John Vandenberg
Olivier Lacan
Adrien Delessert
Ghislain Antony Vaillant
Forest Gregg
Aliaksei Urbanski
Reid Beels
Rodrigo Lemos
Victor Noagbodji
Connor McArthur
Matěj Cepl
Nicholas Matteo
Matt Giguere
Felix Bünemann
Andriy Orehov (Андрій Орєхов)
Dan Nguyen
谭九鼎
Tomáš Hrnčiar
Christopher Bottoms
panolens
Gabe Walker