Importing and Exporting (I/O)
Importing data from tabular data files
To read data from a CSV-like file, use the readtable
function:
DataTables.readtable
— Function.Read data from a tabular-file format (CSV, TSV, ...)
readtable(filename, [keyword options])
Arguments
filename::AbstractString
: the filename to be read
Keyword Arguments
header::Bool
– Use the information from the file's header line to determine column names. Defaults totrue
.separator::Char
– Assume that fields are split by theseparator
character. If not specified, it will be guessed from the filename:.csv
defaults to','
,.tsv
defaults to' '
,.wsv
defaults to' '
.quotemark::Vector{Char}
– Assume that fields contained inside of twoquotemark
characters are quoted, which disables processing of separators and linebreaks. Set toChar[]
to disable this feature and slightly improve performance. Defaults to['"']
.decimal::Char
– Assume that the decimal place in numbers is written using thedecimal
character. Defaults to'.'
.nastrings::Vector{String}
– Translate any of the strings into this vector into a NULL value. Defaults to["", "NULL", "NA"]
.truestrings::Vector{String}
– Translate any of the strings into this vector into a Booleantrue
. Defaults to["T", "t", "TRUE", "true"]
.falsestrings::Vector{String}
– Translate any of the strings into this vector into a Booleanfalse
. Defaults to["F", "f", "FALSE", "false"]
.makefactors::Bool
– Convert string columns intoCategoricalVector
's for use as factors. Defaults tofalse
.nrows::Int
– Read onlynrows
from the file. Defaults to-1
, which indicates that the entire file should be read.names::Vector{Symbol}
– Use the values in this array as the names for all columns instead of or in lieu of the names in the file's header. Defaults to[]
, which indicates that the header should be used if present or that numeric names should be invented if there is no header.eltypes::Vector
– Specify the types of all columns. Defaults to[]
.allowcomments::Bool
– Ignore all text inside comments. Defaults tofalse
.commentmark::Char
– Specify the character that starts comments. Defaults to'#'
.ignorepadding::Bool
– Ignore all whitespace on left and right sides of a field. Defaults totrue
.skipstart::Int
– Specify the number of initial rows to skip. Defaults to0
.skiprows::Vector{Int}
– Specify the indices of lines in the input to ignore. Defaults to[]
.skipblanks::Bool
– Skip any blank lines in input. Defaults totrue
.encoding::Symbol
– Specify the file's encoding as either:utf8
or:latin1
. Defaults to:utf8
.normalizenames::Bool
– Ensure that column names are valid Julia identifiers. For instance this renames a column named"a b"
to"a_b"
which can then be accessed with:a_b
instead ofSymbol("a b")
. Defaults totrue
.
Result
::DataTable
Examples
dt = readtable("data.csv")
dt = readtable("data.tsv")
dt = readtable("data.wsv")
dt = readtable("data.txt", separator = ' ')
dt = readtable("data.txt", header = false)
readtable
requires that you specify the path of the file that you would like to read as a String
. To read data from a non-file source, you may also supply an IO
object. It supports many additional keyword arguments: these are documented in the section on advanced I/O operations.
Exporting data to a tabular data file
To write data to a CSV file, use the writetable
function:
DataTables.writetable
— Function.Write data to a tabular-file format (CSV, TSV, ...)
writetable(filename, dt, [keyword options])
Arguments
filename::AbstractString
: the filename to be createddt::AbstractDataTable
: the AbstractDataTable to be written
Keyword Arguments
separator::Char
– The separator character that you would like to use. Defaults to the output ofgetseparator(filename)
, which uses commas for files that end in.csv
, tabs for files that end in.tsv
and a single space for files that end in.wsv
.quotemark::Char
– The character used to delimit string fields. Defaults to'"'
.header::Bool
– Should the file contain a header that specifies the column names fromdt
. Defaults totrue
.nastring::AbstractString
– What to write in place of missing data. Defaults to"NULL"
.
Result
::DataTable
Examples
dt = DataTable(A = 1:10)
writetable("output.csv", dt)
writetable("output.dat", dt, separator = ',', header = false)
writetable("output.dat", dt, quotemark = '', separator = ',')
writetable("output.dat", dt, header = false)
Supplying DataTable
s inline with non-standard string literals
You can also provide CSV-like tabular data in a non-standard string literal to construct a new DataTable
, as in the following:
dt = csv"""
name, age, squidPerWeek
Alice, 36, 3.14
Bob, 24, 0
Carol, 58, 2.71
Eve, 49, 7.77
"""
The csv
string literal prefix indicates that the data are supplied in standard comma-separated value format. Common alternative formats are also available as string literals. For semicolon-separated values, with comma as a decimal, use csv2
:
dt = csv2"""
name; age; squidPerWeek
Alice; 36; 3,14
Bob; 24; 0
Carol; 58; 2,71
Eve; 49; 7,77
"""
For whitespace-separated values, use wsv
:
dt = wsv"""
name age squidPerWeek
Alice 36 3.14
Bob 24 0
Carol 58 2.71
Eve 49 7.77
"""
And for tab-separated values, use tsv
:
dt = tsv"""
name age squidPerWeek
Alice 36 3.14
Bob 24 0
Carol 58 2.71
Eve 49 7.77
"""