Extended annotated CSV
Extended annotated CSV provides additional annotations and options that specify how CSV data should be converted to line protocol and written to InfluxDB. InfluxDB uses the csv2lp library to convert CSV into line protocol. Extended annotated CSV supports all Annotated CSV annotations.
The Flux csv.from function only supports annotated CSV, not extended annotated CSV.
To write data to InfluxDB, line protocol must include the following:
- measurement
- field set
- timestamp (Optional but recommended)
- tag set (Optional)
Extended CSV annotations identify the element of line protocol a column represents.
CSV Annotations
Extended annotated CSV extends and adds the following annotations:
datatype
Use the #datatype
annotation to specify the line protocol element a column represents. To explicitly define a column as a field of a specific data type, use the field type in the annotation (for example: string
, double
, long
, etc.).
Data type | Resulting line protocol |
---|---|
measurement | Column is the measurement |
tag | Column is a tag |
dateTime | Column is the timestamp |
field | Column is a field |
ignored | Column is ignored |
string | Column is a string field |
double | Column is a float field |
long | Column is an integer field |
unsignedLong | Column is an unsigned integer field |
boolean | Column is a boolean field |
measurement
Indicates the column is the measurement.
tag
Indicates the column is a tag. The column label is the tag key. The column value is the tag value.
dateTime
Indicates the column is the timestamp. time
is an alias for dateTime
. If the timestamp format includes a time zone, the parsed timestamp includes the time zone offset. By default, all timestamps are UTC. You can also use the #timezone annotation to adjust timestamps to a specific time zone.
There can only be one dateTime
column.
The influx write
command converts timestamps to Unix timestamps. Append the timestamp format to the dateTime
datatype with (:
).
#datatype dateTime:RFC3339
#datatype dateTime:RFC3339Nano
#datatype dateTime:number
#datatype dateTime:2006-01-02
Supported timestamp formats
Timestamp format | Description | Example |
---|---|---|
RFC3339 | RFC3339 timestamp | 2020-01-01T00:00:00Z |
RFC3339Nano | RFC3339 timestamp | 2020-01-01T00:00:00.000000000Z |
number | Unix timestamp | 1577836800000000000 |
If using the number
timestamp format and timestamps are not in nanoseconds, use the influx write —precision flag to specify the timestamp precision.
Custom timestamp formats
To specify a custom timestamp format, use timestamp formats as described in the Go time package. For example: 2020-01-02
.
field
Indicates the column is a field. The column label is the field key. The column value is the field value.
With the field
datatype, field values are copies as-is to line protocol. For information about line protocol values and how they are written to InfluxDB, see Line protocol data types and formats. We generally recommend specifying the field type in annotations.
ignored
The column is ignored and not written to InfluxDB.
Field types
The column is a field of a specified type. The column label is the field key. The column value is the field value.
string
Column is a string field.
double
Column is a float field. By default, InfluxDB expects float values that use a period (.
) to separate the fraction from the whole number. If column values include or use other separators, such as commas (,
) to visually separate large numbers into groups, specify the following float separators:
- fraction separator: Separates the fraction from the whole number.
- ignored separator: Visually separates the whole number into groups but ignores the separator when parsing the float value.
Use the following syntax to specify float separators:
# Syntax
<fraction-separator><ignored-separator>
# Example
.,
# With the float separators above
# 1,200,000.15 => 1200000.15
Append float separators to the double
datatype annotation with a colon (:
). For example:
#datatype "double:.,"
If your float separators include a comma (,
), wrap the column annotation in double quotes (""
) to prevent the comma from being parsed as a column separator or delimiter. You can also define a custom column separator.
long
Column is an integer field. If column values contain separators such as periods (.
) or commas (,
), specify the following integer separators:
- fraction separator: Separates the fraction from the whole number. Integer values are truncated at the fraction separator when converted to line protocol.
- ignored separator: Visually separates the whole number into groups but ignores the separator when parsing the integer value.
Use the following syntax to specify integer separators:
# Syntax
<fraction-separator><ignored-separator>
# Example
.,
# With the integer separators above
# 1,200,000.00 => 1200000i
Append integer separators to the long
datatype annotation with a colon (:
). For example:
#datatype "long:.,"
If your integer separators include a comma (,
), wrap the column annotation in double quotes (""
) to prevent the comma from being parsed as a column separator or delimiter. You can also define a custom column separator.
unsignedLong
Column is an unsigned integer (uinteger) field. If column values contain separators such as periods (.
) or commas (,
), specify the following uinteger separators:
- fraction separator: Separates the fraction from the whole number. Uinteger values are truncated at the fraction separator when converted to line protocol.
- ignored separator: Visually separates the whole number into groups but ignores the separator when parsing the uinteger value.
Use the following syntax to specify uinteger separators:
# Syntax
<fraction-separator><ignored-separator>
# Example
.,
# With the uinteger separators above
# 1,200,000.00 => 1200000u
Append uinteger separators to the long
datatype annotation with a colon (:
). For example:
#datatype "unsignedLong:.,"
If your uinteger separators include a comma (,
), wrap the column annotation in double quotes (""
) to prevent the comma from being parsed as a column separator or delimiter. You can also define a custom column separator.
boolean
Column is a boolean field. If column values are not supported boolean values, specify the boolean format with the following syntax:
# Syntax
<true-values>:<false-values>
# Example
y,Y,1:n,N,0
# With the boolean format above
# y => true, Y => true, 1 => true
# n => false, N => false, 0 => false
Append the boolean format to the boolean
datatype annotation with a colon (:
). For example:
#datatype "boolean:y,Y:n,N"
If your boolean format contains commas (,
), wrap the column annotation in double quotes (""
) to prevent the comma from being parsed as a column separator or delimiter. You can also define a custom column separator.
constant
Use the #constant
annotation to define a constant column label and value for each row. The #constant
annotation provides a way to supply line protocol elements that don’t exist in the CSV data.
Use the following syntax to define constants:
#constant <datatype>,<column-label>,<column-value>
To provide multiple constants, include each #constant
annotations on a separate line.
#constant measurement,m
#constant tag,dataSource,csv
For constants with measurement
and dateTime
datatypes, the second value in the constant definition is the column-value.
timezone
Use the #timezone
annotation to update timestamps to a specific timezone. By default, timestamps are parsed as UTC. Use the ±HHmm
format to specify the timezone offset relative to UTC.
strict mode
Use the :strict
keyword to indicate a loss of precision when parsing long
or unsignedLong
data types. Turn on strict mode by using a column data type that ends with strict
, such as long:strict
. When parsing long
or unsignedLong
value from a string value with fraction digits, the whole CSV row fails when in a strict mode. A warning is printed when not in a strict mode, saying line x: column y: '1.2' truncated to '1' to fit into long data type
. For more information on strict parsing, see the package documentation.
Timezone examples
Timezone | Offset |
---|---|
US Mountain Daylight Time | -0600 |
Central European Summer Time | +0200 |
Australia Eastern Standard Time | +1000 |
Apia Daylight Time | +1400 |
Timezone annotation example
#timezone -0600
concat
The #concat
annotation adds a new column that is concatenated from existing columns according to bash-like string interpolation literal with variables referencing existing column labels.
For example:
#concat,string,fullName,${firstName} ${lastName}
This is especially useful when constructing a timestamp from multiple columns. For example, the following annotation will combine the given CSV columns into a timestamp:
#concat,dateTime:2006-01-02,${Year}-${Month}-${Day}
Year,Month,Day,Hour,Minute,Second,Tag,Value
2020,05,22,00,00,00,test,0
2020,05,22,00,05,00,test,1
2020,05,22,00,10,00,test,2
Define custom column separator
If columns are delimited using a character other than a comma, use the sep
keyword to define a custom separator in the first line of your CSV file.
sep=;
Annotation shorthand
Extended annotated CSV supports annotation shorthand. Include the column label, datatype, and (optional) default value in each column header row using the following syntax:
<column-label>|<column-datatype>|<column-default-value>
Example annotation shorthand
m|measurement,location|tag|Hong Kong,temp|double,pm|long|0,time|dateTime:RFC3339
weather,San Francisco,51.9,38,2020-01-01T00:00:00Z
weather,New York,18.2,,2020-01-01T00:00:00Z
weather,,53.6,171,2020-01-01T00:00:00Z
The shorthand explained
- The
m
column represents the measurement and has no default value. - The
location
column is a tag with the default value,Hong Kong
. - The
temp
column is a field with float (double
) values and no default value. - The
pm
column is a field with integer (long
) values and a default of0
. - The
time
column represents the timestamp, uses the RFC3339 timestamp format, and has no default value.
Resulting line protocol
weather,location=San\ Francisco temp=51.9,pm=38i 1577836800000000000
weather,location=New\ York temp=18.2,pm=0i 1577836800000000000
weather,location=Hong\ Kong temp=53.6,pm=171i 1577836800000000000