Extended annotated CSV

Extended annotated CSV provides additional annotations and options that specify how CSV data should be converted to line protocol and written to InfluxDB. InfluxDB uses the csv2lp library to convert CSV into line protocol. Extended annotated CSV supports all Annotated CSV annotations.

The Flux csv.from function only supports annotated CSV, not extended annotated CSV.

To write data to InfluxDB, line protocol must include the following:

measurement
field set
timestamp (Optional but recommended)
tag set (Optional)

Extended CSV annotations identify the element of line protocol a column represents.

CSV Annotations

Extended annotated CSV extends and adds the following annotations:

datatype
constant
timezone
concat

datatype

Use the #datatype annotation to specify the line protocol element a column represents. To explicitly define a column as a field of a specific data type, use the field type in the annotation (for example: string, double, long, etc.).

Data type	Resulting line protocol
measurement	Column is the measurement
tag	Column is a tag
dateTime	Column is the timestamp
field	Column is a field
ignored	Column is ignored
string	Column is a string field
double	Column is a float field
long	Column is an integer field
unsignedLong	Column is an unsigned integer field
boolean	Column is a boolean field

measurement

Indicates the column is the measurement.

tag

Indicates the column is a tag. The column label is the tag key. The column value is the tag value.

dateTime

Indicates the column is the timestamp. time is an alias for dateTime. If the timestamp format includes a time zone, the parsed timestamp includes the time zone offset. By default, all timestamps are UTC. You can also use the #timezone annotation to adjust timestamps to a specific time zone.

There can only be one dateTime column.

The influx write command converts timestamps to Unix timestamps. Append the timestamp format to the dateTime datatype with (:).

#datatype dateTime:RFC3339
#datatype dateTime:RFC3339Nano
#datatype dateTime:number
#datatype dateTime:2006-01-02

Supported timestamp formats

Timestamp format	Description	Example
RFC3339	RFC3339 timestamp	`2020-01-01T00:00:00Z`
RFC3339Nano	RFC3339 timestamp	`2020-01-01T00:00:00.000000000Z`
number	Unix timestamp	`1577836800000000000`

If using the number timestamp format and timestamps are not in nanoseconds, use the influx write —precision flag to specify the timestamp precision.

Custom timestamp formats

To specify a custom timestamp format, use timestamp formats as described in the Go time package. For example: 2020-01-02.

field

Indicates the column is a field. The column label is the field key. The column value is the field value.

With the field datatype, field values are copies as-is to line protocol. For information about line protocol values and how they are written to InfluxDB, see Line protocol data types and formats. We generally recommend specifying the field type in annotations.

ignored

The column is ignored and not written to InfluxDB.

Field types

The column is a field of a specified type. The column label is the field key. The column value is the field value.

string
double
long
unsignedLong
boolean

string

Column is a string field.

double

Column is a float field. By default, InfluxDB expects float values that use a period (.) to separate the fraction from the whole number. If column values include or use other separators, such as commas (,) to visually separate large numbers into groups, specify the following float separators:

fraction separator: Separates the fraction from the whole number.
ignored separator: Visually separates the whole number into groups but ignores the separator when parsing the float value.

Use the following syntax to specify float separators:

# Syntax
<fraction-separator><ignored-separator>
# Example
.,
# With the float separators above
# 1,200,000.15 => 1200000.15

Append float separators to the double datatype annotation with a colon (:). For example:

#datatype "double:.,"

If your float separators include a comma (,), wrap the column annotation in double quotes ("") to prevent the comma from being parsed as a column separator or delimiter. You can also define a custom column separator.

long

Column is an integer field. If column values contain separators such as periods (.) or commas (,), specify the following integer separators:

fraction separator: Separates the fraction from the whole number. Integer values are truncated at the fraction separator when converted to line protocol.
ignored separator: Visually separates the whole number into groups but ignores the separator when parsing the integer value.

Use the following syntax to specify integer separators:

# Syntax
<fraction-separator><ignored-separator>
# Example
.,
# With the integer separators above
# 1,200,000.00 => 1200000i

Append integer separators to the long datatype annotation with a colon (:). For example:

#datatype "long:.,"

If your integer separators include a comma (,), wrap the column annotation in double quotes ("") to prevent the comma from being parsed as a column separator or delimiter. You can also define a custom column separator.

unsignedLong

Column is an unsigned integer (uinteger) field. If column values contain separators such as periods (.) or commas (,), specify the following uinteger separators:

fraction separator: Separates the fraction from the whole number. Uinteger values are truncated at the fraction separator when converted to line protocol.
ignored separator: Visually separates the whole number into groups but ignores the separator when parsing the uinteger value.

Use the following syntax to specify uinteger separators:

# Syntax
<fraction-separator><ignored-separator>
# Example
.,
# With the uinteger separators above
# 1,200,000.00 => 1200000u

Append uinteger separators to the long datatype annotation with a colon (:). For example:

#datatype "unsignedLong:.,"

If your uinteger separators include a comma (,), wrap the column annotation in double quotes ("") to prevent the comma from being parsed as a column separator or delimiter. You can also define a custom column separator.

boolean

Column is a boolean field. If column values are not supported boolean values, specify the boolean format with the following syntax:

# Syntax
<true-values>:<false-values>
# Example
y,Y,1:n,N,0
# With the boolean format above
# y => true, Y => true, 1 => true
# n => false, N => false, 0 => false

Append the boolean format to the boolean datatype annotation with a colon (:). For example:

#datatype "boolean:y,Y:n,N"

If your boolean format contains commas (,), wrap the column annotation in double quotes ("") to prevent the comma from being parsed as a column separator or delimiter. You can also define a custom column separator.

constant

Use the #constant annotation to define a constant column label and value for each row. The #constant annotation provides a way to supply line protocol elements that don’t exist in the CSV data.

Use the following syntax to define constants:

#constant <datatype>,<column-label>,<column-value>

To provide multiple constants, include each #constant annotations on a separate line.

#constant measurement,m
#constant tag,dataSource,csv

For constants with measurement and dateTime datatypes, the second value in the constant definition is the column-value.

timezone

Use the #timezone annotation to update timestamps to a specific timezone. By default, timestamps are parsed as UTC. Use the ±HHmm format to specify the timezone offset relative to UTC.

strict mode

Use the :strict keyword to indicate a loss of precision when parsing long or unsignedLong data types. Turn on strict mode by using a column data type that ends with strict, such as long:strict. When parsing long or unsignedLong value from a string value with fraction digits, the whole CSV row fails when in a strict mode. A warning is printed when not in a strict mode, saying line x: column y: '1.2' truncated to '1' to fit into long data type. For more information on strict parsing, see the package documentation.

Timezone examples

Timezone	Offset
US Mountain Daylight Time	`-0600`
Central European Summer Time	`+0200`
Australia Eastern Standard Time	`+1000`
Apia Daylight Time	`+1400`

Timezone annotation example

#timezone -0600

concat

The #concat annotation adds a new column that is concatenated from existing columns according to bash-like string interpolation literal with variables referencing existing column labels.

For example:

#concat,string,fullName,${firstName} ${lastName}

This is especially useful when constructing a timestamp from multiple columns. For example, the following annotation will combine the given CSV columns into a timestamp:

#concat,dateTime:2006-01-02,${Year}-${Month}-${Day}
Year,Month,Day,Hour,Minute,Second,Tag,Value
2020,05,22,00,00,00,test,0
2020,05,22,00,05,00,test,1
2020,05,22,00,10,00,test,2

Define custom column separator

If columns are delimited using a character other than a comma, use the sep keyword to define a custom separator in the first line of your CSV file.

sep=;

Annotation shorthand

Extended annotated CSV supports annotation shorthand. Include the column label, datatype, and (optional) default value in each column header row using the following syntax:

<column-label>|<column-datatype>|<column-default-value>

Example annotation shorthand

m|measurement,location|tag|Hong Kong,temp|double,pm|long|0,time|dateTime:RFC3339
weather,San Francisco,51.9,38,2020-01-01T00:00:00Z
weather,New York,18.2,,2020-01-01T00:00:00Z
weather,,53.6,171,2020-01-01T00:00:00Z

The shorthand explained

The m column represents the measurement and has no default value.
The location column is a tag with the default value, Hong Kong.
The temp column is a field with float (double) values and no default value.
The pm column is a field with integer (long) values and a default of 0.
The time column represents the timestamp, uses the RFC3339 timestamp format, and has no default value.

Resulting line protocol

weather,location=San\ Francisco temp=51.9,pm=38i 1577836800000000000
weather,location=New\ York temp=18.2,pm=0i 1577836800000000000
weather,location=Hong\ Kong temp=53.6,pm=171i 1577836800000000000

csv syntax write

Extended annotated CSV

Extended annotated CSV

CSV Annotations

datatype

measurement

tag

dateTime

Supported timestamp formats

Custom timestamp formats

field

ignored

Field types

string

double

long

unsignedLong

boolean

constant

timezone

strict mode

Timezone examples

Timezone annotation example

concat

Define custom column separator

Annotation shorthand

Example annotation shorthand

The shorthand explained

Resulting line protocol

Related