Extended annotated CSV

This page documents an earlier version of InfluxDB. InfluxDB v2.7 is the latest stable version. View this page in the v2.7 documentation.

Extended annotated CSV provides additional annotations and options that specify how CSV data should be converted to line protocol and written to InfluxDB. InfluxDB uses the csv2lp library to convert CSV into line protocol. Extended annotated CSV supports all Annotated CSV annotations.

The Flux csv.from function only supports annotated CSV, not extended annotated CSV.

To write data to InfluxDB, line protocol must include the following:

Extended CSV annotations identify the element of line protocol a column represents.

CSV Annotations

Extended annotated CSV extends and adds the following annotations:

datatype

Use the #datatype annotation to specify the line protocol element a column represents. To explicitly define a column as a field of a specific data type, use the field type in the annotation (for example: string, double, long, etc.).

Data typeResulting line protocol
measurementColumn is the measurement
tagColumn is a tag
dateTimeColumn is the timestamp
fieldColumn is a field
ignoredColumn is ignored
stringColumn is a string field
doubleColumn is a float field
longColumn is an integer field
unsignedLongColumn is an unsigned integer field
booleanColumn is a boolean field

measurement

Indicates the column is the measurement.

tag

Indicates the column is a tag. The column label is the tag key. The column value is the tag value.

dateTime

Indicates the column is the timestamp. time is an alias for dateTime. If the timestamp format includes a time zone, the parsed timestamp includes the time zone offset. By default, all timestamps are UTC. You can also use the #timezone annotation to adjust timestamps to a specific time zone.

There can only be one dateTime column.

The influx write command converts timestamps to Unix timestamps. Append the timestamp format to the dateTime datatype with (:).

  1. #datatype dateTime:RFC3339
  2. #datatype dateTime:RFC3339Nano
  3. #datatype dateTime:number
  4. #datatype dateTime:2006-01-02
Supported timestamp formats
Timestamp formatDescriptionExample
RFC3339RFC3339 timestamp2020-01-01T00:00:00Z
RFC3339NanoRFC3339 timestamp2020-01-01T00:00:00.000000000Z
numberUnix timestamp1577836800000000000

If using the number timestamp format and timestamps are not in nanoseconds, use the influx write —precision flag to specify the timestamp precision.

Custom timestamp formats

To specify a custom timestamp format, use timestamp formats as described in the Go time package. For example: 2020-01-02.

field

Indicates the column is a field. The column label is the field key. The column value is the field value.

With the field datatype, field values are copies as-is to line protocol. For information about line protocol values and how they are written to InfluxDB, see Line protocol data types and formats. We generally recommend specifying the field type in annotations.

ignored

The column is ignored and not written to InfluxDB.

Field types

The column is a field of a specified type. The column label is the field key. The column value is the field value.

string

Column is a string field.

double

Column is a float field. By default, InfluxDB expects float values that use a period (.) to separate the fraction from the whole number. If column values include or use other separators, such as commas (,) to visually separate large numbers into groups, specify the following float separators:

  • fraction separator: Separates the fraction from the whole number.
  • ignored separator: Visually separates the whole number into groups but ignores the separator when parsing the float value.

Use the following syntax to specify float separators:

  1. # Syntax
  2. <fraction-separator><ignored-separator>
  3. # Example
  4. .,
  5. # With the float separators above
  6. # 1,200,000.15 => 1200000.15

Append float separators to the double datatype annotation with a colon (:). For example:

  1. #datatype "double:.,"

If your float separators include a comma (,), wrap the column annotation in double quotes ("") to prevent the comma from being parsed as a column separator or delimiter. You can also define a custom column separator.

long

Column is an integer field. If column values contain separators such as periods (.) or commas (,), specify the following integer separators:

  • fraction separator: Separates the fraction from the whole number. Integer values are truncated at the fraction separator when converted to line protocol.
  • ignored separator: Visually separates the whole number into groups but ignores the separator when parsing the integer value.

Use the following syntax to specify integer separators:

  1. # Syntax
  2. <fraction-separator><ignored-separator>
  3. # Example
  4. .,
  5. # With the integer separators above
  6. # 1,200,000.00 => 1200000i

Append integer separators to the long datatype annotation with a colon (:). For example:

  1. #datatype "long:.,"

If your integer separators include a comma (,), wrap the column annotation in double quotes ("") to prevent the comma from being parsed as a column separator or delimiter. You can also define a custom column separator.

unsignedLong

Column is an unsigned integer (uinteger) field. If column values contain separators such as periods (.) or commas (,), specify the following uinteger separators:

  • fraction separator: Separates the fraction from the whole number. Uinteger values are truncated at the fraction separator when converted to line protocol.
  • ignored separator: Visually separates the whole number into groups but ignores the separator when parsing the uinteger value.

Use the following syntax to specify uinteger separators:

  1. # Syntax
  2. <fraction-separator><ignored-separator>
  3. # Example
  4. .,
  5. # With the uinteger separators above
  6. # 1,200,000.00 => 1200000u

Append uinteger separators to the long datatype annotation with a colon (:). For example:

  1. #datatype "usignedLong:.,"

If your uinteger separators include a comma (,), wrap the column annotation in double quotes ("") to prevent the comma from being parsed as a column separator or delimiter. You can also define a custom column separator.

boolean

Column is a boolean field. If column values are not supported boolean values, specify the boolean format with the following syntax:

  1. # Syntax
  2. <true-values>:<false-values>
  3. # Example
  4. y,Y,1:n,N,0
  5. # With the boolean format above
  6. # y => true, Y => true, 1 => true
  7. # n => false, N => false, 0 => false

Append the boolean format to the boolean datatype annotation with a colon (:). For example:

  1. #datatype "boolean:y,Y:n,N"

If your boolean format contains commas (,), wrap the column annotation in double quotes ("") to prevent the comma from being parsed as a column separator or delimiter. You can also define a custom column separator.

constant

Use the #constant annotation to define a constant column label and value for each row. The #constant annotation provides a way to supply line protocol elements that don’t exist in the CSV data.

Use the following syntax to define constants:

  1. #constant <datatype>,<column-label>,<column-value>

To provide multiple constants, include each #constant annotations on a separate line.

  1. #constant measurement,m
  2. #constant tag,dataSource,csv

For constants with measurement and dateTime datatypes, the second value in the constant definition is the column-value.

timezone

Use the #timezone annotation to update timestamps to a specific timezone. By default, timestamps are parsed as UTC. Use the ±HHmm format to specify the timezone offset relative to UTC.

strict mode

Use the :strict keyword to indicate a loss of precision when parsing long or unsignedLong data types. Turn on strict mode by using a column data type that ends with strict, such as long:strict. When parsing long or unsignedLong value from a string value with fraction digits, the whole CSV row fails when in a strict mode. A warning is printed when not in a strict mode, saying line x: column y: '1.2' truncated to '1' to fit into long data type. For more information on strict parsing, see the package documentation.

Timezone examples
TimezoneOffset
US Mountain Daylight Time-0600
Central European Summer Time+0200
Australia Eastern Standard Time+1000
Apia Daylight Time+1400
Timezone annotation example
  1. #timezone -0600

concat

The #concat annotation adds a new column that is concatenated from existing columns according to bash-like string interpolation literal with variables referencing existing column labels.

For example:

  1. #concat,string,fullName,${firstName} ${lastName}

This is especially useful when constructing a timestamp from multiple columns. For example, the following annotation will combine the given CSV columns into a timestamp:

  1. #concat,dateTime:2006-01-02,${Year}-${Month}-${Day}
  2. Year,Month,Day,Hour,Minute,Second,Tag,Value
  3. 2020,05,22,00,00,00,test,0
  4. 2020,05,22,00,05,00,test,1
  5. 2020,05,22,00,10,00,test,2

Define custom column separator

If columns are delimited using a character other than a comma, use the sep keyword to define a custom separator in the first line of your CSV file.

  1. sep=;

Annotation shorthand

Extended annotated CSV supports annotation shorthand. Include the column label, datatype, and (optional) default value in each column header row using the following syntax:

  1. <column-label>|<column-datatype>|<column-default-value>
Example annotation shorthand
  1. m|measurement,location|tag|Hong Kong,temp|double,pm|long|0,time|dateTime:RFC3339
  2. weather,San Francisco,51.9,38,2020-01-01T00:00:00Z
  3. weather,New York,18.2,,2020-01-01T00:00:00Z
  4. weather,,53.6,171,2020-01-01T00:00:00Z
The shorthand explained
  • The m column represents the measurement and has no default value.
  • The location column is a tag with the default value, Hong Kong.
  • The temp column is a field with float (double) values and no default value.
  • The pm column is a field with integer (long) values and a default of 0.
  • The time column represents the timestamp, uses the RFC3339 timestamp format, and has no default value.
Resulting line protocol
  1. weather,location=San\ Francisco temp=51.9,pm=38i 1577836800000000000
  2. weather,location=New\ York temp=18.2,pm=0i 1577836800000000000
  3. weather,location=Hong\ Kong temp=53.6,pm=171i 1577836800000000000

csv syntax write