CSV Tool

The CSV tool can help you import data in CSV format to IoTDB or export data from IoTDB to a CSV file.

Usage of export-csv.shCSV Load Export Tool - 图1open in new window

Syntax

  1. # Unix/OS X
  2. > tools/export-csv.sh -h <ip> -p <port> -u <username> -pw <password> -td <directory> [-tf <time-format> -datatype <true/false> -q <query command> -s <sql file> -linesPerFile <int>]
  3. # Windows
  4. > tools\export-csv.bat -h <ip> -p <port> -u <username> -pw <password> -td <directory> [-tf <time-format> -datatype <true/false> -q <query command> -s <sql file> -linesPerFile <int>]

Description:

  • -datatype:
    • true (by default): print the data type of timesries in the head line of CSV file. i.e., Time, root.sg1.d1.s1(INT32), root.sg1.d1.s2(INT64).
    • false: only print the timeseries name in the head line of the CSV file. i.e., Time, root.sg1.d1.s1 , root.sg1.d1.s2
  • -q <query command>:
    • specifying a query command that you want to execute
    • example: select * from root.** limit 100, or select * from root.** limit 100 align by device
  • -s <sql file>:
    • specifying a SQL file which can consist of more than one sql. If there are multiple SQLs in one SQL file, the SQLs should be separated by line breaks. And, for each SQL, a output CSV file will be generated.
  • -td <directory>:
    • specifying the directory that the data will be exported
  • -tf <time-format>:
    • specifying a time format that you want. The time format have to obey ISO 8601CSV Load Export Tool - 图2open in new window standard. If you want to save the time as the timestamp, then setting -tf timestamp
    • example: -tf yyyy-MM-dd\ HH:mm:ss or -tf timestamp
  • -linesPerFile <int>:
    • Specifying lines of each dump file, 10000 is default.
    • example: -linesPerFile 1
  • -t <timeout>:
    • Specifies the timeout period for session queries, in milliseconds

More, if you don’t use one of -s and -q, you need to enter some queries after running the export script. The results of the different query will be saved to different CSV files.

example

  1. # Unix/OS X
  2. > tools/export-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -td ./
  3. # Or
  4. > tools/export-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -tf yyyy-MM-dd\ HH:mm:ss
  5. # or
  6. > tools/export-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -q "select * from root.**"
  7. # Or
  8. > tools/export-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -s sql.txt
  9. # Or
  10. > tools/export-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -tf yyyy-MM-dd\ HH:mm:ss -s sql.txt
  11. # Or
  12. > tools/export-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -tf yyyy-MM-dd\ HH:mm:ss -s sql.txt -linesPerFile 10
  13. # Or
  14. > tools/export-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -tf yyyy-MM-dd\ HH:mm:ss -s sql.txt -linesPerFile 10 -t 10000
  15. # Windows
  16. > tools/export-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -td ./
  17. # Or
  18. > tools/export-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -tf yyyy-MM-dd\ HH:mm:ss
  19. # or
  20. > tools/export-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -q "select * from root.**"
  21. # Or
  22. > tools/export-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -s sql.txt
  23. # Or
  24. > tools/export-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -tf yyyy-MM-dd\ HH:mm:ss -s sql.txt
  25. # Or
  26. > tools/export-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -tf yyyy-MM-dd\ HH:mm:ss -s sql.txt -linesPerFile 10
  27. # Or
  28. > tools/export-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -tf yyyy-MM-dd\ HH:mm:ss -s sql.txt -linesPerFile 10 -t 10000

Sample SQL file

  1. select * from root.**;
  2. select * from root.** align by device;

The result of select * from root.**

  1. Time,root.ln.wf04.wt04.status(BOOLEAN),root.ln.wf03.wt03.hardware(TEXT),root.ln.wf02.wt02.status(BOOLEAN),root.ln.wf02.wt02.hardware(TEXT),root.ln.wf01.wt01.hardware(TEXT),root.ln.wf01.wt01.status(BOOLEAN)
  2. 1970-01-01T08:00:00.001+08:00,true,"v1",true,"v1",v1,true
  3. 1970-01-01T08:00:00.002+08:00,true,"v1",,,,true

The result of select * from root.** align by device

  1. Time,Device,hardware(TEXT),status(BOOLEAN)
  2. 1970-01-01T08:00:00.001+08:00,root.ln.wf01.wt01,"v1",true
  3. 1970-01-01T08:00:00.002+08:00,root.ln.wf01.wt01,,true
  4. 1970-01-01T08:00:00.001+08:00,root.ln.wf02.wt02,"v1",true
  5. 1970-01-01T08:00:00.001+08:00,root.ln.wf03.wt03,"v1",
  6. 1970-01-01T08:00:00.002+08:00,root.ln.wf03.wt03,"v1",
  7. 1970-01-01T08:00:00.001+08:00,root.ln.wf04.wt04,,true
  8. 1970-01-01T08:00:00.002+08:00,root.ln.wf04.wt04,,true

The data of boolean type signed by true and false without double quotes. And the text data will be enclosed in double quotes.

Note

Note that if fields exported by the export tool have the following special characters:

  1. ,: the field will be escaped by \.

Usage of import-csv.shCSV Load Export Tool - 图3open in new window

Create metadata (optional)

  1. CREATE DATABASE root.fit.d1;
  2. CREATE DATABASE root.fit.d2;
  3. CREATE DATABASE root.fit.p;
  4. CREATE TIMESERIES root.fit.d1.s1 WITH DATATYPE=INT32,ENCODING=RLE;
  5. CREATE TIMESERIES root.fit.d1.s2 WITH DATATYPE=TEXT,ENCODING=PLAIN;
  6. CREATE TIMESERIES root.fit.d2.s1 WITH DATATYPE=INT32,ENCODING=RLE;
  7. CREATE TIMESERIES root.fit.d2.s3 WITH DATATYPE=INT32,ENCODING=RLE;
  8. CREATE TIMESERIES root.fit.p.s1 WITH DATATYPE=INT32,ENCODING=RLE;

IoTDB has the ability of type inference, so it is not necessary to create metadata before data import. However, we still recommend creating metadata before importing data using the CSV import tool, as this can avoid unnecessary type conversion errors.

Sample CSV file to be imported

The data aligned by time, and headers without data type.

  1. Time,root.test.t1.str,root.test.t2.str,root.test.t2.int
  2. 1970-01-01T08:00:00.001+08:00,"123hello world","123\,abc",100
  3. 1970-01-01T08:00:00.002+08:00,"123",,

The data aligned by time, and headers with data type.(Text type data supports double quotation marks and no double quotation marks)

  1. Time,root.test.t1.str(TEXT),root.test.t2.str(TEXT),root.test.t2.int(INT32)
  2. 1970-01-01T08:00:00.001+08:00,"123hello world","123\,abc",100
  3. 1970-01-01T08:00:00.002+08:00,123,hello world,123
  4. 1970-01-01T08:00:00.003+08:00,"123",,
  5. 1970-01-01T08:00:00.004+08:00,123,,12

The data aligned by device, and headers without data type.

  1. Time,Device,str,int
  2. 1970-01-01T08:00:00.001+08:00,root.test.t1,"123hello world",
  3. 1970-01-01T08:00:00.002+08:00,root.test.t1,"123",
  4. 1970-01-01T08:00:00.001+08:00,root.test.t2,"123\,abc",100

The data aligned by device, and headers with data type.(Text type data supports double quotation marks and no double quotation marks)

  1. Time,Device,str(TEXT),int(INT32)
  2. 1970-01-01T08:00:00.001+08:00,root.test.t1,"123hello world",
  3. 1970-01-01T08:00:00.002+08:00,root.test.t1,hello world,123
  4. 1970-01-01T08:00:00.003+08:00,root.test.t1,,123

Syntax

  1. # Unix/OS X
  2. > tools/import-csv.sh -h <ip> -p <port> -u <username> -pw <password> -f <xxx.csv> [-fd <./failedDirectory>] [-aligned <true>] [-tp <ms/ns/us>] [-typeInfer <boolean=text,float=double...>]
  3. # Windows
  4. > tools\import-csv.bat -h <ip> -p <port> -u <username> -pw <password> -f <xxx.csv> [-fd <./failedDirectory>] [-aligned <true>] [-tp <ms/ns/us>] [-typeInfer <boolean=text,float=double...>]

Description:

  • -f:

    • the CSV file that you want to import, and it could be a file or a folder. If a folder is specified, all TXT and CSV files in the folder will be imported in batches.
    • example: -f filename.csv
  • -fd:

    • specifying a directory to save files which save failed lines. If you don’t use this parameter, the failed file will be saved at original directory, and the filename will be the source filename with suffix .failed.
    • example: -fd ./failed/
  • -aligned:

    • whether to use the aligned interface? The option false is default.
    • example: -aligned true
  • -batch:

    • specifying the point’s number of a batch. If the program throw the exception org.apache.thrift.transport.TTransportException: Frame size larger than protect max size, you can lower this parameter as appropriate.
    • example: -batch 100000, 100000 is the default value.
  • -tp <time-precision>:

    • specifying a time precision. Options includes ms(millisecond), ns(nanosecond), and us(microsecond), ms is default.
  • -typeInfer <srcTsDataType1=dstTsDataType1,srcTsDataType2=dstTsDataType2,...>:

    • specifying rules of type inference.
    • Option srcTsDataType includes boolean,int,long,float,double,NaN.
    • Option dstTsDataType includes boolean,int,long,float,double,text.
    • When srcTsDataType is boolean, dstTsDataType should be between boolean and text.
    • When srcTsDataType is NaN, dstTsDataType should be among float, double and text.
    • When srcTsDataType is Numeric type, dstTsDataType precision should be greater than srcTsDataType.
    • example: -typeInfer boolean=text,float=double
  • -linesPerFailedFile <int>:

    • Specifying lines of each failed file, 10000 is default.
    • example: -linesPerFailedFile 1

Example

  1. # Unix/OS X
  2. > tools/import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv -fd ./failed
  3. # or
  4. > tools/import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv -fd ./failed
  5. # or
  6. > tools\import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv -fd ./failed -tp ns
  7. # or
  8. > tools\import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv -fd ./failed -tp ns -typeInfer boolean=text,float=double
  9. # or
  10. > tools\import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv -fd ./failed -tp ns -typeInfer boolean=text,float=double -linesPerFailedFile 10
  11. # Windows
  12. > tools\import-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv
  13. # or
  14. > tools\import-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv -fd .\failed
  15. # or
  16. > tools\import-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv -fd .\failed -tp ns
  17. # or
  18. > tools\import-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv -fd .\failed -tp ns -typeInfer boolean=text,float=double
  19. # or
  20. > tools\import-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv -fd .\failed -tp ns -typeInfer boolean=text,float=double -linesPerFailedFile 10

Note

Note that the following special characters in fields need to be checked before importing:

  1. , : fields containing , should be escaped by \.
  2. you can input time format like yyyy-MM-dd'T'HH:mm:ss, yyy-MM-dd HH:mm:ss, or yyyy-MM-dd'T'HH:mm:ss.SSSZ.
  3. the Time column must be the first one.