TsFile Hierarchy
Here is a brief introduction of the structure of a TsFile file.
Variable Storage
Big Endian
- For Example, the
int
0x8
will be stored as00 00 00 08
, not08 00 00 00
- For Example, the
String with Variable Length
The format is
int size
plusString literal
. Size can be zero.Size equals the number of bytes this string will take, and it may not equal to the length of the string.
For example “sensor_1” will be stored as
00 00 00 08
plus the encoding(ASCII) of “sensor_1”.Note that for the “Magic String”(file signature) “TsFilev0.8.0”, the size(12) and encoding(ASCII) is fixed so there is no need to put the size before this string literal.
Data Type Hardcode
- 0: BOOLEAN
- 1: INT32 (
int
) - 2: INT64 (
long
) - 3: FLOAT
- 4: DOUBLE
- 5: TEXT (
String
)
Encoding Type Hardcode
- 0: PLAIN
- 1: PLAIN_DICTIONARY
- 2: RLE
- 3: DIFF
- 4: TS_2DIFF
- 5: BITMAP
- 6: GORILLA
- 7: REGULAR
Compressing Type Hardcode
- 0: UNCOMPRESSED
- 1: SNAPPY
TsFile Overview
Here is a graph about the TsFile structure.
Magic String
There is a 12 bytes magic string:
TsFilev0.8.0
It is in both the beginning and end of a TsFile file as signature.
Data
The content of a TsFile file can be divided as two parts: data and metadata. There is a byte 0x02
as the marker between data and metadata.
The data section is an array of ChunkGroup
, each ChuckGroup represents a device.
ChuckGroup
The ChunkGroup
has an array of Chunk
, a following byte 0x00
as the marker, and a ChunkFooter
.
Chunk
A Chunk
represents a sensor. There is a byte 0x01
as the marker, following a ChunkHeader
and an array of Page
.
ChunkHeader
Member Description | Member Type |
---|---|
The name of this sensor(measurementID) | String |
Size of this chunk | int |
Data type of this chuck | short |
Number of pages | int |
Compression Type | short |
Encoding Type | short |
Max Tombstone Time | long |
Page
A Page
represents some data in a Chunk
. It contains a PageHeader
and the actual data (The encoded time-value pair).
PageHeader Structure
Member Description | Member Type |
---|---|
Data size before compressing | int |
Data size after compressing(if use SNAPPY) | int |
Number of values | int |
Minimum time stamp | long |
Maximum time stamp | long |
Minimum value of the page | Type of the page |
Maximum value of the page | Type of the page |
First value of the page | Type of the page |
Last value of the page | Type of the page |
Sum of the Page | double |
ChunkGroupFooter
Member Description | Member Type |
---|---|
Deviceid | String |
Data size of the ChunkGroup | long |
Number of chunks | int |
Metadata
TsDeviceMetaData
The first part of metadata is TsDeviceMetaData
Member Description | Member Type |
---|---|
Start time | long |
End time | long |
Number of chunk groups | int |
Then there is an array of ChunkGroupMetaData
after TsDeviceMetaData
ChunkGroupMetaData
Member Description | Member Type |
---|---|
Deviceid | String |
Start offset of the ChunkGroup | long |
End offset of the ChunkGroup | long |
Version | long |
Number of ChunkMetaData | int |
Then there is an array of ChunkMetadata
for each ChunkGroupMetadata
ChunkMetaData
Member Description | Member Type |
---|---|
Measurementid | String |
Start offset of ChunkHeader | long |
Number of data points | long |
Start time | long |
End time | long |
Data type | short |
Number of statistics | int |
The statistics of this chunk | TsDigest |
TsDigest
There are five statistics: min, last, sum, first, max
The storage format is a name-value pair. The name is a string (remember the length is before the literal).
But for the value, there is also a size integer before the data even if it is not string. For example, if the min
is 3, then it will be stored as 3 “min” 4 3 in the TsFile.
File Metadata
After the array of ChunkGroupMetadata
, here is the last part of the metadata.
Member Description | Member Type |
---|---|
Number of Devices | int |
Array of DeviceIndexMetadata | DeviceIndexMetadata |
Number of Measurements | int |
Array of Measurement name and schema | String, MeasurementSchema pair |
Current Version(3 for now) | int |
Author byte | byte |
Author(if author byte is 0x01) | String |
File Metadata size(not including itself) | int |
DeviceIndexMetadata
Member Description | Member Type |
---|---|
Deviceid | String |
Start offset of ChunkGroupMetaData(Or TsDeviceMetaData if it’s the first one) | long |
length | int |
Start time | long |
End time | long |
MeasurementSchema
Member Description | Member Type |
---|---|
Measurementid | String |
Data type | short |
Encoding | short |
Compressor | short |
Size of props | int |
If size of props is greater than 0, there is an array of
Such as “max_point_number””2”.
Done
After the FileMetaData
, there will be another Magic String and you have finished the journey of discovering TsFile!
You can also use /tsfile/example/TsFileSequenceRead to read and validate a TsFile.