Data Model Selection

Data Model Selection

Before importing data to IoTDB, we first select the appropriate data storage model according to the sample data, and then create the storage group and timeseries using SET STORAGE GROUP statement and CREATE TIMESERIES statement respectively.

Storage Model Selection

According to the data attribute layers described in sample data, we can express it as an attribute hierarchy structure based on the coverage of attributes and the subordinate relationship between them, as shown in Figure 3.1 below. Its hierarchical relationship is: power group layer - power plant layer - device layer - sensor layer. ROOT is the root node, and each node of sensor layer is called a leaf node. In the process of using IoTDB, you can directly connect the attributes on the path from ROOT node to each leaf node with “.”, thus forming the name of a timeseries in IoTDB. For example, The left-most path in Figure 3.1 can generate a timeseries named ROOT.ln.wf01.wt01.status.

Data Model Selection - 图1

Figure 3.1 Attribute hierarchy structure

After getting the name of the timeseries, we need to set up the storage group according to the actual scenario and scale of the data. Because in the scenario of this chapter data is usually arrived in the unit of groups (i.e., data may be across electric fields and devices), in order to avoid frequent switching of IO when writing data, and to meet the user’s requirement of physical isolation of data in the unit of groups, we set the storage group at the group layer.

Storage Group Creation

After selecting the storage model, according to which we can set up the corresponding storage group. The SQL statements for creating storage groups are as follows:

  1. IoTDB > set storage group to root.ln
  2. IoTDB > set storage group to root.sgcc

We can thus create two storage groups using the above two SQL statements.

It is worth noting that when the path itself or the parent/child layer of the path is already set as a storage group, the path is then not allowed to be set as a storage group. For example, it is not feasible to set root.ln.wf01 as a storage group when there exist two storage groups root.ln and root.sgcc. The system will give the corresponding error prompt as shown below:

  1. IoTDB> set storage group to root.ln.wf01
  2. Msg: org.apache.iotdb.exception.MetadataErrorException: org.apache.iotdb.exception.PathErrorException: The prefix of root.ln.wf01 has been set to the storage group.

Show Storage Group

After the storage group is created, we can use the SHOW STORAGE GROUP statement to view all the storage groups. The SQL statement is as follows:

  1. IoTDB> show storage group

The result is as follows:

Data Model Selection - 图2

Timeseries Creation

According to the storage model selected before, we can create corresponding timeseries in the two storage groups respectively. The SQL statements for creating timeseries are as follows:

  1. IoTDB > create timeseries root.ln.wf01.wt01.status with datatype=BOOLEAN,encoding=PLAIN
  2. IoTDB > create timeseries root.ln.wf01.wt01.temperature with datatype=FLOAT,encoding=RLE
  3. IoTDB > create timeseries root.ln.wf02.wt02.hardware with datatype=TEXT,encoding=PLAIN
  4. IoTDB > create timeseries root.ln.wf02.wt02.status with datatype=BOOLEAN,encoding=PLAIN
  5. IoTDB > create timeseries root.sgcc.wf03.wt01.status with datatype=BOOLEAN,encoding=PLAIN
  6. IoTDB > create timeseries root.sgcc.wf03.wt01.temperature with datatype=FLOAT,encoding=RLE

It is worth noting that when in the CRATE TIMESERIES statement the encoding method conflicts with the data type, the system will give the corresponding error prompt as shown below:

  1. IoTDB> create timeseries root.ln.wf02.wt02.status WITH DATATYPE=BOOLEAN, ENCODING=TS_2DIFF
  2. error: encoding TS_2DIFF does not support BOOLEAN

Please refer to Encoding for correspondence between data type and encoding.

Show Timeseries

Currently, IoTDB supports two ways of viewing timeseries:

  • SHOW TIMESERIES statement presents all timeseries information in JSON form
  • SHOW TIMESERIES <Path> statement returns all timeseries information and the total number of timeseries under the given <Path> in tabular form. timeseries information includes: timeseries path, storage group it belongs to, data type, encoding type. <Path> needs to be a prefix path or a path with star or a timeseries path. SQL statements are as follows:
  1. IoTDB> show timeseries root
  2. IoTDB> show timeseries root.ln

The results are shown below respectly:

Data Model Selection - 图3

Data Model Selection - 图4

It is worth noting that when the queried path does not exist, the system will return no timeseries.

Precautions

Version 0.8.2 imposes some limitations on the scale of data that users can operate:

Limit 1: Assuming that the JVM memory allocated to IoTDB at runtime is p and the user-defined size of data in memory written to disk (group_size_in_byte) is Q, then the number of storage groups should not exceed p/q.

Limit 2: The number of timeseries should not exceed the ratio of JVM memory allocated to IoTDB at run time to 20KB.