ETL - Loaders

When the ETL module executes, Loaders handle the saving of records. They run at the last stage of the process. The ETL module in OrientDB supports the following loaders:

Output Loader

When the ETL module runs the Output Loader, it prints the transformer results to the console output. This is the loader that runs by default.

  • Component name: output
  • Accepted input classes: [Object]

OrientDB Loader

When the ETL module runs the OrientDB Loader, it loads the records and vertices from the transformers into the OrientDB database.

  • Component name: orientdb
  • Accepted input classes: [ ODocument, OrientVertex ]

Syntax

Parameter Description Type Mandatory Default value
"dbURL" Defines the database URL. string yes
"dbUser" Defines the user name. string admin
"dbPassword" Defines the user password. string admin
"dbAutoCreate" Defines whether it automatically creates the database, in the event that it doesn’t exist already. boolean true
"dbAutoCreateProperties" Defnes whether it automatically creates properties in the schema. boolean false
"dbAutoDropIfExists" Defines whether it automatically drops the database if it exists already. boolean false
"tx" Defines whether it uses transactions boolean false
"txUseLog" Defines whether it uses log in transactions. boolean
"wal" Defines whether it uses write ahead logging. Disable to achieve better performance. boolean true
"batchCommit" When using transactions, defines the batch of entries it commits. Helps avoid having one large transaction in memory. integer 0
"dbType" Defines the database type: graph or document. string document
"class" Defines the class to use in storing new record. string
"cluster" Defines the cluster in which to store the new record. string
"classes" Defines whether it creates classes, if not defined already in the database. inner document
"indexes" Defines indexes to use on the ETL process. Before starting, it creates any declared indexes not present in the database. Indexes must have "type", "class" and "fields". inner document
"useLightweightEdges" Defines whether it changes the default setting for Lightweight Edges. boolean false
"standardELementConstraints" Defines whether it changes the default setting for TinkerPop BLueprint constraints. Value cannot be null and you cannot use id as a property name. boolean true

For the "txUseLog" parameter, when WAL is disabled you can still achieve reliable transactions through this parameter. You may find it useful to group many operations into a batch, such as CREATE EDGE.

Classes

When using the "classes" parameter, it defines an inner document that contains additional configuration variables.

Parameter Description Type Mandatory Default value
"name" Defines the class name. string yes
"extends" Defines the super-class name. string
"clusters" Defines the number of cluster to create under the class. integer 1

NOTE: The "clusters" parameter was introduced in version 2.1.

Indexes

Parameter Description Type Mandatory Default value
"name" Defines the index name. string
"class" Defines the class name in which to create the index. string yes
"type" Defines the index type. string yes
"fields" Defines an array of fields to index. To specify the field type, use the syntax: <field>.<type>. string yes
"metadata" Defines additional index metadata. string

Examples

Configuration to load data into the database dbpedia on OrientDB, in the directory /temp/databases using the PLocal protocol and a Graph database. The load is transactional, performing commits in thousand insert batches. It creates two lookup vertices with indexes against the property string URI in the base vertex class V. The index is unique.

  1. "orientdb": {
  2. "dbURL": "plocal:/temp/databases/dbpedia",
  3. "dbUser": "importer",
  4. "dbPassword": "IMP",
  5. "dbAutoCreate": true,
  6. "tx": false,
  7. "batchCommit": 1000,
  8. "wal" : false,
  9. "dbType": "graph",
  10. "classes": [
  11. {"name":"Person", "extends": "V" },
  12. {"name":"Customer", "extends": "Person", "clusters":8 }
  13. ],
  14. "indexes": [
  15. {"class":"V", "fields":["URI:string"], "type":"UNIQUE" },
  16. {"class":"Person", "fields":["town:string"], "type":"NOTUNIQUE" ,
  17. metadata : { "ignoreNullValues" : false }
  18. }
  19. ]
  20. }