Kettle Import

Whenever you want to import metadata from Kettle (Pentaho Data Integration) you can use this plugin.

Usage

You can use it through the File / Import from Kettle/PDI menu in the Hop GUI or using the hop-import script with the --type kettle option.

Functionality

In general the plugin copies Kettle files and non-Kettle files.

Kettle files

FileConversion

Transformations

Kettle transformation files (.ktr extension) are converted to Hop pipelines. During this process the XML metadata is converted to the appropriate Hop format.

Jobs

Kettle job files (.kjb extension) are converted to Hop workflows. During this process the XML metadata is converted to the appropriate Hop format.

kettle.properties

The Kettle properties file in your .kettle folder (typically found in the home directory or ${KETTLE_HOME}) often contains variables and values regarding your environment. These variables and values are converted into an environment configuration file if you specified the -c or —target-config-file option. When you create an environment in Hop you can simply add this file to it to make everything work. If the configuration file already exists it will be updated, not overwritten. The description of the newly imported variables is set to Imported from Kettle to indicate that it’s new. Values of existing variables are overwritten and the existing description is kept.

shared.xml

The shared.xml file in your .kettle folder (typically found in the home directory or ${KETTLE_HOME}) often contains connections which are shared across many transformations and jobs. These connections are imported as Relational Database Connection metadata stored in the target folder metadata/rdbms folder.

jdbc.properties

A jdbc.properties file sometimes contains relational database metadata which is converted into generic RDBMS connections metadata.

Non-Kettle files

Non-Kettle files like text files, JSON, XML, …​ they are all copied over by default, including the sub-folders. You can change this behavior with the options:

  • --skip-folders : don’t copy sub-folders from the input folder

  • --skip-hidden : Don’t copy hidden files and folders like .git, .gitignore and so on

  • --skip-existing : Only copy files which haven’t been copied yet