Plugin Specification
Pipcook uses plugins to achieve tasks in a specific machine learning lifecycle, which ensures that the framework is simple, stable, and efficient enough.
At the same time, through a set of plugin specifications defined by Pipcook, we can also allow anyone to develop plugins, which ensures the scalability of Pipcook. Theoretically, through plugins, we can achieve all kinds of the machine learning task.
Plugin Package
Pipcook uses the form of NPM as a plugin package. Besides, we have expanded the protocol that belongs to the Pipcook Plugin based on NPM package.json.
{
"name": "my-own-pipcook-plugin",
"version": "1.0.0",
"description": "my own pipcook plugin",
"dependencies": {
"@pipcook/pipcook-core": "^0.5.0"
},
"pipcook": {
"category": "dataCollect",
"datatype": "image"
},
"conda": {
"python": "3.7",
"dependencies": {
"tensorflow": "2.2.0"
}
}
}
After reading the package.json
example above, there are a few requirements:
- plugin package must be written in TypeScript, and compile it to JavaScript before publishing.
- adding the
@pipcook/pipcook-core
todependencies
is required, which contains the unusual types for creating a plugin handler. - adding a root field
pipcook
,pipcook.category
is used to describe the category to which the plugin belongs, and all categories are listed here.pipcook.datatype
is used to describe the type of data to be processed, currently supports:common
,image
, andtext
.
- adding an optional field
conda
for configuring Python-related dependencies,conda.python
is used to specify the Python version, must be3.7
.conda.dependencies
is used to list all Python dependencies which will be installed on plugin initialization, and it supports the following kinds of version string:x.y.z
, the specific version on PyPI.*
, the same to above with the latest version.git+https://github.com/foobar/project@master
, install from GitHub repository, it follows pip-install(1).
Plugin Category
We have defined the following plugin categories for the machine learning lifecycle.
dataCollect(args: ArgsType): Promise<void>
downloads from data source, which is stored in corresponding unified dataset.dataAccess(args: ArgsType): Promise<UniDataset>
gets the dataset ready in loader and compatible with later model.dataProcess(sample: Sample, md: Metadata, args: ArgsType): Promise<void>
processes data in row.modelLoad(data: UniDataset, args: ArgsType): Promise<UniModel>
loads the model into the pipeline.modelDefine(data: UniDataset, args: ModelDefineArgsType): Promise<UniModel>
defines the model.modelTrain(data: UniDataset, model: UniModel, args: ModelTrainArgsType): Promise<UniModel>
outputs the trained model and saves to configured location.modelEvaluate(data: UniDataset, model: UniModel): Promise<EvaluateResult>
calls to corresponding evaluators to view how does the trained model perform.
Developing
Check this contributing documentation for learning how to develop a new plugin.