SQLFlow Command-line Support Model Zoo Commands
Background
As described in model zoo design doc, developers can write their own models and publish them to the Model Zoo
which is a service dedicated to manage model definitions and models. Model Zoo
is designed in server-client mode. Multi developers can access the same Model Zoo
server through their own clients. sqlflow
is a command-line tool currently used to access SQLFlow server. We plan to extend the functionalities of this tool and make it the client of Model Zoo
server.
Overall Design
From the user’s perspective, the command can be divided into categories like access SQLFlow server
and operate the Model Zoo
. So, we can take a classic sub-command format like below, which is written in docopt syntax:
SQLFlow Command-line Tool.
Usage:
sqlflow [options] run [-d <data_source> -e <program> -f <file>]
sqlflow [options] release repo [--force] <model_dir> <name_version>
sqlflow [options] release model [--force] <model> <version>
sqlflow [options] delete repo <name_version>
sqlflow [options] delete model <model> <version>
sqlflow [options] list repo
sqlflow [options] list model
Options:
-v, --version print the version and exit
-h, --help print this screen
-c, --cert-file=<file> cert file to connect SQLFlow or Model Zoo server
--env-file=<file> config file in KEY=VAL format
-s, --sqlflow-server=<addr> SQLFlow server address and port
-m, --model-zoo-server=<addr> Model Zoo server address and port
-u, --user=<user> Model Zoo user account
-p, --password=<password> Model Zoo user password
Run Options:
-d, --data-source=<data_source> data source to operate
-e, --execute=<program> execute given program
-f, --file=<file> execute program in file
Release Options:
--force force overwrite existing model
Implementation
Command-line Parsing
As the command-line is written in docopt
, it can be parsed by existing parsers like docopt.go. After that, we can easily get all the sub commands and their params in the command line.
Model Uploading
For model definitions, we can simply tar the whole directory and upload them through the gRPC interface. For models, there already exists some code to export the model from database to file system. We can upload them after the exporting.
Model and Repo Listing
SQLFlow command-line tool support listing released repos/models. By default, users can only list the repos/models released by himself. So, we need to add authentication info in the listing requests. We added --user
and the --password
options to handle this. As there may be a lot of models and repos, the implementation will pull the list for multiple times, each time for just a small number of results.
Action Plan
We will implement the core logic of the command-line, which is the uploading and deleting of objects in the Model Zoo
. SQLFlow command-line tool may need some authentication process for further operation. This may be implemented by username/password or by certification file. Also, some of the params in the command-line can be written into env file, we postpone the implementation of these features.