Introduction

Introduction

Linkis builds a layer of computation middleware between upper applications and underlying engines. By using standard interfaces such as REST/WS/JDBC provided by Linkis, the upper applications can easily access the underlying engines such as MySQL/Spark/Hive/Presto/Flink, etc., and achieve the intercommunication of user resources like unified variables, scripts, UDFs, functions and resource files at the same time.

As a computation middleware, Linkis provides powerful connectivity, reuse, orchestration, expansion, and governance capabilities. By decoupling the application layer and the engine layer, it simplifies the complex network call relationship, and thus reduces the overall complexity and saves the development and maintenance costs as well.

Since the first release of Linkis in 2019, it has accumulated more than 700 trial companies and 1000+ sandbox trial users, which involving diverse industries, from finance, banking, tele-communication, to manufactory, internet companies and so on. Lots of companies have already used Linkis as a unified entrance for the underlying computation and storage engines of the big data platform.

linkis-intro-01

linkis-intro-03

Features

  • Support for diverse underlying computation storage engines.
    Currently supported computation/storage engines: Spark, Hive, Python, Presto, ElasticSearch, MLSQL, TiSpark, JDBC, Shell, etc;
    Computation/storage engines to be supported: Flink(Supported in version >=1.0.2), Impala, etc;
    Supported scripting languages: SparkSQL, HiveQL, Python, Shell, Pyspark, R, Scala and JDBC, etc.

  • Powerful task/request governance capabilities. With services such as Orchestrator, Label Manager and customized Spring Cloud Gateway, Linkis is able to provide multi-level labels based, cross-cluster/cross-IDC fine-grained routing, load balance, multi-tenancy, traffic control, resource control, and orchestration strategies like dual-active, active-standby, etc.

  • Support full stack computation/storage engine. As a computation middleware, it will receive, execute and manage tasks and requests for various computation storage engines, including batch tasks, interactive query tasks, real-time streaming tasks and storage tasks;

  • Resource management capabilities. ResourceManager is not only capable of managing resources for Yarn and Linkis EngineManger as in Linkis 0.X, but also able to provide label-based multi-level resource allocation and recycling, allowing itself to have powerful resource management capabilities across mutiple Yarn clusters and mutiple computation resource types;

  • Unified Context Service. Generate Context ID for each task/request, associate and manage user and system resource files (JAR, ZIP, Properties, etc.), result set, parameter variable, function, etc., across user, system, and computing engine. Set in one place, automatic reference everywhere;

  • Unified materials. System and user-level unified material management, which can be shared and transferred across users and systems.

Supported engine types

EngineSupported VersionLinkis 0.X version requirementLinkis 1.X version requirementDescription
Flink1.12.2>=dev-0.12.0, PR #703 not merged yet.>=1.0.2Flink EngineConn. Supports FlinkSQL code, and also supports Flink Jar to Linkis Manager to start a new Yarn application.
Impala>=3.2.0, CDH >=6.3.0”>=dev-0.12.0, PR #703 not merged yet.ongoingImpala EngineConn. Supports Impala SQL.
Presto>= 0.180>=0.11.0ongoingPresto EngineConn. Supports Presto SQL.
ElasticSearch>=6.0>=0.11.0ongoingElasticSearch EngineConn. Supports SQL and DSL code.
ShellBash >=2.0>=0.9.3>=1.0.0_rc1Shell EngineConn. Supports shell code.
MLSQL>=1.1.0>=0.9.1ongoingMLSQL EngineConn. Supports MLSQL code.
JDBCMySQL >=5.0, Hive >=1.2.1>=0.9.0>=1.0.0_rc1JDBC EngineConn. Supports MySQL and HiveQL code.
SparkApache 2.0.0~2.4.7, CDH >=5.4.0>=0.5.0>=1.0.0_rc1Spark EngineConn. Supports SQL, Scala, Pyspark and R code.
HiveApache >=1.0.0, CDH >=5.4.0>=0.5.0>=1.0.0_rc1Hive EngineConn. Supports HiveQL code.
HadoopApache >=2.6.0, CDH >=5.4.0>=0.5.0ongoingHadoop EngineConn. Supports Hadoop MR/YARN application.
Python>=2.6>=0.5.0>=1.0.0_rc1Python EngineConn. Supports python code.
TiSpark1.1>=0.5.0ongoingTiSpark EngineConn. Support querying TiDB data by SparkSQL.

Download

Please go to the Linkis releases page to download a compiled distribution or a source code package of Linkis.

Compile and deploy

Please follow Compile Guide to compile Linkis from source code.
Please refer to Deployment_Documents to do the deployment.

Examples and Guidance

You can find examples and guidance for how to use and manage Linkis in User_Manual, Engine_Usage_Documents and API_Documents.

Documentation

The documentation of linkis is in Linkis-WebSite

Architecture

Linkis services could be divided into three categories: computation governance services, public enhancement services and microservice governance services.

  • The computation governance services, support the 3 major stages of processing a task/request: submission -> preparation -> execution;
  • The public enhancement services, including the material library service, context service, and data source service;
  • The microservice governance services, including Spring Cloud Gateway, Eureka and Open Feign.

Below is the Linkis architecture diagram. You can find more detailed architecture docs in Architecture. architecture

Based on Linkis the computation middleware, we’ve built a lot of applications and tools on top of it in the big data platform suite WeDataSphere. Below are the currently available open-source projects.

wedatasphere_stack_Linkis

More projects upcoming, please stay tuned.

Contributing

Contributions are always welcomed, we need more contributors to build Linkis together. either code, or doc, or other supports that could help the community.
For code and documentation contributions, please follow the contribution guide.

Contact Us

Any questions or suggestions please kindly submit an issue.
You can scan the QR code below to join our WeChat and QQ group to get more immediate response.

introduction05

Meetup videos on Bilibili.

Who is Using Linkis

We opened an issue for users to feedback and record who is using Linkis.
Since the first release of Linkis in 2019, it has accumulated more than 700 trial companies and 1000+ sandbox trial users, which involving diverse industries, from finance, banking, tele-communication, to manufactory, internet companies and so on.