Quick Start

This section contains a quick start guide to help you get started with Apache InLong.

Overall architecture

Apache InLong

Apache InLong(incubating) overall architecture is as above. This component is a one-stop integration framework for massive data that provides automated, secure, distributed, and efficient data publishing and subscription capabilities to help You can easily build stream-based data applications.

InLong (应龙) is a divine beast in Chinese mythology who guides river into the sea, it is regarded as a metaphor of the InLong system for reporting streams of data.

InLong was originally built in Tencent and has served online business for more than 8 years. It supports massive data (over 40 trillion pieces of data per day) report services under big data scenarios. The entire platform integrates 5 modules including data collection, aggregation, caching, sorting and management modules. Through this system, the business only needs to provide data sources, data service quality, data landing clusters and data landing formats, that is, data can be continuous Push data from the source cluster to the target cluster, which greatly meets the data reporting service requirements in the business big data scenario.

Compile

  1. $ mvn clean install -DskipTests

(Optional) Compile using docker image:

  1. $ docker pull maven:3.6-openjdk-8
  2. $ docker run -v `pwd`:/inlong -w /inlong maven:3.6-openjdk-8 mvn clean install -DskipTests

after compile successfully, you could find distribution file at inlong-distribution/target with tar.gz format, it includes following files:

  1. inlong-agent
  2. inlong-dataproxy
  3. inlong-dataproxy-sdk
  4. inlong-manager-web
  5. inlong-sort
  6. inlong-tubemq-manager
  7. inlong-tubemq-server
  8. inlong-website

Environment Requirements

  • ZooKeeper 3.5+
  • Hadoop 2.10.x 和 Hive 2.3.x
  • MySQL 5.7+
  • Flink 1.9.x

deploy InLong TubeMQ Server

deploy InLong TubeMQ Server

deploy InLong TubeMQ Manager

deploy InLong TubeMQ Manager

deploy InLong Manager

deploy InLong Manager

deploy InLong WebSite

deploy InLong WebSite

deploy InLong Sort

deploy InLong Sort

deploy InLong DataProxy

deploy InLong DataProxy

deploy InLong DataProxy-SDK

deploy InLong DataProxy

deploy InLong Agent

deploy InLong Agent

Business configuration

How to configure a new business

Data report verification

At this stage, you can collect data through the file agent and verify whether the received data is consistent with the sent data in the specified Hive table.