Version Adaptation

Version Adaptation

Explain where manual modification is required for Apache, CDH, HDP and other version adaptations

Enter the root directory of the project and execute the following commands in sequence

mvn -N  install
mvn clean install -Dmaven.test.skip=true

linkis-dist -> package -> linkis-dml.sql(db folder)

Switch the corresponding engine version to the version you need. If the version you use is consistent with the official version, you do not need to modify this step

for example:

If Spark is 3.0.0, this is SET @ SPARK_ LABEL=”spark-3.0.0”;
If hive is 2.1.1-cdh6.3.2, adjust 2.1.1 first Cdh6.3.2 (during construction), this is SET @ HIVE LABEL=”hive-2.1.1_cdh6.3.2”;

-- variable：
SET @SPARK_LABEL="spark-2.4.3";
SET @HIVE_LABEL="hive-2.3.3";
SET @PYTHON_LABEL="python-python2";
SET @PIPELINE_LABEL="pipeline-1";
SET @JDBC_LABEL="jdbc-4";
SET @PRESTO_LABEL="presto-0.234";
SET @IO_FILE_LABEL="io_file-1.0";
SET @OPENLOOKENG_LABEL="openlookeng-1.5.0";

engine	version
hadoop	2.7.2
hive	2.3.3
spark	2.4.3
flink	1.12.2

engine	version
hadoop	3.1.1
hive	3.1.2
spark	3.0.1
flink	1.13.2

For Linkis version < 1.3.2

<hadoop.version>3.1.1</hadoop.version>
<scala.version>2.12.10</scala.version>
<scala.binary.version>2.12</scala.binary.version>
<!-- hadoop-hdfs replace with hadoop-hdfs-client -->
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-hdfs-client</artifactId>
    <version>${hadoop.version}</version>
<dependency>

For Linkis version >= 1.3.2, we only need to set scala.version and scala.binary.version if necessary

<scala.version>2.12.10</scala.version>
<scala.binary.version>2.12</scala.binary.version>

Because we can directly compile with hadoop-3.3 or hadoop-2.7 profile. Profile hadoop-3.3 can be used for any hadoop3.x, default hadoop3.x version will be hadoop 3.3.1, Profile hadoop-2.7 can be used for any hadoop2.x, default hadoop2.x version will be hadoop 2.7.2, other hadoop version can be specified by -Dhadoop.version=xxx

mvn -N  install
mvn clean install -Phadoop-3.3 -Dmaven.test.skip=true
mvn clean install -Phadoop-3.3 -Dhadoop.version=3.1.1 -Dmaven.test.skip=true

For Linkis version < 1.3.2

<!-- Notice here <version>${hadoop.version}</version> , adjust according to whether you have encountered errors --> 
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-hdfs-client</artifactId>
    <version>${hadoop.version}</version>
</dependency>

For Linkis version >= 1.3.2,linkis-hadoop-common module no need to change

<hive.version>3.1.2</hive.version>

For Linkis version < 1.3.2

<spark.version>3.0.1</spark.version>

For Linkis version >= 1.3.2

We can directly compile with spark-3.2 or spark-2.4-hadoop-3.3 profile, if we need to used with hadoop3, then profile hadoop-3.3 will be needed.
default spark3.x version will be spark 3.2.1. if we compile with spark-3.2 then scala version will be 2.12.15 by default,
so we do not need to set the scala version in Linkis project pom file(mentioned in 5.1.1).
if spark2.x used with hadoop3, for compatibility reason, profile `spark-2.4-hadoop-3.3` need to be activated.

mvn -N  install
mvn clean install -Pspark-3.2 -Phadoop-3.3 -Dmaven.test.skip=true
mvn clean install -Pspark-2.4-hadoop-3.3 -Phadoop-3.3 -Dmaven.test.skip=true

<flink.version>1.13.2</flink.version>

Since some classes of Flink 1.12.2 to 1.13.2 are adjusted, it is necessary to compile and adjust Flink. Select Scala version 2.12 for compiling Flink

temporary plan

Note that the following operations are all in flink

Due to flink1.12.2 to 1.13.2, some classes are adjusted, so flink needs to be compiled and adjusted, and the version of scala selected for compiling flink is version 2.12(The scala version is based on the actual version used)

flink compilation reference instruction: mvn clean install -DskipTests -P scala-2.12 -Dfast -T 4 -Dmaven.compile.fock=true

-- Note that the following classes are copied from version 1.12.2 to version 1.13.2
org.apache.flink.table.client.config.entries.DeploymentEntry
org.apache.flink.table.client.config.entries.ExecutionEntry
org.apache.flink.table.client.gateway.local.CollectBatchTableSink
org.apache.flink.table.client.gateway.local.CollectStreamTableSink

org.apache.linkis.manager.label.conf.LabelCommonConfig file adjustment

    public static final CommonVars<String> SPARK_ENGINE_VERSION =
            CommonVars.apply("wds.linkis.spark.engine.version", "3.0.1");
    public static final CommonVars<String> HIVE_ENGINE_VERSION =
            CommonVars.apply("wds.linkis.hive.engine.version", "3.1.2");

org.apache.linkis.governance.common.conf.GovernanceCommonConf file adjustment

  val SPARK_ENGINE_VERSION = CommonVars("wds.linkis.spark.engine.version", "3.0.1")
  val HIVE_ENGINE_VERSION = CommonVars("wds.linkis.hive.engine.version", "3.1.2")

engine	version
hadoop	3.1.1
hive	3.1.0
spark	2.3.2
json4s.version	3.2.11

For Linkis version < 1.3.2

<hadoop.version>3.1.1</hadoop.version>
<json4s.version>3.2.11</json4s.version>
<!-- hadoop-hdfs replace with hadoop-hdfs-client -->
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-hdfs-client</artifactId>
    <version>${hadoop.version}</version>
<dependency>

For Linkis version >= 1.3.2, we only need to set json4s.version if necessary

<json4s.version>3.2.11</json4s.version>

mvn -N  install
mvn clean install -Phadoop-3.3 -Dmaven.test.skip=true
mvn clean install -Phadoop-3.3 -Dhadoop.version=3.1.1 -Dmaven.test.skip=true

<hive.version>3.1.0</hive.version>

For Linkis version < 1.3.2

<spark.version>2.3.2</spark.version>

For Linkis version >= 1.3.2

We can directly compile with spark-3.2 profile, if we need to use with hadoop3, then profile hadoop-3.3 will be needed.
default spark3.x version will be spark 3.2.1. if we compile with spark-3.2 then scala version will be 2.12.15 by default,
so we do not need to set the scala version in Linkis project pom file(mentioned in 5.1.1).
if spark2.x used with hadoop3, for compatibility reason, profile `spark-2.4-hadoop-3.3` need to be activated.

mvn -N  install
mvn clean install -Pspark-3.2 -Phadoop-3.3 -Dmaven.test.skip=true
mvn clean install -Pspark-2.4-hadoop-3.3 -Phadoop-3.3 -Dmaven.test.skip=true

org.apache.linkis.manager.label.conf.LabelCommonConfig file adjustment

    public static final CommonVars<String> SPARK_ENGINE_VERSION =
            CommonVars.apply("wds.linkis.spark.engine.version", "2.3.2");
    public static final CommonVars<String> HIVE_ENGINE_VERSION =
            CommonVars.apply("wds.linkis.hive.engine.version", "3.1.0");

org.apache.linkis.governance.common.conf.GovernanceCommonConf file adjustment

  val SPARK_ENGINE_VERSION = CommonVars("wds.linkis.spark.engine.version", "2.3.2")
  val HIVE_ENGINE_VERSION = CommonVars("wds.linkis.hive.engine.version", "3.1.0")

<mirrors>
  <!-- mirror
   | Specifies a repository mirror site to use instead of a given repository. The repository that
   | this mirror serves has an ID that matches the mirrorOf element of this mirror. IDs are used
   | for inheritance and direct lookup purposes, and must be unique across the set of mirrors.
   |
  <mirror>
    <id>mirrorId</id>
    <mirrorOf>repositoryId</mirrorOf>
    <name>Human Readable Name for this Mirror.</name>
    <url>http://my.repository.com/repo/path</url>
  </mirror>
   -->
   <mirror>
       <id>nexus-aliyun</id>
       <mirrorOf>*,!cloudera</mirrorOf>
       <name>Nexus aliyun</name>
       <url>http://maven.aliyun.com/nexus/content/groups/public</url>
   </mirror>
   <mirror>
       <id>aliyunmaven</id>
       <mirrorOf>*,!cloudera</mirrorOf>
       <name>Alibaba Cloud Public Warehouse</name>
       <url>https://maven.aliyun.com/repository/public</url>
   </mirror>
   <mirror>
       <id>aliyunmaven</id>
       <mirrorOf>*,!cloudera</mirrorOf>
       <name>spring-plugin</name>
       <url>https://maven.aliyun.com/repository/spring-plugin</url>
   </mirror>
  <mirror>
    <id>maven-default-http-blocker</id>
    <mirrorOf>external:http:*</mirrorOf>
    <name>Pseudo repository to mirror external repositories initially using HTTP.</name>
    <url>http://0.0.0.0/</url>
    <blocked>true</blocked>
  </mirror>
</mirrors>

    <repositories>
        <repository>
            <id>cloudera</id>
            <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
            <releases>
                <enabled>true</enabled>
            </releases>
        </repository>
        <!--To prevent cloudera from not being found, add Alibaba Source-->
        <repository>
            <id>aliyun</id>
            <url>http://maven.aliyun.com/nexus/content/groups/public/</url>
            <releases>
                <enabled>true</enabled>
            </releases>
        </repository>
    </repositories>

engine	version
hadoop	2.6.0-cdh5.12.1
zookeeper	3.4.5-cdh5.12.1
hive	1.1.0-cdh5.12.1
spark	2.3.4
flink	1.12.4
python	python3

<hadoop.version>2.6.0-cdh5.12.1</hadoop.version>
<zookeeper.version>3.4.5-cdh5.12.1</zookeeper.version>
<scala.version>2.11.8</scala.version>

-- update
<hive.version>1.1.0-cdh5.12.1</hive.version>
-- add
<package.hive.version>1.1.0_cdh5.12.1</package.hive.version>

update assembly under distribution.xml file

<outputDirectory>/dist/v${package.hive.version}/lib</outputDirectory>
<outputDirectory>dist/v${package.hive.version}/conf</outputDirectory>
<outputDirectory>plugin/${package.hive.version}</outputDirectory>

update CustomerDelimitedJSONSerDe file

   /* hive version is too low and needs to be noted
   case INTERVAL_YEAR_MONTH:
       {
           wc = ((HiveIntervalYearMonthObjectInspector) oi).getPrimitiveWritableObject(o);
           binaryData = Base64.encodeBase64(String.valueOf(wc).getBytes());
           break;
       }
   case INTERVAL_DAY_TIME:
       {
           wc = ((HiveIntervalDayTimeObjectInspector) oi).getPrimitiveWritableObject(o);
           binaryData = Base64.encodeBase64(String.valueOf(wc).getBytes());
           break;
       }
   */

<flink.version>1.12.4</flink.version>

<spark.version>2.3.4</spark.version>

<python.version>python3</python.version>

org.apache.linkis.manager.label.conf.LabelCommonConfig file adjustment

   public static final CommonVars<String> SPARK_ENGINE_VERSION =
            CommonVars.apply("wds.linkis.spark.engine.version", "2.3.4");
    public static final CommonVars<String> HIVE_ENGINE_VERSION =
            CommonVars.apply("wds.linkis.hive.engine.version", "1.1.0");
            CommonVars.apply("wds.linkis.python.engine.version", "python3")

org.apache.linkis.governance.common.conf.GovernanceCommonConf file adjustment

  val SPARK_ENGINE_VERSION = CommonVars("wds.linkis.spark.engine.version", "2.3.4")
  val HIVE_ENGINE_VERSION = CommonVars("wds.linkis.hive.engine.version", "1.1.0")
  val PYTHON_ENGINE_VERSION = CommonVars("wds.linkis.python.engine.version", "python3")

engine	version
hadoop	3.0.0-cdh6.3.2
hive	2.1.1-cdh6.3.2
spark	3.0.0

<hadoop.version>3.0.0-cdh6.3.2</hadoop.version> 
<scala.version>2.12.10</scala.version>

   <!-- hadoop-hdfs replace with hadoop-hdfs-client --> 
   <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-hdfs-client</artifactId>
    </dependency>

-- update
<hive.version>2.1.1-cdh6.3.2</hive.version>
-- add
<package.hive.version>2.1.1_cdh6.3.2</package.hive.version>

update assembly under distribution.xml file

<outputDirectory>/dist/v${package.hive.version}/lib</outputDirectory>
<outputDirectory>dist/v${package.hive.version}/conf</outputDirectory>
<outputDirectory>plugin/${package.hive.version}</outputDirectory>

<spark.version>3.0.0</spark.version>

org.apache.linkis.manager.label.conf.LabelCommonConfig file adjustment

   public static final CommonVars<String> SPARK_ENGINE_VERSION =
            CommonVars.apply("wds.linkis.spark.engine.version", "3.0.0");
    public static final CommonVars<String> HIVE_ENGINE_VERSION =
            CommonVars.apply("wds.linkis.hive.engine.version", "2.1.1_cdh6.3.2");

org.apache.linkis.governance.common.conf.GovernanceCommonConf file adjustment

  val SPARK_ENGINE_VERSION = CommonVars("wds.linkis.spark.engine.version", "3.0.0")
  val HIVE_ENGINE_VERSION = CommonVars("wds.linkis.hive.engine.version", "2.1.1_cdh6.3.2")

If the class is missing or the method in the class is missing, find the corresponding package dependency, and how to try to switch to the version with the corresponding package or class
If the engine version needs to use -, use _ to replace, add<package.(engine name).version>to specify the replaced version, and use ${package.(engine name). version} in the corresponding engine distribution file to replace the original
If sometimes there is a 403 problem when using Alibaba Cloud images to download the jars of guava, you can switch to Huawei, Tencent and other image warehouses