快速体验 Apache Doris & Paimon

使用须知

  1. 数据放在 hdfs 时,需要将 core-site.xml,hdfs-site.xml 和 hive-site.xml 放到 FE 和 BE 的 conf 目录下。优先读取 conf 目录下的 hadoop 配置文件,再读取环境变量 HADOOP_CONF_DIR 的相关配置文件。
  2. 当前适配的 Paimon 版本为 0.8。

创建 Catalog

Paimon Catalog 当前支持两种类型的 Metastore 创建 Catalog:

  • filesystem(默认),同时存储元数据和数据在 filesystem。
  • hive metastore,它还将元数据存储在 Hive metastore 中。用户可以直接从 Hive 访问这些表。

基于 FileSystem 创建 Catalog

HDFS

  1. CREATE CATALOG `paimon_hdfs` PROPERTIES (
  2. "type" = "paimon",
  3. "warehouse" = "hdfs://HDFS8000871/user/paimon",
  4. "dfs.nameservices" = "HDFS8000871",
  5. "dfs.ha.namenodes.HDFS8000871" = "nn1,nn2",
  6. "dfs.namenode.rpc-address.HDFS8000871.nn1" = "172.21.0.1:4007",
  7. "dfs.namenode.rpc-address.HDFS8000871.nn2" = "172.21.0.2:4007",
  8. "dfs.client.failover.proxy.provider.HDFS8000871" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
  9. "hadoop.username" = "hadoop"
  10. );
  11. CREATE CATALOG `paimon_kerberos` PROPERTIES (
  12. 'type'='paimon',
  13. "warehouse" = "hdfs://HDFS8000871/user/paimon",
  14. "dfs.nameservices" = "HDFS8000871",
  15. "dfs.ha.namenodes.HDFS8000871" = "nn1,nn2",
  16. "dfs.namenode.rpc-address.HDFS8000871.nn1" = "172.21.0.1:4007",
  17. "dfs.namenode.rpc-address.HDFS8000871.nn2" = "172.21.0.2:4007",
  18. "dfs.client.failover.proxy.provider.HDFS8000871" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
  19. 'hadoop.security.authentication' = 'kerberos',
  20. 'hadoop.kerberos.keytab' = '/doris/hdfs.keytab',
  21. 'hadoop.kerberos.principal' = 'hdfs@HADOOP.COM'
  22. );

MINIO

  1. CREATE CATALOG `paimon_s3` PROPERTIES (
  2. "type" = "paimon",
  3. "warehouse" = "s3://bucket_name/paimons3",
  4. "s3.endpoint" = "http://<ip>:<port>",
  5. "s3.access_key" = "ak",
  6. "s3.secret_key" = "sk"
  7. );

OBS

  1. CREATE CATALOG `paimon_obs` PROPERTIES (
  2. "type" = "paimon",
  3. "warehouse" = "obs://bucket_name/paimon",
  4. "obs.endpoint"="obs.cn-north-4.myhuaweicloud.com",
  5. "obs.access_key"="ak",
  6. "obs.secret_key"="sk"
  7. );

COS

  1. CREATE CATALOG `paimon_s3` PROPERTIES (
  2. "type" = "paimon",
  3. "warehouse" = "cosn://paimon-1308700295/paimoncos",
  4. "cos.endpoint" = "cos.ap-beijing.myqcloud.com",
  5. "cos.access_key" = "ak",
  6. "cos.secret_key" = "sk"
  7. );

OSS

  1. CREATE CATALOG `paimon_oss` PROPERTIES (
  2. "type" = "paimon",
  3. "warehouse" = "oss://paimon-zd/paimonoss",
  4. "oss.endpoint" = "oss-cn-beijing.aliyuncs.com",
  5. "oss.access_key" = "ak",
  6. "oss.secret_key" = "sk"
  7. );

基于 Hive Metastore 创建 Catalog

  1. CREATE CATALOG `paimon_hms` PROPERTIES (
  2. "type" = "paimon",
  3. "paimon.catalog.type" = "hms",
  4. "warehouse" = "hdfs://HDFS8000871/user/zhangdong/paimon2",
  5. "hive.metastore.uris" = "thrift://172.21.0.44:7004",
  6. "dfs.nameservices" = "HDFS8000871",
  7. "dfs.ha.namenodes.HDFS8000871" = "nn1,nn2",
  8. "dfs.namenode.rpc-address.HDFS8000871.nn1" = "172.21.0.1:4007",
  9. "dfs.namenode.rpc-address.HDFS8000871.nn2" = "172.21.0.2:4007",
  10. "dfs.client.failover.proxy.provider.HDFS8000871" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
  11. "hadoop.username" = "hadoop"
  12. );
  13. CREATE CATALOG `paimon_kerberos` PROPERTIES (
  14. "type" = "paimon",
  15. "paimon.catalog.type" = "hms",
  16. "warehouse" = "hdfs://HDFS8000871/user/zhangdong/paimon2",
  17. "hive.metastore.uris" = "thrift://172.21.0.44:7004",
  18. "hive.metastore.sasl.enabled" = "true",
  19. "hive.metastore.kerberos.principal" = "hive/xxx@HADOOP.COM",
  20. "dfs.nameservices" = "HDFS8000871",
  21. "dfs.ha.namenodes.HDFS8000871" = "nn1,nn2",
  22. "dfs.namenode.rpc-address.HDFS8000871.nn1" = "172.21.0.1:4007",
  23. "dfs.namenode.rpc-address.HDFS8000871.nn2" = "172.21.0.2:4007",
  24. "dfs.client.failover.proxy.provider.HDFS8000871" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
  25. "hadoop.security.authentication" = "kerberos",
  26. "hadoop.kerberos.principal" = "hdfs@HADOOP.COM",
  27. "hadoop.kerberos.keytab" = "/doris/hdfs.keytab"
  28. );

列类型映射

Paimon Data TypeDoris Data TypeComment
BooleanTypeBoolean
TinyIntTypeTinyInt
SmallIntTypeSmallInt
IntTypeInt
FloatTypeFloat
BigIntTypeBigInt
DoubleTypeDouble
VarCharTypeVarChar
CharTypeChar
VarBinaryType, BinaryTypeString
DecimalType(precision, scale)Decimal(precision, scale)
TimestampType,LocalZonedTimestampTypeDateTime
DateTypeDate
ArrayTypeArray支持Array嵌套
MapTypeMap支持Map嵌套
RowTypeStruct支持Struct嵌套(2.0.10 和 2.1.3 版本开始支持)

常见问题

  1. Kerberos 问题

    • 确保 principal 和 keytab 配置正确。
    • 需在 BE 节点启动定时任务(如 crontab),每隔一定时间(如 12 小时),执行一次 kinit -kt your_principal your_keytab 命令。
  2. Unknown type value: UNSUPPORTED

    这是 Doris 2.0.2 版本和 Paimon 0.5 版本的一个兼容性问题,需要升级到 2.0.3 或更高版本解决,或自行 patch

  3. 访问对象存储(OSS、S3 等)报错文件系统不支持

    在 2.0.5(含)之前的版本,用户需手动下载以下 jar 包并放置在 ${DORIS_HOME}/be/lib/java_extensions/preload-extensions 目录下,重启 BE。

    2.0.6 之后的版本不再需要用户手动放置。