TiDB 6.5.6 Release Notes

Release date: December 7, 2023

TiDB version: 6.5.6

Quick access: Quick start | Production deployment

Compatibility changes

Improvements

  • TiDB

  • TiKV

    • Optimize memory usage of Resolver to prevent OOM #15458 @overvenus
    • Eliminate LRUCache in Router objects to reduce memory usage and prevent OOM #15430 @Connor1996
    • Add the alive and leak monitoring dimensions for the apply_router and raft_router metrics #15357 @tonyxuqqi
  • PD

    • Add monitoring metrics such as Status and Sync Progress for DR Auto-Sync on the Grafana dashboard #6975 @disksing
  • Tools

    • Backup & Restore (BR)

      • During restoring a snapshot backup, BR retries when it encounters certain network errors #48528 @Leavrth
      • Introduce a new integration test for Point-In-Time Recovery (PITR) in the delete range scenario, enhancing PITR stability #47738 @Leavrth
      • Enable automatic retry of Region scatter during snapshot recovery when encountering timeout failures or cancellations of Region scatter #47236 @Leavrth
      • BR can pause Region merging by setting the merge-schedule-limit configuration to 0 #7148 @BornChanger
    • TiCDC

Bug fixes

  • TiDB

    • Fix the issue that the chunk cannot be reused when the HashJoin operator performs probe #48082 @wshwsh12
    • Fix the issue that Duplicate entry might occur when AUTO_ID_CACHE=1 is set #46444 @tiancaiamao
    • Fix the issue that the TIDB_INLJ hint does not take effect when joining two sub-queries #46160 @qw4990
    • Fix the issue that DDL operations might get stuck after TiDB is restarted #46751 @wjhuang2016
    • Fix the issue that DDL operations might get permanently blocked due to incorrect MDL handling #46920 @wjhuang2016
    • Fix the issue that the results of MERGE_JOIN are incorrect #46580 @qw4990
    • Fix the issue that the Sort operator might cause TiDB to crash during the spill process #47538 @windtalker
    • Fix the issue that the cast(col)=range condition causes FullScan when CAST has no precision loss #45199 @AilinKid
    • Fix the panic issue of batch-client in client-go #47691 @crazycs520
    • Prohibit split table operations on non-integer clustered indexes #47350 @tangenta
    • Fix the incompatibility issue between the behavior of prepared plan cache and non-prepared plan cache during time conversion #42439 @qw4990
    • Fix the issue that sometimes an index cannot be created for an empty table using ingest mode #39641 @tangenta
    • Fix the issue of not being able to detect data that does not comply with partition definitions during partition exchange #46492 @mjonss
    • Fix the issue that GROUP_CONCAT cannot parse the ORDER BY column #41986 @AilinKid
    • Fix the issue that HashCode is repeatedly calculated for deeply nested expressions, which causes high memory usage and OOM #42788 @AilinKid
    • Fix the issue that when Aggregation is pushed down through Union in MPP execution plans, the results are incorrect #45850 @AilinKid
    • Fix the issue of incorrect memory usage estimation in INDEX_LOOKUP_HASH_JOIN #47788 @SeaRise
    • Fix the issue that the zip file generated by plan replayer cannot be imported back into TiDB #46474 @YangKeao
    • Fix the incorrect cost estimation caused by an excessively large N in LIMIT N #43285 @qw4990
    • Fix the panic issue that might occur when constructing TopN structure for statistics #35948 @hi-rustin
    • Fix the issue that the result of COUNT(INT) calculated by MPP might be incorrect #48643 @AilinKid
    • Fix the issue that panic might occur when tidb_enable_ordered_result_mode is enabled #45044 @qw4990
    • Fix the issue that the optimizer mistakenly selects IndexFullScan to reduce sort introduced by window functions #46177 @qw4990
    • Fix the issue that the result might be incorrect when predicates are pushed down to common table expressions #47881 @winoros
    • Fix the issue that executing UNION ALL with the DUAL table as the first subnode might cause an error #48755 @winoros
    • Fix the issue that column pruning can cause panic in specific situations #47331 @hi-rustin
    • Fix the issue of possible syntax error when a common table expression (CTE) containing aggregate or window functions is referenced by other recursive CTEs #47603 #47711 @elsa0520
    • Fix the issue that an exception might occur when using the QB_NAME hint in a prepared statement #46817 @jackysp
    • Fix the issue of Goroutine leak when using AUTO_ID_CACHE=1 #46324 @tiancaiamao
    • Fix the issue that TiDB might panic when shutting down #32110 @july2993
    • Fix the issue of not handling locks in the MVCC interface when reading schema diff commit versions from the TiDB schema cache #48281 @cfzjywxk
    • Fix the issue of duplicate rows in information_schema.columns caused by renaming a table #47064 @jiyfhust
    • Fix the bugs in the LOAD DATA REPLACE INTO statement #47995) @lance6716
    • Fix the issue of IMPORT INTO task failure caused by PD leader malfunction for 1 minute #48307 @D3Hunter
    • Fix the issue of ADMIN CHECK failure caused by creating an index on a date type field #47426 @tangenta
    • Fix the issue of unsorted row data returned by TABLESAMPLE #48253 @tangenta
    • Fix the TiDB node panic issue that occurs when DDL jobID is restored to 0 #46296 @jiyfhust
  • TiKV

    • Fix the issue that moving a peer might cause the performance of the Follower Read to deteriorate #15468 @YuJuncen
    • Fix the data error of continuously increasing raftstore-applys #15371 @Connor1996
    • Fix the issue that requests of the TiDB Lightning checksum coprocessor time out when there is online workload #15565 @lance6716
    • Fix security issues by upgrading the version of lz4-sys to 1.9.4 #15621 @SpadeA-Tang
    • Fix security issues by upgrading the version of tokio to 6.5 #15621 @LykxSassinator
    • Fix security issues by removing the flatbuffer #15621 @tonyxuqqi
    • Fix the issue that resolved-ts lag increases when TiKV stores are partitioned #15679 @hicqu
    • Fix the TiKV OOM issue that occurs when restarting TiKV and there are a large number of Raft logs that are not applied #15770 @overvenus
    • Fix the issue that stale peers are retained and block resolved-ts after Regions are merged #15919 @overvenus
    • Fix the issue that the scheduler command variables are incorrect in Grafana on the cloud environment #15832 @Connor1996
    • Fix the issue that blob-run-mode in Titan cannot be updated online #15978 @tonyxuqqi
    • Fix the issue that TiKV panics due to inconsistent metadata between Regions #13311 @cfzjywxk
    • Fix the issue that TiKV panics when the leader is forced to exit during Online Unsafe Recovery #15629 @Connor1996
    • Fix the issue that the joint state of DR Auto-Sync might time out when scaling out #15817 @Connor1996
    • Fix the issue that TiKV coprocessor might return stale data when removing a Raft peer #16069 @overvenus
    • Fix the issue that resolved-ts might be blocked for 2 hours #39130 @overvenus
    • Fix the issue that Flashback might get stuck when encountering notLeader or regionNotFound #15712 @HuSharp
  • PD

    • Fix potential security risks of the plugin directory and files #7094 @HuSharp
    • Fix the issue that modified isolation levels are not synchronized to the default placement rules #7121 @rleungx
    • Fix the issue that evict-leader-scheduler might lose configuration #6897 @HuSharp
    • Fix the issue that the method for counting empty Regions might cause Regions to be unbalanced during the recovery process of BR #7148 @Cabinfever
    • Fix the issue that canSync and hasMajority might be calculated incorrectly for clusters adopting the Data Replication Auto Synchronous (DR Auto-Sync) mode when the configuration of Placement Rules is complex #7201 @disksing
    • Fix the issue that available_stores is calculated incorrectly for clusters adopting the Data Replication Auto Synchronous (DR Auto-Sync) mode #7221 @disksing
    • Fix the issue that the primary AZ cannot add TiKV nodes when the secondary AZ is down for clusters adopting the Data Replication Auto Synchronous (DR Auto-Sync) mode #7218 @disksing
    • Fix the issue that adding multiple TiKV nodes to a large cluster might cause TiKV heartbeat reporting to become slow or stuck #7248 @rleungx
    • Fix the issue that PD might delete normal Peers when TiKV nodes are unavailable #7249 @lhy1024
    • Fix the issue that it takes a long time to switch the leader in DR Auto-Sync mode #6988 @HuSharp
    • Upgrade the version of Gin Web Framework from v1.8.1 to v1.9.1 to fix some security issues #7438 @niubell
  • TiFlash

    • Fix the issue that the max_snapshot_lifetime metric is displayed incorrectly on Grafana #7713 @JaySon-Huang
    • Fix the issue that executing the ALTER TABLE ... EXCHANGE PARTITION ... statement causes panic #8372 @JaySon-Huang
    • Fix the issue that the memory usage reported by MemoryTracker is inaccurate #8128 @JinheLin
  • Tools

    • Backup & Restore (BR)

      • Fix the issue that the log backup might get stuck in some scenarios when backing up large wide tables #15714 @YuJuncen
      • Fix the issue that frequent flushes cause log backup to get stuck #15602 @3pointer
      • Fix the issue that the retry after an EC2 metadata connection reset cause degraded backup and restore performance #47650 @Leavrth
      • Fix the issue that running PITR multiple times within 1 minute might cause data loss #15483 @YuJuncen
      • Fix the issue that the default values for BR SQL commands and CLI are different, which might cause OOM issues #48000 @YuJuncen
      • Fix the issue that log backup might panic when the PD owner is transferred #47533 @YuJuncen
      • Fix the issue that BR generates incorrect URIs for external storage files #48452 @3AceShowHand
    • TiCDC

      • Fix the issue that TiCDC server might panic when executing lossy DDL statements in upstream #9739 @hicqu
      • Fix the issue that the replication task reports an error when executing RESUME with the redo log feature enabled #9769 @hicqu
      • Fix the issue that replication lag becomes longer when the TiKV node crashes #9741 @sdojjy
      • Fix the issue that the WHERE statement does not use the primary key as the condition when replicating data to TiDB or MySQL #9988 @asddongmen
      • Fix the issue that the workload of a replication task is not distributed evenly across TiCDC nodes #9839 @3AceShowHand
      • Fix the issue that the interval between replicating DDL statements is too long when redo log is enabled #9960 @CharlesCheung96
      • Fix the issue that the changefeed cannot replicate DML events in bidirectional replication mode if the target table is dropped and then recreated in upstream #10079 @asddongmen
      • Fix the issue that the replication lag becomes longer due to too many NFS files when replicating data to an object storage service #10041 @CharlesCheung96
      • Fix the issue that the TiCDC server might panic when replicating data to an object storage service #10137 @sdojjy
      • Fix the issue that TiCDC accesses the invalid old address during PD scaling up and down #9584 @fubinzh @asddongmen
      • Fix the issue that fetching wrong memory information might cause OOM issues in some operating systems #9762 @sdojjy
    • TiDB Data Migration (DM)

      • Fix the issue that DM skips partition DDLs in optimistic mode #9788 @GMHDBJD
      • Fix the issue that DM cannot properly track upstream table schemas when skipping online DDLs #9587 @GMHDBJD
      • Fix the issue that replication lag returned by DM keeps growing when a failed DDL is skipped and no subsequent DDLs are executed #9605 @D3Hunter
      • Fix the issue that DM skips all DMLs when resuming a task in optimistic mode #9588 @GMHDBJD
    • TiDB Lightning

      • Fix the issue that data import fails when encountering the write to tikv with no leader returned error #45673 @lance6716
      • Fix the issue that data import fails because HTTP retry requests do not use the current request content #47930 @lance6716
      • Fix the issue that TiDB Lightning gets stuck during writeToTiKV #46321 @lance6716
      • Remove unnecessary get_regions calls in physical import mode #45507 @mittalrishabh
    • TiDB Binlog

      • Fix the issue that Drainer exits when transporting a transaction greater than 1 GB #28659 @jackysp