Releases: apache/incubator-gluten
Releases · apache/incubator-gluten
v1.3.0
Release Notes - Gluten version 1.3.0
Highlights
- Spark 3.2.2/3.3.1/3.4.3(upgraded)/3.5.2(upgraded)
- 268+ spark functions including json
- Update OAP's Velox codebase to 2025/01/07
- Join: Sort Merge Join support
- Shuffle: Sort based Shuffle(Row)
- Query Plan: RAS Optimization
- Datalake: Hudi 0.15.0 support/Iceberg 1.5.0/Delta 3.2.0
- RSS: Celeborn 0.5.2/Uniffle 0.9.1
- File Format: CSV support via arrow
- JVM libhdfs with viewfs/kerberos support
- Partial Project(UDF) support
- Mix backend refactor
- Bucket write in partitioned Hive table
- CI/Nightly Package Tools Update
- Build & Compile Tools Update(recommend to use vcpkg with static build)
- Fix several result mismatch issues
- Fix OOM/Yarn Kill unstable issues
What's Changed
- [VL] Make velox writer queue size configurable @yikf github.com//pull/6341
- [VL] Remove useless ctx variable @gaoyangxiaozhu github.com//pull/6348
- [1632][CH]Daily Update Clickhouse Version (20240706) @kyligence-git github.com//pull/6359
- [VL] fix build bundle package @zhouyuan github.com//pull/6364
- [VL] Fix process_setup_alinux3 arrow CMakeLists.txt path @liujiayi771 github.com//pull/6363
- [VL] Daily Update Velox Version (2024_07_08) @GlutenPerfBot github.com//pull/6366
- [6262][CH]Json input format ignore key case @KevinyhZou github.com//pull/6263
- [6285][VL] Add debian10 vcpkg depends @wenwj0 github.com//pull/6286
- [CELEBORN] CelebornShuffleManager#stop should stop non-null _vanillaCelebornShuffleManager @SteNicholas github.com//pull/6371
- [VL] Update ubuntu docker to use cmake 3.28 @boneanxs github.com//pull/6373
- [6304][CH]Support array_join @KevinyhZou github.com//pull/6305
- [VL] Daily Update Velox Version (2024_07_09) @GlutenPerfBot github.com//pull/6376
- [6378][CH] Support delta count optimizer for the MergeTree format @zzcclp github.com//pull/6379
- [6345][CH] Deprecate SCALAR_FUNCTIONS SerializedPlanParser @lgbo-ustc github.com//pull/6347
- [TEST] Use project version rather than Gluten version Gluten it @ulysses-you github.com//pull/6385
- [6377][CH] Support window function
percent_rank
@lgbo-ustc github.com//pull/6386 - [VL] Minor refactor for ValueStream node construction and usage @Yohahaha github.com//pull/6382
- [VL] Enable levenshtein function @zhli1142015 github.com//pull/6389
- [VL] Daily Update Velox Version (2024_07_10) @GlutenPerfBot github.com//pull/6384
- [1632][CH]Daily Update Clickhouse Version (20240710) @kyligence-git github.com//pull/6383
- Test input_file_name, input_file_block_start & input_file_block_length when scan falls back @gaoyangxiaozhu github.com//pull/6318
- [6394][VL] Fix the vcpkg package script @weixiuli github.com//pull/6395
- [6288][CH] Support BroadcastNestedLoopJoinExe[Part one] @loneylee github.com//pull/6290
- [CELEBORN] Rename CelebornHashBasedColumnarShuffleWriter to CelebornColumnarShuffleWriter @kerwin-zk github.com//pull/6391
- [VL] Fix E function fallback issue some condition @gaoyangxiaozhu github.com//pull/6397
- [CI] Fix centos7 failure @marin-ma github.com//pull/6404
- [1632][CH]Daily Update Clickhouse Version (20240711) @kyligence-git github.com//pull/6399
- [CELEBORN] Add compression for row-based shuffle @kerwin-zk github.com//pull/6380
- [VL] Daily Update Velox Version (2024_07_11) @GlutenPerfBot github.com//pull/6400
- [CORE] Remove local sort for TopNRowNumber @ulysses-you github.com//pull/6381
- [VL] Spark assert_true function support @gaoyangxiaozhu github.com//pull/6329
- [VL] Add schema validation for all operators @zhli1142015 github.com//pull/6406
- [CORE] Minor code cleanups against fallback tagging @zhztheplayer github.com//pull/6320
- [VL] Try to find arrow libs from velox bundled path firstly @PHILO-HE github.com//pull/6413
- [VL] disable tpch benchmarks on comment/merge @zhouyuan github.com//pull/6402
- [UT] Add a tool to validate any unary expression with all its accepted types @PHILO-HE github.com//pull/6392
- [CH] Fix a source file name typo @zhztheplayer github.com//pull/6412
- [VL] Fix Pi function fallback issue some condition @gaoyangxiaozhu github.com//pull/6408
- [CELEBORN] VeloxCelebornColumnarBatchSerializer uses the key and default value of SHUFFLE_COMPRESS to check whether to compress shuffle output @SteNicholas github.com//pull/6414
- [VL] Quick fix for commit conflicts @zhztheplayer github.com//pull/6418
- [Doc] Update new supported spark functions @gaoyangxiaozhu github.com//pull/6423
- [VL] Add a test to validate substring_index @boneanxs github.com//pull/6393
- [VL] Fix shuffle spill triggered by evicting buffers during stop @marin-ma github.com//pull/6422
- [VL] Enable repeat function @zhli1142015 github.com//pull/6419
- [VL] Accelerate Arrow compile @jinchengchenghh github.com//pull/6426
- [CI][VL] Update docker image for CI @zhouyuan github.com//pull/6401
- [VL] Daily Update Velox Version (2024_07_12) @GlutenPerfBot github.com//pull/6417
- [VL] Daily Update Velox Version (2024_07_13) @GlutenPerfBot github.com//pull/6436
- [VL] Daily Update Velox Version (2024_07_14) @GlutenPerfBot github.com//pull/6441
- [VL] Set Arrow_SOURCE to AUTO to allow using system arrow libs @PHILO-HE github.com//pull/6325
- [CELEBORN] CHCelebornColumnarShuffleWriter supports celeborn.client.spark.shuffle.writer to use memory sort shuffle ClickHouse backend @SteNicholas github.com//pull/6432
- [VL] Make sure the same thrift lib bundled arrow build is used for building Velox @zhztheplayer github.com//pull/6431
- [CORE] Make SparkSession transient HiveTableScanExecTransformer @yikf github.com//pull/6410
- [6176][CH] Add tpcds suite from decimal table schema @loneylee github.com//pull/6369
- [VL] Move dependencies setup ahead @PHILO-HE github.com//pull/6444
- [CH][CELEBORN] CHCelebornColumnarShuffleWriter supports celeborn.client.spark.shuffle.writer to use memory sort shuffle ClickHouse backend @SteNicholas github.com//pull/6454
- [VL] Enable right and anti join smj @JkSelf github.com//pull/6449
- [CH][CELEBORN] CHCelebornColumnarBatchSerializer uses AtomicBoolean to identify whether to call close() to avoid calling close() twice situation @SteNicholas github.com//pull/6455
- [CI][VL] Re-enable a build job running on clean dockers weekly @PHILO-HE github.com//pull/6424
- [CORE] Update LICENSE, NOTICE, LICENSE-binary, NOTICE-binary @weiting-chen github.com//pull/6443
- [CORE] Change DISCLAIMER to DISCLAIMER-WIP @weiting-chen github.com//pull/6442
- [VL] RAS: Minor code cleanup for offloading project @zhztheplayer github.com//pull/6452
- [VL] Add a way to create static build with docker container and gluten-te @zhztheplayer github.com//pull/6457
- [6467][CH] Minor Fix Build @baibaichen github.com//pull/6468
- [VL] Minor improvements and fixes for gluten-it and gluten-te @zhztheplayer github.com//pull/6471
- [CORE] Fix fallback for spark sequence function with literal array data as input @gaoyangxiaozhu github.com//pull/6433
- [VL] Fix offload input_file_name assert error @zml1206 github.com//pull/6390
- [VL] update docker image for cache-native-lib job @yma11 github.com//pull/6466
- [BUILD] Fix unbound variable @zml1206 github.com//pull/6474
- [VL] Daily Update Velox Version (2024_07_16) @GlutenPerfBot github.com//pull/6460
- [6437][BUILD] Fix vcpkg setup-build-dependens.sh for centos @wecharyu github.com//pull/6438
- [6470][CH]Fix Task not serializable error when inserting mergetree data @zzcclp github.com//pull/6473
- [6425][CH] Support day time internval @lgbo-ustc github.com//pull/6456
- [VL] remove redundant code parquet datasource to avoid memory leakage PR6430 @liujp github.com//pull/6462
- [Core] Spark version function support @gaoyangxiaozhu github.com//pull/6469
- [VL] Daily Update Velox Version (2024_07_17) @GlutenPerfBot github.com//pull/6479
- [VL] Minor improvements on gluten-it / gluten-te toolchains @zhztheplayer github.com//pull/6476
- [CH] Support merge MergeTree files @liuneng1994 github.com//pull/6472
- [6463][CH]refactor the code of parsing join parameters @lgbo-ustc github.com//pull/6485
- [1632][CH]Daily Update Clickhouse Version (20240718) @kyligence-git github.com/apache/incubat...
v1.3.0-rc0
Release Notes - Gluten version 1.3.0-rc0
Highlights
- Spark 3.2.2/3.3.1/3.4.3(upgraded)/3.5.2(upgraded)
- 268+ spark functions including json
- Update OAP's Velox codebase to 2025/01/07
- Join: Sort Merge Join support
- Shuffle: Sort based Shuffle(Row)
- Query Plan: RAS Optimization
- Datalake: Hudi 0.15.0 support/Iceberg 1.5.0/Delta 3.2.0
- RSS: Celeborn 0.5.2/Uniffle 0.9.1
- File Format: CSV support via arrow
- JVM libhdfs with viewfs/kerberos support
- Partial Project(UDF) support
- Mix backend refactor
- Bucket write in partitioned Hive table
- CI/Nightly Package Tools Update
- Build & Compile Tools Update(recommend to use vcpkg with static build)
- Fix several result mismatch issues
- Fix OOM/Yarn Kill unstable issues
What's Changed
- [VL] Make velox writer queue size configurable @yikf github.com//pull/6341
- [VL] Remove useless ctx variable @gaoyangxiaozhu github.com//pull/6348
- [1632][CH]Daily Update Clickhouse Version (20240706) @kyligence-git github.com//pull/6359
- [VL] fix build bundle package @zhouyuan github.com//pull/6364
- [VL] Fix process_setup_alinux3 arrow CMakeLists.txt path @liujiayi771 github.com//pull/6363
- [VL] Daily Update Velox Version (2024_07_08) @GlutenPerfBot github.com//pull/6366
- [6262][CH]Json input format ignore key case @KevinyhZou github.com//pull/6263
- [6285][VL] Add debian10 vcpkg depends @wenwj0 github.com//pull/6286
- [CELEBORN] CelebornShuffleManager#stop should stop non-null _vanillaCelebornShuffleManager @SteNicholas github.com//pull/6371
- [VL] Update ubuntu docker to use cmake 3.28 @boneanxs github.com//pull/6373
- [6304][CH]Support array_join @KevinyhZou github.com//pull/6305
- [VL] Daily Update Velox Version (2024_07_09) @GlutenPerfBot github.com//pull/6376
- [6378][CH] Support delta count optimizer for the MergeTree format @zzcclp github.com//pull/6379
- [6345][CH] Deprecate SCALAR_FUNCTIONS SerializedPlanParser @lgbo-ustc github.com//pull/6347
- [TEST] Use project version rather than Gluten version Gluten it @ulysses-you github.com//pull/6385
- [6377][CH] Support window function
percent_rank
@lgbo-ustc github.com//pull/6386 - [VL] Minor refactor for ValueStream node construction and usage @Yohahaha github.com//pull/6382
- [VL] Enable levenshtein function @zhli1142015 github.com//pull/6389
- [VL] Daily Update Velox Version (2024_07_10) @GlutenPerfBot github.com//pull/6384
- [1632][CH]Daily Update Clickhouse Version (20240710) @kyligence-git github.com//pull/6383
- Test input_file_name, input_file_block_start & input_file_block_length when scan falls back @gaoyangxiaozhu github.com//pull/6318
- [6394][VL] Fix the vcpkg package script @weixiuli github.com//pull/6395
- [6288][CH] Support BroadcastNestedLoopJoinExe[Part one] @loneylee github.com//pull/6290
- [CELEBORN] Rename CelebornHashBasedColumnarShuffleWriter to CelebornColumnarShuffleWriter @kerwin-zk github.com//pull/6391
- [VL] Fix E function fallback issue some condition @gaoyangxiaozhu github.com//pull/6397
- [CI] Fix centos7 failure @marin-ma github.com//pull/6404
- [1632][CH]Daily Update Clickhouse Version (20240711) @kyligence-git github.com//pull/6399
- [CELEBORN] Add compression for row-based shuffle @kerwin-zk github.com//pull/6380
- [VL] Daily Update Velox Version (2024_07_11) @GlutenPerfBot github.com//pull/6400
- [CORE] Remove local sort for TopNRowNumber @ulysses-you github.com//pull/6381
- [VL] Spark assert_true function support @gaoyangxiaozhu github.com//pull/6329
- [VL] Add schema validation for all operators @zhli1142015 github.com//pull/6406
- [CORE] Minor code cleanups against fallback tagging @zhztheplayer github.com//pull/6320
- [VL] Try to find arrow libs from velox bundled path firstly @PHILO-HE github.com//pull/6413
- [VL] disable tpch benchmarks on comment/merge @zhouyuan github.com//pull/6402
- [UT] Add a tool to validate any unary expression with all its accepted types @PHILO-HE github.com//pull/6392
- [CH] Fix a source file name typo @zhztheplayer github.com//pull/6412
- [VL] Fix Pi function fallback issue some condition @gaoyangxiaozhu github.com//pull/6408
- [CELEBORN] VeloxCelebornColumnarBatchSerializer uses the key and default value of SHUFFLE_COMPRESS to check whether to compress shuffle output @SteNicholas github.com//pull/6414
- [VL] Quick fix for commit conflicts @zhztheplayer github.com//pull/6418
- [Doc] Update new supported spark functions @gaoyangxiaozhu github.com//pull/6423
- [VL] Add a test to validate substring_index @boneanxs github.com//pull/6393
- [VL] Fix shuffle spill triggered by evicting buffers during stop @marin-ma github.com//pull/6422
- [VL] Enable repeat function @zhli1142015 github.com//pull/6419
- [VL] Accelerate Arrow compile @jinchengchenghh github.com//pull/6426
- [CI][VL] Update docker image for CI @zhouyuan github.com//pull/6401
- [VL] Daily Update Velox Version (2024_07_12) @GlutenPerfBot github.com//pull/6417
- [VL] Daily Update Velox Version (2024_07_13) @GlutenPerfBot github.com//pull/6436
- [VL] Daily Update Velox Version (2024_07_14) @GlutenPerfBot github.com//pull/6441
- [VL] Set Arrow_SOURCE to AUTO to allow using system arrow libs @PHILO-HE github.com//pull/6325
- [CELEBORN] CHCelebornColumnarShuffleWriter supports celeborn.client.spark.shuffle.writer to use memory sort shuffle ClickHouse backend @SteNicholas github.com//pull/6432
- [VL] Make sure the same thrift lib bundled arrow build is used for building Velox @zhztheplayer github.com//pull/6431
- [CORE] Make SparkSession transient HiveTableScanExecTransformer @yikf github.com//pull/6410
- [6176][CH] Add tpcds suite from decimal table schema @loneylee github.com//pull/6369
- [VL] Move dependencies setup ahead @PHILO-HE github.com//pull/6444
- [CH][CELEBORN] CHCelebornColumnarShuffleWriter supports celeborn.client.spark.shuffle.writer to use memory sort shuffle ClickHouse backend @SteNicholas github.com//pull/6454
- [VL] Enable right and anti join smj @JkSelf github.com//pull/6449
- [CH][CELEBORN] CHCelebornColumnarBatchSerializer uses AtomicBoolean to identify whether to call close() to avoid calling close() twice situation @SteNicholas github.com//pull/6455
- [CI][VL] Re-enable a build job running on clean dockers weekly @PHILO-HE github.com//pull/6424
- [CORE] Update LICENSE, NOTICE, LICENSE-binary, NOTICE-binary @weiting-chen github.com//pull/6443
- [CORE] Change DISCLAIMER to DISCLAIMER-WIP @weiting-chen github.com//pull/6442
- [VL] RAS: Minor code cleanup for offloading project @zhztheplayer github.com//pull/6452
- [VL] Add a way to create static build with docker container and gluten-te @zhztheplayer github.com//pull/6457
- [6467][CH] Minor Fix Build @baibaichen github.com//pull/6468
- [VL] Minor improvements and fixes for gluten-it and gluten-te @zhztheplayer github.com//pull/6471
- [CORE] Fix fallback for spark sequence function with literal array data as input @gaoyangxiaozhu github.com//pull/6433
- [VL] Fix offload input_file_name assert error @zml1206 github.com//pull/6390
- [VL] update docker image for cache-native-lib job @yma11 github.com//pull/6466
- [BUILD] Fix unbound variable @zml1206 github.com//pull/6474
- [VL] Daily Update Velox Version (2024_07_16) @GlutenPerfBot github.com//pull/6460
- [6437][BUILD] Fix vcpkg setup-build-dependens.sh for centos @wecharyu github.com//pull/6438
- [6470][CH]Fix Task not serializable error when inserting mergetree data @zzcclp github.com//pull/6473
- [6425][CH] Support day time internval @lgbo-ustc github.com//pull/6456
- [VL] remove redundant code parquet datasource to avoid memory leakage PR6430 @liujp github.com//pull/6462
- [Core] Spark version function support @gaoyangxiaozhu github.com//pull/6469
- [VL] Daily Update Velox Version (2024_07_17) @GlutenPerfBot github.com//pull/6479
- [VL] Minor improvements on gluten-it / gluten-te toolchains @zhztheplayer github.com//pull/6476
- [CH] Support merge MergeTree files @liuneng1994 github.com//pull/6472
- [6463][CH]refactor the code of parsing join parameters @lgbo-ustc github.com//pull/6485
- [1632][CH]Daily Update Clickhouse Version (20240718) @kyligence-git github.com/apache/inc...
v1.3.0-preview
What's Changed
- VL Make velox writer queue size configurable @yikf #6341
- VL Remove useless ctx variable @gaoyangxiaozhu #6348
- [1632]CHDaily 20240706) @kyligence-git #6359
- VL fix build bundle package @zhouyuan #6364
- VL Fix process_setup_alinux3 arrow CMakeLists.txt path @liujiayi771 #6363
- VL Daily 2024_07_08) @GlutenPerfBot #6366
- [6262]CHJson input format ignore key case @KevinyhZou #6263
- [6285]VL Add debian10 vcpkg depends @wenwj0 #6286
- [CELEBORN] CelebornShuffleManager#stop should stop non-null _vanillaCelebornShuffleManager @SteNicholas #6371
- VL Update ubuntu docker to use cmake 3.28 @boneanxs #6373
- [6304]CHSupport array_join @KevinyhZou #6305
- VL Daily 2024_07_09) @GlutenPerfBot #6376
- [6378]CH Support delta count optimizer for the MergeTree format @zzcclp #6379
- [6345]CH Deprecate SCALAR_FUNCTIONS in SerializedPlanParser @lgbo-ustc #6347
- [TEST] Use project version rather than Gluten version in Gluten it @ulysses-you #6385
- [6377]CH Support window function
percent_rank
@lgbo-ustc #6386 - VL Minor refactor for ValueStream node construction and usage @Yohahaha #6382
- VL Enable levenshtein function @zhli1142015 #6389
- VL Daily 2024_07_10) @GlutenPerfBot #6384
- [1632]CHDaily 20240710) @kyligence-git #6383
- Test input_file_name, input_file_block_start & input_file_block_length when scan falls back @gaoyangxiaozhu #6318
- [6394]VL Fix the vcpkg package script @weixiuli #6395
- [6288]CH Support BroadcastNestedLoopJoinExe[Part one] @loneylee #6290
- [CELEBORN] Rename CelebornHashBasedColumnarShuffleWriter to CelebornColumnarShuffleWriter @kerwin-zk #6391
- VL Fix E function fallback issue in some condition @gaoyangxiaozhu #6397
- [CI] Fix centos7 failure @marin-ma #6404
- [1632]CHDaily 20240711) @kyligence-git #6399
- [CELEBORN] Add compression for row-based shuffle @kerwin-zk #6380
- VL Daily 2024_07_11) @GlutenPerfBot #6400
- CORE Remove local sort for TopNRowNumber @ulysses-you #6381
- VL Spark assert_true function support @gaoyangxiaozhu #6329
- VL Add schema validation for all operators @zhli1142015 #6406
- CORE Minor code cleanups against fallback tagging @zhztheplayer #6320
- VL Try to find arrow libs from velox bundled path firstly @PHILO-HE #6413
- VL disable tpch benchmarks on comment/merge @zhouyuan #6402
- [UT] Add a tool to validate any unary expression with all its accepted types @PHILO-HE #6392
- CH Fix a source file name typo @zhztheplayer #6412
- VL Fix Pi function fallback issue in some condition @gaoyangxiaozhu #6408
- [CELEBORN] VeloxCelebornColumnarBatchSerializer uses the key and default value of SHUFFLE_COMPRESS to check whether to compress shuffle output @SteNicholas #6414
- VL Quick fix for commit conflicts @zhztheplayer #6418
- [Doc] Update new supported spark functions @gaoyangxiaozhu #6423
- VL Add a test to validate substring_index @boneanxs #6393
- VL Fix shuffle spill triggered by evicting buffers during stop @marin-ma #6422
- VL Enable repeat function @zhli1142015 #6419
- VL Accelerate Arrow compile @jinchengchenghh #6426
- [CI]VL Update docker image for CI @zhouyuan #6401
- VL Daily 2024_07_12) @GlutenPerfBot #6417
- VL Daily 2024_07_13) @GlutenPerfBot #6436
- VL Daily 2024_07_14) @GlutenPerfBot #6441
- VL Set Arrow_SOURCE to AUTO to allow using system arrow libs @PHILO-HE #6325
- [CELEBORN] CHCelebornColumnarShuffleWriter supports celeborn.client.spark.shuffle.writer to use memory sort shuffle in ClickHouse backend @SteNicholas #6432
- VL Make sure the same thrift lib bundled in arrow build is used for building Velox @zhztheplayer #6431
- CORE Make SparkSession transient in HiveTableScanExecTransformer @yikf #6410
- [6176]CH Add tpcds suite from decimal table schema @loneylee #6369
- VL Move dependencies setup ahead @PHILO-HE #6444
- CH[CELEBORN] CHCelebornColumnarShuffleWriter supports celeborn.client.spark.shuffle.writer to use memory sort shuffle in ClickHouse backend @SteNicholas #6454
- VL Enable right and anti join in smj @JkSelf #6449
- CH[CELEBORN] CHCelebornColumnarBatchSerializer uses AtomicBoolean to identify whether to call close() to avoid calling close() twice situation @SteNicholas #6455
- [CI]VL Re-enable a build job running on clean dockers weekly @PHILO-HE #6424
- CORE Update LICENSE, NOTICE, LICENSE-binary, NOTICE-binary @weiting-chen #6443
- CORE Change DISCLAIMER to DISCLAIMER-WIP @weiting-chen #6442
- VL RAS: Minor code cleanup for offloading project @zhztheplayer #6452
- VL Add a way to create static build with docker container and gluten-te @zhztheplayer #6457
- [6467]CH Minor Fix Build @baibaichen #6468
- VL Minor improvements and fixes for gluten-it and gluten-te @zhztheplayer #6471
- CORE Fix fallback for spark sequence function with literal array data as input @gaoyangxiaozhu #6433
- VL Fix offload input_file_name assert error @zml1206 #6390
- VL update docker image for cache-native-lib job @yma11 #6466
- [BUILD] Fix unbound variable @zml1206 #6474
- VL Daily 2024_07_16) @GlutenPerfBot #6460
- [6437][BUILD] Fix vcpkg setup-build-dependens.sh for centos @wecharyu #6438
- [6470]CHFix Task not serializable error when inserting mergetree data @zzcclp #6473
- [6425]CH Support day time internval @lgbo-ustc #6456
- VL remove redundant code in parquet datasource to avoid memory leakage PR6430 @liujp #6462
- CORE Spark version function support @gaoyangxiaozhu #6469
- VL Daily 2024_07_17) @GlutenPerfBot #6479
- VL Minor improvements on gluten-it / gluten-te toolchains @zhztheplayer #6476
- CH Support merge MergeTree files @liuneng1994 #6472
- [6463]CHrefactor the code of parsing join parameters @lgbo-ustc #6485
- [1632]CHDaily 20240718) @kyligence-git #6491
- VL Daily 2024_07_18) @GlutenPerfBot #6492
- [6495]VL Fix build issue: --build_arrow=ON wipes --build_type= setting silently @PHILO-HE #6498
- VL RAS: Make default rough cost model exhaustively offload computations @zhztheplayer #6493
- VL Print exception early when raised from ManagedReservationListener#unreserve @zhztheplayer https://github.com/apache/inc...
v1.2.1
Highlight
- 3 Shuffle, Spill related bug fix
- 5 RSS(Celeborn, Uniffle) related bug fix
- 4 Compile & Package related bug fix
- 10 CI/CD related bug fix
- Move to use OAP's Velox v1.2.2
- 4 major issue fixed in OAP's Velox
- More minor bug fix, please check below full list
What's Changed
- [VL][1.2] Upgrade GHA artifacts version to 3 by @weiting-chen in #7293
- [VL] Port CI changes to branch-1.2 and pick simdjson related fix by @PHILO-HE in #7314
- [CORE][1.2] Bump branch-1.2 version to 1.2.1-SNAPSHOT by @weiting-chen in #7290
- [VL] Follow-up fix for #7314 on branch-1.2: skip data gen in oom test by @PHILO-HE in #7329
- [VL][1.2] Install devtoolset-9 for centos7 build native lib GHA by @wForget in #7678
- [GLUTEN-7037][VL][1.2] Add dwarf dependency to folly when building with vcpkg by @wForget in #7699
- [CK][1.2] Support trigger CK Backend CI/CD in branch1.2 by @weiting-chen in #7936
- [VL][1.2] Port 6563 6679 for build options and collectQueryExecutionFallbackSummary fix by @weiting-chen in #7919
- [VL][1.2] Port 6432 6657 for Celeborn bug fix in branch 1.2 by @weiting-chen in #7922
- [VL][1.2] Port #6573 #7025 #7132 by @weiting-chen in #7973
- [VL][1.2] Port #6560 #6569 #6730 #7117 for vcpkg issue fix by @weiting-chen in #7974
- [VL][1.2] Port #7121 #7448 by @weiting-chen in #7988
- [VL] Branch 1.2: Backport fixes for #7243 by @zhztheplayer in #7943
- [GLUTEN-7126][CORE][1.2] Fix issue that unsupported join type in BNLJ is not fallback by @ccat3z in #7569
- [GLUTEN-7126][VL][1.2] Port Fix shuffle spill triggered by evicting buffers during stop (#6422) by @kecookier in #7991
- [GLUTEN-7126][VL][1.2] Port #6698 #7525 #7560 for Uniffle bug fix by @weiting-chen in #7994
- [VL] Branch 1.2: Port #8047 to fix libelf link by @zhztheplayer in #8059
- [VL][Branch-1.2] Port #8034 & #8027 for fixing march flag setting and #8042 for fixing GHA failure on centos-7 by @PHILO-HE in #8075
- [CORE][BRANCH-1.2] Port #7861 to fix OOM in shuffle writer by @ccat3z in #8078
- [1.2] Preparing for Gltuen v1.2.1-rc0 by @weiting-chen in #8110
Full Changelog: v1.2.0...v1.2.1
v1.2.1-rc0
What's Changed
- [VL][1.2] Upgrade GHA artifacts version to 3 by @weiting-chen in #7293
- [VL] Port CI changes to branch-1.2 and pick simdjson related fix by @PHILO-HE in #7314
- [CORE][1.2] Bump branch-1.2 version to 1.2.1-SNAPSHOT by @weiting-chen in #7290
- [VL] Follow-up fix for #7314 on branch-1.2: skip data gen in oom test by @PHILO-HE in #7329
- [VL][1.2] Install devtoolset-9 for centos7 build native lib GHA by @wForget in #7678
- [GLUTEN-7037][VL][1.2] Add dwarf dependency to folly when building with vcpkg by @wForget in #7699
- [CK][1.2] Support trigger CK Backend CI/CD in branch1.2 by @weiting-chen in #7936
- [VL][1.2] Port 6563 6679 for build options and collectQueryExecutionFallbackSummary fix by @weiting-chen in #7919
- [VL][1.2] Port 6432 6657 for Celeborn bug fix in branch 1.2 by @weiting-chen in #7922
- [VL][1.2] Port #6573 #7025 #7132 by @weiting-chen in #7973
- [VL][1.2] Port #6560 #6569 #6730 #7117 for vcpkg issue fix by @weiting-chen in #7974
- [VL][1.2] Port #7121 #7448 by @weiting-chen in #7988
- [VL] Branch 1.2: Backport fixes for #7243 by @zhztheplayer in #7943
- [GLUTEN-7126][CORE][1.2] Fix issue that unsupported join type in BNLJ is not fallback by @ccat3z in #7569
- [GLUTEN-7126][VL][1.2] Port Fix shuffle spill triggered by evicting buffers during stop (#6422) by @kecookier in #7991
- [GLUTEN-7126][VL][1.2] Port #6698 #7525 #7560 for Uniffle bug fix by @weiting-chen in #7994
- [VL] Branch 1.2: Port #8047 to fix libelf link by @zhztheplayer in #8059
- [VL][Branch-1.2] Port #8034 & #8027 for fixing march flag setting and #8042 for fixing GHA failure on centos-7 by @PHILO-HE in #8075
- [CORE][BRANCH-1.2] Port #7861 to fix OOM in shuffle writer by @ccat3z in #8078
- [1.2] Preparing for Gltuen v1.2.1-rc0 by @weiting-chen in #8110
Full Changelog: v1.2.0...v1.2.1-rc0
v1.2.1-preview
What's Changed
- [VL][1.2] Upgrade GHA artifacts version to 3 by @weiting-chen in #7293
- [VL] Port CI changes to branch-1.2 and pick simdjson related fix by @PHILO-HE in #7314
- [CORE][1.2] Bump branch-1.2 version to 1.2.1-SNAPSHOT by @weiting-chen in #7290
- [VL] Follow-up fix for #7314 on branch-1.2: skip data gen in oom test by @PHILO-HE in #7329
- [VL][1.2] Install devtoolset-9 for centos7 build native lib GHA by @wForget in #7678
- [GLUTEN-7037][VL][1.2] Add dwarf dependency to folly when building with vcpkg by @wForget in #7699
- [CK][1.2] Support trigger CK Backend CI/CD in branch1.2 by @weiting-chen in #7936
- [VL][1.2] Port 6563 6679 for build options and collectQueryExecutionFallbackSummary fix by @weiting-chen in #7919
- [VL][1.2] Port 6432 6657 for Celeborn bug fix in branch 1.2 by @weiting-chen in #7922
- [VL][1.2] Port #6573 #7025 #7132 by @weiting-chen in #7973
- [VL][1.2] Port #6560 #6569 #6730 #7117 for vcpkg issue fix by @weiting-chen in #7974
- [VL][1.2] Port #7121 #7448 by @weiting-chen in #7988
- [VL] Branch 1.2: Backport fixes for #7243 by @zhztheplayer in #7943
- [GLUTEN-7126][CORE][1.2] Fix issue that unsupported join type in BNLJ is not fallback by @ccat3z in #7569
- [GLUTEN-7126][VL][1.2] Port Fix shuffle spill triggered by evicting buffers during stop (#6422) by @kecookier in #7991
- [GLUTEN-7126][VL][1.2] Port #6698 #7525 #7560 for Uniffle bug fix by @weiting-chen in #7994
Full Changelog: v1.2.0...v1.2.1-preview
v1.2.0
Release Notes - Gluten version 1.2.0
We are pleased to announce that Gluten v1.2.0 has been published as 1st official Apache release.
Highlights (Velox backend only)
- Support Spark 3.2.2, 3.3.1, 3.4.2, and 3.5.1 with all UTs passed(if data type supported)
- Support 31 common Spark Operators(based on Spark3.2)
- Support 266 common Spark Functions(based on Spark3.2)
- Velox codebase updated to 2024/07/05
- New RSS support: add Apache Uniffle integration
- New Data Lake support: Iceberge, Delta Lake
- New File Format Support: CSV
- Enhanced CI workflow
- Refresh Documentations in Gluten website(https://gluten.apache.org/)
- More Stability in Spill, OOM, and other cases support
- More Bug Fixing
What's Changed
- [CORE] Move all columnar rules to post-columnar transitions by @zhztheplayer in #4790
- [GLUTEN-4398][FOLLOW] Mask PullOutPostProject and PullOutPreProject id by @zwangsheng in #4815
- [GLUTEN-2956][VL] Support Spark NullType by @PHILO-HE in #2996
- [CORE] Add logical link to rewritten spark plan by @ulysses-you in #4817
- [GLUTEN-4803][UT] Add Golden Files for TPC-H Spark33 + Gluten Execution Plan by @zwangsheng in #4804
- [VL] Allow replacing installed minio package by @PHILO-HE in #4825
- [VL] Daily Update Velox Version (2024_03_01) by @GlutenPerfBot in #4821
- [VL] Enable more tests of GlutenParquetIOSuite for Spark32/33/34 by @Yohahaha in #4823
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240302) by @lwz9103 in #4837
- [GLUTEN-4039][VL] support map_keys and map_values by @konjac in #4826
- [GLUTEN-4424][CORE] Upgrade spark version to 3.5.1 in Gluten by @JkSelf in #4822
- [VL] Daily Update Velox Version (2024_03_04) by @GlutenPerfBot in #4841
- [GLUTEN-4813] Replace resize/reserve to resize_extact/reserve_exact to reduce memory by @taiyang-li in #4824
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240305) by @lwz9103 in #4849
- [VL] Fix boost installation issue and remove useless QueryCtx by @PHILO-HE in #4850
- [VL] Enable "parquet v2 pages - delta encoding" test for Spark33/Spark34 by @Yohahaha in #4816
- [CORE] Support FileSourceScanExec driver metrics for spark3.4/3.5 by @zhli1142015 in #4848
- [GLUTEN-4772][VL] Support empty map/array literal by @WangGuangxin in #4771
- [GLUTEN-4860][CELEBORN] Replace celeborn link by @kerwin-zk in #4861
- [VL][CI] Fix CI failure related to Celeborn by @PHILO-HE in #4862
- [CORE] Support In list option contains non-foldable expression by @ulysses-you in #4843
- [VL] Daily Update Velox Version (2024_03_05) by @GlutenPerfBot in #4852
- [VL] Enable more tests in GlutenParquetQuerySuite for Spark32/33/34 by @Yohahaha in #4854
- [CORE] ColumnarShuffleExchangeExec should respect advisoryPartitionSize for Spark3.5 by @ulysses-you in #4865
- [GLUTEN-4853][CORE] Only trim Alias when its child is semantically equal to resAttr by @liujiayi771 in #4857
- [VL] minor change for delta ut by @zhli1142015 in #4869
- [VL] Add libsodium.so to thirdparty lib for CentOS8 by @kerwin-zk in #4870
- [VL] Updated documentation, refactoring and added more testcases for BNLJ by @Surbhi-Vijay in #4782
- [VL] Daily Update Velox Version (2024_03_06) by @GlutenPerfBot in #4868
- [MINOR] Remove ExtendedAnalysisException by @PHILO-HE in #4864
- [GLUTEN-4831][VL] Support StructType in HashAggregate by @WangGuangxin in #4832
- [VL] Support inline function by @marin-ma in #4847
- [VL] Add flushable decimal sum test case by @liujiayi771 in #4871
- [CORE] Add synchronized for ExplainUtils processPlan by @ulysses-you in #4876
- [VL] Rewrite collect_set and collect_list aggregate function by @ulysses-you in #4805
- [VL] Fix and use flattenVector by @marin-ma in #4783
- [VL] Enable tests of ParquetPartitionDisconverySuite for Spark33/34 by @Yohahaha in #4881
- [CORE] Minor adjustment to columnar rule list, and move all columnar sub-rules to one source folder by @zhztheplayer in #4863
- [VL] Merge Partial and PartialMerge logic in generateMergeCompanionNode by @liujiayi771 in #4883
- [CORE] Fix Spark-3.5 CI by @ulysses-you in #4886
- [GLUTEN-4424][CORE] Follow up upgrading spark version to 3.5.1 by @JkSelf in #4845
- Add .asf.yml by @yaooqinn in #4892
- Update Vulnerability Handling Process by @yaooqinn in #4894
- [VL] Daily Update Velox Version (2024_03_07) by @GlutenPerfBot in #4877
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240308) by @lwz9103 in #4890
- [CORE] ColumnarBroadcastExchangeExec should set/cancel with job tag for Spark3.5 by @ulysses-you in #4882
- [VL] Daily Update Velox Version (2024_03_08) by @GlutenPerfBot in #4895
- [VL] Pass partition id to velox functions by @zhli1142015 in #4344
- Add Incubation Standard Disclaimer by @yaooqinn in #4911
- [GLUTEN-4835][CORE] Match metric names with Spark by @clee704 in #4834
- [Gluten-4732][CH] delta-mergetree support update/delete/upsert/insert in a more native delta way by @binmahone in #4733
- [GLUTEN-4898][CH]Bug fix to date diff by @KevinyhZou in #4900
- [VL] Daily Update Velox Version(2024_03_11) by @GlutenPerfBot in #4908
- [DOC] Update release & configuration doc by @PHILO-HE in #4910
- [VL] Support lead window function by @ulysses-you in #4902
- [VL] Fix protobuf configure arguments in get_velox.sh by @liujiayi771 in #4920
- [Gluten-4918][CH]support CTAS for clickhouse table by @binmahone in #4919
- [GLUTEN-4926][CELEBORN]
CelebornShuffleManager
should removeshuffleId
fromcolumnarShuffleIds
after unregistering shuffle by @SteNicholas in #4927 - [Gluten-4912][CH]Support Specifying columns in clickhouse tables to b… by @binmahone in #4925
- [Gluten-4706] [CH][CORE] Add a mode to execute count distinct directly instead o… by @binmahone in #4708
- [VL] Daily Update Velox Version (2024_03_12) by @GlutenPerfBot in #4923
- [GLUTEN-4914][CH] Fix exceptions in ASTParser by @taiyang-li in #4916
- [DOC] Minor fix for wrong gluten folder used in doc by @leoluan2009 in #4938
- [VL] Refine log plan/split json into one line by @Yohahaha in #4934
- [VL] Support posexplode function and code refactoring on GenerateExecTransformer by @marin-ma in #4901
- [CORE] Prior to #4893, add vanilla Spark's original scan source code to keep git history by @zhztheplayer in #4931
- [VL] Fix wrong plan equality due to case class inheritance by @zhztheplayer in #4893
- [GLUTEN-3559][VL] enable more sql query tests for Spark34 by @zhouyuan in #4880
- [VL] Daily Update Velox Version (2024_03_13) by @GlutenPerfBot in #4944
- [VL]Bucket join support for Iceberg tables by @SinghAsDev in #4859
- [GLUTEN-4827][UT] Add Golden Files for TPC-H Spark34 + Gluten Execution Plan by @zwangsheng in https://github.com/apache/i...
v1.2.0-rc3
What's Changed
- [CORE] Move all columnar rules to post-columnar transitions by @zhztheplayer in #4790
- [GLUTEN-4398][FOLLOW] Mask PullOutPostProject and PullOutPreProject id by @zwangsheng in #4815
- [GLUTEN-2956][VL] Support Spark NullType by @PHILO-HE in #2996
- [CORE] Add logical link to rewritten spark plan by @ulysses-you in #4817
- [GLUTEN-4803][UT] Add Golden Files for TPC-H Spark33 + Gluten Execution Plan by @zwangsheng in #4804
- [VL] Allow replacing installed minio package by @PHILO-HE in #4825
- [VL] Daily Update Velox Version (2024_03_01) by @GlutenPerfBot in #4821
- [VL] Enable more tests of GlutenParquetIOSuite for Spark32/33/34 by @Yohahaha in #4823
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240302) by @lwz9103 in #4837
- [GLUTEN-4039][VL] support map_keys and map_values by @konjac in #4826
- [GLUTEN-4424][CORE] Upgrade spark version to 3.5.1 in Gluten by @JkSelf in #4822
- [VL] Daily Update Velox Version (2024_03_04) by @GlutenPerfBot in #4841
- [GLUTEN-4813] Replace resize/reserve to resize_extact/reserve_exact to reduce memory by @taiyang-li in #4824
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240305) by @lwz9103 in #4849
- [VL] Fix boost installation issue and remove useless QueryCtx by @PHILO-HE in #4850
- [VL] Enable "parquet v2 pages - delta encoding" test for Spark33/Spark34 by @Yohahaha in #4816
- [CORE] Support FileSourceScanExec driver metrics for spark3.4/3.5 by @zhli1142015 in #4848
- [GLUTEN-4772][VL] Support empty map/array literal by @WangGuangxin in #4771
- [GLUTEN-4860][CELEBORN] Replace celeborn link by @kerwin-zk in #4861
- [VL][CI] Fix CI failure related to Celeborn by @PHILO-HE in #4862
- [CORE] Support In list option contains non-foldable expression by @ulysses-you in #4843
- [VL] Daily Update Velox Version (2024_03_05) by @GlutenPerfBot in #4852
- [VL] Enable more tests in GlutenParquetQuerySuite for Spark32/33/34 by @Yohahaha in #4854
- [CORE] ColumnarShuffleExchangeExec should respect advisoryPartitionSize for Spark3.5 by @ulysses-you in #4865
- [GLUTEN-4853][CORE] Only trim Alias when its child is semantically equal to resAttr by @liujiayi771 in #4857
- [VL] minor change for delta ut by @zhli1142015 in #4869
- [VL] Add libsodium.so to thirdparty lib for CentOS8 by @kerwin-zk in #4870
- [VL] Updated documentation, refactoring and added more testcases for BNLJ by @Surbhi-Vijay in #4782
- [VL] Daily Update Velox Version (2024_03_06) by @GlutenPerfBot in #4868
- [MINOR] Remove ExtendedAnalysisException by @PHILO-HE in #4864
- [GLUTEN-4831][VL] Support StructType in HashAggregate by @WangGuangxin in #4832
- [VL] Support inline function by @marin-ma in #4847
- [VL] Add flushable decimal sum test case by @liujiayi771 in #4871
- [CORE] Add synchronized for ExplainUtils processPlan by @ulysses-you in #4876
- [VL] Rewrite collect_set and collect_list aggregate function by @ulysses-you in #4805
- [VL] Fix and use flattenVector by @marin-ma in #4783
- [VL] Enable tests of ParquetPartitionDisconverySuite for Spark33/34 by @Yohahaha in #4881
- [CORE] Minor adjustment to columnar rule list, and move all columnar sub-rules to one source folder by @zhztheplayer in #4863
- [VL] Merge Partial and PartialMerge logic in generateMergeCompanionNode by @liujiayi771 in #4883
- [CORE] Fix Spark-3.5 CI by @ulysses-you in #4886
- [GLUTEN-4424][CORE] Follow up upgrading spark version to 3.5.1 by @JkSelf in #4845
- Add .asf.yml by @yaooqinn in #4892
- Update Vulnerability Handling Process by @yaooqinn in #4894
- [VL] Daily Update Velox Version (2024_03_07) by @GlutenPerfBot in #4877
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240308) by @lwz9103 in #4890
- [CORE] ColumnarBroadcastExchangeExec should set/cancel with job tag for Spark3.5 by @ulysses-you in #4882
- [VL] Daily Update Velox Version (2024_03_08) by @GlutenPerfBot in #4895
- [VL] Pass partition id to velox functions by @zhli1142015 in #4344
- Add Incubation Standard Disclaimer by @yaooqinn in #4911
- [GLUTEN-4835][CORE] Match metric names with Spark by @clee704 in #4834
- [Gluten-4732][CH] delta-mergetree support update/delete/upsert/insert in a more native delta way by @binmahone in #4733
- [GLUTEN-4898][CH]Bug fix to date diff by @KevinyhZou in #4900
- [VL] Daily Update Velox Version(2024_03_11) by @GlutenPerfBot in #4908
- [DOC] Update release & configuration doc by @PHILO-HE in #4910
- [VL] Support lead window function by @ulysses-you in #4902
- [VL] Fix protobuf configure arguments in get_velox.sh by @liujiayi771 in #4920
- [Gluten-4918][CH]support CTAS for clickhouse table by @binmahone in #4919
- [GLUTEN-4926][CELEBORN]
CelebornShuffleManager
should removeshuffleId
fromcolumnarShuffleIds
after unregistering shuffle by @SteNicholas in #4927 - [Gluten-4912][CH]Support Specifying columns in clickhouse tables to b… by @binmahone in #4925
- [Gluten-4706] [CH][CORE] Add a mode to execute count distinct directly instead o… by @binmahone in #4708
- [VL] Daily Update Velox Version (2024_03_12) by @GlutenPerfBot in #4923
- [GLUTEN-4914][CH] Fix exceptions in ASTParser by @taiyang-li in #4916
- [DOC] Minor fix for wrong gluten folder used in doc by @leoluan2009 in #4938
- [VL] Refine log plan/split json into one line by @Yohahaha in #4934
- [VL] Support posexplode function and code refactoring on GenerateExecTransformer by @marin-ma in #4901
- [CORE] Prior to #4893, add vanilla Spark's original scan source code to keep git history by @zhztheplayer in #4931
- [VL] Fix wrong plan equality due to case class inheritance by @zhztheplayer in #4893
- [GLUTEN-3559][VL] enable more sql query tests for Spark34 by @zhouyuan in #4880
- [VL] Daily Update Velox Version (2024_03_13) by @GlutenPerfBot in #4944
- [VL]Bucket join support for Iceberg tables by @SinghAsDev in #4859
- [GLUTEN-4827][UT] Add Golden Files for TPC-H Spark34 + Gluten Execution Plan by @zwangsheng in #4828
- [VL] Verify unhex has been offloaded to native successfully by @Yohahaha in #4937
- [VL] Support skewness aggregate function by @liujiayi771 in #4939
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240314) by @lwz9103 in #4948
- [VL] parquet file metadata columns support in velox by @gaoyangxiaozhu in #3870
- [VL] Daily Update Velox Version (2024_03_14) by @GlutenPerfBot in #4949
- [VL] Untangle code of TransformPreOverrides by @zhztheplayer in ht...
v1.2.0-rc2
What's Changed
- [CORE] Move all columnar rules to post-columnar transitions by @zhztheplayer in #4790
- [GLUTEN-4398][FOLLOW] Mask PullOutPostProject and PullOutPreProject id by @zwangsheng in #4815
- [GLUTEN-2956][VL] Support Spark NullType by @PHILO-HE in #2996
- [CORE] Add logical link to rewritten spark plan by @ulysses-you in #4817
- [GLUTEN-4803][UT] Add Golden Files for TPC-H Spark33 + Gluten Execution Plan by @zwangsheng in #4804
- [VL] Allow replacing installed minio package by @PHILO-HE in #4825
- [VL] Daily Update Velox Version (2024_03_01) by @GlutenPerfBot in #4821
- [VL] Enable more tests of GlutenParquetIOSuite for Spark32/33/34 by @Yohahaha in #4823
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240302) by @lwz9103 in #4837
- [GLUTEN-4039][VL] support map_keys and map_values by @konjac in #4826
- [GLUTEN-4424][CORE] Upgrade spark version to 3.5.1 in Gluten by @JkSelf in #4822
- [VL] Daily Update Velox Version (2024_03_04) by @GlutenPerfBot in #4841
- [GLUTEN-4813] Replace resize/reserve to resize_extact/reserve_exact to reduce memory by @taiyang-li in #4824
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240305) by @lwz9103 in #4849
- [VL] Fix boost installation issue and remove useless QueryCtx by @PHILO-HE in #4850
- [VL] Enable "parquet v2 pages - delta encoding" test for Spark33/Spark34 by @Yohahaha in #4816
- [CORE] Support FileSourceScanExec driver metrics for spark3.4/3.5 by @zhli1142015 in #4848
- [GLUTEN-4772][VL] Support empty map/array literal by @WangGuangxin in #4771
- [GLUTEN-4860][CELEBORN] Replace celeborn link by @kerwin-zk in #4861
- [VL][CI] Fix CI failure related to Celeborn by @PHILO-HE in #4862
- [CORE] Support In list option contains non-foldable expression by @ulysses-you in #4843
- [VL] Daily Update Velox Version (2024_03_05) by @GlutenPerfBot in #4852
- [VL] Enable more tests in GlutenParquetQuerySuite for Spark32/33/34 by @Yohahaha in #4854
- [CORE] ColumnarShuffleExchangeExec should respect advisoryPartitionSize for Spark3.5 by @ulysses-you in #4865
- [GLUTEN-4853][CORE] Only trim Alias when its child is semantically equal to resAttr by @liujiayi771 in #4857
- [VL] minor change for delta ut by @zhli1142015 in #4869
- [VL] Add libsodium.so to thirdparty lib for CentOS8 by @kerwin-zk in #4870
- [VL] Updated documentation, refactoring and added more testcases for BNLJ by @Surbhi-Vijay in #4782
- [VL] Daily Update Velox Version (2024_03_06) by @GlutenPerfBot in #4868
- [MINOR] Remove ExtendedAnalysisException by @PHILO-HE in #4864
- [GLUTEN-4831][VL] Support StructType in HashAggregate by @WangGuangxin in #4832
- [VL] Support inline function by @marin-ma in #4847
- [VL] Add flushable decimal sum test case by @liujiayi771 in #4871
- [CORE] Add synchronized for ExplainUtils processPlan by @ulysses-you in #4876
- [VL] Rewrite collect_set and collect_list aggregate function by @ulysses-you in #4805
- [VL] Fix and use flattenVector by @marin-ma in #4783
- [VL] Enable tests of ParquetPartitionDisconverySuite for Spark33/34 by @Yohahaha in #4881
- [CORE] Minor adjustment to columnar rule list, and move all columnar sub-rules to one source folder by @zhztheplayer in #4863
- [VL] Merge Partial and PartialMerge logic in generateMergeCompanionNode by @liujiayi771 in #4883
- [CORE] Fix Spark-3.5 CI by @ulysses-you in #4886
- [GLUTEN-4424][CORE] Follow up upgrading spark version to 3.5.1 by @JkSelf in #4845
- Add .asf.yml by @yaooqinn in #4892
- Update Vulnerability Handling Process by @yaooqinn in #4894
- [VL] Daily Update Velox Version (2024_03_07) by @GlutenPerfBot in #4877
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240308) by @lwz9103 in #4890
- [CORE] ColumnarBroadcastExchangeExec should set/cancel with job tag for Spark3.5 by @ulysses-you in #4882
- [VL] Daily Update Velox Version (2024_03_08) by @GlutenPerfBot in #4895
- [VL] Pass partition id to velox functions by @zhli1142015 in #4344
- Add Incubation Standard Disclaimer by @yaooqinn in #4911
- [GLUTEN-4835][CORE] Match metric names with Spark by @clee704 in #4834
- [Gluten-4732][CH] delta-mergetree support update/delete/upsert/insert in a more native delta way by @binmahone in #4733
- [GLUTEN-4898][CH]Bug fix to date diff by @KevinyhZou in #4900
- [VL] Daily Update Velox Version(2024_03_11) by @GlutenPerfBot in #4908
- [DOC] Update release & configuration doc by @PHILO-HE in #4910
- [VL] Support lead window function by @ulysses-you in #4902
- [VL] Fix protobuf configure arguments in get_velox.sh by @liujiayi771 in #4920
- [Gluten-4918][CH]support CTAS for clickhouse table by @binmahone in #4919
- [GLUTEN-4926][CELEBORN]
CelebornShuffleManager
should removeshuffleId
fromcolumnarShuffleIds
after unregistering shuffle by @SteNicholas in #4927 - [Gluten-4912][CH]Support Specifying columns in clickhouse tables to b… by @binmahone in #4925
- [Gluten-4706] [CH][CORE] Add a mode to execute count distinct directly instead o… by @binmahone in #4708
- [VL] Daily Update Velox Version (2024_03_12) by @GlutenPerfBot in #4923
- [GLUTEN-4914][CH] Fix exceptions in ASTParser by @taiyang-li in #4916
- [DOC] Minor fix for wrong gluten folder used in doc by @leoluan2009 in #4938
- [VL] Refine log plan/split json into one line by @Yohahaha in #4934
- [VL] Support posexplode function and code refactoring on GenerateExecTransformer by @marin-ma in #4901
- [CORE] Prior to #4893, add vanilla Spark's original scan source code to keep git history by @zhztheplayer in #4931
- [VL] Fix wrong plan equality due to case class inheritance by @zhztheplayer in #4893
- [GLUTEN-3559][VL] enable more sql query tests for Spark34 by @zhouyuan in #4880
- [VL] Daily Update Velox Version (2024_03_13) by @GlutenPerfBot in #4944
- [VL]Bucket join support for Iceberg tables by @SinghAsDev in #4859
- [GLUTEN-4827][UT] Add Golden Files for TPC-H Spark34 + Gluten Execution Plan by @zwangsheng in #4828
- [VL] Verify unhex has been offloaded to native successfully by @Yohahaha in #4937
- [VL] Support skewness aggregate function by @liujiayi771 in #4939
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240314) by @lwz9103 in #4948
- [VL] parquet file metadata columns support in velox by @gaoyangxiaozhu in #3870
- [VL] Daily Update Velox Version (2024_03_14) by @GlutenPerfBot in #4949
- [VL] Untangle code of TransformPreOverrides by @zhztheplayer in ht...
v1.2.0-rc1
What's Changed
- [CORE] Move all columnar rules to post-columnar transitions by @zhztheplayer in #4790
- [GLUTEN-4398][FOLLOW] Mask PullOutPostProject and PullOutPreProject id by @zwangsheng in #4815
- [GLUTEN-2956][VL] Support Spark NullType by @PHILO-HE in #2996
- [CORE] Add logical link to rewritten spark plan by @ulysses-you in #4817
- [GLUTEN-4803][UT] Add Golden Files for TPC-H Spark33 + Gluten Execution Plan by @zwangsheng in #4804
- [VL] Allow replacing installed minio package by @PHILO-HE in #4825
- [VL] Daily Update Velox Version (2024_03_01) by @GlutenPerfBot in #4821
- [VL] Enable more tests of GlutenParquetIOSuite for Spark32/33/34 by @Yohahaha in #4823
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240302) by @lwz9103 in #4837
- [GLUTEN-4039][VL] support map_keys and map_values by @konjac in #4826
- [GLUTEN-4424][CORE] Upgrade spark version to 3.5.1 in Gluten by @JkSelf in #4822
- [VL] Daily Update Velox Version (2024_03_04) by @GlutenPerfBot in #4841
- [GLUTEN-4813] Replace resize/reserve to resize_extact/reserve_exact to reduce memory by @taiyang-li in #4824
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240305) by @lwz9103 in #4849
- [VL] Fix boost installation issue and remove useless QueryCtx by @PHILO-HE in #4850
- [VL] Enable "parquet v2 pages - delta encoding" test for Spark33/Spark34 by @Yohahaha in #4816
- [CORE] Support FileSourceScanExec driver metrics for spark3.4/3.5 by @zhli1142015 in #4848
- [GLUTEN-4772][VL] Support empty map/array literal by @WangGuangxin in #4771
- [GLUTEN-4860][CELEBORN] Replace celeborn link by @kerwin-zk in #4861
- [VL][CI] Fix CI failure related to Celeborn by @PHILO-HE in #4862
- [CORE] Support In list option contains non-foldable expression by @ulysses-you in #4843
- [VL] Daily Update Velox Version (2024_03_05) by @GlutenPerfBot in #4852
- [VL] Enable more tests in GlutenParquetQuerySuite for Spark32/33/34 by @Yohahaha in #4854
- [CORE] ColumnarShuffleExchangeExec should respect advisoryPartitionSize for Spark3.5 by @ulysses-you in #4865
- [GLUTEN-4853][CORE] Only trim Alias when its child is semantically equal to resAttr by @liujiayi771 in #4857
- [VL] minor change for delta ut by @zhli1142015 in #4869
- [VL] Add libsodium.so to thirdparty lib for CentOS8 by @kerwin-zk in #4870
- [VL] Updated documentation, refactoring and added more testcases for BNLJ by @Surbhi-Vijay in #4782
- [VL] Daily Update Velox Version (2024_03_06) by @GlutenPerfBot in #4868
- [MINOR] Remove ExtendedAnalysisException by @PHILO-HE in #4864
- [GLUTEN-4831][VL] Support StructType in HashAggregate by @WangGuangxin in #4832
- [VL] Support inline function by @marin-ma in #4847
- [VL] Add flushable decimal sum test case by @liujiayi771 in #4871
- [CORE] Add synchronized for ExplainUtils processPlan by @ulysses-you in #4876
- [VL] Rewrite collect_set and collect_list aggregate function by @ulysses-you in #4805
- [VL] Fix and use flattenVector by @marin-ma in #4783
- [VL] Enable tests of ParquetPartitionDisconverySuite for Spark33/34 by @Yohahaha in #4881
- [CORE] Minor adjustment to columnar rule list, and move all columnar sub-rules to one source folder by @zhztheplayer in #4863
- [VL] Merge Partial and PartialMerge logic in generateMergeCompanionNode by @liujiayi771 in #4883
- [CORE] Fix Spark-3.5 CI by @ulysses-you in #4886
- [GLUTEN-4424][CORE] Follow up upgrading spark version to 3.5.1 by @JkSelf in #4845
- Add .asf.yml by @yaooqinn in #4892
- Update Vulnerability Handling Process by @yaooqinn in #4894
- [VL] Daily Update Velox Version (2024_03_07) by @GlutenPerfBot in #4877
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240308) by @lwz9103 in #4890
- [CORE] ColumnarBroadcastExchangeExec should set/cancel with job tag for Spark3.5 by @ulysses-you in #4882
- [VL] Daily Update Velox Version (2024_03_08) by @GlutenPerfBot in #4895
- [VL] Pass partition id to velox functions by @zhli1142015 in #4344
- Add Incubation Standard Disclaimer by @yaooqinn in #4911
- [GLUTEN-4835][CORE] Match metric names with Spark by @clee704 in #4834
- [Gluten-4732][CH] delta-mergetree support update/delete/upsert/insert in a more native delta way by @binmahone in #4733
- [GLUTEN-4898][CH]Bug fix to date diff by @KevinyhZou in #4900
- [VL] Daily Update Velox Version(2024_03_11) by @GlutenPerfBot in #4908
- [DOC] Update release & configuration doc by @PHILO-HE in #4910
- [VL] Support lead window function by @ulysses-you in #4902
- [VL] Fix protobuf configure arguments in get_velox.sh by @liujiayi771 in #4920
- [Gluten-4918][CH]support CTAS for clickhouse table by @binmahone in #4919
- [GLUTEN-4926][CELEBORN]
CelebornShuffleManager
should removeshuffleId
fromcolumnarShuffleIds
after unregistering shuffle by @SteNicholas in #4927 - [Gluten-4912][CH]Support Specifying columns in clickhouse tables to b… by @binmahone in #4925
- [Gluten-4706] [CH][CORE] Add a mode to execute count distinct directly instead o… by @binmahone in #4708
- [VL] Daily Update Velox Version (2024_03_12) by @GlutenPerfBot in #4923
- [GLUTEN-4914][CH] Fix exceptions in ASTParser by @taiyang-li in #4916
- [DOC] Minor fix for wrong gluten folder used in doc by @leoluan2009 in #4938
- [VL] Refine log plan/split json into one line by @Yohahaha in #4934
- [VL] Support posexplode function and code refactoring on GenerateExecTransformer by @marin-ma in #4901
- [CORE] Prior to #4893, add vanilla Spark's original scan source code to keep git history by @zhztheplayer in #4931
- [VL] Fix wrong plan equality due to case class inheritance by @zhztheplayer in #4893
- [GLUTEN-3559][VL] enable more sql query tests for Spark34 by @zhouyuan in #4880
- [VL] Daily Update Velox Version (2024_03_13) by @GlutenPerfBot in #4944
- [VL]Bucket join support for Iceberg tables by @SinghAsDev in #4859
- [GLUTEN-4827][UT] Add Golden Files for TPC-H Spark34 + Gluten Execution Plan by @zwangsheng in #4828
- [VL] Verify unhex has been offloaded to native successfully by @Yohahaha in #4937
- [VL] Support skewness aggregate function by @liujiayi771 in #4939
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240314) by @lwz9103 in #4948
- [VL] parquet file metadata columns support in velox by @gaoyangxiaozhu in #3870
- [VL] Daily Update Velox Version (2024_03_14) by @GlutenPerfBot in #4949
- [VL] Untangle code of TransformPreOverrides by @zhztheplayer in ht...