Skip to content

Ray-2.41.0

Latest
Compare
Choose a tag to compare
@aslonnie aslonnie released this 23 Jan 10:02
· 89 commits to master since this release
021baf7

Highlights

  • Major update of RLlib docs and example scripts for the new API stack.

Ray Libraries

Ray Data

🎉 New Features:

  • Expression support for filters (#49016)
  • Support partition_cols in write_parquet (#49411)
  • Feature: implement multi-directional sort over Ray Data datasets (#49281)

💫 Enhancements:

  • Use dask 2022.10.2 (#48898)
  • Clarify schema validation error (#48882)
  • Raise ValueError when the data sort key is None (#48969)
  • Provide more messages when webdataset format is error (#48643)
  • Upgrade Arrow version from 17 to 18 (#48448)
  • Update hudi version to 0.2.0 (#48875)
  • webdataset: expand JSON objects into individual samples (#48673)
  • Support passing kwargs to map tasks. (#49208)
  • Add ExecutionCallback interface (#49205)
  • Add seed for read files (#49129)
  • Make select_columns and rename_columns use Project operator (#49393)

🔨 Fixes:

  • Fix partial function name parsing in map_groups (#48907)
  • Always launch one task for read_sql (#48923)
  • Reimplement of fix memory pandas (#48970)
  • webdataset: flatten return args (#48674)
  • Handle numpy > 2.0.0 behaviour in _create_possibly_ragged_ndarray (#48064)
  • Fix DataContext sealing for multiple datasets. (#49096)
  • Fix to_tf for List types (#49139)
  • Fix type mismatch error while mapping nullable column (#49405)
  • Datasink: support passing write results to on_write_completes (#49251)
  • Fix groupby hang when value contains np.nan (#49420)
  • Fix bug where file_extensions doesn't work with compound extensions (#49244)
  • Fix map operator fusion when concurrency is set (#49573)

Ray Train

🎉 New Features:

  • Output JSON structured log files for system and application logs (#49414)
  • Add support for AMD ROCR_VISIBLE_DEVICES (#49346)

💫 Enhancements:

🏗 Architecture refactoring:

  • LightGBM: Rewrite get_network_params implementation (#49019)

Ray Tune

🎉 New Features:

  • Update optuna_search to allow users to configure optuna storage (#48547)

🏗 Architecture refactoring:

Ray Serve

💫 Enhancements:

  • Improved request_id generation to reduce proxy CPU overhead (#49537)
  • Tune GC threshold by default in proxy (#49720)
  • Use pickle.dumps for faster serialization from proxy to replica (#49539)

🔨 Fixes:

  • Handle nested ‘=’ in serve run arguments (#49719)
  • Fix bug when ray.init() is called multiple times with different runtime_envs (#49074)

🗑️ Deprecations:

  • Adds a warning that the default behavior for sync methods will change in a future release. They will be run in a threadpool by default. You can opt into this behavior early by setting RAY_SERVE_RUN_SYNC_IN_THREADPOOL=1. (#48897)

RLlib

🎉 New Features:

  • Add support for external Envs to new API stack: New example script and custom tcp-capable EnvRunner. (#49033)

💫 Enhancements:

  • Offline RL:
    • Add sequence sampling to EpisodeReplayBuffer. (#48116)
    • Allow incomplete SampleBatch data and fully compressed observations. (#48699)
    • Add option to customize OfflineData. (#49015)
    • Enable offline training without specifying an environment. (#49041)
    • Various fixes: #48309, #49194, #49195
  • APPO/IMPALA acceleration (new API stack):
    • Add support for AggregatorActors per Learner. (#49284)
    • Auto-sleep time AND thread-safety for MetricsLogger. (#48868)
    • Activate APPO cont. actions release- and CI tests (HalfCheetah-v1 and Pendulum-v1 new in tuned_examples). (#49068)
    • Add "burn-in" period setting to the training of stateful RLModules. (#49680)
  • Callbacks API: Add support for individual lambda-style callbacks. (#49511)
  • Other enhancements: #49687, #49714, #49693, #49497, #49800, #49098

📖 Documentation:

🔨 Fixes:

🏗 Architecture refactoring:

  • RLModule: Introduce Default[algo]RLModule classes (#49366, #49368)
  • Remove RLlib dependencies from setup.py; add ormsgpack (#49489)

🗑️ Deprecations:

Ray Core and Ray Clusters

Ray Core

💫 Enhancements:

  • Add task_name, task_function_name and actor_name in Structured Logging (#48703)
  • Support redis/valkey authentication with username (#48225)
  • Add v6e TPU Head Resource Autoscaling Support (#48201)
  • compiled graphs: Support all driver and actor read combinations (#48963)
  • compiled graphs: Add ascii based CG visualization (#48315)
  • compiled graphs: Add ray[cg] pip install option (#49220)
  • Allow uv cache at installation (#49176)
  • Support != Filter in GCS for Task State API (#48983)
  • compiled graphs: Add CPU-based NCCL communicator for development (#48440)
  • Support gcs and raylet log rotation (#48952)
  • compiled graphs: Support nsight.nvtx profiling (#49392)

🔨 Fixes:

  • autoscaler: Health check logs are not visible in the autoscaler container's stdout (#48905)
  • Only publish WORKER_OBJECT_EVICTION when the object is out of scope or manually freed (#47990)
  • autoscaler: Autoscaler doesn't scale up correctly when the KubeRay RayCluster is not in the goal state (#48909)
  • autoscaler: Fix incorrectly terminating nodes misclassified as idle in autoscaler v1 (#48519)
  • compiled graphs: Fix the missing dependencies when num_returns is used (#49118)
  • autoscaler: Fuse scaling requests together to avoid overloading the Kubernetes API server (#49150)
  • Fix bug to support S3 pre-signed url for .whl file (#48560)
  • Fix data race on gRPC client context (#49475)
  • Make sure draining node is not selected for scheduling (#49517)

Ray Clusters

💫 Enhancements:

  • Azure: Enable accelerated networking as a flag in azure vms (#47988)

📖 Documentation:

  • Kuberay: Logging: Add Fluent Bit DaemonSet and Grafana Loki to "Persist KubeRay Operator Logs" (#48725)
  • Kuberay: Logging: Specify the Helm chart version in "Persist KubeRay Operator Logs" (#48937)

Dashboard

💫 Enhancements:

  • Add instance variable to many default dashboard graphs (#49174)
  • Display duration in milliseconds if under 1 second. (#49126)
  • Add RAY_PROMETHEUS_HEADERS env for carrying additional headers to Prometheus (#49353)
  • Document about the RAY_PROMETHEUS_HEADERS env for carrying additional headers to Prometheus (#49700)

🏗 Architecture refactoring:

  • Move memray dependency from default to observability (#47763)
  • Move StateHead's methods into free functions. (#49388)

Thanks

@raulchen, @alanwguo, @omatthew98, @xingyu-long, @tlinkin, @yantzu, @alexeykudinkin, @andrewsykim, @win5923, @csy1204, @dayshah, @richardliaw, @stephanie-wang, @gueraf, @rueian, @davidxia, @fscnick, @wingkitlee0, @KPostOffice, @GeneDer, @MengjinYan, @simonsays1980, @pcmoritz, @petern48, @kashiwachen, @pfldy2850, @zcin, @scottjlee, @Akhil-CM, @Jay-ju, @JoshKarpel, @edoakes, @ruisearch42, @gorloffslava, @jimmyxie-figma, @bthananjeyan, @sven1977, @bnorick, @jeffreyjeffreywang, @ravi-dalal, @matthewdeng, @angelinalg, @ivanthewebber, @rkooo567, @srinathk10, @maresb, @gvspraveen, @akyang-anyscale, @mimiliaogo, @bveeramani, @ryanaoleary, @kevin85421, @richardsliu, @hartikainen, @coltwood93, @mattip, @Superskyyy, @justinvyu, @hongpeng-guo, @ArturNiederfahrenhorst, @jecsand838, @Bye-legumes, @hcc429, @WeichenXu123, @martinbomio, @HollowMan6, @MortalHappiness, @dentiny, @zhe-thoughts, @anyadontfly, @smanolloff, @richo-anyscale, @khluu, @xushiyan, @rynewang, @japneet-anyscale, @jjyao, @sumanthratna, @saihaj, @aslonnie

Many thanks to all those who contributed to this release!