Releases: GoogleCloudPlatform/gcs-connector-for-pytorch
Releases · GoogleCloudPlatform/gcs-connector-for-pytorch
1.4.0
What's Changed
- Lightning multinode parquet by @jdnurme in #73
- Install pytest on running continuous test for dataflux pypi package. by @akansha1812 in #75
- Make DatafluxPytTrain a wrapper of DataFluxMapStyleDataset by @abhibyreddi in #74
- updated base docker, example yaml, added readme by @jdnurme in #76
- Fix DatafluxPytTrain.getitem by @abhibyreddi in #77
- Run continuous test on the pypi installed package on presubmit by @akansha1812 in #78
- Add code to make it possible to deploy training on a multi-node GKE cluster by @abhibyreddi in #81
- Configure shared memory size by @abhibyreddi in #82
- Reorder Dockerfile and add dockerignore to speed up builds by @MattIrv in #84
- Correcting the checkpointing functions to handle the Path object. by @Yash9060 in #83
- Parse bucket name from ckpt directory name instead of separate parameter for bucket name by @Yash9060 in #85
- Make Lightning checkpoint demo work with Bernard's GKE framework and with FSDP strategy by @MattIrv in #86
- Initialize new storage_client.bucket on every request by @Yash9060 in #87
- Add README file for lightning image segmentation workload by @abhibyreddi in #89
- Check in initial Parquet benchmark based on MaxText data loading benchmark by @MattIrv in #90
- Add GKE deployment for MaxText Parquet training benchmark by @MattIrv in #91
- Skip training when demo is run to benchmark Dataflux by @abhibyreddi in #92
- Update the definition of the local flag by @abhibyreddi in #93
- Allow running demo code in listing-only mode by @abhibyreddi in #95
- Raise exception when ADC are missing by @abhibyreddi in #94
- Update defaults for batch_size and num_workers by @abhibyreddi in #96
- Faster Lightning Checkpoint download by @MattIrv in #99
- Adding custom GCS Writer. by @Yash9060 in #98
- update to latest dataflux client by @jdnurme in #101
- add continuous benchmark with kokoro by @jdnurme in #102
- Run image training demo as part of continuous integ tests by @abhibyreddi in #104
- Adding GCS Custom reader by @Yash9060 in #105
- MultiNode demo by @Yash9060 in #106
- add benchmark code and update kokoro scripts by @jdnurme in #108
- Parameterizing min_epochs, max_epochs & max_steps by @Yash9060 in #107
- Add a helper method to create storage_client when needed. by @awonak in #109
- Make step time configurable by @abhibyreddi in #110
- Remove client initialization for fast listing from dataflux-pytorch by @akansha1812 in #111
- Multipart checkpoint upload by @jdnurme in #114
- adds unit tests, adds presubmit integration test, updated demo code by @jdnurme in #117
- Add code to clear kernel cache after saving checkpoints by @abhibyreddi in #122
- update continuous to run full benchmark by @jdnurme in #123
- Adding benchmarking code for multi node checkpointing. by @Yash9060 in #121
- set multipart upload to default behavior by @jdnurme in #127
- Introduce AsyncCheckpointIO option for non-blocking checkpoint saves by @awonak in #116
- Print average times to save and load checkpoints together by @abhibyreddi in #129
- Changing hardcoded values to placeholders by @Yash9060 in #128
- Make num_nodes configurable by @abhibyreddi in #130
- update lightning bench with multipart and 10k info by @jdnurme in #131
- update default dataflux to use multipart by @jdnurme in #133
- Run unit tests on x86 Mac by @abhibyreddi in #115
- implement fast download for df checkpoint by @jdnurme in #134
- Add image segmentation benchmark results to README by @abhibyreddi in #118
- Add single node async benchmark execution to integration tests by @awonak in #135
- Refactor benchmark tables by @awonak in #136
- add option to run benchmark without lightning by @jdnurme in #137
- Fix AsyncCheckpointIO race condition by @awonak in #138
- Update image segmentation benchmark README by @abhibyreddi in #139
- add upload and download improvements to multinode by @jdnurme in #141
- Update documented step time by @abhibyreddi in #142
- CPU simulated benchmarking for GKE cluster. by @Yash9060 in #143
- Simulated CPU benchmarking code by @Yash9060 in #145
- Add support for multi-node checkpointing with fsspec by @abhibyreddi in #144
- Correcting the code for simulated benchmarks by @Yash9060 in #146
- Multi-node checkpoint benchmark improvements by @MattIrv in #149
- Set pytorch version to 2.3.1 by @abhibyreddi in #148
- update main readme with checkpoing bench results by @jdnurme in #150
- Add support to benchmark multi-node checkpointing with default FSDP strategy by @abhibyreddi in #151
- Remove duplicative pip install instructions from multi-node checkpoint benchmark readme by @MattIrv in #152
- Skip saving checkpoints during training by @abhibyreddi in #153
- Install checkpoint benchmark dependencies before running the benchmark by @abhibyreddi in #155
- Update checkpoint readmes by @MattIrv in #159
- Implement a custom FSDP strategy for benchmarking loads from boot disk by @abhibyreddi in #157
- Added debug flag to GCSReader/Writer by @Yash9060 in #154
- Correcting load_checkpoint for simulated benchmarks. by @Yash9060 in #161
- Add support for benchmarking checkpoint save/restore to/from distributed filesystems by @abhibyreddi in #162
- Correct table header row by @abhibyreddi in #163
- Adding option to use FSspec with simulated benchmarks by @Yash9060 in #164
- Create client for each processs by @akansha1812 in #166
- update bench script to run simulated multinode bench by @jdnurme in https://github.com/GoogleCloudPlatform/dataflux-pyt...
v1.3.0
What's Changed
- Add boilerplate code for Dataflux-Pytorch Lightning demo by @abhibyreddi in #57
- Refactor Dataflux simple demo loops and add retry flags by @MattIrv in #59
- Catch exception when loading arrays from raw bytes fails by @abhibyreddi in #38
- Implement data module for the pytorch lightning workload by @abhibyreddi in #60
- Update default retry config to match successful 1k-node benchmarks. by @MattIrv in #63
- Lightning text by @jdnurme in #64
- add limit_train_batches param by @jdnurme in #65
- Make it possible to deploy Pytorch Lightning image segmentation workload on a Ray cluster. by @abhibyreddi in #66
- Update demo loops to configure multiprocessing start method by @MattIrv in #68
- For mac and windows skip passing client storage to avoid pickling error in multiprocessing by @akansha1812 in #67
- Disable compose download when create and delete permissions are missing by @akansha1812 in #70
- Continuous test which installs gcs-torch-dataflux from PyPi and runs integration test. by @akansha1812 in #71
- Added lightning package to setup file and updated version for re… by @divrawal in #69
New Contributors
- @akansha1812 made their first contribution in #67
Full Changelog: v1.2.0...v1.3.0
v1.2.0
What's Changed
- Standardize Python extensions and formatting settings. by @MattIrv in #53
- Set Dataflux user-agent through dataflux_core.user_agent module by @MattIrv in #52
- Apply new formatter to all Python files. by @MattIrv in #54
- Benchmark update by @divrawal in #51
- Configure retry logic by @jdnurme in #55
Full Changelog: v1.1.0...v1.2.0
v1.1.0
What's Changed
- Update README.md by @MattIrv in #33
- Presubmit Integration Testing by @jdnurme in #34
- Update README.md by @dutchiechris in #36
- Add a new flag for specifying number of dataloader threads by @abhibyreddi in #37
- add CODEOWNERS file by @jdnurme in #39
- Lightning checkpoint by @divrawal in #40
- Fix the continuous build failure by introducing virtual environment by @bernardhan33 in #43
- Add threaded download to map-style dataset. by @MattIrv in #44
- Readme typo fix by @divrawal in #42
- add disable_compose config by @jdnurme in #46
- Fix continuous and presubmit tests by @bernardhan33 in #47
- Benchmark checkpoint by @divrawal in #45
- update readme with 429 info by @jdnurme in #48
- Fix QPS limit example URL referring to project instead of bucket. by @MattIrv in #49
- Bump version to 1.1.0 by @bernardhan33 in #50
New Contributors
- @dutchiechris made their first contribution in #36
- @abhibyreddi made their first contribution in #37
- @divrawal made their first contribution in #40
Full Changelog: v1.0.0...v1.1.0
v1.0.0
What's Changed
- Add iterable dataset to Colab demo by @bernardhan33 in #19
- Update README to note the iterable dataset support by @bernardhan33 in #18
- add kokoro configs by @jdnurme in #21
- Fix Iterable Dataset bug on downloading the whole subset of data by @bernardhan33 in #28
- Update baseline performance numbers for Dataflux datasets by @bernardhan33 in #29
- Debug logging for Kokoro unit test build by @MattIrv in #30
- Increase pytest verbosity in Kokoro by @MattIrv in #31
- bump version to 1.0.0 by @bernardhan33 in #32
New Contributors
Full Changelog: v0.1.0...v1.0.0
v0.1.0
What's Changed
- Create a real-world end-to-end image segmentation training demo with Dataflux Dataset by @bernardhan33 in #8
- Add checkpointing support by @bernardhan33 in #9
- Add the simple walkthrough Colab by @bernardhan33 in #11
- Add fast listing component to quick demo by @bernardhan33 in #13
- Add pyproject.toml to prepare for PyPI release by @bernardhan33 in #14
- Update README and demos to note the new pip install command by @bernardhan33 in #16
- Add support for Dataflux Iterable Dataset by @bernardhan33 in #17
New Contributors
Full Changelog: v0.0.0...v0.1.0
Dataflux v0.0.0
- Added support for PyTorch map-style dataset.
- Published early README.
What's Changed
- Initial commit of dataflux-pytorch by @Magichan33 in #1
- Fix padding by @Magichan33 in #2
- Fix typo by @Magichan33 in #3
New Contributors
- @Magichan33 made their first contribution in #1
Full Changelog: https://github.com/GoogleCloudPlatform/dataflux-pytorch/commits/v0.0.0