Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Prepare 3.10.0 release #2992

Merged
merged 2 commits into from
Oct 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .bumpversion.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[tool.bumpversion]
current_version = "3.9.2b1"
current_version = "3.10.0"
commit = false
tag = false
tag_name = "{new_version}"
Expand Down
80 changes: 40 additions & 40 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,27 +94,27 @@ FROM "sampleDB"."sampleTable" ORDER BY time DESC LIMIT 3
## At scale
AWS SDK for pandas can also run your workflows at scale by leveraging [Modin](https://modin.readthedocs.io/en/stable/) and [Ray](https://www.ray.io/). Both projects aim to speed up data workloads by distributing processing over a cluster of workers.

Read our [docs](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/scale.html) or head to our latest [tutorials](https://github.com/aws/aws-sdk-pandas/tree/main/tutorials) to learn more.
Read our [docs](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/scale.html) or head to our latest [tutorials](https://github.com/aws/aws-sdk-pandas/tree/main/tutorials) to learn more.

> ⚠️ **Ray is currently not available for Python 3.12. While AWS SDK for pandas supports Python 3.12, it cannot be used at scale.**

## [Read The Docs](https://aws-sdk-pandas.readthedocs.io/)

- [**What is AWS SDK for pandas?**](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/about.html)
- [**Install**](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/install.html)
- [PyPi (pip)](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/install.html#pypi-pip)
- [Conda](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/install.html#conda)
- [AWS Lambda Layer](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/install.html#aws-lambda-layer)
- [AWS Glue Python Shell Jobs](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/install.html#aws-glue-python-shell-jobs)
- [AWS Glue PySpark Jobs](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/install.html#aws-glue-pyspark-jobs)
- [Amazon SageMaker Notebook](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/install.html#amazon-sagemaker-notebook)
- [Amazon SageMaker Notebook Lifecycle](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/install.html#amazon-sagemaker-notebook-lifecycle)
- [EMR](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/install.html#emr)
- [From source](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/install.html#from-source)
- [**At scale**](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/scale.html)
- [Getting Started](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/scale.html#getting-started)
- [Supported APIs](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/scale.html#supported-apis)
- [Resources](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/scale.html#resources)
- [**What is AWS SDK for pandas?**](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/about.html)
- [**Install**](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/install.html)
- [PyPi (pip)](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/install.html#pypi-pip)
- [Conda](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/install.html#conda)
- [AWS Lambda Layer](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/install.html#aws-lambda-layer)
- [AWS Glue Python Shell Jobs](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/install.html#aws-glue-python-shell-jobs)
- [AWS Glue PySpark Jobs](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/install.html#aws-glue-pyspark-jobs)
- [Amazon SageMaker Notebook](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/install.html#amazon-sagemaker-notebook)
- [Amazon SageMaker Notebook Lifecycle](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/install.html#amazon-sagemaker-notebook-lifecycle)
- [EMR](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/install.html#emr)
- [From source](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/install.html#from-source)
- [**At scale**](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/scale.html)
- [Getting Started](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/scale.html#getting-started)
- [Supported APIs](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/scale.html#supported-apis)
- [Resources](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/scale.html#resources)
- [**Tutorials**](https://github.com/aws/aws-sdk-pandas/tree/main/tutorials)
- [001 - Introduction](https://github.com/aws/aws-sdk-pandas/blob/main/tutorials/001%20-%20Introduction.ipynb)
- [002 - Sessions](https://github.com/aws/aws-sdk-pandas/blob/main/tutorials/002%20-%20Sessions.ipynb)
Expand Down Expand Up @@ -155,30 +155,30 @@ Read our [docs](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/scale.html) or
- [039 - Athena Iceberg](https://github.com/aws/aws-sdk-pandas/blob/main/tutorials/039%20-%20Athena%20Iceberg.ipynb)
- [040 - EMR Serverless](https://github.com/aws/aws-sdk-pandas/blob/main/tutorials/040%20-%20EMR%20Serverless.ipynb)
- [041 - Apache Spark on Amazon Athena](https://github.com/aws/aws-sdk-pandas/blob/main/tutorials/041%20-%20Apache%20Spark%20on%20Amazon%20Athena.ipynb)
- [**API Reference**](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html)
- [Amazon S3](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#amazon-s3)
- [AWS Glue Catalog](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#aws-glue-catalog)
- [Amazon Athena](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#amazon-athena)
- [Amazon Redshift](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#amazon-redshift)
- [PostgreSQL](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#postgresql)
- [MySQL](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#mysql)
- [SQL Server](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#sqlserver)
- [Oracle](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#oracle)
- [Data API Redshift](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#data-api-redshift)
- [Data API RDS](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#data-api-rds)
- [OpenSearch](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#opensearch)
- [AWS Glue Data Quality](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#aws-glue-data-quality)
- [Amazon Neptune](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#amazon-neptune)
- [DynamoDB](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#dynamodb)
- [Amazon Timestream](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#amazon-timestream)
- [Amazon EMR](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#amazon-emr)
- [Amazon CloudWatch Logs](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#amazon-cloudwatch-logs)
- [Amazon Chime](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#amazon-chime)
- [Amazon QuickSight](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#amazon-quicksight)
- [AWS STS](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#aws-sts)
- [AWS Secrets Manager](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#aws-secrets-manager)
- [Global Configurations](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#global-configurations)
- [Distributed - Ray](https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/api.html#distributed-ray)
- [**API Reference**](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html)
- [Amazon S3](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#amazon-s3)
- [AWS Glue Catalog](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#aws-glue-catalog)
- [Amazon Athena](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#amazon-athena)
- [Amazon Redshift](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#amazon-redshift)
- [PostgreSQL](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#postgresql)
- [MySQL](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#mysql)
- [SQL Server](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#sqlserver)
- [Oracle](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#oracle)
- [Data API Redshift](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#data-api-redshift)
- [Data API RDS](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#data-api-rds)
- [OpenSearch](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#opensearch)
- [AWS Glue Data Quality](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#aws-glue-data-quality)
- [Amazon Neptune](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#amazon-neptune)
- [DynamoDB](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#dynamodb)
- [Amazon Timestream](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#amazon-timestream)
- [Amazon EMR](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#amazon-emr)
- [Amazon CloudWatch Logs](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#amazon-cloudwatch-logs)
- [Amazon Chime](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#amazon-chime)
- [Amazon QuickSight](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#amazon-quicksight)
- [AWS STS](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#aws-sts)
- [AWS Secrets Manager](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#aws-secrets-manager)
- [Global Configurations](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#global-configurations)
- [Distributed - Ray](https://aws-sdk-pandas.readthedocs.io/en/3.10.0/api.html#distributed-ray)
- [**License**](https://github.com/aws/aws-sdk-pandas/blob/main/LICENSE.txt)
- [**Contributing**](https://github.com/aws/aws-sdk-pandas/blob/main/CONTRIBUTING.md)

Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
3.9.2b1
3.10.0
2 changes: 1 addition & 1 deletion awswrangler/__metadata__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,5 @@

__title__: str = "awswrangler"
__description__: str = "Pandas on AWS."
__version__: str = "3.9.2b1"
__version__: str = "3.10.0"
__license__: str = "Apache License 2.0"
16 changes: 8 additions & 8 deletions awswrangler/athena/_read.py
Original file line number Diff line number Diff line change
Expand Up @@ -793,11 +793,11 @@ def read_sql_query(

**Related tutorial:**

- `Amazon Athena <https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/
- `Amazon Athena <https://aws-sdk-pandas.readthedocs.io/en/3.10.0/
tutorials/006%20-%20Amazon%20Athena.html>`_
- `Athena Cache <https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/
- `Athena Cache <https://aws-sdk-pandas.readthedocs.io/en/3.10.0/
tutorials/019%20-%20Athena%20Cache.html>`_
- `Global Configurations <https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/
- `Global Configurations <https://aws-sdk-pandas.readthedocs.io/en/3.10.0/
tutorials/021%20-%20Global%20Configurations.html>`_

**There are three approaches available through ctas_approach and unload_approach parameters:**
Expand Down Expand Up @@ -861,7 +861,7 @@ def read_sql_query(
/athena.html#Athena.Client.get_query_execution>`_ .

For a practical example check out the
`related tutorial <https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/
`related tutorial <https://aws-sdk-pandas.readthedocs.io/en/3.10.0/
tutorials/024%20-%20Athena%20Query%20Metadata.html>`_!


Expand Down Expand Up @@ -1137,11 +1137,11 @@ def read_sql_table(

**Related tutorial:**

- `Amazon Athena <https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/
- `Amazon Athena <https://aws-sdk-pandas.readthedocs.io/en/3.10.0/
tutorials/006%20-%20Amazon%20Athena.html>`_
- `Athena Cache <https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/
- `Athena Cache <https://aws-sdk-pandas.readthedocs.io/en/3.10.0/
tutorials/019%20-%20Athena%20Cache.html>`_
- `Global Configurations <https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/
- `Global Configurations <https://aws-sdk-pandas.readthedocs.io/en/3.10.0/
tutorials/021%20-%20Global%20Configurations.html>`_

**There are three approaches available through ctas_approach and unload_approach parameters:**
Expand Down Expand Up @@ -1205,7 +1205,7 @@ def read_sql_table(
/athena.html#Athena.Client.get_query_execution>`_ .

For a practical example check out the
`related tutorial <https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/
`related tutorial <https://aws-sdk-pandas.readthedocs.io/en/3.10.0/
tutorials/024%20-%20Athena%20Query%20Metadata.html>`_!


Expand Down
4 changes: 2 additions & 2 deletions awswrangler/catalog/_create.py
Original file line number Diff line number Diff line change
Expand Up @@ -1100,7 +1100,7 @@ def create_csv_table(
If True allows schema evolution (new or missing columns), otherwise a exception will be raised.
(Only considered if dataset=True and mode in ("append", "overwrite_partitions"))
Related tutorial:
https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/tutorials/014%20-%20Schema%20Evolution.html
https://aws-sdk-pandas.readthedocs.io/en/3.10.0/tutorials/014%20-%20Schema%20Evolution.html
sep
String of length 1. Field delimiter for the output file.
skip_header_line_count
Expand Down Expand Up @@ -1280,7 +1280,7 @@ def create_json_table(
If True allows schema evolution (new or missing columns), otherwise a exception will be raised.
(Only considered if dataset=True and mode in ("append", "overwrite_partitions"))
Related tutorial:
https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/tutorials/014%20-%20Schema%20Evolution.html
https://aws-sdk-pandas.readthedocs.io/en/3.10.0/tutorials/014%20-%20Schema%20Evolution.html
serde_library
Specifies the SerDe Serialization library which will be used. You need to provide the Class library name
as a string.
Expand Down
4 changes: 2 additions & 2 deletions awswrangler/s3/_read_orc.py
Original file line number Diff line number Diff line change
Expand Up @@ -225,7 +225,7 @@ def read_orc(
must return a bool, True to read the partition or False to ignore it.
Ignored if `dataset=False`.
E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
https://aws-sdk-pandas.readthedocs.io/en/3.10.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
columns
List of columns to read from the file(s).
validate_schema
Expand Down Expand Up @@ -384,7 +384,7 @@ def read_orc_table(
must return a bool, True to read the partition or False to ignore it.
Ignored if `dataset=False`.
E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
https://aws-sdk-pandas.readthedocs.io/en/3.10.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
columns
List of columns to read from the file(s).
validate_schema
Expand Down
4 changes: 2 additions & 2 deletions awswrangler/s3/_read_parquet.py
Original file line number Diff line number Diff line change
Expand Up @@ -398,7 +398,7 @@ def read_parquet(
must return a bool, True to read the partition or False to ignore it.
Ignored if `dataset=False`.
E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
https://aws-sdk-pandas.readthedocs.io/en/3.10.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
columns
List of columns to read from the file(s).
validate_schema
Expand Down Expand Up @@ -639,7 +639,7 @@ def read_parquet_table(
must return a bool, True to read the partition or False to ignore it.
Ignored if `dataset=False`.
E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
https://aws-sdk-pandas.readthedocs.io/en/3.10.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
columns
List of columns to read from the file(s).
validate_schema
Expand Down
6 changes: 3 additions & 3 deletions awswrangler/s3/_read_text.py
Original file line number Diff line number Diff line change
Expand Up @@ -236,7 +236,7 @@ def read_csv(
This function MUST return a bool, True to read the partition or False to ignore it.
Ignored if `dataset=False`.
E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
https://aws-sdk-pandas.readthedocs.io/en/3.10.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
s3_additional_kwargs
Forwarded to botocore requests.
ray_args
Expand Down Expand Up @@ -397,7 +397,7 @@ def read_fwf(
This function MUST return a bool, True to read the partition or False to ignore it.
Ignored if `dataset=False`.
E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
https://aws-sdk-pandas.readthedocs.io/en/3.10.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
s3_additional_kwargs
Forwarded to botocore requests.
ray_args
Expand Down Expand Up @@ -565,7 +565,7 @@ def read_json(
This function MUST return a bool, True to read the partition or False to ignore it.
Ignored if `dataset=False`.
E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
https://aws-sdk-pandas.readthedocs.io/en/3.10.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
s3_additional_kwargs
Forwarded to botocore requests.
ray_args
Expand Down
4 changes: 2 additions & 2 deletions awswrangler/s3/_write_orc.py
Original file line number Diff line number Diff line change
Expand Up @@ -407,7 +407,7 @@ def to_orc(
concurrent_partitioning
If True will increase the parallelism level during the partitions writing. It will decrease the
writing time and increase the memory usage.
https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/tutorials/022%20-%20Writing%20Partitions%20Concurrently.html
https://aws-sdk-pandas.readthedocs.io/en/3.10.0/tutorials/022%20-%20Writing%20Partitions%20Concurrently.html
mode
``append`` (Default), ``overwrite``, ``overwrite_partitions``. Only takes effect if dataset=True.
catalog_versioning
Expand All @@ -416,7 +416,7 @@ def to_orc(
If True allows schema evolution (new or missing columns), otherwise a exception will be raised. True by default.
(Only considered if dataset=True and mode in ("append", "overwrite_partitions"))
Related tutorial:
https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/tutorials/014%20-%20Schema%20Evolution.html
https://aws-sdk-pandas.readthedocs.io/en/3.10.0/tutorials/014%20-%20Schema%20Evolution.html
database
Glue/Athena catalog: Database name.
table
Expand Down
6 changes: 3 additions & 3 deletions awswrangler/s3/_write_parquet.py
Original file line number Diff line number Diff line change
Expand Up @@ -435,18 +435,18 @@ def to_parquet(
concurrent_partitioning
If True will increase the parallelism level during the partitions writing. It will decrease the
writing time and increase the memory usage.
https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/tutorials/022%20-%20Writing%20Partitions%20Concurrently.html
https://aws-sdk-pandas.readthedocs.io/en/3.10.0/tutorials/022%20-%20Writing%20Partitions%20Concurrently.html
mode
``append`` (Default), ``overwrite``, ``overwrite_partitions``. Only takes effect if dataset=True.
For details check the related tutorial:
https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/tutorials/004%20-%20Parquet%20Datasets.html
https://aws-sdk-pandas.readthedocs.io/en/3.10.0/tutorials/004%20-%20Parquet%20Datasets.html
catalog_versioning
If True and `mode="overwrite"`, creates an archived version of the table catalog before updating it.
schema_evolution
If True allows schema evolution (new or missing columns), otherwise a exception will be raised. True by default.
(Only considered if dataset=True and mode in ("append", "overwrite_partitions"))
Related tutorial:
https://aws-sdk-pandas.readthedocs.io/en/3.9.2b1/tutorials/014%20-%20Schema%20Evolution.html
https://aws-sdk-pandas.readthedocs.io/en/3.10.0/tutorials/014%20-%20Schema%20Evolution.html
database
Glue/Athena catalog: Database name.
table
Expand Down
Loading
Loading