Skip to content

Latest commit

 

History

History
142 lines (103 loc) · 5.07 KB

CONTRIBUTING.md

File metadata and controls

142 lines (103 loc) · 5.07 KB

Contributing to hudi-rs

Welcome to the Apache Hudi community! We appreciate your interest in contributing to this open-source data lake platform. This guide will walk you through the process of making your first contribution.

Starter issues

If you are new to the project, we recommend starting with issues listed in https://github.com/apache/hudi-rs/contribute.

File an issue

Testing and reporting bugs are also valueable contributions. Please follow the issue template to file bug reports.

Issue tracking

All issues tagged for a release can be found in the corresponding milestone page, see https://github.com/apache/hudi-rs/milestones.

Features, bugs, and p0 issues that are targeting the next release can be found in this project view. Pull requests won't be tracked in the project view, instead, they will be linked to the corresponding issues.

Prepare for development

  • Install Rust, e.g. as described here
  • Have a compatible Python version installed (check python/pyproject.toml for current requirement)

Commonly used dev commands

For most of the time, use dev commands specified in the Makefile.

To setup python virtual env, run

make setup-venv

Note

This will run python3 command to set up the virtual environment in venv/. Activate the virtual environment by running source venv/bin/activate for example.

Once a virtual environment is activated, build the project for development by

make develop

This will install hudi dependency built from your local repo to the virtual env.

Run tests locally

For Rust,

# For all tests
make test-rust
# or
cargo test --workspace

# For all tests in a crate / package
cargo test -p hudi-core

# For a specific test case
cargo test -p hudi-core table::tests::hudi_table_get_schema

For Python,

# For all tests
make test-python
# or
pytest -s python/tests

# For a specific test case
pytest python/tests/test_table_read.py -s -k "test_read_table_has_correct_schema"

Before creating a pull request

Run the below command and fix issues if any:

make format check test

Create a pull request

When submitting a pull request, please follow these guidelines:

  1. Title Format: The pull request title must follow the format outlined in the conventional commits spec. This is a standardized format for commit messages, and also allows us to auto-generate change logs and release notes. Since only the main branch requires this format, and we always squash commits and then merge the PR, incremental commits' messages do not need to conform to it.
  2. Line Count: A general guideline is to keep the PR's diff, i.e., max(added lines, deleted lines), less than 1000 lines. Keeping PRs concise makes it easier for reviewers to thoroughly examine changes without experiencing fatigue. If your changes exceed this limit, consider breaking them down into smaller, logical PRs that address specific aspects of the feature or bug fix.
  3. Coverage Requirements: All new features and bug fixes must include appropriate unit tests to ensure functionality and prevent regressions. Tests should cover both typical use cases and edge cases. Ensure that new tests pass locally before submitting the PR.
  4. Code Comments: Properly designed APIs and code should be self-explanatory and make in-code comments redundant. In case that complex logic or non-obvious implementations are absolutely unavoidable, please add comments to explain the code's purpose and behavior.

Code coverage

We use codecov to generate code coverage report and enforce code coverage rate. See codecov.yml for the configuration.

Learning

To help with contributing to the project, please explore Hudi's documentation for further learning.

Code of Conduct

We expect all community members to follow our Code of Conduct.