Skip to content

Commit

Permalink
Merge branch 'main' into optimise-indexing-operation
Browse files Browse the repository at this point in the history
  • Loading branch information
aivanoved committed Sep 24, 2024
2 parents a73e650 + d019d6c commit 0350661
Show file tree
Hide file tree
Showing 68 changed files with 1,068 additions and 411 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/check_docs_build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
- name: Install uv
uses: astral-sh/setup-uv@v2
uses: astral-sh/setup-uv@v3
with:
enable-cache: "true"
cache-suffix: ${{ matrix.python-version }}
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/check_tpch_queries.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
- name: Install uv
uses: astral-sh/setup-uv@v2
uses: astral-sh/setup-uv@v3
with:
enable-cache: "true"
cache-suffix: ${{ matrix.python-version }}
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/downstream_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
- name: Install uv
uses: astral-sh/setup-uv@v2
uses: astral-sh/setup-uv@v3
with:
enable-cache: "true"
cache-suffix: ${{ matrix.python-version }}
Expand Down Expand Up @@ -63,7 +63,7 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
- name: Install uv
uses: astral-sh/setup-uv@v2
uses: astral-sh/setup-uv@v3
with:
enable-cache: "true"
cache-suffix: ${{ matrix.python-version }}
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/extremes.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
- name: Install uv
uses: astral-sh/setup-uv@v2
uses: astral-sh/setup-uv@v3
with:
enable-cache: "true"
cache-suffix: ${{ matrix.python-version }}
Expand All @@ -46,7 +46,7 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
- name: Install uv
uses: astral-sh/setup-uv@v2
uses: astral-sh/setup-uv@v3
with:
enable-cache: "true"
cache-suffix: ${{ matrix.python-version }}
Expand Down Expand Up @@ -75,7 +75,7 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
- name: Install uv
uses: astral-sh/setup-uv@v2
uses: astral-sh/setup-uv@v3
with:
enable-cache: "true"
cache-suffix: ${{ matrix.python-version }}
Expand Down Expand Up @@ -104,7 +104,7 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
- name: Install uv
uses: astral-sh/setup-uv@v2
uses: astral-sh/setup-uv@v3
with:
enable-cache: "true"
cache-suffix: ${{ matrix.python-version }}
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,6 @@ jobs:
path: .cache
restore-keys: |
mkdocs-material-
- run: pip install -r docs/requirements-docs.txt -e . pandas polars
- run: pip install -r docs/requirements-docs.txt -e . pandas polars pyarrow

- run: mkdocs gh-deploy --force
6 changes: 3 additions & 3 deletions .github/workflows/pytest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
- name: Install uv
uses: astral-sh/setup-uv@v2
uses: astral-sh/setup-uv@v3
with:
enable-cache: "true"
cache-suffix: ${{ matrix.python-version }}
Expand All @@ -46,7 +46,7 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
- name: Install uv
uses: astral-sh/setup-uv@v2
uses: astral-sh/setup-uv@v3
with:
enable-cache: "true"
cache-suffix: ${{ matrix.python-version }}
Expand Down Expand Up @@ -75,7 +75,7 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
- name: Install uv
uses: astral-sh/setup-uv@v2
uses: astral-sh/setup-uv@v3
with:
enable-cache: "true"
cache-suffix: ${{ matrix.python-version }}
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/random_ci_pytest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
- name: Install uv
uses: astral-sh/setup-uv@v2
uses: astral-sh/setup-uv@v3
with:
enable-cache: "true"
cache-suffix: ${{ matrix.python-version }}
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -32,5 +32,8 @@ tpch/data/*
# VSCode
.vscode/

# IntelliJ IDEA / PyCharm
.idea/

# MacOS
.DS_Store
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: 'v0.6.5'
rev: 'v0.6.7'
hooks:
# Run the formatter.
- id: ruff-format
Expand Down
8 changes: 5 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,8 +90,10 @@ If you add code that should be tested, please add tests.
### 6. Running tests
To run tests, run `pytest`. To check coverage: `pytest --cov=narwhals`.
To run tests on the docset-module, use `pytest narwhals --doctest-modules`.
- To run tests, run `pytest`. To check coverage: `pytest --cov=narwhals`
- To run tests on the doctests, use `pytest narwhals --doctest-modules`
- To run unit tests and doctests at the same time, run `pytest tests narwhals --cov=narwhals --doctest-modules`
- To run tests multiprocessed, you may also want to use [pytest-xdist](https://github.com/pytest-dev/pytest-xdist) (optional)
If you want to have less surprises when opening a PR, you can take advantage of [nox](https://nox.thea.codes/en/stable/index.html) to run the entire CI/CD test suite locally in your operating system.
Expand Down Expand Up @@ -153,7 +155,7 @@ listed above in [Working with local development environment](#working-with-local
## How it works

If Narwhals looks like underwater unicorn magic to you, then please read
[how it works](https://narwhals-dev.github.io/narwhals/how-it-works/).
[how it works](https://narwhals-dev.github.io/narwhals/how_it_works/).

## Imports

Expand Down
1 change: 1 addition & 0 deletions docs/api-reference/dataframe.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@
- tail
- to_arrow
- to_dict
- to_native
- to_numpy
- to_pandas
- unique
Expand Down
1 change: 1 addition & 0 deletions docs/api-reference/dtypes.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
- Categorical
- Enum
- String
- Date
- Datetime
- Duration
- Object
Expand Down
1 change: 1 addition & 0 deletions docs/api-reference/expr_str.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
- contains
- ends_with
- head
- len_chars
- slice
- replace
- replace_all
Expand Down
46 changes: 18 additions & 28 deletions docs/api-reference/index.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,20 @@
# API Reference

Anything documented in the API reference is intended to work consistently among
supported backends.

For example:
```python
import narwhals as nw

df.with_columns(
a_mean=nw.col("a").mean(),
a_std=nw.col("a").std(),
)
```
is supported, as `DataFrame.with_columns`, `narwhals.col`, `Expr.mean`, and `Expr.std` are
all documented in the API reference.

However,
```python
import narwhals as nw

df.with_columns(
a_ewm_mean=nw.col("a").ewm_mean(alpha=0.7),
)
```
is not - `Expr.ewm_mean` only appears in the Polars API reference, but not in the Narwhals
one.

In general, you should expect any fundamental dataframe operation to be supported - if
one that you need is not, please do open a feature request!
- [Top-level functions](narwhals.md)
- [narwhals.DataFrame](dataframe.md)
- [narwhals.Expr](expr.md)
- [narwhals.Expr.cat](expr_cat.md)
- [narwhals.Expr.dt](expr_dt.md)
- [narwhals.Expr.name](expr_name.md)
- [narwhals.Expr.str](expr_str.md)
- [narwhals.GroupBy](group_by.md)
- [narwhals.LazyFrame](lazyframe.md)
- [narwhals.Schema](schema.md)
- [narwhals.Series](series.md)
- [narwhals.Series.cat](series_cat.md)
- [narwhals.Series.dt](series_dt.md)
- [narwhals.Series.str](series_str.md)
- [narwhals.dependencies](dependencies.md)
- [narwhals.dtypes](dtypes.md)
- [narwhals.selectors](selectors.md)
- [narwhals.typing](typing.md)
1 change: 1 addition & 0 deletions docs/api-reference/lazyframe.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
- select
- sort
- tail
- to_native
- unique
- with_columns
- with_row_index
Expand Down
2 changes: 2 additions & 0 deletions docs/api-reference/series.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
members:
- __arrow_c_stream__
- __getitem__
- __iter__
- abs
- alias
- all
Expand Down Expand Up @@ -57,6 +58,7 @@
- to_list
- to_numpy
- to_pandas
- to_native
- unique
- value_counts
- zip_with
Expand Down
1 change: 1 addition & 0 deletions docs/api-reference/series_str.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
- contains
- ends_with
- head
- len_chars
- replace
- replace_all
- slice
Expand Down
44 changes: 43 additions & 1 deletion docs/basics/dataframe.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ To write a dataframe-agnostic function, the steps you'll want to follow are:

1. Initialise a Narwhals DataFrame or LazyFrame by passing your dataframe to `nw.from_native`.
All the calculations stay lazy if we start with a lazy dataframe - Narwhals will never automatically trigger computation without you asking it to.

Note: if you need eager execution, make sure to pass `eager_only=True` to `nw.from_native`.

2. Express your logic using the subset of the Polars API supported by Narwhals.
Expand All @@ -21,6 +21,7 @@ Just like in Polars, we can pass expressions to
`DataFrame.select` or `LazyFrame.select`.

Make a Python file with the following content:

```python exec="1" source="above" session="df_ex1"
import narwhals as nw
from narwhals.typing import FrameT
Expand All @@ -34,6 +35,7 @@ def func(df: FrameT) -> FrameT:
a_std=nw.col("a").std(),
)
```

Let's try it out:

=== "pandas"
Expand All @@ -60,7 +62,16 @@ Let's try it out:
print(func(df).collect())
```

=== "PyArrow"
```python exec="true" source="material-block" result="python" session="df_ex1"
import pyarrow as pa

table = pa.table({"a": [1, 1, 2]})
print(func(table))
```

Alternatively, we could have opted for the more explicit version:

```python
import narwhals as nw
from narwhals.typing import IntoFrameT
Expand All @@ -75,6 +86,7 @@ def func(df_native: IntoFrameT) -> IntoFrameT:
)
return nw.to_native(df)
```

Despite being more verbose, it has the advantage of preserving the type annotation of the native
object - see [typing](../api-reference/typing.md) for more details.

Expand All @@ -84,6 +96,7 @@ In general, in this tutorial, we'll use the former.

Just like in Polars, we can pass expressions to `GroupBy.agg`.
Make a Python file with the following content:

```python exec="1" source="above" session="df_ex2"
import narwhals as nw
from narwhals.typing import FrameT
Expand All @@ -93,6 +106,7 @@ from narwhals.typing import FrameT
def func(df: FrameT) -> FrameT:
return df.group_by("a").agg(nw.col("b").mean()).sort("a")
```

Let's try it out:

=== "pandas"
Expand All @@ -119,12 +133,21 @@ Let's try it out:
print(func(df).collect())
```

=== "PyArrow"
```python exec="true" source="material-block" result="python" session="df_ex2"
import pyarrow as pa

table = pa.table({"a": [1, 1, 2], "b": [4, 5, 6]})
print(func(table))
```

## Example 3: horizontal sum

Expressions can be free-standing functions which accept other expressions as inputs.
For example, we can compute a horizontal sum using `nw.sum_horizontal`.

Make a Python file with the following content:

```python exec="1" source="above" session="df_ex3"
import narwhals as nw
from narwhals.typing import FrameT
Expand All @@ -134,6 +157,7 @@ from narwhals.typing import FrameT
def func(df: FrameT) -> FrameT:
return df.with_columns(a_plus_b=nw.sum_horizontal("a", "b"))
```

Let's try it out:

=== "pandas"
Expand All @@ -160,6 +184,14 @@ Let's try it out:
print(func(df).collect())
```

=== "PyArrow"
```python exec="true" source="material-block" result="python" session="df_ex3"
import pyarrow as pa

table = pa.table({"a": [1, 1, 2], "b": [4, 5, 6]})
print(func(table))
```

## Example 4: multiple inputs

`nw.narwhalify` can be used to decorate functions that take multiple inputs as well and
Expand All @@ -169,6 +201,7 @@ For example, let's compute how many rows are left in a dataframe after filtering
on a series.

Make a Python file with the following content:

```python exec="1" source="above" session="df_ex4"
from typing import Any

Expand Down Expand Up @@ -201,3 +234,12 @@ Let's try it out:
s = pl.Series([1, 3])
print(func(df, s.to_numpy(), "a"))
```

=== "PyArrow"
```python exec="true" source="material-block" result="python" session="df_ex4"
import pyarrow as pa

table = pa.table({"a": [1, 1, 2, 2, 3], "b": [4, 5, 6, 7, 8]})
a = pa.array([1, 3])
print(func(table, a.to_numpy(), "a"))
```
Loading

0 comments on commit 0350661

Please sign in to comment.