Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: use mypy pre-commit in local environment #1966

Merged
merged 56 commits into from
Feb 11, 2025
Merged
Show file tree
Hide file tree
Changes from 44 commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
8be02ed
add hypothesis and control mypy in pyproject.toml
EdAbati Feb 5, 2025
eb70be2
mypy in local
EdAbati Feb 7, 2025
8e7990c
get_item tidy up draft
EdAbati Feb 7, 2025
6ae4436
Merge remote-tracking branch 'upstream/main' into mypy-fixes
EdAbati Feb 7, 2025
6017193
declaring relative path
EdAbati Feb 7, 2025
86c23ed
pyproject fix
EdAbati Feb 7, 2025
13745b4
restore precommit
EdAbati Feb 8, 2025
faa477b
add makefile
EdAbati Feb 8, 2025
b7aebab
skip mypy from precommit ci
EdAbati Feb 8, 2025
75f474d
new typing CI
EdAbati Feb 8, 2025
5f1aea6
add forgotten setup-python
EdAbati Feb 8, 2025
2144196
where is mypy?
EdAbati Feb 8, 2025
8664672
trying to create a venv
EdAbati Feb 8, 2025
3aaf983
fix
EdAbati Feb 8, 2025
a030f61
trying again with PATH
EdAbati Feb 8, 2025
9dbd938
always activate
EdAbati Feb 8, 2025
a4f6ae2
one more activate
EdAbati Feb 8, 2025
46fade5
remove redundant which mypy
EdAbati Feb 8, 2025
6bcea6b
force color?
EdAbati Feb 8, 2025
ce3eae9
force_color
EdAbati Feb 8, 2025
4e09713
Merge branch 'main' into mypy-fixes
EdAbati Feb 8, 2025
0ee5312
Merge branch 'main' into mypy-fixes
EdAbati Feb 8, 2025
07532ba
env var check
EdAbati Feb 8, 2025
57adae7
just mypy?
EdAbati Feb 8, 2025
5689858
colors are not showing, giving up
EdAbati Feb 8, 2025
a6892e2
add polars only in mypy dependencies
EdAbati Feb 8, 2025
96446f4
switching back to core
EdAbati Feb 8, 2025
45deb67
calm down mypy a bit
EdAbati Feb 8, 2025
5e03ed0
easy fix
EdAbati Feb 8, 2025
9eec997
Merge branch 'main' into mypy-fixes
EdAbati Feb 8, 2025
8c3fe94
some type fixes
EdAbati Feb 8, 2025
b301356
disable pyarrow-stubs
EdAbati Feb 8, 2025
8698311
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Feb 8, 2025
4e63fcb
restore
EdAbati Feb 8, 2025
c22fd06
Merge branch 'main' into mypy-fixes
dangotbanned Feb 8, 2025
9a2d9ff
Merge branch 'main' into mypy-fixes
EdAbati Feb 9, 2025
b9612cc
fix mypy version
EdAbati Feb 9, 2025
8177933
organise optional dependencies
EdAbati Feb 9, 2025
4456c33
add typing_extensions to tests
EdAbati Feb 9, 2025
5dcdc1a
Merge remote-tracking branch 'upstream/main' into mypy-fixes
EdAbati Feb 9, 2025
0dfed72
fix downstream tests
EdAbati Feb 9, 2025
58ed0bf
fix last types
EdAbati Feb 9, 2025
99a6690
add tests reqs
EdAbati Feb 9, 2025
81ee931
Merge branch 'main' into mypy-fixes
EdAbati Feb 9, 2025
bf9e0d9
fix(typing): resolve some new `_duckdb.expr` warnings
dangotbanned Feb 10, 2025
338962c
chore(typing): add stub ignore
dangotbanned Feb 10, 2025
9abdd34
Merge remote-tracking branch 'upstream/main' into pr/EdAbati/1966
dangotbanned Feb 10, 2025
54a29c7
fix(typing): widen annotation to inclide `_1DArray`
dangotbanned Feb 10, 2025
5306ce2
refactor: replace `ConstantExpression` w/ `lit` alias
dangotbanned Feb 10, 2025
6fc845e
fix(typing): resolve `_duckdb` `[operator]` warnings
dangotbanned Feb 10, 2025
1dd6d98
revert(typing): undo "fixes" for `ddof`
dangotbanned Feb 10, 2025
fd192f2
chore(typing): resolve `PandasLikeSeries.hist` warnings
dangotbanned Feb 10, 2025
53a8dda
fix(typing): resolve 26 `pyright` warnings
dangotbanned Feb 10, 2025
9f87adf
Merge branch 'main' into mypy-fixes
dangotbanned Feb 10, 2025
59b7ba5
refactor(typing): replace `NoReturn` w/ `Never`
dangotbanned Feb 10, 2025
e7a8494
Merge remote-tracking branch 'upstream/main' into pr/EdAbati/1966
dangotbanned Feb 11, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/check_tpch_queries.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ jobs:
cache-suffix: ${{ matrix.python-version }}
cache-dependency-glob: "pyproject.toml"
- name: local-install
run: uv pip install -U --pre -e ".[dev, core, dask]" --system
run: uv pip install -U --pre -e ".[tests, core, dask]" --system
- name: generate-data
run: cd tpch && python generate_data.py
- name: tpch-tests
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/downstream_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -220,7 +220,7 @@ jobs:
run: |
cd tea-tasting
pdm remove narwhals
pdm add ./..[dev]
pdm add ./..[tests]
- name: show-deps
run: |
cd tea-tasting
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/extremes.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ jobs:
run: uv pip install pipdeptree tox virtualenv setuptools pandas==0.25.3 polars==0.20.3 numpy==1.17.5 pyarrow==11.0.0 "pyarrow-stubs<17" scipy==1.5.0 scikit-learn==1.1.0 duckdb==1.0 tzdata --system
- name: install-reqs
run: |
uv pip install -e ".[dev]" --system
uv pip install -e ".[tests]" --system
- name: show-deps
run: uv pip freeze
- name: Assert dependencies
Expand Down Expand Up @@ -64,7 +64,7 @@ jobs:
- name: install-pretty-old-versions
run: uv pip install pipdeptree tox virtualenv setuptools pandas==1.1.5 polars==0.20.3 numpy==1.17.5 pyarrow==11.0.0 "pyarrow-stubs<17" pyspark==3.5.0 scipy==1.5.0 scikit-learn==1.1.0 duckdb==1.0 tzdata --system
- name: install-reqs
run: uv pip install -e ".[dev]" --system
run: uv pip install -e ".[tests]" --system
- name: show-deps
run: uv pip freeze
- name: show-deptree
Expand Down Expand Up @@ -103,7 +103,7 @@ jobs:
- name: install-not-so-old-versions
run: uv pip install tox virtualenv setuptools pandas==2.0.3 polars==0.20.8 numpy==1.24.4 pyarrow==15.0.0 "pyarrow-stubs<17" pyspark==3.5.0 scipy==1.8.0 scikit-learn==1.3.0 duckdb==1.0 dask[dataframe]==2024.10 tzdata --system
- name: install-reqs
run: uv pip install -e ".[dev]" --system
run: uv pip install -e ".[tests]" --system
- name: show-deps
run: uv pip freeze
- name: Assert not so old versions dependencies
Expand Down Expand Up @@ -140,7 +140,7 @@ jobs:
cache-suffix: ${{ matrix.python-version }}
cache-dependency-glob: "pyproject.toml"
- name: install-reqs
run: uv pip install -e ".[dev]" --system
run: uv pip install -e ".[tests]" --system
- name: install-kaggle
run: uv pip install kaggle --system
- name: Download Kaggle notebook artifact
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/pytest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ jobs:
cache-dependency-glob: "pyproject.toml"
- name: install-reqs
# Python3.8 is technically at end-of-life, so we don't test everything
run: uv pip install -e ".[dev, core]" --system
run: uv pip install -e ".[tests, core]" --system
- name: show-deps
run: uv pip freeze
- name: Run pytest
Expand All @@ -49,7 +49,7 @@ jobs:
cache-suffix: ${{ matrix.python-version }}
cache-dependency-glob: "pyproject.toml"
- name: install-reqs
run: uv pip install -e ".[dev, core, extra, dask, modin]" --system
run: uv pip install -e ".[tests, core, extra, dask, modin]" --system
- name: install pyspark
run: uv pip install -e ".[pyspark]" --system
# PySpark is not yet available on Python3.12+
Expand Down Expand Up @@ -83,7 +83,7 @@ jobs:
cache-suffix: ${{ matrix.python-version }}
cache-dependency-glob: "pyproject.toml"
- name: install-reqs
run: uv pip install -e ".[dev, core, extra, modin, dask]" --system
run: uv pip install -e ".[tests, core, extra, modin, dask]" --system
- name: install pyspark
run: uv pip install -e ".[pyspark]" --system
# PySpark is not yet available on Python3.12+
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/random_ci_pytest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ jobs:
- name: install-random-verions
run: uv pip install -r random-requirements.txt --system
- name: install-narwhals
run: uv pip install -e ".[dev]" --system
run: uv pip install -e ".[tests]" --system
- name: show versions
run: uv pip freeze
- name: Run pytest
Expand Down
40 changes: 40 additions & 0 deletions .github/workflows/typing.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
name: Type checking

on:
pull_request:
push:
branches: [main]

jobs:
mypy:
strategy:
matrix:
python-version: ["3.11"]
os: [ubuntu-latest]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install uv
uses: astral-sh/setup-uv@v5
with:
enable-cache: "true"
cache-suffix: ${{ matrix.python-version }}
cache-dependency-glob: "pyproject.toml"
- name: Create venv
run: uv venv .venv
- name: install-reqs
# TODO: add more dependencies/backends incrementally
run: |
source .venv/bin/activate
uv pip install -e ".[tests, typing, core]"
- name: show-deps
run: |
source .venv/bin/activate
uv pip freeze
- name: Run mypy
run: |
source .venv/bin/activate
make typing
14 changes: 8 additions & 6 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
ci:
autoupdate_schedule: monthly
skip: [mypy]
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
Expand All @@ -13,12 +14,6 @@ repos:
- id: ruff
alias: check-docstrings
entry: python utils/check_docstrings.py
- repo: https://github.com/pre-commit/mirrors-mypy
rev: 'v1.14.1'
hooks:
- id: mypy
additional_dependencies: ['polars==1.4.1', 'pytest==8.3.2']
files: ^(narwhals|tests)/
- repo: https://github.com/codespell-project/codespell
rev: 'v2.4.1'
hooks:
Expand Down Expand Up @@ -84,6 +79,13 @@ repos:
entry: pull_request_target
language: pygrep
files: ^\.github/workflows/
- id: mypy
name: mypy
entry: make typing
files: ^(narwhals|tests)/
language: system
types: [python]
require_serial: true
- repo: https://github.com/adamchainz/blacken-docs
rev: "1.19.1" # replace with latest tag on GitHub
hooks:
Expand Down
23 changes: 23 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Mostly based on polars Makefile
# https://github.com/pola-rs/polars/blob/main/py-polars/Makefile

.DEFAULT_GOAL := help

SHELL=bash
VENV=./.venv

ifeq ($(OS),Windows_NT)
VENV_BIN=$(VENV)/Scripts
else
VENV_BIN=$(VENV)/bin
endif


.PHONY: help
help: ## Display this help screen
@echo -e "\033[1mAvailable commands:\033[0m"
@grep -E '^[a-z.A-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | awk 'BEGIN {FS = ":.*?## "}; {printf " \033[36m%-22s\033[0m %s\n", $$1, $$2}' | sort

.PHONY: typing
typing: ## Run typing checks
$(VENV_BIN)/mypy
5 changes: 3 additions & 2 deletions narwhals/_arrow/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
from narwhals._arrow.namespace import ArrowNamespace
from narwhals.dtypes import DType
from narwhals.typing import _1DArray
from narwhals.typing import _2DArray
from narwhals.utils import Version


Expand Down Expand Up @@ -339,7 +340,7 @@ def __getitem__(self: Self, idx: int | slice | Sequence[int]) -> Any | Self:
def scatter(self: Self, indices: int | Sequence[int], values: Any) -> Self:
import numpy as np # ignore-banned-import

mask = np.zeros(self.len(), dtype=bool)
mask: _1DArray = np.zeros(self.len(), dtype=bool)
mask[indices] = True
if isinstance(values, self.__class__):
ser, values = broadcast_and_extract_native(
Expand Down Expand Up @@ -726,7 +727,7 @@ def to_dummies(self: Self, *, separator: str, drop_first: bool) -> ArrowDataFram
name = self._name
da = series.dictionary_encode(null_encoding="encode").combine_chunks()

columns = np.zeros((len(da.dictionary), len(da)), np.int8)
columns: _2DArray = np.zeros((len(da.dictionary), len(da)), np.int8)
columns[da.indices, np.arange(len(da))] = 1
null_col_pa, null_col_pl = f"{name}{separator}None", f"{name}{separator}null"
cols = [
Expand Down
4 changes: 2 additions & 2 deletions narwhals/_duckdb/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,7 @@ def select(
):
return self._from_native_frame(
self._native_frame.aggregate(
[val.alias(col) for col, val in new_columns_map.items()]
[val.alias(col) for col, val in new_columns_map.items()] # type: ignore[arg-type]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on the duckdb stubs, only str are allowed in .aggregate. πŸ€”

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks wrong, it should be str | list[Expression], right? want to make a PR to duckdb?

),
validate_column_names=False,
)
Expand Down Expand Up @@ -482,7 +482,7 @@ def explode(self: Self, columns: list[str]) -> Self:
original_columns = self.columns

not_null_condition = (
col_to_explode.isnotnull() & FunctionExpression("len", col_to_explode) > 0
col_to_explode.isnotnull() & FunctionExpression("len", col_to_explode) > 0 # type: ignore[operator]
)
non_null_rel = rel.filter(not_null_condition).select(
*(
Expand Down
17 changes: 10 additions & 7 deletions narwhals/_duckdb/expr.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
from narwhals.utils import Version


class DuckDBExpr(CompliantExpr["duckdb.Expression"]):
class DuckDBExpr(CompliantExpr["duckdb.Expression"]): # type: ignore[type-var]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ignore should actually be solved. duckdb.Expression is not a CompatibleSeries

Is it ok for a follow-up?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup definitely

_implementation = Implementation.DUCKDB
_depth = 0 # Unused, just for compatibility with CompliantExpr

Expand Down Expand Up @@ -312,13 +312,16 @@ def skew(self: Self) -> Self:
def func(_input: duckdb.Expression) -> duckdb.Expression:
count = FunctionExpression("count", _input)
return CaseExpression(
condition=count == 0, value=ConstantExpression(None)
condition=count == 0, # type: ignore[arg-type]
value=ConstantExpression(None),
).otherwise(
CaseExpression(
condition=count == 1, value=ConstantExpression(float("nan"))
condition=count == 1, # type: ignore[arg-type]
value=ConstantExpression(float("nan")),
).otherwise(
CaseExpression(
condition=count == 2, value=ConstantExpression(0.0)
condition=count == 2, # type: ignore[arg-type]
value=ConstantExpression(0.0),
).otherwise(FunctionExpression("skewness", _input))
)
)
Expand Down Expand Up @@ -427,7 +430,7 @@ def _std(_input: duckdb.Expression, ddof: int) -> duckdb.Expression:
return (
FunctionExpression("stddev_pop", _input)
* FunctionExpression("sqrt", n_samples)
/ (FunctionExpression("sqrt", (n_samples - ddof)))
/ (FunctionExpression("sqrt", (n_samples - ddof))) # type: ignore[operator]
)

return self._from_call(
Expand All @@ -440,7 +443,7 @@ def _std(_input: duckdb.Expression, ddof: int) -> duckdb.Expression:
def var(self: Self, ddof: int) -> Self:
def _var(_input: duckdb.Expression, ddof: int) -> duckdb.Expression:
n_samples = FunctionExpression("count", _input)
return FunctionExpression("var_pop", _input) * n_samples / (n_samples - ddof)
return FunctionExpression("var_pop", _input) * n_samples / (n_samples - ddof) # type: ignore[operator]

return self._from_call(
_var,
Expand Down Expand Up @@ -521,7 +524,7 @@ def fill_null(self: Self, value: Any, strategy: Any, limit: int | None) -> Self:
def cast(self: Self, dtype: DType | type[DType]) -> Self:
def func(_input: duckdb.Expression) -> duckdb.Expression:
native_dtype = narwhals_to_native_dtype(dtype, self._version)
return _input.cast(native_dtype)
return _input.cast(native_dtype) # type: ignore[arg-type]

return self._from_call(
func,
Expand Down
12 changes: 6 additions & 6 deletions narwhals/_duckdb/expr_dt.py
dangotbanned marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -60,23 +60,23 @@ def second(self: Self) -> DuckDBExpr:
def millisecond(self: Self) -> DuckDBExpr:
return self._compliant_expr._from_call(
lambda _input: FunctionExpression("millisecond", _input)
- FunctionExpression("second", _input) * 1_000,
dangotbanned marked this conversation as resolved.
Show resolved Hide resolved
- FunctionExpression("second", _input) * 1_000, # type: ignore[operator]
"millisecond",
expr_kind=self._compliant_expr._expr_kind,
)

def microsecond(self: Self) -> DuckDBExpr:
return self._compliant_expr._from_call(
lambda _input: FunctionExpression("microsecond", _input)
- FunctionExpression("second", _input) * 1_000_000,
- FunctionExpression("second", _input) * 1_000_000, # type: ignore[operator]
"microsecond",
expr_kind=self._compliant_expr._expr_kind,
)

def nanosecond(self: Self) -> DuckDBExpr:
return self._compliant_expr._from_call(
lambda _input: FunctionExpression("nanosecond", _input)
- FunctionExpression("second", _input) * 1_000_000_000,
- FunctionExpression("second", _input) * 1_000_000_000, # type: ignore[operator]
"nanosecond",
expr_kind=self._compliant_expr._expr_kind,
)
Expand Down Expand Up @@ -123,7 +123,7 @@ def total_minutes(self: Self) -> DuckDBExpr:
def total_seconds(self: Self) -> DuckDBExpr:
return self._compliant_expr._from_call(
lambda _input: 60
* FunctionExpression("datepart", ConstantExpression("minute"), _input)
* FunctionExpression("datepart", ConstantExpression("minute"), _input) # type: ignore[operator]
+ FunctionExpression("datepart", ConstantExpression("second"), _input),
"total_seconds",
expr_kind=self._compliant_expr._expr_kind,
Expand All @@ -132,7 +132,7 @@ def total_seconds(self: Self) -> DuckDBExpr:
def total_milliseconds(self: Self) -> DuckDBExpr:
return self._compliant_expr._from_call(
lambda _input: 60_000
* FunctionExpression("datepart", ConstantExpression("minute"), _input)
* FunctionExpression("datepart", ConstantExpression("minute"), _input) # type: ignore[operator]
+ FunctionExpression("datepart", ConstantExpression("millisecond"), _input),
"total_milliseconds",
expr_kind=self._compliant_expr._expr_kind,
Expand All @@ -141,7 +141,7 @@ def total_milliseconds(self: Self) -> DuckDBExpr:
def total_microseconds(self: Self) -> DuckDBExpr:
return self._compliant_expr._from_call(
lambda _input: 60_000_000
* FunctionExpression("datepart", ConstantExpression("minute"), _input)
* FunctionExpression("datepart", ConstantExpression("minute"), _input) # type: ignore[operator]
+ FunctionExpression("datepart", ConstantExpression("microsecond"), _input),
"total_microseconds",
expr_kind=self._compliant_expr._expr_kind,
Expand Down
4 changes: 2 additions & 2 deletions narwhals/_duckdb/expr_str.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,10 @@ def func(_input: duckdb.Expression) -> duckdb.Expression:
_input,
ConstantExpression(offset + 1)
if offset >= 0
else FunctionExpression("length", _input) + offset + 1,
else FunctionExpression("length", _input) + offset + 1, # type: ignore[operator]
FunctionExpression("length", _input)
if length is None
else ConstantExpression(length) + offset,
else ConstantExpression(length) + offset, # type: ignore[operator]
)

return self._compliant_expr._from_call(
Expand Down
5 changes: 3 additions & 2 deletions narwhals/_duckdb/group_by.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
from typing import TYPE_CHECKING

if TYPE_CHECKING:
from duckdb import Expression
from typing_extensions import Self

from narwhals._duckdb.dataframe import DuckDBLazyFrame
Expand All @@ -23,7 +24,7 @@ def __init__(
self._keys = keys

def agg(self: Self, *exprs: DuckDBExpr) -> DuckDBLazyFrame:
agg_columns = self._keys.copy()
agg_columns: list[str | Expression] = list(self._keys)
df = self._compliant_frame
for expr in exprs:
output_names = expr._evaluate_output_names(df)
Expand All @@ -49,5 +50,5 @@ def agg(self: Self, *exprs: DuckDBExpr) -> DuckDBLazyFrame:
)

return self._compliant_frame._from_native_frame(
self._compliant_frame._native_frame.aggregate(agg_columns)
self._compliant_frame._native_frame.aggregate(agg_columns) # type: ignore[arg-type]
)
Loading
Loading