Skip to content

Commit

Permalink
Polars
Browse files Browse the repository at this point in the history
  • Loading branch information
stinodego committed Nov 2, 2023
1 parent 01b2d79 commit 4f5b729
Show file tree
Hide file tree
Showing 7 changed files with 18 additions and 18 deletions.
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@ Refer to the [Polars CLI repository](https://github.com/pola-rs/polars-cli) for
Polars is very fast. In fact, it is one of the best performing solutions available.
See the results in [DuckDB's db-benchmark](https://duckdblabs.github.io/db-benchmark/).

In the [TPCH benchmarks](https://www.pola.rs/benchmarks.html) polars is orders of magnitudes faster than pandas, dask, modin and vaex
In the [TPCH benchmarks](https://www.pola.rs/benchmarks.html) Polars is orders of magnitudes faster than pandas, dask, modin and vaex
on full queries (including IO).

### Lightweight
Expand Down Expand Up @@ -228,9 +228,9 @@ Required Rust version `>=1.71`.

Want to contribute? Read our [contribution guideline](/CONTRIBUTING.md).

## Python: compile polars from source
## Python: compile Polars from source

If you want a bleeding edge release or maximal performance you should compile **polars** from source.
If you want a bleeding edge release or maximal performance you should compile **Polars** from source.

This can be done by going through the following steps in sequence:

Expand All @@ -251,7 +251,7 @@ can `pip install polars` and `import polars`.

## Use custom Rust function in Python?

Extending polars with UDFs compiled in Rust is easy. We expose pyo3 extensions for `DataFrame` and `Series`
Extending Polars with UDFs compiled in Rust is easy. We expose pyo3 extensions for `DataFrame` and `Series`
data structures. See more in https://github.com/pola-rs/pyo3-polars.

## Going big...
Expand All @@ -260,13 +260,13 @@ Do you expect more than `2^32` ~4,2 billion rows? Compile polars with the `bigid

Or for Python users install `pip install polars-u64-idx`.

Don't use this unless you hit the row boundary as the default polars is faster and consumes less memory.
Don't use this unless you hit the row boundary as the default Polars is faster and consumes less memory.

## Legacy

Do you want polars to run on an old CPU (e.g. dating from before 2011), or on an `x86-64` build
Do you want Polars to run on an old CPU (e.g. dating from before 2011), or on an `x86-64` build
of Python on Apple Silicon under Rosetta? Install `pip install polars-lts-cpu`. This version of
polars is compiled without [AVX](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) target
Polars is compiled without [AVX](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) target
features.

## Sponsors
Expand Down
10 changes: 5 additions & 5 deletions docs/user-guide/expressions/plugins.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Expression plugins

Expression plugins are the preferred way to create user defined functions. They allow you to compile a Rust function
and register that as an expression into the polars library. The polars engine will dynamically link your function at runtime
and register that as an expression into the Polars library. The Polars engine will dynamically link your function at runtime
and your expression will run almost as fast as native expressions. Note that this works without any interference of Python
and thus no GIL contention.

Expand Down Expand Up @@ -85,9 +85,9 @@ named `expression_lib` and we create an `expression_lib/__init__.py`. The result
```

Then we create a new class `Language` that will hold the expressions for our new `expr.language` namespace. The function
name of our expression can be registered. Note that it is important that this name is correct, otherwise the main polars
package cannot resolve the function name. Furthermore we can set additional keyword arguments that explain to polars how
this expression behaves. In this case we tell polars that this function is elementwise. This allows polars to run this
name of our expression can be registered. Note that it is important that this name is correct, otherwise the main Polars
package cannot resolve the function name. Furthermore we can set additional keyword arguments that explain to Polars how
this expression behaves. In this case we tell Polars that this function is elementwise. This allows Polars to run this
expression in batches. Whereas for other operations this would not be allowed, think for instance of a sort, or a slice.

```python
Expand All @@ -96,7 +96,7 @@ import polars as pl
from polars.type_aliases import IntoExpr
from polars.utils.udfs import _get_shared_lib_location

# boilerplate needed to inform polars of the location of binary wheel.
# Boilerplate needed to inform Polars of the location of binary wheel.
lib = _get_shared_lib_location(__file__)

@pl.api.register_expr_namespace("language")
Expand Down
4 changes: 2 additions & 2 deletions docs/user-guide/expressions/user-defined-functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,15 +160,15 @@ In Python, those would be passed as `dict` to the calling Python function and ca

### Return types?

Custom Python functions are black boxes for polars. We really don't know what kind of black arts you are doing, so we have
Custom Python functions are black boxes for Polars. We really don't know what kind of black arts you are doing, so we have
to infer and try our best to understand what you meant.

As a user it helps to understand what we do to better utilize custom functions.

The data type is automatically inferred. We do that by waiting for the first non-null value. That value will then be used
to determine the type of the `Series`.

The mapping of Python types to polars data types is as follows:
The mapping of Python types to Polars data types is as follows:

- `int` -> `Int64`
- `float` -> `Float64`
Expand Down
2 changes: 1 addition & 1 deletion docs/user-guide/io/json.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Reading a JSON file should look familiar:

### Newline Delimited JSON

JSON objects that are delimited by newlines can be read into polars in a much more performant way than standard json.
JSON objects that are delimited by newlines can be read into Polars in a much more performant way than standard json.

Polars can read an NDJSON file into a `DataFrame` using the `read_ndjson` function:

Expand Down
2 changes: 1 addition & 1 deletion docs/user-guide/migration/pandas.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Operations like resampling will be done by specialized functions or methods that
stating the columns that that 'verb' operates on. As such, it is our conviction that not having indices make things simpler,
more explicit, more readable and less error-prone.

Note that an 'index' data structure as known in databases will be used by polars as an optimization technique.
Note that an 'index' data structure as known in databases will be used by Polars as an optimization technique.

### Polars uses Apache Arrow arrays to represent data in memory while pandas uses NumPy arrays

Expand Down
2 changes: 1 addition & 1 deletion docs/user-guide/misc/alternatives.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Alternatives

These are some tools that share similar functionality to what polars does.
These are some tools that share similar functionality to what Polars does.

- Pandas

Expand Down
2 changes: 1 addition & 1 deletion docs/user-guide/transformations/pivot.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ aggregation.

## Lazy

A polars `LazyFrame` always need to know the schema of a computation statically (before collecting the query).
A Polars `LazyFrame` always need to know the schema of a computation statically (before collecting the query).
As a pivot's output schema depends on the data, and it is therefore impossible to determine the schema without
running the query.

Expand Down

0 comments on commit 4f5b729

Please sign in to comment.