🛫 Optimize vec_transform again #103

Korijn · 2025-02-23T00:38:46Z

Follow-up to #102

I like optimizing too much I think. 🤪

After comparing many different implementations, this ultimately turned out to be the fastest. Even though theoretically speaking it a 3x3 matrix multiplication should be faster than a full 4x4 matrix multiplication, in practice it is not - but only for batches. I suspect there are hidden optimizations within Numpy (BLAS etc). 🤡

@panxinmiao With projection=False, vec_transform is now always faster than apply_matrix (both in single vector as in batch case).

# vector batch=1, timeit number=100000
vec_transform (projection=True) 0.402374500001315
vec_transform (projection=False) 0.25878190000366885
apply_matrix 0.28508210000291

# vector batch=100000, timeit number=10
vec_transform (batch, projection=True) 0.03246779998880811
vec_transform (batch, projection=False) 0.02070159999129828
apply_matrix (batch) 0.022843300001113676

Benchmarked with the script below:

import numpy as np
import pylinalg as la
import timeit


def vec_transform(
    vectors, matrix, /, *, w=1, projection=True, out=None, dtype=None
) -> np.ndarray:
    matrix = np.asarray(matrix, dtype='f8')

    if projection:
        vectors = la.vec_homogeneous(vectors, w=w, dtype='f8')
        if vectors.ndim == 1:
            vectors = matrix @ vectors
            vectors[:-1] /= vectors[-1]
            vectors = vectors[:-1]
        elif vectors.ndim == 2:
            vectors = (matrix @ vectors.T).T
            vectors = vectors[..., :-1] / vectors[..., -1, None]
        else:
            raise ValueError("vectors must be a 1D or 2D array")
    else:
        if vectors.ndim == 1:
            vectors = matrix[:-1, :-1] @ vectors + matrix[:-1, -1]
        elif vectors.ndim == 2:
            vectors = la.vec_homogeneous(vectors, w=w, dtype='f8')
            vectors = (matrix @ vectors.T).T
            vectors = vectors[..., :-1]
        else:
            raise ValueError("vectors must be a 1D or 2D array")

    if out is None:
        out = vectors
        if dtype is not None:
            out = out.astype(dtype, copy=False)
    else:
        out[:] = vectors

    return out


def apply_matrix(v, m):
    v = la.vec_homogeneous(v, dtype='f8')
    vv = m @ v.T
    return vv.T[..., :-1]


v = np.array([1, 2, 3])
m = la.mat_compose(np.array([1, 2, 3]), la.quat_from_euler((1, 2, 3)), np.array([1, 2, 1]))

print(vec_transform(v, m))
print(vec_transform(v, m, projection=False))
print(apply_matrix(v, m))

v_batch = np.stack([v] * 100000)

print(vec_transform(v_batch, m)[:5])
print(vec_transform(v_batch, m, projection=False)[:5])
print(apply_matrix(v_batch, m)[:5])

print("vec_transform (projection=True)", timeit.timeit(lambda: vec_transform(v, m), number=100000))
print("vec_transform (projection=False)", timeit.timeit(lambda: vec_transform(v, m, projection=False), number=100000))
print("apply_matrix", timeit.timeit(lambda: apply_matrix(v, m), number=100000))
print("vec_transform (batch, projection=True)", timeit.timeit(lambda: vec_transform(v_batch, m), number=10))
print("vec_transform (batch, projection=False)", timeit.timeit(lambda: vec_transform(v_batch, m, projection=False), number=10))
print("apply_matrix (batch)", timeit.timeit(lambda: apply_matrix(v_batch, m), number=10)

panxinmiao · 2025-02-23T09:12:48Z

Great, ultimate performance optimization :)

almarklein · 2025-02-24T07:48:20Z

pylinalg/vector.py

@@ -105,17 +105,28 @@ def vec_transform(
        transformed vectors
    """

-    matrix = np.asarray(matrix)
+    matrix = np.asarray(matrix, dtype="f8")


I'm curious: is there a reason for using "f8" instead of float?

Yes! They have the same effect, but "f8" is a local string literal and float needs to be looked up from globals. It improves performance (significantly).

optimize vec_transform

06383c0

Korijn merged commit e436128 into main Feb 23, 2025
10 checks passed

Korijn deleted the optimize-vec-transform-more branch February 23, 2025 09:55

Korijn mentioned this pull request Feb 23, 2025

🔥 Optimize vec_transform again #104

Merged

almarklein reviewed Feb 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🛫 Optimize vec_transform again #103

🛫 Optimize vec_transform again #103

Korijn commented Feb 23, 2025 •

edited

Loading

panxinmiao commented Feb 23, 2025

almarklein Feb 24, 2025

Korijn Feb 24, 2025

almarklein Feb 24, 2025

🛫 Optimize vec_transform again #103

🛫 Optimize vec_transform again #103

Conversation

Korijn commented Feb 23, 2025 • edited Loading

panxinmiao commented Feb 23, 2025

almarklein Feb 24, 2025

Choose a reason for hiding this comment

Korijn Feb 24, 2025

Choose a reason for hiding this comment

almarklein Feb 24, 2025

Choose a reason for hiding this comment

Korijn commented Feb 23, 2025 •

edited

Loading