Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User facing linear algebra abstraction #187

Open
akunft opened this issue Apr 20, 2016 · 6 comments
Open

User facing linear algebra abstraction #187

akunft opened this issue Apr 20, 2016 · 6 comments

Comments

@akunft
Copy link
Contributor

akunft commented Apr 20, 2016

This issue should be used to discuss the user facing abstraction for the matrix and vector type.

The initial prototype and ongoing effort is tracked in PR #191.

The focus is on the traits Matrix and Vector.

In the following, I want to highlight some of the more special parts of the abstraction open for discussion:

Type bounds for the generic value

Currently, we use spire.Numeric as type bound for the values in Matrix/Vector. This allows us to use all basic operations like +, -, * and / (which is not supported by the scala Numeric) and there are implicit conversions for all the numeric primitives in scala.

Instead, we could define our own type as bound for the values, to support e.g. Strings, boolean
, ... similar to spire.Field. As this would generalize the abstraction, we also had to implement the implicit conversions spire gives us for free.

I would suggest to keep the Numeric bound for now and see if there is need for a wider bound.

Aggregations

Currently we allow aggregations over vectors only. This enables the user to define his own aggregation functions. In combination with the columns() and rows() method, the user can define aggregations of the columns and rows of a matrix.

A point to mention is that the return type is dependent on the result of the traversal.

    // means should be a row-vector
    val means = for (col <- M.cols()) yield {
      col.aggregate(_ + _) / col.length
    }
    // means should be a matrix
    val means = for (col <- M.cols()) yield {
      col + 3
    }

Open questions:

  • What is the best suited return type for the cols() and rows methods. Array, Traversable, DataBag[(Int, A)]?
  • Should we allow aggregations over all values in a matrix directly (e.g. to specify the sum over all values in the matrix)?
  • If we introduce matrix wide aggregations, should we fix the traversal of the values (row- or column-wise) or let the user specify or allow usage of order independent aggregations only?

Default implementation

The current implementations are based on one dimensional arrays. Therefore it is easy to hand the execution of operators to netlib-java easily (not yet done).

@aalexandrov
Copy link
Contributor

  • 👍 for the Numeric type bound
  • 👍 for adding a third transformation elements() that allows for element-wise aggregations
  • 👎 for fixing the traversal order, I suggest to stick to the expressiveness given by union-style folds which is order-independent, unless we see a good reason for something else

@fschueler
Copy link
Contributor

I am still not sure on the Numeric type bound for matrices but I think for a start this should be fine.

Do we also allow aggregations over Vectors? I think this is important, e.g. for Vector-norms.

👍 for aggregation over all elements. Most APIs offer aggregations specified by the dimensions (1 = rows, 2 = columns, default = all elements)

I would also not fix the traversal order and allow only commutative aggregation operations.

For the rest I will think about it some more!

@akunft
Copy link
Contributor Author

akunft commented Apr 20, 2016

Yes, we do allow aggregations over vectors, as shown in the example above.
I also agree on the aggregation over elements, but I would keep the methods separated in cols(), rows() to do per-vector aggregations and an additional elements() method to do aggregation over the elements of a matrix.

@akunft akunft added the LINALG label Apr 20, 2016
@akunft
Copy link
Contributor Author

akunft commented Apr 21, 2016

Changes for the API are now tracked in PR #191

@akunft
Copy link
Contributor Author

akunft commented Apr 23, 2016

@stratosphere/emma-committers If nobody has objections, I would allow +,-,*,/ (single char only) as method names in the scala style formatter, for the methods in matrix/vector.

@aalexandrov
Copy link
Contributor

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants