add reference output #39

willow-ahrens · 2023-10-30T19:16:02Z

This PR adds reference output for binsparse matrices and vectors @BenBrock @ivirshup. fixes #38

github-actions · 2023-10-30T19:16:16Z

Automated Review URLs

render latest/index.bs

BenBrock · 2023-11-27T18:01:18Z

Hi @willow-ahrens, just took a look and ran into two issues with my parser:

My parser reads all the binsparse matrices as having shape [0, 0], which I don't think is correct. Not sure if this is a parsing issue or an error in the files.
There's no "version" key in the files, which my parser expects. Maybe we can add a "version" key equal to 0.1?

willow-ahrens · 2023-12-04T16:33:59Z

I think it should be fixed now!

BenBrock · 2024-04-23T00:03:29Z

I'm still not seeing a version in the HDF5, which is causing my parser to fail. Here's what I see in hdf5dump for the binsparse JSON in /examples/reference/b1_ss_COO/b1_ss_COO.bsp.h5:

   ATTRIBUTE "binsparse" {
      DATATYPE  H5T_STRING {
         STRSIZE 328;
         STRPAD H5T_STR_NULLTERM;
         CSET H5T_CSET_UTF8;
         CTYPE H5T_C_S1;
      }
      DATASPACE  SCALAR
      DATA {
      (0): "{
               "binsparse": {
                   "format": "COOR",
                   "fill": true,
                   "shape": [
                       7,
                       7
                   ],
                   "data_types": {
                       "indices_0": "int64",
                       "indices_1": "int64",
                       "values": "float64",
                       "fill_value": "float64"
                   },
                   "attrs": {}
               }
           }
           "
      }
   }

Could we add a version tag with value 1.0? We should also perhaps discuss how version checking should work at some point. (e.g. should a non-tensor parser try to accept everything in [1.0, 2.0) and fail upon encountering something unparsable, or instead only accept something in [1.0, highest_known_nontensor_version]?)

willow-ahrens · 2024-04-23T16:11:05Z

oops! I fixed Finch but I didn't regenerate the reference files. I have checked there is a version, it looks like it's currently set to 0.1. Should we set it to 1.0?

Regarding tensor vs. matrix formats, I think that tensors should be v1.0, but tensor parsers should normalize to the matrix format tags whenever possible. If it helps, we could define a key that tells us if there is tensor stuff.

Whether or not there are tensor formats feels orthogonal to 1.0 vs 2.0. We may one day want to upgrade matrix version from 1.0 to 2.0, but that shouldn't have to mean the upgraded version needs to support tensors.

BenBrock · 2024-04-23T23:03:13Z

I can see the version there now, but it's stored as a string, not a number ("1.0" vs. 1.0), which creates some problems for my parser. Could you update again?

That makes sense about adding a key for the tensor extensions. I added a note to the agenda for our next meeting about that.

willow-ahrens · 2024-04-23T23:10:57Z

Are we sure we want to store versions as numbers? you mean a floating point number?

DrTimothyAldenDavis · 2024-04-23T23:12:01Z

It would be best to have it as a string or as an array of 3 integers. Version 1.12 > 1.2 for example.

DrTimothyAldenDavis · 2024-04-23T23:12:42Z

And also a number can’t be 1.2.3.

BenBrock · 2024-04-24T18:00:14Z

Okay, so it sounds like we're going with a string version, where the string must satisfy the regex ^\d+.\d+$. [illustrative examples]

e.g., there's a major and minor version number only.

BenBrock · 2024-04-29T20:35:08Z

@willow-ahrens I can read the binsparse files successfully now, but I have a couple of issues with the Matrix Market files. Would it be possible to just use the raw files from SuiteSparse Matrix Collection? My parser doesn't work on the tensor object type you use in these files, which seems to swap the row and column index.

willow-ahrens · 2024-05-10T13:38:57Z

I'll try to fix my matrixmarket today

willow-ahrens · 2024-05-14T15:35:41Z

@BenBrock I investigated this, and updated the matrixmarket writer to use matrix when applicable. Regarding row and column swapping, these mtx files match the index ordering used in the original files, so I don't think that's the issue?

BenBrock · 2024-05-16T21:20:48Z

Thanks, I can read the Matrix Market files, although I do still have a little problem with the rows/columns, which appear to me to be swapped in the Binsparse files.

Here's b1_ss.mtx as an example:

%%MatrixMarket matrix coordinate real general
7 7 15
5 1 -0.03599942
6 1 -0.0176371
7 1 -0.007721779
1 2 1.0
2 2 -1.0
1 3 1.0
3 3 -1.0
1 4 1.0
4 4 -1.0
2 5 0.45
5 5 1.0
3 6 0.1
6 6 1.0
*4 7 0.45* <- Let's look at this stored value as an example.
7 7 1.0

There's a stored value at (4, 7) one-indexed. That's (3, 6) zero-indexed, so that's what we should see in Binsparse. If I h5dump the included b1_ss_COO.bsp.h5 to compare, I see the following indices_0 (row indices) and indices_1 (column indices):

   DATASET "indices_0" {
      DATATYPE  H5T_STD_I64LE
      DATASPACE  SIMPLE { ( 15 ) / ( 15 ) }
      DATA {
      (0): 1, 2, 3, 1, 4, 2, 5, 3, *6*, 0, 4, 0, 5, 0, 6
      }
   }
   DATASET "indices_1" {
      DATATYPE  H5T_STD_I64LE
      DATASPACE  SIMPLE { ( 15 ) / ( 15 ) }
      DATA {
      (0): 0, 0, 0, 1, 1, 2, 2, 3, *3*, 4, 4, 5, 5, 6, 6
      }
   }

I've highlighted our indices with *, which are stored at indices_0[8] and indices_1[8]. For COOR, indices_0 holds row indices and indices_1 holds column indices, which means the Binsparse file stored (6, 3), not (3, 6) as it should be.

Not sure exactly where things are going wrong---maybe you're defaulting to COOC for COO instead of COOR?

willow-ahrens · 2024-05-16T22:13:10Z

Ah, I see. I think I caught it, it was on my side, the writer for COO was mixing up the numbering of the levels.

add reference output

b2a1eb7

willow-ahrens added 2 commits November 27, 2023 13:16

fix zero sizes

54d2df9

updated

35555f0

updated Finch version and reference output

eb8f2ec

willow-ahrens mentioned this pull request May 5, 2024

matrixmarket reader may be swapping row and column finch-tensor/Finch.jl#523

Closed

willow-ahrens added 5 commits May 14, 2024 10:49

Merge branch 'main' into wma/reference_output

735a691

quick update, still broken

b718880

ignore manifest

6c45117

symmetry

33f453b

removed until symmetry becomes a thing

b7a08d6

fix examples and update compat

9bf0dea

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add reference output #39

add reference output #39

willow-ahrens commented Oct 30, 2023 •

edited

Loading

github-actions bot commented Oct 30, 2023 •

edited

Loading

BenBrock commented Nov 27, 2023

willow-ahrens commented Dec 4, 2023

BenBrock commented Apr 23, 2024

willow-ahrens commented Apr 23, 2024 •

edited

Loading

BenBrock commented Apr 23, 2024

willow-ahrens commented Apr 23, 2024

DrTimothyAldenDavis commented Apr 23, 2024 via email •

edited

Loading

DrTimothyAldenDavis commented Apr 23, 2024 via email •

edited

Loading

BenBrock commented Apr 24, 2024

BenBrock commented Apr 29, 2024

willow-ahrens commented May 10, 2024

willow-ahrens commented May 14, 2024

BenBrock commented May 16, 2024 •

edited

Loading

willow-ahrens commented May 16, 2024 •

edited

Loading

add reference output #39

Are you sure you want to change the base?

add reference output #39

Conversation

willow-ahrens commented Oct 30, 2023 • edited Loading

github-actions bot commented Oct 30, 2023 • edited Loading

Automated Review URLs

BenBrock commented Nov 27, 2023

willow-ahrens commented Dec 4, 2023

BenBrock commented Apr 23, 2024

willow-ahrens commented Apr 23, 2024 • edited Loading

BenBrock commented Apr 23, 2024

willow-ahrens commented Apr 23, 2024

DrTimothyAldenDavis commented Apr 23, 2024 via email • edited Loading

DrTimothyAldenDavis commented Apr 23, 2024 via email • edited Loading

BenBrock commented Apr 24, 2024

BenBrock commented Apr 29, 2024

willow-ahrens commented May 10, 2024

willow-ahrens commented May 14, 2024

BenBrock commented May 16, 2024 • edited Loading

willow-ahrens commented May 16, 2024 • edited Loading

willow-ahrens commented Oct 30, 2023 •

edited

Loading

github-actions bot commented Oct 30, 2023 •

edited

Loading

willow-ahrens commented Apr 23, 2024 •

edited

Loading

DrTimothyAldenDavis commented Apr 23, 2024 via email •

edited

Loading

DrTimothyAldenDavis commented Apr 23, 2024 via email •

edited

Loading

BenBrock commented May 16, 2024 •

edited

Loading

willow-ahrens commented May 16, 2024 •

edited

Loading