Unconventional MatrixMarket format #25

fenekku · 2014-05-20T18:13:32Z

There are a number of output files obtained from running collaborative filtering algorithms found in toolkits/collaborative_filtering that advertise themselves to be MatrixMarket files through a .mm extension or a %%MatrixMarket matrix array real general header, but do not seem to follow the expected MatrixMarket format as defined by NIST.

For example, the output of running ./toolkits/collaborative_filtering/rating --training=smallnetflix_mm --num_ratings=5 --quiet=1 --algorithm=als is two files:

smallnetflix_mm.ids
smallnetflix_mm.ratings

Their header is (only one is shown here):

$ head -n 10 smallnetflix_mm.ids
%%MatrixMarket matrix array real general 
%This file contains item ids matching the ratings. In each row i, num_ratings top item ids for user i. (First column: user id, next columns, top K ratings). Note: 0 item id means there are no more items to recommend for this user.
95526 6 
1 1243 424 2641 2109 1557
2 2641 1548 1227 548 76 
3 1243 2548 1227 2641 76 
4 1449 2641 2109 3172 1227 
5 1449 1227 2298 735 1382 
6 2109 2669 1227 3112 2583
7 3516 2016 2647 1548 1243

'array' here indicates to the parser that the output is expected to be one value per line (column-oriented), yet it is not the case. Other files with the same problem include files ending in _U.mm or _V.mm.

This problem is especially apparent when using mmread from scipy.io (Python third-party way of reading matrixmarket files) to read these files as the format is then perceived as invalid and the file can't be read. (The --R_output_format option is not changing any of that for me).

I might be missing something here though. Thanks for the tool :).

The text was updated successfully, but these errors were encountered:

zachmayer · 2014-06-20T19:51:47Z

Theres a similar issue with inputs, particular for the gensgd program

meteotester · 2014-11-25T15:25:57Z

More about this unconventional MatrixMarket format:
#9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unconventional MatrixMarket format #25

Unconventional MatrixMarket format #25

fenekku commented May 20, 2014

zachmayer commented Jun 20, 2014

meteotester commented Nov 25, 2014

Unconventional MatrixMarket format #25

Unconventional MatrixMarket format #25

Comments

fenekku commented May 20, 2014

zachmayer commented Jun 20, 2014

meteotester commented Nov 25, 2014