Graphee 📈

Graphee is a light-weight code written in C++ to compute graph properties, such as: PageRank, TrustRank, harmonic centrality, etc...

What's the purpose of Graphee?

In order to improve search engine results, one has compute some values from the graph of the web. Actually, some famous works are known like the PageRank or the TrustRank algorithms. They were booth indispensable to make their search engines pertinent and spam proof.

However, with time the graph of the web has become wider and wider. It clearly contains more than a dozen billion vertices and at least ten times more edges. Some, has chosen to persist in full-RAM computations and using distributed framework as Hadoop or Spark. While, some other approaches as GraphChi or m-flash are using serialization methods, but both limited by uint32_t for the vertex ids.

Graphee's innovations

Thus, we propose to develop software able to handle uint64 and still performing calculations on a single laptop (sometimes with an external hard-drive for wide graphs).

The graph is converted into an adjacency matrix and then saved with the fully-tested (since 70's) Compressed Sparse Row (CSR) matrix format. The serialization is done by dividing vectors into slices, and matrices into blocks, with sizes not exceeding the RAM-limit of the computer.

However, nowadays the graph of the web is so wide that one cannot restart calculations from scratch. The preferred solution would be using delta files, saving only differences between two graph snapshots. But the CSR format is unfriendly with insertions and deletions. So we propose to implement the Dynamic CSR matrix format, which presents high speedups for the matrix modifications.

Also a binding of Graphee with Python is under development !

Efficiency

Technically we use pthread and OpenMP technologies for the parallel calculations on CPUs. The full support or CUDA standards is one of the major properties to be included soon !

Examples & functionalities

NOTE: For now on, the code contains all the needed tools for the PageRank. But I need to write a small script to DL few test data. I'm on my way...

Contributing

Please first read CONRTIBUTING.md and propose what you want or you can fix or add functionalities detailed within TODO.md.

For any questions, comments, or collaborations, please use: n.martin [at] qwantresearch [dot] com.

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
examples		examples
src		src
.doxyconf		.doxyconf
.gitignore		.gitignore
.gitmodules		.gitmodules
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
TODO.md		TODO.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Graphee 📈

What's the purpose of Graphee?

Graphee's innovations

Efficiency

Examples & functionalities

Contributing

References

About

Releases

Packages

Languages

License

nitot/graphee

Folders and files

Latest commit

History

Repository files navigation

Graphee 📈

What's the purpose of Graphee?

Graphee's innovations

Efficiency

Examples & functionalities

Contributing

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages