Skip to content

Latest commit

 

History

History
13 lines (9 loc) · 667 Bytes

README.md

File metadata and controls

13 lines (9 loc) · 667 Bytes

wiki-pagerank-hadoop

Starting

  1. Download these files:
  1. Create the input_pages and input_links folders at the root of the project.
  2. Put frwiki-latest-page.sql.gz in input_pages and frwiki-latest-pagelinks.sql.gz in input_pagelinks.
  3. Download dependencies with mvn install
  4. You can run the jar in the target folder with 3 args: "input_pagelinks input_pages final_result". The final_result folder will be created automatically and musn't exist at start.