Skip to content

wubdut/ScaleOfWebsites

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ScaleOfWebsites

crawl the scale of websites

This is a python crawler to get the number of pages of one website. I want to study crawler which implements on distributed system such as hadoop. So, it is the beginning.

I got this idea from a interview for a internship of Google. Although I failed, the interviewer taught me how to write a crowler to get the number of pages of a website in an hour.

I am so sad about the loss, which motivates me a lot to write this crawler.

About

crale the scale of websites

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages