GitHub - moviepilot/seoserver: A PhantomjS-based SEO server that serves pages of JavaScript apps to search engine robots like the Googlebot.

Welcome!

Seo Server is a command line tool that runs a server that allows GoogleBot (and any other crawlers) to crawl your heavily Javascript built websites. The tool works with very little changes to your server or client side code.

Getting started

Install CoffeeScript (if not already)
npm install -g coffee-script
Edit configuration file src/config.coffee.sample and save it as src/config.coffee
Compile the config into project directory
coffee --output lib/ -c src/config.coffee
Install npm dependencies
npm install
Install PhantomJS
npm install -g phantomjs
Start the main process on port 10300 and with default memcached conf:
bin/seoserver start -p 10300

Internals

The crawler has three parts:

lib/phantom-server.js A small PhantomJS script for fetching the page and returning the response along with the response headers in serialized form. It can be executed via:

phantomjs lib/phantom-server.js http://moviepilot.com/stories

lib/seoserver.js A node express app responsible for accepting the requests from Googlebot, checking if there is a cached version on memcached, otherwise fetching the page via phantom-server.js.

You can start it locally with:

node lib/seoserver.js start

And test its output with:

curl -v http://localhost:10300

bin/seoserver Forever-monitor script, for launching and monitoring the node main process.

bin/seoserver start -p 10300

Nginx and Varnish configuration examples

Your webserver has to detect incoming search engine requests in order to route them to the seoserver. A way of doing so is looking for the string "bot" in the User-Agent-Header, or by checking for Google's escaped fragment. In Nginx you can check the variable $http_user_agent and set the backend similar to this:

location / {
  proxy_pass  http://defaultbackend;
  if ($http_user_agent ~* bot)  {
    proxy_pass  http://seoserver;
}
location ~* escaped_fragment {
  proxy_pass  http://seoserver;
}

If you deliver a cached version of your website with a reverse proxy in front, you can do a similar check. A vcl example for Varnish:

if (req.http.User-Agent ~ "bot" || req.url ~ "escaped_fragment") {
  set req.http.UA-Type = "crawler";
} else {
  set req.http.UA-Type = "regular";
}

Credits

This code is based on a tutorial by Thomas Davis and on https://github.com/apiengine/seoserver

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
bin		bin
lib		lib
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
buster.js		buster.js
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Welcome!

Getting started

Internals

Nginx and Varnish configuration examples

Credits

About

Releases

Packages

Contributors 5

Languages

License

moviepilot/seoserver

Folders and files

Latest commit

History

Repository files navigation

Welcome!

Getting started

Internals

Nginx and Varnish configuration examples

Credits

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages