-
-
Notifications
You must be signed in to change notification settings - Fork 185
Developers' Guide
Developers are responsible for the technical infrastructure of the website and solve complex problems about accessibility, performance, internationalization, SEO, and more.
- Platform development: Developers will work together to build the front and back ends of the Almanac website, reusing some components from previous editions and building new ones. This is an ongoing commitment with varying levels of work at any given time from June to November.
The amount of time you put in is up to you, but developers typically give about 10 hours over 6 months.
Leave a comment in the Call for Developers issue.
Browse issues with the Development label.
Casual contributors should fork the repo into their own GitHub and submit pull requests. Core developers can work from the main repo in separate branches, which may be easier for collaboration. Only core maintainers (currently Rick and Barry) are able to alter the main
or production
branch and others must submit PRs to main
for them to merge.
For 2020 we moved our main branch to main
so those involved in 2019 with their own fork may need to follow the steps in #880 to migrate their own fork.
Developers should assign Issues to themselves so other developers are aware they are being worked on, and release them if they are unable to continue on that Issue for whatever issue. Comment frequently, and reach out for help! We also label Issues to make them easier to find. The good first issue label is a great place to start for new developers!
Developers should also help review other developers code to help the core maintainers, improve the code quality of merged code, and familiarise themselves with changes to the code.
The Web Almanac website is built using vanilla CSS and JS, hosted on Google Cloud Platform, through a Python-based Flask application server, serving Jinja2 templates. Wow that's a lot of technical terms!
The current tech stack is essentially split into the following pieces:
- Scripts
- Templates
- Python
- Static
- Config
These are discussed in more detail below. This tech stack has served us pretty well, but we are always up for changing this if there are good reasons to and general consensus too. There's probably a bit too much technology in there, some overlap between EJS and Jinja2, possible too many layers of hierarchy for the Jinja2 templates and we've even discussed in the past whether it should just be a static site! But that's what we have for now. Raise an issue if you want to discuss the tech stack further, but suggest you familiarise yourself with it first to see advantages and disadvantages of it.
The scripts in the src/tools/generate
folder are Node/JavaScript files which are used to convert the chapters (written in Markdown in the content
folder), into Jinja2/HTML templates. They use some EJS templates (very like Jinaj2 in functionality but with slight syntax differences). This is run automatically on commit to main
using a GitHub Action (which also automatically updates CSS and JS cache busting query params), but can also be run manually. Run npm install
from src
directory to be able to run the npm run generate
command.
There are also further commands (npm run ebooks
) to generate the ebooks using princexml but that only needs to be done on release so core maintainers can take care of that mostly.
The Jinja2 templates in the templates
folder allow us to avoid duplicating code for all the pages, and support multiple languages and years. The templates in the base
folders are the majority of the HTML, and the language specific templates mostly just contain translations of various phrases and paragraphs. The individual chapter's HTML should never need be edited directly as it is overwritten with about generate process.
The templates follow a bit of layering, which can be quite confusing at first. Take for example the CSS chapter in English for 2019. It is made of the following files in the src
directory:
- The core content, written by the Authors, is in
content/en/2019/css.md
- This is converted to
templates/en/2019/chapters/css.html
using thetemplates/base/2019/chapter.html
EJS template before the site is deployed to production. - The
templates/en/2019/chapters/css.html
template extendstemplates/en/2019/base_chapter.html
Jinja2 template which adds English wording used in the base_chapter template (note we mostly treat English the same as other languages for consistency). - The
templates/en/2019/base_chapter.html
template extendstemplates/base/2019/base_chapter.html
Jinja2 template, which is the core HTML for the chapters. - The
templates/base/2019/base_chapter.html
template extends thetemplates/en/2019/base.html
Jinja2 template, which includes any generic wording and phrases in English needed by the whole site. - The
templates/en/2019/base.html
template extends thetemplates/base/2019/base.html
Jinja2 template, which is the core HTML and design for the 2019 edition. - The
templates/base/2019/base.html
template extends thetemplates/base.html
Jinja2 template, which is the base HTML every page needs no matter what the year (including standard meta entries, Google Analytics...etc.). Jinja2 allows these to be overridden by layers above.
As you can see it's quite convoluted. Usually it's easy to figure out what page to edit from a quick search of the code but this all could probably be simplified a fair bit (maybe just have one translation file instead of a page specific file like templates/en/2019/base_chapter.html
as well as templates/en/2019/base.html
file?). Anyway that's what we have for now.
The non-chapters (e.g. the home, Methodology, table of contents...etc. pages) follow a similar route but these do not have their content written in Markdown and are written directly in HTML/Jinja2 as they are normally written by the development team who have the skills and ability to write directly in those, whereas we want chapter authors to concentrate on the content.
Python scripts in the main src
folder are the webserver code, including mapping of URL routes, to templates and various functions made available to the Jinja2 templates. You need to install the dependencies as detailed in the src/README
folder and then run the webserver with python main.py
so you can browse the site locally at http://127.0.0.1:8080/
using the built in werkzeug development server.
The static folder contains CSS, JS, Images, Fonts and other "static" files, that can be served directly by Google Cloud Platform without going through Python application server. Developers will mostly be editing the CSS and JS files. Some CSS and JS in inlined into the template, but for code that is shared across many pages (e.g. the core CSS, or the chapter JS) they are separated out into static files to allow caching reuse.
The cofig folder contains a JSON config per year. This allows common config to avoid being hardcoded and using JSON allows the config to be share between the node/JS generate scripts and also the Python/Jinja2 site.
Note the top-level sql
folder is a collection of the HTTP Archive SQL queries used to get the stats. It is not used on the site (though there is a link to this from chapters to explore). So mostly developers can ignore this folder and leave it for the https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Analysts'-Guide to manage. They will also create the figure images for each chapter.
The site is released periodically by the core maintainers (currently Rick and Barry). We won't release on every merge (especially if there are a few PRs in the pipeline that look nearly done), but at the same time not afraid to release if a good bit of functionality, or an important bug fix is now merged.
We aim for this site to be as available and inclusive as possible and follow all best practices for creating a website – especially as we are calling out the usage of such best practices on the web!
As such the site needs to be built with high quality code (including code reuse rather than duplication, readability and maintainability) and be performant (many of use involved in the Almanac are web performance evangelists).
Inclusion and accessibility is very important to us and we publish a comprehensive [https://almanac.httparchive.org/en/accessibility-statement](Accessibility Statement) to abide by and keep up to date. On a similar not internationalisation is very important. At the time of writing the 2019 Web Almanac is available in English and Japanese, and partially available in French and Spanish. The tech stack is built with multi-lingual and multi-year support but translations may require additional development to support language-specific features or other localisation needs.
SEO is an important consideration for any site and we have spent a lot of time optimising the site for SEO so must consider that too.
We have a good basis of the site thanks to the hard work in 2019 and we've cleaned up most of the outstanding development issues since that was launched. So we're in a good place for 2020, however there's a few things I'd love to tackle in this year as well as launching the site, including:
- Adding more automatic testing (either via scripts or GitHub actions?) - we have some test cases for the python code, but very little else.
- Adding automatic linting for HTML/CSS/JS (either via scripts or GitHub actions?). This might be more difficult for HTML and inline code, given we use Jinja2 so not pure HTML but would be good to have.
- Improved Production monitoring to know if we have broken anyway.
- Consider what to do about the (interactive visuals](https://github.com/HTTPArchive/almanac.httparchive.org/issues/896).
- Simplify the templates (see above).
- Maybe consider moving off of Python/Flask/Jinja2 to JavaScript-based stack to simplify our technologies?
- Maybe consider again whether to move to static site?
Would love to hear your thoughts on these, or any other goals you think we should have!
Let me know if you have any questions, Barry