ezunpaywall is a Unpaywall mirror hosted in France by Inist-CNRS of data from Unpaywall since 2020 and updated daily. Unpaywall is a metadata repository of free and open access electronic resources. This app is available at https://unpaywall.inist.fr/.
Table of content
ezunpaywall operates as a service. It is updated daily with its own update service. Data is stored in an elastic index. To access this data, ezunpaywall offers 2 types of access:
- A graphql API for querying unpaywall data via one or more DOIs
- A file enrichment service that allows you to enrich a csv or jsonl file containing a column or a doi key.
These services are accessible via API keys, which can be managed by the API key service. The keys are stored in a redis database and can be accessed by the graphql service and enrich. A web interface is also available as a demonstrator. It allows you to :
- Show data metrics
- Examples of how to use the graphql API and enrichment service
- openAPI documentation
- A contact form
- A server administration section
- A history of data update reports.
- A healthcare service makes sure that all its services work and communicate well together.
On the front, a nginx acts as a reverse proxy, redirecting all these services to a single entry point.
Each service :
ezunpaywall is made up of several services which are distributed in several docker containers.
git clone https://github.com/ezpaarse-project/ezunpaywall
The tools you need to let ezunpaywall run are :
- docker
- npm
Command :
# install dependencies
npm i
# create volume for elastic
docker-compose -f docker-compose.debug.yml run --rm elastic chown -R elasticsearch /usr/share/elasticsearch/
# Start ezunpaywall as daemon
docker-compose -f docker-compose.debug.yml up -d
# Stop ezunpaywall
docker-compose -f docker-compose.debug.yml stop
# Get the status of ezunpaywall services
docker-compose -f docker-compose.debug.yml ps
To run tests, you need ezunpaywall to be launched in dev mode with fakeUnpaywall. With that, you can run test on.
# there are alias on root folder
npm run test
npm run test:admin
npm run test:enrich
npm run test:graphql
# you can run test for each service
ezunpaywall/src/admin npm run test
ezunpaywall/src/enrich npm run test
ezunpaywall/src/graphql npm run test
- docker
- docker compose
- Unpaywall data in elastic with single node in index with 3 shards measured about 130Gb, it is necessary to provide the necessary place on the hard drive (storage for index + unpaywall file if you want to keep them).
Create an environment file named ezunpaywall.local.env.sh
and export the following environment variables. You can then source ezunpaywall.env.sh
, which contains a set of predefined variables and is overridden by ezunpaywall.local.env.sh
.
Elasticsearch has some system requirements that you should check.
To avoid memory exceptions, you may have to increase mmaps count. Edit /etc/sysctl.conf
and add the following line :
# configuration needed for elastic search
vm.max_map_count=262144
Then apply the changes :
sysctl -p
Before you start ezunpaywall, make sure all necessary environment variables are set.
# Start ezunpaywall as daemon
docker-compose up -d
# Stop ezunpaywall
docker-compose stop
# Get the status of ezunpaywall services
docker-compose ps
You can update your data via update snapshots provided by unpaywall on a weekly or daily basis (if you have API key). in the admin service, there is a cron that allows to automatically update the data from unpaywall, weekly or daily.