RapidPro Normalizer is a command line utility to flatten records of RapidPro API Responses in order to export them as files or database records.
- Interactive command line interface
- Easy Yaml configuration
- Export dataset to file and database
- Works on Linux and Windows (may be on Mac as well)
The easiest way to install this utility is to clone it from GitHub:
$ git clone https://github.com/supermalang/rapidpro-normalizer.git
Navigate to the directory and install the python requirements
$ cd rapidpro-normalizer
$ pip3 install -r requirements.txt
Create the .env
file from the sample.env
file:
$ cp sample.env .env
Now open the .env
file and configure it by putting the good values for RAPIDPRO_TOKEN, DB_HOST, DB_NAME, DB_USER and DB_PASSWORD
- 🆘 If you do not have a RapidPro token please contact your Technical Focal Point.
- 🆗 If you do not export to database you can ignore the database credentials
Create the config.yml
file and update the content.
- Create the config file:
$ cp sample.config.yml config.yml
- Define the file export settings. The
path
must in the directory of the utility - Enable or disable the export to database
- Give the field group to use to fetch columns that need to be exported. Here
covid_edu_poll
is the field group, but you can customize. Make sure it does not contain spaces, numbers or special characters. The field group will be refered in the command line asfieldgroup
. - Give your fields. The RapidPro field hierarchy from the API Response must be conserved.
First you can use API clients like Postman to send a request and look into the response to see what the fields hierarchy looks like. You will need to consider only fields that are in theresults
property.
You don't need to put all fields. Just give fields you want to be exported in the dataset.
You can add many field groups in your config file but only one
fielgroup
can be used at at time in the command line.
This part is optional
You can customize the values of the requests types getcontacts, getruns and getmessages
directly in the config.yml
file to add requests parameters as necessary.
Example: If you are only interested in runs that belong to a given flow you can customize the request type as following:
# Types of api requests you can use
# You can customize the requests by adding parameters that comply with the RapidPro API
rapidpro_api_settings:
- request_types:
- getruns: "https://api.rapidpro.io/api/v2/runs.json?flow=f5901b62-ba76-4003-9c62-72fdacc1b7b7"
You can ignore this part if you do not export to database
databaseuser
has at least the ALTER
privilege on the database.
Update the database to use utf8mb4
as the default character set.
ALTER SCHEMA `databasename` DEFAULT CHARACTER SET utf8mb4 ;
Change databasename
by the name of your database.
The syntax to use the RapidPro Normalizer is:
$ python3 src/data/make.py [OPTIONS]
⚠️ Depending on your environment you might need to usepython
(with version 3) instead ofpython3
You can use the following options:
requesttype
: type of the RapidPro request. Therequesttype
needs to be defined in the config filefieldgroup
: Group of fields to export. Thefieldgroup
needs to be defined in the config file.datasetname
: name of the dataset to export.
Inline execution
$ python3 src/data/make.py --requesttype getcontacts --fieldgroup contact_fields --datasetname mycontacts
This part is optional
You can schedule the automatic execution of the utility by creating a cron task on a Linux machine or using the Task scheduler on Windows. Follow these steps, if you are using Linux:
- Display and copy the command to be executed by the cron task
⚠️ Make sure you are still in the rapidpro-normalization directory
Run the following to copy the command to give to the cron task. You will need to update the parameters accordingly.
$ echo "python3 $(pwd)/src/data/make.py --requesttype getcontacts --fieldgroup contact_fields --datasetname mycontacts"
- Edit the
crontab
file
The
crontab
file contains instructions for the cron daemon in the following simplified manner: "run this command on this date at this time".
$ crontab -e
Add at the end of the file the command you have copied from the previous step in this way and save and close the file:
0 1 * * * python3 /home/user/path/to/rapidpro-mormalizer/src/data/make.py --requesttype getcontacts --fieldgroup contact_fields --datasetname mycontacts
This gives instruction to the cron daemon to run the command python3 /home/user/path/to/rapidpro-mormalizer/src/data/make.py --requesttype getcontacts --fieldgroup contact_fields --datasetname mycontacts
every day at 1:00 AM.