This project contains Lambda functions that migrate data from Neptune to Solr and Postgres when an appropriately formatted SQS message is received. In the RIALTO architecture these messages come from https://github.com/sul-dlss/rialto-trigger-rebuild when a full rebuild is needed or from https://github.com/sul-dlss/sparql-loader when a single entity needs to be updated.
Start localstack. If you're on a Mac, ensure you are running the docker daemon.
SERVICES=lambda,sns,sqs LAMBDA_EXECUTOR=docker localstack start
Start Blazegraph. On AWS we would use Neptune, but Neptune is not yet a part of localstack.
- Note * use Java 8 -- it won't work with newer versions of Java.
export JAVA_HOME="$(/usr/libexec/java_home -v 1.8)"
java -server -Xmx4g -jar blazegraph.jar
make
- Start localstack. If you're on a Mac, ensure you are running the docker daemon.
SERVICES=lambda,sns LAMBDA_EXECUTOR=docker localstack start
- Setup environment for localstack
export AWS_DEFAULT_REGION=us-east-1
export AWS_ACCESS_KEY_ID=_not_needed_locally_
export AWS_SECRET_ACCESS_KEY=_not_needed_locally_
- Upload zip and create function definitions
aws lambda \
--endpoint-url http://localhost:4574 create-function \
--function-name f1 \
--runtime go1.x \
--role r1 \
--handler postgres_derivative \
--environment "Variables={\
SPARQL_ENDPOINT=http://127.0.0.1:9999/blazegraph/namespace/kb/sparql, \
RDS_DB_NAME=rialto_development, \
RDS_USERNAME=postgres, \
RDS_HOSTNAME=127.0.0.1, \
RDS_PORT=5432, \
RDS_PASSWORD=sekret}" \
--zip-file fileb://postgres_derivative.zip
aws lambda \
--endpoint-url http://localhost:4574 create-function \
--function-name f2 \
--runtime go1.x \
--role r1 \
--handler solr_derivative \
--environment "Variables={SOLR_HOST=http://127.0.0.1:8983/solr,SOLR_COLLECTION=collection1,\
SPARQL_ENDPOINT=http://127.0.0.1:9999/blazegraph/namespace/kb/sparql}" \
--zip-file fileb://solr_derivative.zip
- Create SNS topic
aws sns \
--endpoint-url=http://localhost:4575 create-topic \
--name data-update
- Subscribe to SNS events
aws sns \
--endpoint-url=http://localhost:4575 subscribe \
--topic-arn arn:aws:sns:us-east-1:123456789012:data-update \
--protocol lambda \
--notification-endpoint arn:aws:lambda:us-east-1:000000000000:function:f1
aws sns \
--endpoint-url=http://localhost:4575 subscribe \
--topic-arn arn:aws:sns:us-east-1:123456789012:data-update \
--protocol lambda \
--notification-endpoint arn:aws:lambda:us-east-1:000000000000:function:f2
- Start Solr and create a collection
gem install solr_wrapper
solr_wrapper
- Publish a Message
aws sns \
--endpoint-url=http://localhost:4575 publish \
--topic-arn arn:aws:sns:us-east-1:123456789012:data-update \
--message '{"Records": [{"EventSource": "foo", "Sns": { "Timestamp": "2014-05-16T08:28:06.801Z",
"Message": "{\"Action\": \"touch\", \"Entities\": [\"http://sul.stanford.edu/rialto/agents/orgs/school-of-engineering\"]}" }}]}'
- View output When you go to http://127.0.0.1:8983/solr/collection1/select?q=:
You should see an item record with:
"_source":{"foo": "barfoo"}
- Cleanup (necessary before you upload a newer version of the function)
aws lambda \
--endpoint-url=http://localhost:4574 delete-function \
--function-name f1
go test ./...
The database.dump
file was generated by checking out rialto-webapp and doing:
pg_dump rialto_test > database.dump
To restore it:
psql circle_test < database.dump
Alternatively, the test database can be run in a docker container:
# To start db
docker run --rm --name rialto_test_db -e POSTGRES_DB=rialto_test -p "5432:5432" -e POSTGRES_USER=$USER -d postgres:9.6.2-alpine
# To load test data
cat database.dump | docker exec -i rialto_test_db psql -U $USER rialto_test
# Run tests
go test ./...
# Stop container
docker stop rialto_test_db