This script acts as glue between a local file storage mount point and RIAK. It is targeted at specific use cases when local mount-points need to be migrated to RIAK without changing the applications accessing those mount point. Think of it as a transparent RIAK filesystem layer with multiple options to control it’s behavior regarding local files.
The tool is started from command line or supervisord (see [supervisord] manual for further information how to set-up as daemon). It takes 2 parameters:
riak-fuse.py -h
For a help text:
usage: riak-fuse.py [-h] -s SOURCE -t TARGET [-f] [-rp RIAKPORT]
[-rh RIAKHOST] [-rnp RIAK_NAMESPACE_PREFIX]
[-rdnp RIAK_DIRECTORY_NAMESPACE_PREFIX]
[-rbt RIAK_DIRECTORY_SET_BUCKETTYPE]
[-rdk RIAK_DIRECTORY_SET_DIRECTORYKEY]
[-rct RIAK_CONTENT_TYPE] [-dell] [-ddir] [-rreadcontent]
[-rreaddir] [-rfuid RIAK_CONTENTS_FILE_UID]
[-rfgid RIAK_CONTENTS_FILE_GID]
optional arguments:
-h, --help show this help message and exit
-s SOURCE, --source SOURCE
the source mount point
-t TARGET, --target TARGET
the target mount point
-f, --foreground don't go into background on start-up
-rp RIAKPORT, --riakport RIAKPORT
the port RIAK PBC is listening on
-rh RIAKHOST, --riakhost RIAKHOST
the host or IP adress RIAK PBC is listening on
-rnp RIAK_NAMESPACE_PREFIX, --riak_namespace_prefix RIAK_NAMESPACE_PREFIX
the prefix given to each RIAK binary content bucket
-rdnp RIAK_DIRECTORY_NAMESPACE_PREFIX, --riak_directory_namespace_prefix RIAK_DIRECTORY_NAMESPACE_PREFIX
the prefix given to each RIAK directory content bucket
-rbt RIAK_DIRECTORY_SET_BUCKETTYPE, --riak_directory_set_buckettype RIAK_DIRECTORY_SET_BUCKETTYPE
the RIAK bucket type name used for directory content
buckets
-rdk RIAK_DIRECTORY_SET_DIRECTORYKEY, --riak_directory_set_directorykey RIAK_DIRECTORY_SET_DIRECTORYKEY
the reserved key name of the directory listing set
-rct RIAK_CONTENT_TYPE, --riak_content_type RIAK_CONTENT_TYPE
the mime type used for the RIAK binary content
-dell, --delete_local
when present the local copy of a file shall be removed
when it was successfully transferred to RIAK
-ddir, --disable_maintain_directory
when present the directory structure will NOT be
maintained in RIAK
-rreadcontent, --use_riak_read_content
should the contents of RIAK be used to fulfill read
accesses to files
-rreaddir, --use_riak_read_directory
should also the maintained RIAK datastructure be used
for directory read access
-rfuid RIAK_CONTENTS_FILE_UID, --riak_contents_file_uid RIAK_CONTENTS_FILE_UID
the UID used for files when RIAK is used for read
directory access
-rfgid RIAK_CONTENTS_FILE_GID, --riak_contents_file_gid RIAK_CONTENTS_FILE_GID
the GID used for files when RIAK is used for read
directory access
Where source-mountpoint is the mountpoint from where you migrate - essentially the local disk or mounted share that holds the current data-set. The tool is acting only upon a certain path-scheme that is being in the following form: /$foldername/images/*
Every file matching the *
is going to be handled by the script. Directories below that structure are not supported. Everything else is going to be ignored. This behavior is hard-coded and can be changed in the NameMapping.py files legacyPath*
methods.
The target-mountpoint is the mount point where the tool will interact with the applications. It’s probably to replace the previously mounted source-mountpoint.
- hardlinks and symlinks are not supported and won't be supported
- renaming across buckets is not supported
- chown/chmod/setattr is not supported when RIAK is used for reading of files and directories
- subfolders are not supported beyond the matched path
- the directory set name must be named so that it does not collide with filenames/directory names inside that directory matching the pattern for this tool
There are two buckets per merchant:
- Binary Content Bucket
- here the file data / binary contents get stored
- bucket name:
$riak_namespace_prefix$foldername
- so for a folder like:
/test/images/file.jpg
with default prefix it’ll have the bucket nameIMG_test
- so for a folder like:
- the keys stored in this bucket are going to be named after the file part of the path
- so for the bespoke example it’ll be
file.jpg
- so for the bespoke example it’ll be
- for debugging:
- to quickly get the contents of a bucket
curl "http://localhost:8098/buckets/IMG_test/keys?keys=true" | python -m json.tool
- to get the contents of a file
curl "http://localhost:8098/buckets/IMG_test/keys/file.jpg"
- to quickly get the contents of a bucket
- Directory Bucket
- here the directory listing and file size information get stored
- bucket name:
$riak_directory_namespace_prefix$foldername
- so for a folder like:
/test/images/file.jpg
with default prefix it’ll have the bucket nameIMGDIR_test
- so for a folder like:
- the keys (sets) stored in this bucket holding the size information are named like the file part of the path. As a set they are only holding one piece of information (the size) for now.
- so for bespoke example it’ll be
file.jpg
which contains a set of 1 piece of information which is the file size
- so for bespoke example it’ll be
- in addition there’s a directory set holding a full directory listing of this directory. Each entry in the set is one file.
- so for default values this key is named
directory
- so for default values this key is named
- for debugging:
-
to get the contents of a directory
curl http://localhost:8098/types/sets/buckets/IMGDIR_test/datatypes/directory | python -m json.tool
-
to get all buckets contents of a directory set bucket
curl http://localhost:8098/types/sets/buckets/IMGDIR_test/keys?keys=true | python -m json.tool
-
to get all available directory buckets
curl http://localhost:8098/types/sets/buckets?buckets=true | python -m json.tool
- riak KV 2.1 or later
apt-get install pkg-config libtool autoconf build-essentials gcc automake make libffi-dev libssl-dev openssl curl apt-transport-https libc6-dev-i386 libncurses5-dev fop xsltproc unixodbc-dev libpam0g-dev
curl -O https://raw.githubusercontent.com/spawngrid/kerl/master/kerl
chmod a+x kerl
- this installs Erlang
./kerl build git git://github.com/basho/otp.git OTP_R16B02_basho8 R16B02-basho8
- this installs RIAK
./kerl install R16B02-basho8 ~/erlang/R16B02-basho8
- this activates RIAK
.~/erlang/R16B02-basho8/activate
- after RIAK is set-up and started
- start using
riak-admin start
- add the required SETS bucket-type by running these commands:
riak-admin bucket-type create sets '{"props":{"datatype":"set"}}'
riak-admin bucket-type activate sets
- start using
- fuse 2.9.3 or later
- example set-up:
- install build essentials:
apt-get install pkg-config libtool autoconf build-essentials gcc automake make libffi-dev libssl-dev openssl curl apt-transport-https libc6-dev-i386
- get current source from https://github.com/libfuse/libfuse:
wget https://github.com/libfuse/libfuse/releases/download/fuse-2.9.7/fuse-2.9.7.tar.gz
tar xvf fuse-2.9.7.tar.gz
cd fuse-2.9.7
./configure
make -j8
make install
- install build essentials:
- example set-up:
- python 2.7 or later
apt-get install python python-pip
- libraries installed:
- fusepy 2.0.4 or later (
pip install fusepy
) - RIAK KV 2.5.3 or later (
pip install riak
)
- fusepy 2.0.4 or later (
Several aspects of this tool can and need to be configured (as in: changed in
riak-fuse.py
), as there is:- RIAK
- Host/IP
- default
riak_host = 'localhost'
- default
- PBC port of RIAK
- default:
riak_port = 8087
- default:
- Namespace Prefix for the file contents bucket
- default:
riak_namespace_prefix = 'IMG_'
- default:
- Namespace Prefix for the directory bucket
- default:
riak_directory_namespace_prefix = 'IMGDIR_'
- default:
- RIAK Bucket-Type to use for the directory bucket
- default:
riak_directory_set_buckettype = 'sets'
- Note: This needed to be set-up earlier, see Pre-Requisites and Dependencies
- default:
- RIAK directory bucket key name to store the directory listing in
- default:
riak_directory_set_directorykey = 'directory'
- default:
- RIAK content value content type (better left unchanged for now)
- default:
riak_content_type = 'application/octet-stream'
- default:
- Host/IP
- Behavior Options
- wether or not the local copy of a file shall be removed when it was successfully transferred to RIAK
- default:
remove_local_copy_after_successful_mapping = False
- default:
- wether or not the directory structure will be maintained in RIAK
- default:
maintain_riak_directory_structure = True
- default:
- should also the maintained RIAK directory data structure be used for directory read access (like listing files)
- CHOWN/CHMOD and other attribute commands are ignored when enabled
- local directory contents are ignored
- default:
use_riak_directory_structure_for_read_access = False
- should the contents of RIAK be used to fulfill read accesses to files
- default:
use_riak_file_contents_for_read_access = False
- default:
- File Permission Mask to be used when listing files from RIAK directory data structure
- default:
riak_contents_file_mask = 0o777
- default:
- File GID and UID reported when listing directories / file status flags and information
- default:
riak_contents_file_uid = 0
riak_contents_file_gid = 0
- default:
- wether or not the local copy of a file shall be removed when it was successfully transferred to RIAK
- Logging Options
- which log-level should be outputted as log
- choose from: CRITICAL, ERROR, WARNING, INFO, DEBUG
- default:
logger.setLevel(logging.ERROR)
- which log-level should be outputted as log
-
You can use docker to spawn a RIAK + riak-Fuse demo machine right away.
In order to do that run this to build the environment:
'docker-compose build riak-fuse'
To run:
'docker-compose run riak-fuse'
You now should be able to map the 22 port of the riak-fuse machine to whatever local port you feel good about and connect through SSH as user root with password root.
The riak-fuse script resides in /src/riak-fuse and you should call it from there.
For a brief demo - log in and go ahead and make a test:
check if riak is there: 'ping riak-kv'
now short demo:
'cd ~' 'mkdir source' 'mkdir source/test' 'mkdir source/test/images' 'mkdir target' '/src/riak-fuse/riak-fuse.py -s ~/source/ -t ~/target/ -rh riak-kv -dell -rreaddir -rreadcontent'
you now should be able to generate files in ~/target/test/images - reading and writing.
Those get stored on riak-kv.
You can check if they are there by doing:
'echo "testcontent" >~/target/test/images/test.txt'
and get the dir-listing:
'curl "http://riak-kv:8098/buckets/IMG_test/keys?keys=true" | python -m json.tool'
and contents: