Skip to content
This repository has been archived by the owner on Jun 17, 2024. It is now read-only.
Ewan Higgs edited this page Jul 15, 2016 · 7 revisions

Introduction

This repository is a proof of concept (PoC) for HDFS Provided Storage (HDFS-9806). In HDFS, there are a few storage types available for administrators to use to store data. The folks at Hortonworks give a good explanation here. The current project is concerned with extending the idea of storage policies to included a PROVIDED storage type which hosts blocks elsewhere. Loosely, it's a bit like the cloud backup solution you might use on your laptop to sync files between machines.

The way the proof of concept works is by using a tool to scan your provided storage (e.g. Azure, S3 bucket, etc) to create a static file system image (fsimage) for HDFS. Creating the fsimage is done with a new tool called fs2img (in the hadoop-tools subproject of hadoop).

Getting started

To try out the PoC, we need to:

  1. Get the code and build it
  2. Setup the environment
  3. Create configuration files
  4. Create the filesystem image (fsimage)
  5. Start HDFS

Get the code:

git clone https://github.com/Microsoft-CSIL/hadoop-prototype.git

Make sure you have all the developmental dependencies installed (as seen here).

Build the code:

mvn install -Dmaven.javadoc.skip=true -Pdist,native -Djava.awt.headless=true -DskipTests

You should now have the build software at hadoop-dist/target/hadoop-2.8.0-SNAPSHOT/.

Setup the environment

Now it's time to set the environment so you can use this. Make a fresh directory with subdirectories for your logs and pid files:

mkdir -p hadoop-run hadoop-run/conf/client hadoop-run/conf/nn hadoop-run/conf/dn1 hadoop-run/log hadoop-run/pid
cd hadoop-run

Create a script (env.sh) that we will use to source our environment:

#!/bin/bash
export JAVA_HOME=/path/to/java/if/you/dont/already/have/it/set
export HADOOP_HOME=/path/to/hadoop/checkout/hadoop-dist/target/hadoop-2.8.0-SNAPSHOT
export HADOOP_CONF_DIR=$(pwd)/conf
export HADOOP_LOG_DIR=$(pwd)/log
PATH="$HADOOP_HOME"/bin:$PATH
export HADOOP_CLASSPATH=$(hadoop classpath):"$HADOOP_HOME"/share/hadoop/tools/lib/*
export HADOOP_USER_CLASSPATH_FIRST=true

fs2img () {
    # e.g. fs2img -b org.apache.hadoop.hdfs.server.common.TextFileRegionFormat s3a://bucket/
    CP=$(hadoop classpath):"$HADOOP_HOME"/share/hadoop/tools/lib/*
    java -cp "$CP" org.apache.hadoop.hdfs.server.namenode.FileSystemImage -conf "$tmpconf" "$@"
}

Create configuration files

We need three configuration files. One for the namenode, datanode, and for the cient. The namenode and datanode each use a hdfs-site.xml and the client uses core-site.xml.

conf/nn/hdfs-site.xml (don't forget to update hdfs.image.block.csv.read.path and dfs.namenode.name.dir, etc):

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:60010</value>
    </property>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
    <property>
        <name>hdfs.namenode.block.provider.class</name>
        <value>org.apache.hadoop.hdfs.server.namenode.BlockFormatProvider</value>
    </property>
    <property>
        <name>hdfs.image.writer.block.class</name>
        <value>org.apache.hadoop.hdfs.server.common.TextFileRegionFormat</value>
    </property>
    <property>
        <name>hdfs.namenode.block.provider.id</name>
        <value><!-- This will be changed later when the namenode tells you the value with DS-<some-uuid> --></value>
    </property>
    <property>
        <name>hdfs.image.block.csv.read.path</name>
        <value>file:///<...>/hadoop-run/blocks.csv</value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:///<...>/hadoop-run/hdfs/name</value>
    </property>
    <property>
        <name>dfs.namenode.http-address</name>
        <value>localhost:51070</value>
    </property>
    <property>
        <name>dfs.namenode.rpc-address</name>
        <value>localhost:60010</value>
    </property>
    <property>
        <name>dfs.namenode.rpc-bind-host</name>
        <value>0.0.0.0</value>
    </property>
    <property>
        <name>dfs.namenode.servicerpc-address</name>
        <value>localhost:60020</value>
    </property>
    <property>
        <name>dfs.namenode.servicerpc-bind-host</name>
        <value>0.0.0.0</value>
    </property>
</configuration>

conf/dn1/hdfs-site.xml (don't forget to edit the fields for dfs.datanode.data.dir and hdfs.image.block.csv.read.path, etc.):

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:60010</value>
    </property>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
    <property>
        <name>hdfs.image.writer.block.class</name>
        <value>org.apache.hadoop.hdfs.server.common.TextFileRegionFormat</value>
    </property>
    <property>
        <name>hdfs.image.block.csv.read.path</name>
        <value>file:///<...>/hadoop-run/blocks.csv</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>[PROVIDED]/<...>/hadoop-run/hdfs/data/</value>
    </property>
    <property>
        <name>dfs.datanode.address</name>
        <value>localhost:51010</value>
    </property>
    <property>
        <name>dfs.datanode.http.address</name>
        <value>localhost:51075</value>
    </property>
    <property>
        <name>dfs.datanode.ipc.address</name>
        <value>localhost:51020</value>
    </property>
    <property>
        <name>dfs.datanode.provided</name>
        <value>true</value>
    </property>

<!--
Add values you would use for your S3a or Azure client. e.g.
    <name>fs.s3a.access.key</name>
    <name>fs.s3a.secret.key</name>
    <name>fs.s3a.proxy.host</name>
    <name>fs.s3a.proxy.port</name>
-->

conf/client/core-site.xml:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:60010</value>
    </property>
</configuration>

Create the filesystem image (fsimage).

Run the fs2img program:

source env.sh
export HADOOP_CONF_DIR=`pwd`/conf/dn1
fs2img -b org.apache.hadoop.hdfs.server.common.TextFileRegionFormat s3a://bucket/

You should now have blocks.csv file in your local directory as well as a directory called hdfs. The hdfs/name directory contains the fsimage which is the directory structure that maps to block ids. This is a normal part of how HDFS works. The blocks.csv contains the mapping of blocks to file regions in your s3 or other storage (in the HDFS-9806 specification, these are the tuples).

Start HDFS

Create two spare terminal windows. One for your namenode (NN) and one for your datanode (DN).

Start the namenode

source env.sh
HADOOP_CONF_DIR=`pwd`/conf/nn
hdfs namenode

Start the datanode

source env.sh
HADOOP_CONF_DIR=`pwd`/conf/dn1
hdfs datanode

Access the data

If things are running without error, in a third terminal window you should be able to run the following:

hdfs dfs -ls /

You should see the items that were part of the bucket when you scanned. Now, this is only telling you about the filename to block id mapping. It's not getting the blocks yet. To get the blocks, we need to fish out the Storage ID from the Namenode logs. There should be a value listed as eg DS-d2d2b174-d9d9-487b-a188-644c46b2fdf9. It's different each time you run fs2img. At this point you should stop the Namenode and edit the config to set hdfs.namenode.block.provider.id to this DS-<uuid> value. Restart the Namenode and then try to cat a file:

# We use `tail` in case it's a large file and we don't want to spew everything to the screen.
hdfs dfs -cat /some-file | tail -5

Conclusion

If the instructions have worked, you should be able to retrieve data from your provided storage system through HDFS. This example was done using the TextFileRegionFormat class which holds the mapping in a csv file - but this won't scale beyond a single node unless it's kept on a shared file system. Microsoft has also provided AzureFileRegionFormat which uses Azure Tables as a key value store to map the blocks to storage locations. There are also S3AFileRegionFormat which holds the csv file in S3. And a ZookeeperFileRegionFormat which holds the mapping in Zookeeper (though it's of prototype quality).

Clone this wiki locally