Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hive upgrade to 3.1.2 and hadoop upgrade to 3.2.1 #56

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 15 additions & 7 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,28 +1,32 @@
FROM bde2020/hadoop-base:2.0.0-hadoop2.7.4-java8
FROM bde2020/hadoop-base:2.0.0-hadoop3.2.1-java8

MAINTAINER Yiannis Mouchakis <[email protected]>
MAINTAINER Ivan Ermilov <[email protected]>
MAINTAINER Jian Shen <[email protected]>

# Allow buildtime config of HIVE_VERSION
ARG HIVE_VERSION
# Set HIVE_VERSION from arg if provided at build, env if provided at run, or default
# https://docs.docker.com/engine/reference/builder/#using-arg-variables
# https://docs.docker.com/engine/reference/builder/#environment-replacement
ENV HIVE_VERSION=${HIVE_VERSION:-2.3.2}
ENV HIVE_VERSION=${HIVE_VERSION:-3.1.3}

ENV HIVE_HOME /opt/hive
ENV PATH $HIVE_HOME/bin:$PATH
ENV HADOOP_HOME /opt/hadoop-$HADOOP_VERSION

WORKDIR /opt

RUN sed -i 's/^.*$/deb http:\/\/mirrors.tuna.tsinghua.edu.cn\/debian\/ buster main\ndeb-src http:\/\/mirrors.tuna.tsinghua.edu.cn\/debian\/ buster main\ndeb http:\/\/mirrors.tuna.tsinghua.edu.cn\/debian-security\/ buster\/updates main\ndeb-src http:\/\/mirrors.tuna.tsinghua.edu.cn\/debian-security\/ buster\/updates main/g' /etc/apt/sources.list

#Install Hive and PostgreSQL JDBC
#Install Hive and PostgreSQL JDBC
RUN apt-get update && apt-get install -y wget procps && \
wget https://archive.apache.org/dist/hive/hive-$HIVE_VERSION/apache-hive-$HIVE_VERSION-bin.tar.gz && \
tar -xzvf apache-hive-$HIVE_VERSION-bin.tar.gz && \
mv apache-hive-$HIVE_VERSION-bin hive && \
wget https://jdbc.postgresql.org/download/postgresql-9.4.1212.jar -O $HIVE_HOME/lib/postgresql-jdbc.jar && \
rm apache-hive-$HIVE_VERSION-bin.tar.gz && \
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hive/hive-3.1.3/apache-hive-3.1.3-bin.tar.gz && \
tar -xzvf apache-hive-3.1.3-bin.tar.gz && \
mv apache-hive-3.1.3-bin hive
RUN wget https://jdbc.postgresql.org/download/postgresql-9.4.1212.jar -O $HIVE_HOME/lib/postgresql-jdbc.jar && \
rm apache-hive-3.1.3-bin.tar.gz && \
apt-get --purge remove -y wget && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
Expand All @@ -46,6 +50,10 @@ RUN chmod +x /usr/local/bin/startup.sh
COPY entrypoint.sh /usr/local/bin/
RUN chmod +x /usr/local/bin/entrypoint.sh

# solve log version conflict
RUN cp /opt/hadoop-3.2.1/share/hadoop/common/lib/guava-27.0-jre.jar /opt/hive/lib/
RUN rm -rf /opt/hive/lib/guava-19.0.jar

EXPOSE 10000
EXPOSE 10002

Expand Down
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

# docker-hive

This is a docker container for Apache Hive 2.3.2. It is based on https://github.com/big-data-europe/docker-hadoop so check there for Hadoop configurations.
This is a docker container for Apache Hive 3.1.2 It is based on https://github.com/big-data-europe/docker-hadoop so check there for Hadoop configurations.
This deploys Hive and starts a hiveserver2 on port 10000.
Metastore is running with a connection to postgresql database.
The hive configuration is performed with HIVE_SITE_CONF_ variables (see hadoop-hive.env for an example).
Expand All @@ -29,7 +29,7 @@ This deploys a Presto server listens on port `8080`
Load data into Hive:
```
$ docker-compose exec hive-server bash
# /opt/hive/bin/beeline -u jdbc:hive2://localhost:10000
# /opt/hive/bin/beeline -u jdbc:hive2://
> CREATE TABLE pokes (foo INT, bar STRING);
> LOAD DATA LOCAL INPATH '/opt/hive/examples/files/kv1.txt' OVERWRITE INTO TABLE pokes;
```
Expand All @@ -47,3 +47,4 @@ Then query it from PrestoDB. You can get [presto.jar](https://prestosql.io/docs/
* Ivan Ermilov [@earthquakesan](https://github.com/earthquakesan) (maintainer)
* Yiannis Mouchakis [@gmouchakis](https://github.com/gmouchakis)
* Ke Zhu [@shawnzhu](https://github.com/shawnzhu)
* Jian Shen [@SJshenjian](https://github.com/SJshenjian)
4 changes: 4 additions & 0 deletions conf/hive-site.xml
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,8 @@
See the License for the specific language governing permissions and
limitations under the License.
--><configuration>
<property>
<name>hive.metastore.event.db.notification.api.auth</name>
<value>false</value>
</property>
</configuration>
28 changes: 18 additions & 10 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,28 +1,29 @@
version: "3"
version: "3.5"

services:
namenode:
image: bde2020/hadoop-namenode:2.0.0-hadoop2.7.4-java8
image: bde2020/hadoop-namenode:2.0.0-hadoop3.2.1-java8
volumes:
- namenode:/hadoop/dfs/name
environment:
- CLUSTER_NAME=test
env_file:
- ./hadoop-hive.env
ports:
- "50070:50070"
- "9870:9870"
- "9000:9000"
datanode:
image: bde2020/hadoop-datanode:2.0.0-hadoop2.7.4-java8
image: bde2020/hadoop-datanode:2.0.0-hadoop3.2.1-java8
volumes:
- datanode:/hadoop/dfs/data
env_file:
- ./hadoop-hive.env
environment:
SERVICE_PRECONDITION: "namenode:50070"
SERVICE_PRECONDITION: "namenode:9870"
ports:
- "50075:50075"
- "9864:9864"
hive-server:
image: bde2020/hive:2.3.2-postgresql-metastore
image: bde2020/hive:3.1.2-postgresql-metastore
env_file:
- ./hadoop-hive.env
environment:
Expand All @@ -31,16 +32,18 @@ services:
ports:
- "10000:10000"
hive-metastore:
image: bde2020/hive:2.3.2-postgresql-metastore
image: bde2020/hive:3.1.2-postgresql-metastore
env_file:
- ./hadoop-hive.env
command: /opt/hive/bin/hive --service metastore
environment:
SERVICE_PRECONDITION: "namenode:50070 datanode:50075 hive-metastore-postgresql:5432"
SERVICE_PRECONDITION: "namenode:9870 datanode:9864 hive-metastore-postgresql:5432"
ports:
- "9083:9083"
hive-metastore-postgresql:
image: bde2020/hive-metastore-postgresql:2.3.0
image: bde2020/hive-metastore-postgresql:3.1.0
ports:
- "5432:5432"
presto-coordinator:
image: shawnzhu/prestodb:0.181
ports:
Expand All @@ -49,3 +52,8 @@ services:
volumes:
namenode:
datanode:

# solve java.net.URISyntaxException Illegal character in hostname at index 49: thrift://docker-hive-hive-metastore-1.docker-hive_default:9083
networks:
default:
name: docker-hive-default
2 changes: 2 additions & 0 deletions entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -110,9 +110,11 @@ function wait_for_it()
echo "[$i/$max_try] $service:${port} is available."
}

# shellcheck disable=SC2068
for i in ${SERVICE_PRECONDITION[@]}
do
wait_for_it ${i}
done

# shellcheck disable=SC2068
exec $@
2 changes: 2 additions & 0 deletions hadoop-hive.env
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ HIVE_SITE_CONF_javax_jdo_option_ConnectionUserName=hive
HIVE_SITE_CONF_javax_jdo_option_ConnectionPassword=hive
HIVE_SITE_CONF_datanucleus_autoCreateSchema=false
HIVE_SITE_CONF_hive_metastore_uris=thrift://hive-metastore:9083
HIVE_SITE_CONF_hive_server2_thrift_bind_host=0.0.0.0
HIVE_SITE_CONF_hive_server2_thrift_port=10000
HDFS_CONF_dfs_namenode_datanode_registration_ip___hostname___check=false

CORE_CONF_fs_defaultFS=hdfs://namenode:8020
Expand Down
2 changes: 1 addition & 1 deletion startup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ hadoop fs -chmod g+w /tmp
hadoop fs -chmod g+w /user/hive/warehouse

cd $HIVE_HOME/bin
./hiveserver2 --hiveconf hive.server2.enable.doAs=false
./hiveserver2 --hiveconf hive.server2.enable.doAs=false