以下安装说明,只是为了说明流程步骤,相关优化配置,自己按需调整。另外如果自己要用的spark、kafka、redis和cassandra是不同的版本,请相应修改本平台pom.xml里的驱动版本
##目录
sudo -s su - root useradd -s /bin/bash -p 123456789 -m spark -d /data/spark mkdir -p /data/apps/meteor chown -R spark:spark /data/apps sudo -s su - spark ssh-keygen -t rsa(一直按空格键) cat /data/spark/.ssh/id_rsa.pub >> /data/spark/.ssh/authorized_keys
安装Java HotSpot 1.7
export SPARK_DAEMON_MEMORY=512m export JAVA_HOME=/usr/local/jdk export SPARK_HOME=/data/apps/spark export SPARK_WORKER_CORES=60 export SPARK_WORKER_MEMORY=2g export SPARK_WORKER_DIR=$SPARK_HOME/work export SPARK_LOCAL_DIRS=/tmp
10、启动spark集群:/data/apps/spark/sbin/start-all.sh,通过http://本机外网IP:8080
/data/apps/kafka/bin/zookeeper-server-start.sh /data/apps/kafka/config/zookeeper.properties > /tmp/startup_zookeeper.log 2>&1 &
/data/apps/kafka/bin/kafka-server-start.sh /data/apps/kafka/config/server.properties > /tmp/startup_kafka.log 2>&1 &
1、下载redis-3.0.7:http://redis.io/download
mkdir -p /data/apps/redis-3.0.7/conf mkdir -p /data/apps/redis-3.0.7/bin mkdir -p /data/apps/redis-3.0.7/data ln -s /data/apps/redis-3.0.7 /data/apps/redis cp redis.conf sentinel.conf /data/apps/redis/conf cp runtest* /data/apps/redis/bin cd src cp mkreleasehdr.sh redis-benchmark redis-check-aof redis-check-dump redis-cli redis-sentinel redis-server redis-trib.rb /data/apps/redis/bin/
cp /data/apps/redis/conf/redis.conf /data/apps/redis/conf/redis-6379.conf
vim /data/apps/redis/conf/redis-6379.conf
daemonize yes pidfile /var/run/redis-6379.pid port 6379 #save 900 1 #save 300 10 #save 60 10000 dbfilename dump-6379.rdb dir /data/apps/redis/data maxmemory 1g maxmemory-policy allkeys-lru maxmemory-samples 3 cluster-enabled yes cluster-config-file /data/apps/redis/conf/nodes-6379.conf
cp /data/apps/redis/conf/redis.conf /data/apps/redis/conf/redis-6380.conf
vim /data/apps/redis/conf/redis-6380.conf
daemonize yes pidfile /var/run/redis-6380.pid port 6380 #save 900 1 #save 300 10 #save 60 10000 dbfilename dump-6380.rdb dir /data/apps/redis/data maxmemory 1g maxmemory-policy allkeys-lru maxmemory-samples 3 cluster-enabled yes cluster-config-file /data/apps/redis/conf/nodes-6380.conf
cp /data/apps/redis/conf/redis.conf /data/apps/redis/conf/redis-6381.conf
vim /data/apps/redis/conf/redis-6381.conf
daemonize yes pidfile /var/run/redis-6381.pid port 6380 #save 900 1 #save 300 10 #save 60 10000 dbfilename dump-6381.rdb dir /data/apps/redis/data maxmemory 1g maxmemory-policy allkeys-lru maxmemory-samples 3 cluster-enabled yes cluster-config-file /data/apps/redis/conf/nodes-6381.conf
/data/apps/redis/bin/redis-server /data/apps/redis/conf/redis-6379.conf /data/apps/redis/bin/redis-server /data/apps/redis/conf/redis-6380.conf /data/apps/redis/bin/redis-server /data/apps/redis/conf/redis-6381.conf
exit sudo -s su - root apt-get update apt-get install ruby1.9.3 apt-get install rubygems gem install redis
sudo -s su - spark /data/apps/redis/bin/redis-trib.rb create 127.0.0.1:6379 127.0.0.1:6380 127.0.0.1:6381
可选,涉及超大量级去重、join才需要用到,如基于历史数据算新UV,join成为新用户对应的来源渠道数据
sudo -s su - root vim /etc/hosts 127.0.0.1 kafka1 127.0.0.1 redis1 127.0.0.1 cassandra1
/data/meteor/doc/sql/create.sql
/data/meteor/doc/sql/init_demo.sql
其中下载scala包会很慢,因为是在国外的,可以从http://pan.baidu.com/s/1bpxBhrL 这里下载并解压到你的maven respository/org/目录下
4、启动前台管理系统程序,通过http://x.x.x.x:8070 登录
java -Xms128m -Xmx128m -cp /data/meteor/jetty-server/target/meteor-jetty-server-1.0-SNAPSHOT-jar-with-dependencies.jar com.meteor.jetty.server.JettyServer "/data/meteor/mc/target/meteor-mc-1.0-SNAPSHOT.war" "/" "8070" > mc.log 2>&1 &
平台任务操作细节详情,可查看里面的帮助文档和表单注释
java -Xms128m -Xmx128m -cp /data/meteor/demo/target/meteor-demo-1.0-SNAPSHOT-jar-with-dependencies.jar com.meteor.demo.DemoSourceData
1)按需修改/data/meteor/conf/meteor.properties
2)cp /data/meteor/hiveudf/target/meteor-hiveudf-1.0-SNAPSHOT-jar-with-dependencies.jar /data/spark_lib_ext/
3)cp /data/meteor/conf/log4j.properties /data/apps/spark/conf/
4)vim /data/apps/spark/conf/spark-defaults.conf
spark.driver.extraClassPath /data/spark_lib_ext/* spark.executor.extraClassPath /data/spark_lib_ext/*
5)启动程序
/data/apps/spark/bin/spark-submit \ --class com.meteor.server.MeteorServer \ --master spark://你的内网IP:7077 \ --executor-memory 1G \ --total-executor-cores 16 \ --driver-cores 4 \ --driver-memory 1G \ --supervise \ --verbose \ /data/meteor/server/target/meteor-server-1.0-SNAPSHOT-jar-with-dependencies.jar \ "/data/meteor/conf/meteor.properties"
首次启动会因kafka的一些topic没有,报错而自动创建
可通过http://本机外网IP:4040查看
用于把执行日志导回mysql,方便前台管理系统查看
java -Xms128m -Xmx128m -cp /data/meteor/jetty-server/target/meteor-jetty-server-1.0-SNAPSHOT-jar-with-dependencies.jar com.meteor.jetty.server.JettyServer "/data/meteor/transfer/target/meteor-transfer-1.0-SNAPSHOT.war" "/" "8090" > transfer.log 2>&1 &
/data/apps/kafka/bin/kafka-console-consumer.sh --zookeeper 127.0.0.1:2181 --topic uv_ref_hour