1、安装hadoop集群
参考:2、安装hive
参考:3、安装配置spark
编译spark: 部署参考:4、spark-sql集成hive
拷贝hdfs-site.xml、hive-site.xml配置文件到spark conf/目录下:
$ cp /opt/cdh-5.3.6/hive-0.13.1-cdh5.3.6/conf/hive-site.xml .
$ cp /opt/cdh-5.3.6/hadoop-2.5.0-cdh5.3.6/etc/hadoop/hdfs-site.xml .5、启动spark-sql
$ bin/spark-sql --master local[2]
启动之后可以在shell客户端进行交互式HQL访问hive数据库了。
6、测试:
spark-sql (default)> show databases;
... ... result chavin default ... ...spark-sql (default)> select * from chavin.dept;
... ... deptno dname loc 10 ACCOUNTING NEW YORK 20 RESEARCH DALLAS 30 SALES CHICAGO 40 OPERATIONS BOSTON Time taken: 0.378 seconds, Fetched 4 row(s) ... ...