spark从2.3之后开始 在包里面有docker 镜像脚本。
本文主要介绍2.x的镜像打包已经在k8s中怎么去部署spark,3.x的目前自己还没有测试成功,目前3.0自己遇到的问题是在k8s启动后,会提示没有权限创建logs目录。
1.到spark官网中下载2.x的spark,我下载的是spark-2.4.8-bin-hadoop2.6
spark下载https://archive.apache.org/dist/spark/2.解压spark-2.4.8-bin-hadoop2.6
3.修改 sbin/spark-daemon.sh 这个脚本文件,需要把 "--" 删除 ,否则会在之后的k8s启动中提示
如下错误。
failed to launch: nice -n 0 /opt/spark/bin/spark-class org.apache.spark.deploy.master.Master --host spark-manager-84c8795878-4n8x9 --port 7077 --webui-port 8080nohup: can't execute '--': No such file or directory
full log in /opt/spark/logs/spark--org.apache.spark.deploy.master.Master-1-spark-manager-84c8795878-4n8x9.out
4.执行打包命令
./bin/docker-image-tool.sh -t xxx(docker tag) build
5.docker images |grep spark
会出现两个 我用的是不带-r的
6.接下来就是利用yaml文件 创建 spark了 yaml文件我就直接粘贴了 没啥说的,需要注意的就是将镜像改成自己的就行了
---
apiVersion: v1
kind: Service
metadata:name: spark-manager
spec:type: ClusterIPports:- name: rpcport: 7077- name: uiport: 8080selector:app: sparkcomponent: sparkmanager
---
apiVersion: v1
kind: Service
metadata:name: spark-manager-rest
spec:type: NodePortports:- name: restport: 8080nodePort: 30221targetPort: 8080selector:app: sparkcomponent: sparkmanager
---
apiVersion: apps/v1
kind: Deployment
metadata:name: spark-manager
spec:replicas: 1selector:matchLabels:app: sparkcomponent: sparkmanagertemplate:metadata:labels:app: sparkcomponent: sparkmanagerspec:containers:- name: sparkmanagerimage: spark:my_spark_2.4_hadoop_2.7workingDir: /opt/sparkcommand: ["/bin/bash", "-c", "/opt/spark/sbin/start-master.sh && while true;do echo hello;sleep 6000;done"]ports:- containerPort: 7077name: rpc- containerPort: 8080name: uilivenessProbe:tcpSocket:port: 7077initialDelaySeconds: 30periodSeconds: 60
---
apiVersion: apps/v1
kind: Deployment
metadata:name: spark-worker
spec:replicas: 2selector:matchLabels:app: sparkcomponent: workertemplate:metadata:labels:app: sparkcomponent: workerspec:containers:- name: sparkworkerimage: spark:my_spark_2.4_hadoop_2.7workingDir: /opt/sparkcommand: ["/bin/bash", "-c", "/opt/spark/sbin/start-slave.sh spark://spark-manager:7077 && while true;do echo hello;sleep 6000;done"]
7.通过nodePort将spark-ui映射出去可以在浏览器上访问
8.进入到master的pod容器中 执行 spark-submit 测试任务,圆周率计算
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master local examples/jars/spark-examples_2.11-2.4.8.jar
计算成功。
9.至此spark on k8s完成,如果有疑问可以留言,自己也是初步探索,也找了很多网上的资料。