1. 程式人生 > >Kubernetes與大資料之二:編譯並執行基於Scalar的Spark程式WordCount

Kubernetes與大資料之二:編譯並執行基於Scalar的Spark程式WordCount

一、前言

通過SBT編譯scala程式然後在Kubernetes使用Spark執行WordCount任務。

轉載自https://blog.csdn.net/cloudvtech

二、安裝環境和編譯

2.1 安裝SBT

mv bintray-sbt-rpm.repo /etc/yum.repos.d/

yum install -y sbt

2.2 編輯檔案

mkdir -p spark-example-project/src/main/scala/

cd spark-example-project

ls /opt/spark/spark-2.1.1-bin-hadoop2.7/jars/scala-library-2.11.8.jar

ls /opt/spark/spark-2.1.1-bin-hadoop2.7/jars/spark-core_2.11-2.1.1.jar 

simple.sbt 

name := "Simple Project"
version := "1.0"
scalaVersion := "2.11.8"
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.1.1"

src/main/scala/SparkPi.scala

package org.apache.spark.examples
import scala.math.random
import org.apache.spark._
/** Computes an approximation to pi */
object SparkPi {
  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("Spark Pi")
    val spark = new SparkContext(conf)
    val slices = if (args.length > 0) args(0).toInt else 2
    val n = math.min(100000L * slices, Int.MaxValue).toInt // avoid overflow
    val count = spark.parallelize(1 until n, slices).map { i =>
      val x = random * 2 - 1
      val y = random * 2 - 1
      if (x*x + y*y < 1) 1 else 0
    }.reduce(_ + _)
    println("Pi is roughly " + 4.0 * count / n)
    spark.stop()
  }
}

2.3 編譯

sbt clean
sbt package
[info] Loading project definition from /root/spark-example-project/project
[info] Loading settings for project spark-example-project from simple.sbt ...
[info] Set current project to Simple Project (in build file:/root/spark-example-project/)
[info] Updating ...
[info] Done updating.
[warn] There may be incompatibilities among your library dependencies.
[warn] Run 'evicted' to see detailed eviction warnings
[info] Compiling 2 Scala sources to /root/spark-example-project/target/scala-2.11/classes ...
[info] Done compiling.
[warn] Multiple main classes detected.  Run 'show discoveredMainClasses' to see the list
[info] Packaging /root/spark-example-project/target/scala-2.11/simple-project_2.11-1.0.jar ...
[info] Done packaging.
[success] Total time: 27 s, completed Sep 10, 2018 1:48:22 PM

轉載自https://blog.csdn.net/cloudvtech

三、打包和執行

3.1 構建docker image

cp spark-example-project/target/scala-2.11/simple-project_2.11-1.0.jar /root/spark-2.3.0-bin-hadoop2.7/examples/jars/sparkpi_2.11-1.0.jar
cd /root/spark-2.3.0-bin-hadoop2.7
docker build -t  192.168.56.10:5000/spark:2.3.0.4 -f kubernetes/dockerfiles/spark/Dockerfile  .
docker push 192.168.56.10:5000/spark:2.3.0.4

3.2 執行task

/root/spark/bin/spark-submit  \
   --master k8s://https://192.168.56.10:6443  \
   --deploy-mode cluster  \
   --name spark-pi   \
   --class org.apache.spark.examples.SparkPi \
   --conf spark.executor.instances=3  \
   --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
   --conf spark.kubernetes.container.image=192.168.56.10:5000/spark:2.3.0.4  \
          local:///opt/spark/examples/jars/sparkpi_2.11-1.0.jar

轉載自https://blog.csdn.net/cloudvtech