使用REST API提交Apache Spark Job
阿新 • • 發佈:2018-12-09
使用REST API提交Apache Spark Job
使用Apache spark時,有時需要從群集外部按需觸發Spark作業。我們可以通過兩種方式在群集中提交Apache spark作業。
- Spark從Spark叢集中提交
要從spark叢集中提交spark作業,我們使用spark-submit。下面是一個示例shell指令碼,它提交了Spark作業。大多數參與者都是自我解釋的。
<span style="color:#212529"><span style="color:#212529"><code><span style="color:#93a1a1">#!/bin/bash</span> <span style="color:#22b3eb">$SPARK_HOME</span>/bin/spark-submit <span style="color:#cb4b16">\</span> <span style="color:#22b3eb">--class</span> com.nitendragautam.sparkbatchapp.main.Boot <span style="color:#cb4b16">\</span> <span style="color:#22b3eb">--master</span> spark://192.168.133.128:7077 <span style="color:#cb4b16">\</span> <span style="color:#22b3eb">--deploy-mode</span> cluster <span style="color:#cb4b16">\</span> <span style="color:#22b3eb">--supervise</span> <span style="color:#cb4b16">\</span> <span style="color:#22b3eb">--executor-memory</span> 4G <span style="color:#cb4b16">\</span> <span style="color:#22b3eb">--driver-memory</span> 4G <span style="color:#cb4b16">\</span> <span style="color:#22b3eb">--total-executor-cores</span> 2 <span style="color:#cb4b16">\</span> /home/hduser/sparkbatchapp.jar <span style="color:#cb4b16">\</span> /home/hduser/NDSBatchApp/input <span style="color:#cb4b16">\</span> /home/hduser/NDSBatchApp/output/ </code></span></span>
- 來自Spark叢集外部的REST API
在這篇文章中,我將解釋如何在REST API的幫助下觸發Spark作業。我請在提交Spark Job之前確保Spark Cluster正在執行。
圖:Apache Spark Master
使用Shell指令碼觸發Spark批處理作業
建立一個submit_spark_job.sh
以下面的內容命名的Shell指令碼。給shell指令碼
<span style="color:#212529"><span style="color:#212529"><code><span style="color:#93a1a1">#!/bin/bash</span> curl <span style="color:#22b3eb">-X</span> POST http://192.168.133.128:6066/v1/submissions/create <span style="color:#22b3eb">--header</span> <span style="color:#2aa198">"Content-Type:application/json;charset=UTF-8"</span> <span style="color:#22b3eb">--data</span> <span style="color:#2aa198">'{ "appResource": "/home/hduser/sparkbatchapp.jar", "sparkProperties": { "spark.executor.memory": "4g", "spark.master": "spark://192.168.133.128:7077", "spark.driver.memory": "4g", "spark.driver.cores": "2", "spark.eventLog.enabled": "false", "spark.app.name": "Spark REST API201804291717022", "spark.submit.deployMode": "cluster", "spark.jars": "/home/hduser/sparkbatchapp.jar", "spark.driver.supervise": "true" }, "clientSparkVersion": "2.0.1", "mainClass": "com.nitendragautam.sparkbatchapp.main.Boot", "environmentVariables": { "SPARK_ENV_LOADED": "1" }, "action": "CreateSubmissionRequest", "appArgs": [ "/home/hduser/NDSBatchApp/input", "/home/hduser/NDSBatchApp/output/" ] }'</span> </code></span></span>
一旦火花作業成功執行,您將看到具有以下內容的輸出。
<span style="color:#212529"><span style="color:#212529"><code> [email protected]: sh submit_spark_job.sh <span style="color:#859900">{</span> <span style="color:#2aa198">"action"</span> : <span style="color:#2aa198">"CreateSubmissionResponse"</span>, <span style="color:#2aa198">"message"</span> : <span style="color:#2aa198">"Driver successfully submitted as driver-20180429125849-0001"</span>, <span style="color:#2aa198">"serverSparkVersion"</span> : <span style="color:#2aa198">"2.0.1"</span>, <span style="color:#2aa198">"submissionId"</span> : <span style="color:#2aa198">"driver-20180429125849-0001"</span>, <span style="color:#2aa198">"success"</span> : <span style="color:#b58900">true</span> <span style="color:#859900">}</span> </code></span></span>
使用REST API檢查Spark作業的狀態
如果要檢查Spark作業的狀態,可以使用Submission Id和下面的shell指令碼。
<span style="color:#212529"><span style="color:#212529"><code> curl http://192.168.133.128:6066/v1/submissions/status/driver-20180429125849-0001
<span style="color:#859900">{</span>
<span style="color:#2aa198">"action"</span> : <span style="color:#2aa198">"SubmissionStatusResponse"</span>,
<span style="color:#2aa198">"driverState"</span> : <span style="color:#2aa198">"FINISHED"</span>,
<span style="color:#2aa198">"serverSparkVersion"</span> : <span style="color:#2aa198">"2.0.1"</span>,
<span style="color:#2aa198">"submissionId"</span> : <span style="color:#2aa198">"driver-20180429125849-0001"</span>,
<span style="color:#2aa198">"success"</span> : <span style="color:#b58900">true</span>,
<span style="color:#2aa198">"workerHostPort"</span> : <span style="color:#2aa198">"192.168.133.128:38451"</span>,
<span style="color:#2aa198">"workerId"</span> : <span style="color:#2aa198">"worker-20180429124356-192.168.133.128-38451"</span>
<span style="color:#859900">}</span>
</code></span></span>