StreamingPro again supports Structured Streaming
Preface
Before the article has been written, StreamingPro support Spark Structured Streaming , but was only the nature of the game, because the Spark 2.0 + version is actually only try nature, the focus is on the spark 1.6 series. But the time is going, Spark 2.0+ version is still the trend. So this version of the bottom to do a lot of reconstruction, StreamingPro currently supports Flink , Spark 1.6 +, Spark 2.0 + three engines.
Preparing for a job Download the package for streaming software for spark 2.0 and then download the spark 2.1 installation package.
You can also find the spark 1.6+ or flink version in the streamingpro directory . The latest general format in accordance with the following format:
Before the article has been written, StreamingPro support Spark Structured Streaming , but was only the nature of the game, because the Spark 2.0 + version is actually only try nature, the focus is on the spark 1.6 series. But the time is going, Spark 2.0+ version is still the trend. So this version of the bottom to do a lot of reconstruction, StreamingPro currently supports Flink , Spark 1.6 +, Spark 2.0 + three engines.
Preparing for a job Download the package for streaming software for spark 2.0 and then download the spark 2.1 installation package.
You can also find the spark 1.6+ or flink version in the streamingpro directory . The latest general format in accordance with the following format:
streamingpro-spark-0.4.14-SNAPSHOT.jar 适配 spark 1.6+,scala 2.10
streamingpro-spark-2.0-0.4.14-SNAPSHOT.jar 适配 spark 2.0+,scala 2.11
streamingpro.flink-0.4.14-SNAPSHOT-online-1.2.0.jar 适配 flink 1.2.0, scala 2.10
Test example <br> Write a json file ss.json, as follows:{
"scalamaptojson": {
"desc": "测试",
"strategy": "spark",
"algorithm": [],
"ref": [
],
"compositor": [
{
"name": "ss.sources",
"params": [
{
"format": "socket",
"outputTable": "test",
"port":"9999",
"host":"localhost",
"path": "-"
},
{
"format": "com.databricks.spark.csv",
"outputTable": "sample",
"header":"true",
"path": "/Users/allwefantasy/streamingpro/sample.csv"
}
]
},
{
"name": "ss.sql",
"params": [
{
"sql": "select city from test left join sample on test.value == sample.name",
"outputTableName": "test3"
}
]
},
{
"name": "ss.outputs",
"params": [
{
"mode": "append",
"format": "console",
"inputTableName": "test3",
"path": "-"
}
]
}
],
"configParams": {
}
}
}
Is roughly a socket source, a sample file. Socket source is streaming, sample file is batch processing.
Sample.csv reads as follows:
id,name,city,age
1,david,shenzhen,31
2,eason,shenzhen,27
3,jarry,wuhan,35
Then you in the terminal implementation nc -lk 9999 just fine.
Then run the spark program:
SHome=/Users/allwefantasy/streamingpro
./bin/spark-submit --class streaming.core.StreamingApp \
--master local[2] \
--name test \
$SHome/streamingpro-spark-2.0-0.4.14-SNAPSHOT.jar \
-streaming.name test \
-streaming.platform spark_structrued_streaming \
-streaming.job.file.path file://$SHome/ss.json
Commentaires
Enregistrer un commentaire