Spark function to explain: checkpoint
Set the checkpoint for the current RDD. The function will create a binary file and store it in the checkpoint directory, which is set with Spark Context.setCheckpointDir (). During the checkpoint process, all of the RDD-dependent information in the parent RDD will be all removed. A checkpoint operation on RDD is not performed immediately and an Action must be performed to trigger it.
Function prototype
def checkpoint() |
scala> val data = sc.parallelize( 1 to 100000 , 15 ) data : org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[ 12 ] at parallelize at <console> : 12 scala> sc.setCheckpointDir( "/iteblog" ) scala> data.checkpoint scala> data.count 15 / 02 / 15 11 : 47 : 47 INFO RDDCheckpointData : Done checkpointing RDD 12 to hdfs : //iteblogcluster/iteblog/5f2053e9-a02f-4661-ad1d-2250a8473e92/rdd-12, new parent is RDD 13 res 17 : Long = 100000 [iteblog.com @ ~]$ bin/hadoop fs -ls /iteblog/ 5 f 2053 e 9 -a 02 f- 4661 -ad 1 d- 2250 a 8473 e 92 /rdd- 12 Found 15 items -rw-r--r-- ... 2015 - 02 - 15 /iteblog/ 5 f 2053 e 9 -a 02 f- 4661 -ad 1 d- 2250 a 8473 e 92 /rdd- 12 /part- 00000 -rw-r--r-- ... 2015 - 02 - 15 /iteblog/ 5 f 2053 e 9 -a 02 f- 4661 -ad 1 d- 2250 a 8473 e 92 /rdd- 12 /part- 00001 -rw-r--r-- ... 2015 - 02 - 15 /iteblog/ 5 f 2053 e 9 -a 02 f- 4661 -ad 1 d- 2250 a 8473 e 92 /rdd- 12 /part- 00002 -rw-r--r-- ... 2015 - 02 - 15 /iteblog/ 5 f 2053 e 9 -a 02 f- 4661 -ad 1 d- 2250 a 8473 e 92 /rdd- 12 /part- 00003 -rw-r--r-- ... 2015 - 02 - 15 /iteblog/ 5 f 2053 e 9 -a 02 f- 4661 -ad 1 d- 2250 a 8473 e 92 /rdd- 12 /part- 00004 -rw-r--r-- ... 2015 - 02 - 15 /iteblog/ 5 f 2053 e 9 -a 02 f- 4661 -ad 1 d- 2250 a 8473 e 92 /rdd- 12 /part- 00005 -rw-r--r-- ... 2015 - 02 - 15 /iteblog/ 5 f 2053 e 9 -a 02 f- 4661 -ad 1 d- 2250 a 8473 e 92 /rdd- 12 /part- 00006 -rw-r--r-- ... 2015 - 02 - 15 /iteblog/ 5 f 2053 e 9 -a 02 f- 4661 -ad 1 d- 2250 a 8473 e 92 /rdd- 12 /part- 00007 -rw-r--r-- ... 2015 - 02 - 15 /iteblog/ 5 f 2053 e 9 -a 02 f- 4661 -ad 1 d- 2250 a 8473 e 92 /rdd- 12 /part- 00008 -rw-r--r-- ... 2015 - 02 - 15 /iteblog/ 5 f 2053 e 9 -a 02 f- 4661 -ad 1 d- 2250 a 8473 e 92 /rdd- 12 /part- 00009 -rw-r--r-- ... 2015 - 02 - 15 /iteblog/ 5 f 2053 e 9 -a 02 f- 4661 -ad 1 d- 2250 a 8473 e 92 /rdd- 12 /part- 00010 -rw-r--r-- ... 2015 - 02 - 15 /iteblog/ 5 f 2053 e 9 -a 02 f- 4661 -ad 1 d- 2250 a 8473 e 92 /rdd- 12 /part- 00011 -rw-r--r-- ... 2015 - 02 - 15 /iteblog/ 5 f 2053 e 9 -a 02 f- 4661 -ad 1 d- 2250 a 8473 e 92 /rdd- 12 /part- 00012 -rw-r--r-- ... 2015 - 02 - 15 /iteblog/ 5 f 2053 e 9 -a 02 f- 4661 -ad 1 d- 2250 a 8473 e 92 /rdd- 12 /part- 00013 -rw-r--r-- ... 2015 - 02 - 15 /iteblog/ 5 f 2053 e 9 -a 02 f- 4661 -ad 1 d- 2250 a 8473 e 92 /rdd- 12 /part- 00014 |
Commentaires
Enregistrer un commentaire