One operation and maintenance 1. Master hang up, standby restart is also invalid Master defaults to 512M of memory, when the task in the cluster is particularly high, it will hang, because the master will read each task event log log to generate spark ui, the memory will naturally OOM, you can run the log See that the master of the start through the HA will naturally fail for this reason. solve Increase the Master's memory spark-env.sh , set in the master node spark-env.sh : export SPARK_DAEMON_MEMORY 10g # 根据你的实际情况 Reduce the job information stored in the Master memory spark.ui.retainedJobs 500 # 默认都是1000 spark.ui.retainedStages 500 Hang up or suspend Sometimes we will see the web node in the web ui disappear or in the dead state, the task of running the node will report a variety of lost worker errors, causing the same reasons and the above, worker memory to save a lot of ui The information leads to gc when the heartbeat is lost
Really Good blog post.provided a helpful information.I hope that you will post more updates like this Big Data Hadoop Online Course Hyderabad
RépondreSupprimer