Upgrading a Running Spark Streaming Application with Code Changes with Zero-Data-Loss
Upgrading a Running Spark Streaming Application with Code Changes with Zero-Data-Loss Spark Streaming is one of the most reliable (near) real time processing solutions available in the streaming world these days. It comes with ease of development , continuous evolving features with very good community support. One of the most important concern area of any streaming solution is how it tackles failure situations like application going down due to some issue/error. Because the data is realtime and keeps on coming even while application is down, its really important to handle backlog data before processing current time data in order to avoid data loss once application is back to life. Together with Kafka , which is distributed messaging system capable of replaying past data, Spark Streaming handles failure situations quite nicely . Restart of the system is normal and not just the case of failures. Upgrading a running streaming application with code changes,c