What are the improvements Apache Spark2 brings compared to Apache Spark?
- From architecture perspective
- From application point of view
- or more
What are the improvements Apache Spark2 brings compared to Apache Spark?
Apache Spark 2.0.0 APIs have stayed largely similar to 1.X, Spark 2.0.0 does have API breaking changes
Apache Spark 2.0.0 is the first release on the 2.x line. The major updates are API usability, SQL 2003 support, performance improvements, structured streaming, R UDF support, as well as operational improvements.
New in spark 2:
You can go through the Spark release 2.0.0 where updates in following points are explained:
There is not much difference with respect to architecture as the nutshell is still DAG and RDD , which is the most important part of it !
Though Spark 2.0 is much more optimized and has DataSet Api which gives much more powerful to the hands of developers. So I would say the architecture is same it is just the Spark 2.0 provides much optimized and has a rich set of Api !
These are the main things that are provided by Apache Spark 2.0:
For more information please take a lok here : https://www.quora.com/What-are-special-features-and-advantages-of-Apache-Spark-2-0-over-earlier-versions