Solving Analytical Problems using Apache Spark
In this talk, we will explore why Spark is the most prominent solution as compared to just Hadoop. We will look at MapReduce and how Spark makes the creation of Big Data algorithms simple and faster. Next, we will explore Spark Context and how Resilient Distributed Dataset (RDD) to help with the establishment of Directed Acyclic Graph (DAG); Transformations using map and filter; Actions using collect, count and reduce. Later we will explore the Spark Cassandra connector. We will look at Spark API and Spark SQL. We will also discuss how DataStax helps give a high level of stability to open source Apache Spark and Apache Cassandra projects. Key takeaways from this talk will be for a developer and architect to understand how Apache Spark and Apache Cassandra helps in implementing enterprise level analytical solutions. It is 100x faster than Hadoop!