-
Notifications
You must be signed in to change notification settings - Fork 126
Home
Awantik Das edited this page Mar 17, 2017
·
3 revisions
- What is Apache Spark?
- Spark Jobs and APIs
- Review of Resilient Distributed Datasets (RDDs), DataFrames, and Datasets
- Review of Catalyst Optimizer and Project Tungsten
- Review of the Spark 2.0 architecture
- Internal workings of an RDD
- Creating RDDs
- Global versus local scopes
- Transformations
- Actions
- Understanding Spark
- Spark Jobs & API
- Architecture
- RDD Internals
- Creating RDD
- Understanding Deployment & Program Behaviour
- RDD Transformation & Action
- Assignments 1
- Best Practices -1
- Introduction to DataFrame
- PySpark SQL
- Pandas to DataFrames
- Machine Learning with PySpark
- Transformers
- Estimators
- Spark Streaming
- Structured Streaming
- GraphX & GraphFrames
- Data Processing Architectures
- Problems