Big Data

Introduction to Advanced Spark



Skill level


Course cost

About this course

Apache Spark, the unified analytics engine, has seen rapid adoption by enterprises across a wide range of industries. Internet powerhouses such as Netflix, Yahoo, and eBay have deployed Spark at massive scale, collectively processing multiple petabytes of data on clusters of over 8,000 nodes. In this course you will learn about spark configuration and properties of spark to start with. The course is designed for anyone having a general idea of programming with easy to code and intuitive hands-on explanation which makes understanding concepts easier. The course covers all important and frequently used concepts in the data processing world such as memory tuning, cluster management and application scheduling. The course covers parallel programming in detail which is the most important concept to grasp while learning distributed systems.

Skills covered

  • check Spark properties
  • check Spark job scheduling

Course Syllabus

Introduction to Advanced Spark

  • play Spark configuration
  • play Spark properties
  • play Performance tuning
  • play Data serialization
  • play Memory tuning
  • play Garbage collection
  • play Level of parallelism and memory usage
  • play Broadcasting and data locality
  • play Job scheduling
  • play Modes in cluster management
  • play Dynamic resource allocation
  • play Graceful decommission of executors
  • play Scheduling within an application

Course Certificate

Get Introduction to Advanced Spark course completion certificate from Great learning which you can share in the Certifications section of your LinkedIn profile, on printed resumes, CVs, or other documents.

GL Academy Sample Certificate
popup asset

Welcome to Great Learning Academy