Virtual Classrooms

Introduction to Hadoop



Skill level


Course cost

About this course

Data is everywhere. People upload videos, take pictures, use several apps on their phones, search the web and more. Machines too, are generating and keeping more and more data. Existing tools are incapable of processing such large data sets. Hadoop and large-scale distributed data processing, in general, is rapidly becoming an important skill set for many programmers. Hadoop is an open-source framework for writing and running distributed applications that process large amounts of data. This course introduces Hadoop in terms of distributed systems as well as data processing systems. With this course, get an overview of the MapReduce programming model using a simple word counting mechanism along with existing tools that highlight the challenges around processing data at a large scale. Dig deeper and implement this example using Hadoop to gain a deeper appreciation of its simplicity.

Skills covered

  • check Different techniques of big data analytics using Hadoop
  • check Understand the importance of distributed data storage system

Course Syllabus

Introduction to Hadoop

  • play Introduction to Big Data / Hadoop
  • play Hadoop distributed file system (HDFS)
  • play Intro to ETL
  • play Distributed computing
  • play Map-Reduce abstraction
  • play Programming MapReduce jobs
  • play Introduction to Oozie and HDFS processing
  • play Hadoop cluster and eco system
  • play Input/Output formats and conversion between different formats
  • play MapReduce features
  • play Troubleshooting MapReduce jobs
  • play YARN (Hadoop2.0)

Course Certificate

Get Introduction to Hadoop course completion certificate from Great learning which you can share in the Certifications section of your LinkedIn profile, on printed resumes, CVs, or other documents.

GL Academy Sample Certificate
popup asset

Welcome to Great Learning Academy