Hadoop YARN Tutorial – Learn the Fundamentals of YARN Architecture

Why YARN?

In Hadoop version 1.0, introduced as MRV1(MapReduce Version 1), MapReduce did both processing and resource control functions. It consisted of a Job Tracker, that was the only master. The Job Tracker designated the resources performed, scheduling, and watched the processing jobs. It specified map and reduce tasks on several smaller processes called the Task Trackers. The Task Trackers regularly reported their development to the Job Tracker.

Hadoop YARN Tutorial – Learn the Fundamentals of YARN Architecture

This design followed in scalability bottleneck due to only one Job Tracker. In its article, IBM stated that according to Yahoo!, the reasonable limits of such a design are spread with a cluster of 5000 nodes and 40,000 tasks working concurrently. Aside from this condition, the utilization of computational devices is ineffective in MRV1. Also, the Hadoop framework grew restricted only to the MapReduce processing model.

To win all these problems, YARN was added in Hadoop version 2.0 in 2012 by Yahoo and Hortonworks. YARN’s central idea is to relieve MapReduce by taking over Resource Management and Job Scheduling responsibility. YARN began to give Hadoop the capacity to run non-MapReduce jobs inside the Hadoop framework.

Introduction to Hadoop YARN

Now that I have told you about YARN’s need, let me present you with the core component of Hadoop v2.0, YARN. YARN provides various data processing methods like graph processing, interactive processing, stream processing, and batch processing to manage and prepare data stored in HDFS. Hence YARN opens up Hadoop to other types of distributed applications behind MapReduce.

Aside from Resource Management, YARN also provides Job Scheduling. YARN makes all your processing activities by allotting resources and scheduling jobs. Apache Hadoop YARN Architecture consists of the subsequent central parts :

Resource Manager: Works on a master daemon and controls the resource allocation in the group.
Node Manager: They work on the slave daemons and are accountable for performing a task on each single Data Node.
Application Master: Handles the user job lifecycle and support needs of unique applications. It works with the Node Manager and watches the accomplishment of tasks.
Container: Use resources, including RAM, CPU, Network, HDD, etc., on a particular node.

Components of YARN

The workflow in Hadoop YARN

Point to the given image and see the subsequent steps required in the Application workflow of Apache Hadoop YARN:

The client capitulates an application
Resource Manager allows a container to begin Application Manager
Application Manager records with Resource Manager
Application Manager demands containers from Resource Manager
Application Manager tells the Node Manager to start containers
Application code is done in the container
Client contacts Resource Manager/Application Manager to monitor the application’s status
Application Manager unregisters with Resource Manager

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Share this article

Why Does Data Visualization Matter in Power BI?

July 10, 2025

Boost Automation with Salesforce Business Rules

July 9, 2025

TOSCA TCP Guide: OpenURL, CloseBrowser, TBOX Window Ops

July 9, 2025

Master Tableau Dashboards: Quick Tips

July 9, 2025

Master Data Analytics with Python

July 8, 2025

What Is Exploratory Data Analysis (EDA) in Data Analytics?

July 8, 2025

Need a Free Demo Class?

Join H2K Infosys IT Online Training

Enroll Now

Key features of Spark SQL

May 29, 2024

Big data Hadoop Certification course – An Insight

October 21, 2020

Hadoop & MapReduce Interview Questions & Answers

October 8, 2020

Apache Flume Tutorial

September 21, 2020

Hadoop & Mapreduce Examples

September 4, 2020

Hadoop Online Training: Top 10 Hadoop Tools for Big Data

June 9, 2020

Hadoop Big Data Online Test

September 4, 2017

Steven Roger

Steven Roger is a technology blogger for the H2K Infosys blog, where he brings complex tech concepts to life with clear, engaging insights. With a passion for IT education and over a decade of industry experience, Steven specializes in demystifying the latest in software development, business analysis, and quality assurance training. His articles provide readers with practical knowledge and tips on upskilling for successful careers in tech.

Read All from Steven Roger