Contact us on +(44) 74484 56791 info@globaltnc.com

Hadoop Online Training

Hadoop Online Training

Course Duration : 35 Hrs

Hadoop is a free, Java based programming framework that supports the processing of large data sets in a distributed computing environment. Hadoop is developed to scale up from a single server to thousands of systems, with a very high degree of fault tolerance. With the help of Hadoop it is easy to run applications on systems with thousands of nodes involving thousands of terabytes. The main advantage of Hadoop is that businesses and organizations can now find value in data that was recently considered useless.

Hadoop Course Content
INTRODUCTION

  • What is Hadoop?
  • History of Hadoop
  • Building Blocks – Hadoop Eco-System
  • Who is behind Hadoop?
  • What Hadoop is good for and why it is Good

HDFS

  • Configuring HDFS
  • Interacting With HDFS
  • HDFS Permissions and Security
  • Additional HDFS Tasks
  • HDFS Overview and Architecture
  • HDFS Installation
  • Hadoop File System Shell
  • File System Java API

MAPREDUCE

  • Map/Reduce Overview and Architecture
  • Installation
  • Developing Map/Red Jobs
  • Input and Output Formats
  • Job Configuration
  • Job Submission
  • Practicing Map Reduce Programs (atleast 10 Map Reduce Algorithms )

Getting Started With Eclipse IDE

  • Configuring Hadoop API on Eclipse IDE
  • Connecting Eclipse IDE to HDFS

Hadoop Streaming Advanced MapReduce Features

  • Custom Data Types
  • Input Formats
  • Output Formats
  • Partitioning Data
  • Reporting Custom Metrics
  • Distributing Auxiliary Job Data

Distributing Debug Scripts Using Yahoo Web Services Pig

  • Pig Overview
  • Installation
  • Pig Latin
  • Pig with HDFS

Hive

  • Hive Overview
  • Installation
  • Hive QL
  • Hive Unstructured Data Analyzation
  • Hive Semistructured Data Analyzation

HBase

  • HBase Overview and Architecture
  • HBase Installation
  • HBase Shell
  • CRUD operations
  • Scanning and Batching
  • Filters
  • HBase Key Design

ZooKeeper

  • Zoo Keeper Overview
  • Installation
  • Server Mantainace

Sqoop

  • Sqoop Overview
  • Installation
  • Imports and Exports

CONFIGURATION

  • Basic Setup
  • Important Directories
  • Selecting Machines
  • Cluster Configurations
  • Small Clusters: 2-10 Nodes
  • Medium Clusters: 10-40 Nodes
  • Large Clusters: Multiple Racks

Integrations Putting it all together

  • Distributed installations
  • Best Practices

Request for Training

    OR