Log On/Register  

855.838.5028

Hadoop for Business Analysts

Duration: 3 Days
Course Price: $2,195

Overview

Apache Hadoop is the most popular framework for processing Big Data. Hadoop provides rich and deep analytics capability, and it is making in-roads in to traditional BI analytics world. This course will introduce an analyst to the core components of Hadoop eco system and its analytics

Audience

  • Business Analyst
  • Data Analyst

 

 

 

 

Objectives

  • Understanding Hadoop ecosystem
  • Data storage using HDFS
  • ETL using Pig
  • Data warehousing and querying using Hive

 

Overview

Apache Hadoop is the most popular framework for processing Big Data. Hadoop provides rich and deep analytics capability, and it is making in-roads in to traditional BI analytics world. This course will introduce an analyst to the core components of Hadoop eco system and its analytics

Audience

  • Business Analyst
  • Data Analyst

 

 

 

 

Objectives

  • Understanding Hadoop ecosystem
  • Data storage using HDFS
  • ETL using Pig
  • Data warehousing and querying using Hive

 

Pre-requisites

  • programming background with databases / SQL
  • basic knowledge of Linux (be able to navigate Linux command line, editing files with vi / nano)

Outline

1: Quick primer on Hadoop / HDFS / MapReduce

  •  
  • Hadoop eco system
  • distributions
  • high level architecture
  • hardware / software
  • Labs : first look at Hadoop
  •  
  • HDFS Overview
  • concepts (horizontal scaling, replication, data locality)
  • architecture (Namenode,  Data node)
  • Demo : Interacting with HDFS
  •  
  • Map Reduce Overview
  • mapreduce concepts
  • YARN operating system
  • Demo : Running a Map Reduce program

 

2: Hive

  • hive concepts & architecture
  • SQL support in Hive
  • Data warehousing in Hive
  • data types
  • table creation and queries
  • partitions
  • joins
  • text analytics
  • labs (multiple) : creating Hive tables and running queries, joins , using partitions, using text analytics functions

 

3 : Pig

  • pig concepts and architecture
  • pig latin language
  • understanding pig job flow
  • basic data analysis with Pig
  • data cleanup
  • ETL workloads with Pig
  • joins and multi datasets with Pig
  • user defined functions
  • debugging Pig scripts
  • lab : writing pig scripts to analyze / transform data

 

4: BI Tools for Hadoop

  • BI tools and Hadoop
  • Overview of current BI tools landscape
  • Choosing the best tool for the job

 

 

Learn More
Please type the letters below so we know you are not a robot (upper or lower case):