Log On/Register  

855.838.5028

Hive Essentials

Duration: 1 Day
Course Price: $1,595

Overview

Hive is a system for querying and managing structured data built on top of Hadoop. It uses Map-Reduce for execution, HDFS for storage, and structured data with rich data types (structs, lists and maps) to represent data. Hive allows to directly query data from different formats (text/binary) and file formats (Flat/Sequence) using SQL as a familiar programming tool for standard analytics. Hive provides extensibility using embedded scripts for non-standard applications, and it supports rich metadata to allow data discovery and optimization. This comprehensive one-day Hive training class gives you the skills you need to start using Hive in your project.

Audience

· Software Engineers · Administrators of Hadoop/Hive · Data Analysts

 

Objectives

· Understand the main concepts of using Hive · Create Hive’s native and external tables · Write SQL queries and learn some tricks of optimization · Debug and resolve issues · Write pluggable Map-Reduce scripts · Learn important settings and some administration tasks

Overview

Hive is a system for querying and managing structured data built on top of Hadoop. It uses Map-Reduce for execution, HDFS for storage, and structured data with rich data types (structs, lists and maps) to represent data. Hive allows to directly query data from different formats (text/binary) and file formats (Flat/Sequence) using SQL as a familiar programming tool for standard analytics. Hive provides extensibility using embedded scripts for non-standard applications, and it supports rich metadata to allow data discovery and optimization. This comprehensive one-day Hive training class gives you the skills you need to start using Hive in your project.

Audience

· Software Engineers · Administrators of Hadoop/Hive · Data Analysts

 

Objectives

· Understand the main concepts of using Hive · Create Hive’s native and external tables · Write SQL queries and learn some tricks of optimization · Debug and resolve issues · Write pluggable Map-Reduce scripts · Learn important settings and some administration tasks

Pre-requisites

· Working knowledge of SQL. Some knowledge of scripting languages. Basic understanding of Linux operating system

1: Why Hive vs. regular Map-Reduce?

o History

o Definitions and terminology

2: Hive’s architecture and functionality

o Services and interoperability with Hadoop

o Query processor

 

3: Hive’s MetaData

o Creating new tables

o Partitioned tables

o Dynamic partitions

o Tables with different serialization and encoding formats

 

4: Writing Hive’s complex queries

o Different kinds of joins

o Embedding custom scripts

 

5: Administration of running Hive queries

o Hadoop permissions and groups

o Enabling jobs scheduling/prioritizing strategies

o Setting controls on shared resources

o Hive’s production quality metadata storage and its backup

o Tools for jobs control flow – overview

 

6: Advanced Hive functionality

o Writing embedded Map/Reduce scripts

o Considerations of Map vs.Reduce, RAM vs. writes

o Writing embedded Java UDF and UDAF

Case studies and best practices

Learn More
Please type the letters below so we know you are not a robot (upper or lower case):