Development   Hadoop  Development,   Administration  Data  Science   Master  the  Data   Analysis  tools  like   Pig  and  hive  

&  BI-­‐  0  to  100  

 

 

Build  a   recommendation   engine  

 

Hadoop  Development  -­‐  0  to  100  

Basics  

Hands  On  

 

Development  

Learn  the  basics   of  Big  Data  and   hadoop  

Play  with  Hadoop   and  hadoop   ecosystem  

Become  a  top   notch  hadoop   developer  

Hadoop  Development,  Administration   and  BI  Program-­‐  0  to  100  (60  Hours)   Overview  of  the  course:   Hadoop  Development,  Administration  and  BI  Program  is  a  one-­‐stop  course  that  introduces  you  to  the  domain  of   hadoop  development  as  well  as  gives  you  technical  knowhow  of  the  same.  At  the  end  of  this  course  you  will  be   able  to  earn  a  credential  of  hadoop  professional  and  you  will  be  capable  of  dealing  with  Terabyte  scale  of  data  and   analyze  it  successfully  using  mapreduce  

Who  this  course  is  for  and  not  for?   For:  Typically  professionals  with  basic  knowledge  of  software  development,  programming  languages,  and   databases  will  find  this  course  really  helpful.  Basic  knowledge  should  be  enough  to  succeed  at  this  course    Not  For:  Students  who  are  absolute  beginners  at  software  development  as  a  discipline  will  find  it  difficult  to   follow  the  course        

 

 

 

 

 

 

 

 

 

 

1 2 3

Phase  1:  Hadoop   Fundamentals       (20  Hours)   Getting  the  Basics   Rights  

Big  Data  

Hadoop  Ecosystem  

-­‐  What  is  Big  Data   -­‐  Dimensions  of  Big  Data   -­‐  Big  Data  in  Advertising   -­‐  Big  Data  in  Banking   -­‐  Big  Data  in  Telecom   -­‐  Big  Data  in  eCommerce   -­‐  Big  Data  in  Healthcare   -­‐  Big  Data  in  Defense   -­‐  Processing  options  of  Big   Data   -­‐  Hadoop  as  an  option  

-­‐  Sqoop   -­‐  Oozie   -­‐  Pig   -­‐  Hive   -­‐  Flume  

Hadoop   -­‐  What  is  Hadoop   -­‐  How  Hadoop  Works   -­‐  HDFS   -­‐  Mapreduce   -­‐  How  Hadoop  has  an  edge    

-­‐  Running  an  Oozie  workflow   -­‐  Analyzing  twitter  data  using   Flume  

Multinode  Setup  

Hadoop  Hands  On   -­‐  Setting  up  Hadoop  on  a   Single  node  cluster   -­‐  Running  HDFS  commands   -­‐  Running  your  Mapreduce   program   -­‐  Running  Sqoop  Import  and   Sqoop  Export   -­‐  Creating  Hive  tables  directly   from  Sqoop   -­‐  Creating  Hive  tables   -­‐  Querying  Hive  tables    

-­‐  Setting  up  Multinode  setup   on  Amazon  ec2   -­‐  Setting  up  multimode  setup   on  the  classroom  machines   -­‐  Setting  up  Cloudera   Manager  on  the  cloud   -­‐  Setting  up  Cloudera   Manager  on    local  setup    

Cluster  Capacity  Planning   Level  1:  Mini  Project   Level  1:  Evaluation  Test   (50  marks)      

1 2 3

Phase  2:  Hadoop   Development       (16  hours)   Become  a  Pro   developer  

Program  

Advanced  Mapreduce  

   

-­‐  Mapreduce  Code   Walkthrough   -­‐  ToolRunner   -­‐  MR  Unit   -­‐  Distributed  Cache   -­‐  Combiner   -­‐  Partitioner   -­‐  Setup  and  Cleanup  methods   -­‐  Using  Java  API  to  access   HDFS   -­‐  Map  Side  joins   -­‐  Reduce  side  joins   -­‐  Input  Types  in  Mapreduce   -­‐  Output  Types  in  Mapreduce   -­‐  Custom  Input  Data  types   -­‐  Custom  Output  Data  types     -­‐  Multiple  reducer  MR   program   Zero  Reducer  Mapper    

 

 

 

 

Mapreduce  Design   Patterns  Hands  On:  

Advanced  Mapreduce   Hands  On   -­‐  MR  Unit  hands  On   -­‐  Distributed  Cache  hands  On   -­‐  Partitioner  hands  On   -­‐  Combiner  hands  On   -­‐  Accessing  files  using  HDFS   API  hands  on   -­‐  Map  Side  joins  hands  on   -­‐  Reduce  side  joins  hands  on    Mapreduce  Design  

Patterns:   -­‐  Searching   -­‐  Sorting   -­‐  Filtering   -­‐  Inverted  Index   -­‐  F-­‐IDF   -­‐  Word  Co-­‐occurrence      

 

 

 

-­‐  Searching  Hands  On   -­‐  Sorting  Hands  On   -­‐  Filtering  Hands  On   -­‐  Inverted  Index  Hands  On   -­‐  TF-­‐IDF  –  Hands  On   -­‐  Word  Co-­‐occurrence  Hands   On  

Evaluation  Test  (50   marks)        

           

Phase  3:  Hadoop   BI       (16  hours)   Analyze  data  using  Pig   and  Hive  

Pig   -­‐  Introduction   -­‐  Basic  Data  Analysis   -­‐  Complex  Data  Analysis   -­‐  Multi  Data  Set  Analysis   -­‐  UDFs  in  Pig   -­‐  Troubleshooting  and  Optimizing   Pig   -­‐  Pig  Hands  On  

Hive   -­‐  Introduction   -­‐  Basic  Data  Analysis  with  Hive   -­‐  Hive  Data  Management   -­‐  Text  Processing  with  Hive   -­‐  Transformations  in  Hive   -­‐  Optimizing  Hive   -­‐  Hive  Hands  On      

 

Data  Analysis  Using  Pentaho   as  a  ETL  tool   -­‐  Introduction   -­‐  Setting  up  Pentaho   -­‐  Loading  Data  to  HDFS   -­‐  Loading  Data  to  Hive   -­‐  Aggregation  through  Mapreduce   -­‐  Transforming  Data  with  Hive   -­‐  Transforming  Data  with  Pig   -­‐  Loading  data  from  HDFS  to   RDBMS   -­‐  Loading  Data  from  hive  to  RDBMS   -­‐  Reporting  on  HDFS  Data   -­‐  Reporting  on  Hive  Data    

Evaluation  Test        

 

 

Phase  4:  Hadoop   BI       (8  hours)   Master  the  Hadoop   Administration  

Scheduling  in  Hadoop  

Note:  60  hours  is  bifurcated  as  40  hrs  of  classroom   training  and  20  hrs  of  hands  on  assignments  

-­‐FIFO  Scheduling   -­‐Fair  Scheduling    

         

Cluster  Monitoring   -­‐  Basic  Monitoring   -­‐  Log  Management   -­‐  Using  Ganglia  for  monitoring    

   

Cluster  Maintenance  

           

-­‐  Cluster  Upgrades   -­‐  Failover  Mechanism    

Hands  On   60  Mark  Evaluation            

   

 

 

 

 

 

   

 

 

 

 

Trainer  Profile  

   

Experienced  

Certified  

 

8+  yrs  of  Enterprise   Software  Dev  Exp.  

Hadoop,  Hbase  and   MapR  certified  

Customers   Analysis   Served  customers   like  Accenture,  HP,   Genpact,  Mastek,   and  Cisco  

About  the  trainer  

Trainer’s  Certifications  

CCAH,CCHD,  CCHSB      MapR  M5            Zend  

 

 

 

     SCJP  

 

             SCWCD

 

Hadoop Course Contents.pdf

of Big Data and. hadoop. Hands On. Play with Hadoop. and hadoop. ecosystem. Development. Become a top. notch hadoop. developer. Hadoop Development ...

398KB Sizes 4 Downloads 140 Views

Recommend Documents

pro hadoop pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. pro hadoop pdf.

Hadoop and MapReduce.pdf
Download. Connect more apps... Try one of the apps below to open or edit this item. Hadoop and MapReduce.pdf. Hadoop and MapReduce.pdf. Open. Extract.

Expert Hadoop Administration
Online PDF Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS (Addison-Wesley Data Analytics Series), Read PDF ...

oreilly hadoop pdf
Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. oreilly hadoop pdf. oreilly hadoop pdf. Open. Extract. Open with.

hadoop-2.7.0-tutorial.pdf
Hadoop terinspirasi dari publikasi makalah Google MapReduce dan Google File System. (GFS) oleh ilmuwan dari Google, Jeffrey Dean dan Sanjay Ghemawat ...

hadoop pdf book
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. hadoop pdf ...

hadoop guide pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. hadoop guide ...

pro hadoop pdf
File: Pro hadoop pdf. Download now. Click here if your download doesn't start automatically. Page 1 of 1. pro hadoop pdf. pro hadoop pdf. Open. Extract.

Big Data & Hadoop -
Work on a Real Life Project on Big Data Analytics and gain Hands on Project Experience. Big Data & Hadoop .... Mobile : +91-7053133032 l 011-65544707.

Course Contents Category of Course Course Title Course Code ...
Course Contents. Category of. Course. Course Title. Course Code Credits – 6C. Theory Papers. (ES). L T P ... Fluid Mechanics and Machinery- Agrawal, TMH. 3.

Hadoop For Dummies - Dirk deRoos.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Hadoop For ...

Hadoop- the final product.pdf
Page 2 of 42. 1 | a m e e r p e t m a t e r i a l s . b l o g s p o t . i n. HADOOP. 1. Introduction to Big data. Big Data Characteristics. Huge data 10000 TB's of Data ...