What you'll learn

  • HDFS
  • Hive
  • Oozie & sqoop
  • Scala
  • Spark
  • MLlib & Graphx

Why Learn Big Data Masters At iNeuron?

iNeuron faculties are industry professionals who have worked in leading MNC's as big data engineers Learn how to process your big data structures quickly using Hadoop and spark.

  • Students:1400+
  • Duration:6 months
  • Major Projects:3+
  • Minor Projects:10+
  • Mode:Online & Offline
  • Job Assistence:100%
  • Certificate:Yes
  • Mock Interviews:Yes

Course Curriculum

  • Why is Data So Important?
  • Pre-requisite – Data Scale
  • What is Big Data?
  • Big Bank: Big Challenge
  • Common Problems
  • 3 Vs of Big Data
  • Defining Big Data
  • Sources of Data Flood
  • Exploding Data Problem
  • Redefining the Challenges of Big Data
  • Possible Solutions: Scaling Up Vs. Scaling Out
  • Challenges of Scaling Out
  • Solution for Data Explosion-Hadoop
  • Hadoop: Introduction
  • Hadoop In Layman’s Term
  • Hadoop Ecosystem
  • Evolutionary Features of Hadoop
  • Hadoop Timeline
  • Why Learn Big Data Technologies?
  • Who Is Using Big Data?
  • HDFS: Introduction
  • Design Of HDFS
  • HDFS Blocks
  • Components of Hadoop 1.X
  • NameNode And Hadoop Cluster
  • Arrangement of Racks
  • Arrangement of Machines and Racks
  • Local FS And HDFS

  • NameNode
  • Checkpointing
  • Replica Placement
  • Benefits-Replica Placement and Rack Awareness
  • URI
  • URL and URN
  • HDFS Commands
  • Problems with HDFS in Hadoop 1.X
  • HDFS Federation (Included in Hadoop 2.X)
  • HDFS Federation
  • High Availability, Anatomy of File Read From HDFS
  • Data Read Steps
  • Important Java Classes to Write Data To HDFS
  • Anatomy of File Write To HDFS
  • Writing File To HDFS: Steps

  • Building Principles
  • Introduction to MapReduce
  • MR Demo
  • Pseudo Code
  • Mapper Class
  • Reducer Class
  • Driver Code
  • Input Split
  • Input Split And Data Blocks
  • Difference
  • Why Is the Block Size 128 MB?
  • RecordReader
  • InputFormat
  • Default Inputformat: TextInputFormat
  • InputFormat
  • OutputFormat
  • Using A Different OutputFormat
  • Important Points
  • Partitioner
  • Using Partitioner
  • Map Only Job
  • Flow of Operations in MapReduce

  • Serialization in MapReduce
  • Custom Writable in MapReduce
  • Custom WritableComparable In MapReduce
  • Schedulers In YARN
  • FIFO Scheduler
  • Capacity Scheduler
  • Fair Scheduler
  • Differences Between Hadoop 1.X And Hadoop 2.X
  • Introduction to Apache Pig
  • Why Pig?
  • Apache Pig Architecture
  • Simple Data Types
  • Complex Data Types

  • Sample Execution
  • Pig Operators demo
  • Parameter Substitution
  • Macros
  • Anatomy of Reduce-Side-Join
  • Job Optimizations in Pig
  • UDF’s in Pig
  • Execution of XML and CSV Files in Pig

  • Introduction
  • Hive DDL
  • Demo: Databases.Ddl
  • Demo: Tables.Ddl
  • Hive Views
  • Demo: Views.Ddl
  • Architecture
  • Primary Data Types
  • Data Load
  • Demo: Import Export.Dml
  • Demo: Hive Queries.Dml
  • Demo: Explain.Hql Table Types
  • Demo: ExternalTable.Ddl
  • Complex Data Types
  • Demo: Working with Complex Datatypes
  • Hive Variables
  • Demo: Working with Hive Variables
  • Hive Variables and Execution Customisation

  • Working with Arrays
  • Sort by And Order By
  • Distribute by And Cluster By
  • Partitioning
  • Static and Dynamic Partitioning
  • Bucketing Vs Partitioning
  • Joins and Types
  • Bucket-Map Join
  • Sort-Merge-Bucket-Map Join
  • Left Semi Join
  • DDemo: Join Optimisations
  • Input Formats in Hive
  • 12200
  • Sequence Files in Hive
  • RC File in Hive
  • File Formats in Hive
  • ORC Files in Hive
  • Inline Index in ORC Files
  • ORC File Configurations in Hive

  • SerDe In Hive
  • Demo: CSV SerDe
  • JSONSerDe
  • RegexSerDe
  • Analytic and Windowing in Hive
  • Demo: Analytics.Hql
  • Hcatalog In Hive,
  • Demo: Using_HCatalog
  • Accessing Hive With JDBC
  • Demo: HiveQueries.Java
  • HiveServer2 And Beeline
  • Demo: Beeline
  • UDF In Hive
  • Demo: ToUpper.Java and Working_with_UDF
  • Optimizations in Hive
  • Demo: Optimizations

  • Challenges with Traditional RDBMS
  • Features of NoSQL Databases
  • NoSQL Database Types
  • CAP Theorem
  • What Is HBase Regions
  • HBase HMaster ZooKeeper
  • HBase First Read
  • HBase Meta Table
  • Region Split
  • Apache HBase Architecture Benefits
  • HBase Vs. RDBMS
  • Shell Commands
  • Hive Integration with HBase
  • Pig Integration with HBase

  • Introduction to Oozie
  • Oozie Architecture
  • Oozie Workflow Nodes
  • Oozie Server
  • Oozie Workflow
  • Sqoop Architecture
  • Sqoop Features

  • Sqoop Hands On
  • Flume: Introduction
  • Flume Architecture
  • Example Description
  • Transactions
  • Batching
  • Partitioning
  • Exec Source
  • Spooling Directory Source
  • File Channel
  • Memory Channel
  • Logger Sink
  • HDFS Sink

  • Project Discussion
  • Introduction to Function Programming Language and Scala
  • Functional vs OOP
  • Variable
  • Functions
  • Using if
  • While to define logic
  • Loops in scala
  • Collections in scala

  • Object Oriented Programming
  • Classes and Objects
  • Traits in Scala
  • Constructors in Scala
  • Method Overloading
  • Implicit parameter usage

  • Inheritance – OOP
  • Override modifier
  • Polymorphism
  • Invoking superclass methods
  • Final members
  • Traits in detail

  • Control Structures in detail
  • Exception Handling
  • Coding without break and continue
  • Coding the functional way
  • Case classes in Scala
  • Implicit conversions and
  • Implicit
  • Parameter in depth

  • Introduction to Apache Spark
  • Map Reduce Limitations
  • RDD’s
  • Sparkcontext – SQLContext and HiveContext

  • Programming with RDD’s
  • Creating RDD’s from text-files
  • Transformations and Actions
  • How does spark execution work
  • RDD API’s – filter
  • FlatMap
  • Fold
  • Foreach
  • Glom
  • GroupBy
  • Map
  • ReduceByKey
  • Zip
  • Persist
  • Unpersist
  • Read/Write from storage
  • RDD Examples

  • RDD API’s – aggregate
  • Cartesian
  • Checkpoint
  • Coalesce
  • Reparition
  • Cogroup
  • CollectAsMap
  • CombineByKey
  • count and countApprox functions
  • More RDD Examples
  • Schema – StructType
  • StructFields
  • DataType
  • DataFrame API’s and examples

  • Create temporary tables
  • SparkSQL
  • Parquet vs Avro
  • Examples and problem solving on real data using RDD and converting the same to Dataframe

  • Understanding spark configurations better
  • Open Source Rest interfaces on top of Spark (JobServer/Livy). We will work with JobServer
  • Demo of JobServer and its use case

  • Accumulators, BroadCast Variables
  • Query Execution Plan
  • Internal of spark workings
  • Spark Tuning – what should your production configuration be like

  • Spark 2.1.0 – what has changed
  • Datasets, Create a Spark project. SBT / Maven How do maven repo work
  • Creating and submitting an application to jobserver/livy

  • Spark Streaming
  • Project discussion

  • Spark ML-lib
  • Spark GraphX
  • Project discussion contd.

Projects

  • ETL pipeline with Big Data Component.
  • ETL pipeline with Azure Cloud.
  • UBER Big Data pipeline.
  • Supply Chain Inventory Management forecasting.
  • Big Data Ready Enterprise.
  • Deployment of Big Data Application on Cloud Logging Monitoring.

Batch Timings & Fee

Fee Structure

₹20,000

+ GST
Online

Start Date : 28th Mar 2020

Sat & Sun: 10.00 AM - 12.00 PM

Thu(Doubt Clearing): 10.00 PM - 12.00 AM

Big Data, Hadoop & Spark Masters from iNeuron

Complete your Big data, Hadoop & Spark development program at iNeuron and get your certificate.

  • 6 Months live session training
  • Placement Assistance
  • 3 Months in-house internship
Certificate

Our Features

6 Months Classroom Training

Once on-board our candidates will go through an intense classroom training session of 3 months conducted by our best team of experienced senior data scientists, who will provide all conceptual knowledge in an innovative as well as an interactive manner.

Career Counselling

After successful completion of our course, we train our candidates via mock interviews, personal interviews, and group discussions as well as provide them with professional mentoring and help them build an attractive resume which will enable them to fetch a lucrative job.

3 Months Of In-House Projects

What makes us different from other training institutes is that we are also into product development and that enables us to provide hands-on experience to our candidates to contribute in live projects, which will result in a deeper understanding of the course with industry-level knowledge.

Admission Process

Search Courses

View Course Details

Apply, Enroll or Register

Our Alumni Words

Our Hiring Partners

Learning together, we achieve great things.

Enquiry about this course

Frequently Asked Questions