starweaver-logo
LOG INGET STARTED
LOG INGET STARTED
  • Browse
  • Doing

  • On Air
  • Channels
  • Career Paths
  • LEARNING

  • Courses
  • Certifications
  • Journeys
  • Test Prep
  • CONNECTING

  • How It Works
  • Community
  • Techbytes
  • Podcasts
  • Leaderboards
  • SUPPORT

  • Support & FAQs
  • Starweaver for Business
  • Starweaver for Campus
  • Teach with Starweaver
footer-brand-logo
  • COMPANY
  • About Us
  • Support and Knowledge Base
  • Policies & Terms
  • Contact
  • CONTENT
  • Courses
  • Certifications
  • Journeys
  • Test Prep
  • Meet the Gurus
  • Techbytes
  • FOR ORGANIZATIONS
  • Starweaver for Business
  • Starweaver for Campus
  • Catalogue
  • Pricing
  • Private Classes
  • PARTNER WITH US
  • Instructors & Teachers
  • Books, Writing & Publishing
  • FOLLOW US
    • facebook
    • twitter
    • linkedin
    • pinterest
    • instagram
    • youtube
Our trademarks include Starweaver®, Make genius happen™, Education you can bank on®, People are your most important assets!®, Body of Knowledge™, StarLabs™, LiveLabs™, Journeys™
© Starweaver Group, Inc. All Rights Reserved.
  1. Courses
  2. >
  3. Big Data Crash Course

Big Data Crash Course

This course is the perfect blend of exploring the expedition of various Big Data Technologies such as Hadoop, Spark, Nifi, Apache Kafka, etc, and hands-on experience using Google Cloud Platform.

Bhavuk Chawla
Bhavuk Chawla
Data Science | core | 9 hours 30 minutes |   Published: Jan 2022
In partnership with:  Coursera

    Discussions

Overview

1.4KSTUDENTS*
96.3%RECOMMEND*

This course includes:

  • 9+ hours of on-demand video
  • 12 modules
  • Intermediate level
  • Direct access/chat with the instructor
  • 100% self-paced online
  • Many downloadable resources
  • Shareable certificate of completion
Want to ramp up on key big data technologies in shortest possible time? Then this big data crash course is for you. As it not only covers the basics, but it takes you to the journey where you learn and grow simultaneously. This course is the perfect blend of exploring the expedition of various big data technologies such as Hadoop, Spark, Nifi, Apache Kafka, etc., and hands-on experience using Google Cloud Platform. With the whiteboarding, this course provides a classroom-like experience, this course has become the ideal fit for all the newbies in the big data world. Enroll now and start your journey to the realm of data.

Skills You Will Gain

Apache Kafka
Apache Nifi
Apache Spark
Big Data
Google Cloud
Hadoop
HBase

Learning Outcomes (At the end of this program you will be able to)

  • Develop the Big Data Pipelines 
  • Understand various Big Data Technologies such as Hadoop, Apache Spark, Apache Nifi, Apache Kafka, Sqoop, Hive, Impala, HBase, and many more 
  • Handle a large amount of data easily 
  • Develop a Big Data Framework VM Instance in Google Cloud Platform using DataProc 
  • Understand Key Architectures of Big Data 
  • Gain a holistic understanding of the Big Data Ecosystem 
  • Work with various File Formats within Big Data Frameworks such as Avro, JSON, Parquet, and many more 
  • Create a Real-Time Data Analytics Pipeline for fetching the data from Twitter and performing analysis within Apache Spark using Apache NiFi. 

Prerequisites

  • Basics of SQL and RDBMS 
  • Unix/Linux Basic Commands like mkdir, ls, cat etc. 
  • Python/Java (not used extensively in the course) 
  • Credit card for setting up GCP account (no charges will be deducted if using GCP trial version). You can perform all exercises of this course without incurring charges. Please refer “GCP Account Best Practices” Section for more details. 
  • Twitter Account 

Who Should Attend

  • Engineers who are aiming to get a job in Big Data 
  • Engineers who would like to transition their roles into Big Data Technologies  
  • Big Data Engineers planning to appear for Certifications like CCA175, CCA159 
  • Big Data Engineers who are looking for Promotion 

Curriculum

Instructors

Frequently Asked Questions

How much do the courses at Starweaver cost?

We offer flexible payment options to make learning accessible for everyone. With our Pay-As-You-Go plan, you can pay for each course individually. Alternatively, our Subscription-Based plan provides you with unlimited access to all courses for a monthly or yearly fee.

Do you offer any certifications upon completion of a course at Starweaver?

Yes, we do offer a certification upon completion of our course to showcase your newly acquired skills and expertise.

Does Starweaver offer any free courses or trials?

No, we don't offer any free courses, but we do offer 5-day trial only on our subscriptions-based plans.

Are Starweaver's courses designed for beginners or advanced students?

Our course is designed with three levels to cater to your learning needs - Core, Intermediate, and Advanced. You can choose the level that best suits your knowledge and skillset to enhance your learning experience.

What payment options are available for Starweaver courses?

We accept various payment methods such as major credit cards, PayPal, wire transfer, and company purchase orders. For more information related to payments contact customer support.

Do you offer refunds?

Yes, we do offer a 100% refund guarantee for our courses within a specified time frame. If you are not satisfied with the course, contact our customer support team to request a refund with your order details. Some restrictions may apply.

*Where courses have been offered multiple times, the “# Students” includes all students who have enrolled. The “%Recommended” shown is also based on this data.
1Welcome to the course!
2Module 1: Overview
3Module 2: Environment setup
4Module 3: Getting Started with Big Data Journey
5Module 4: Hadoop Filesystem
6Module 5: Distributed Processing using MapReduce and Beyond
7Module 6: Data Persistence in Big Database
8Module 7: Data Ingestion using Sqoop
9Module 8: Data Analysis using Hive Impala
10Module 9: Data Processing using Spark
11Module 10: Streaming Events through Kafka
12Module 11: Building Dataflows using NiFi
13Module 12: Epilogue
Bhavuk Chawla

Bhavuk Chawla

With a distinguished career spanning decades in cutting-edge technologies such as Generative AI, Machine Learning, Cloud Computing, and Big Data Analytics, Bhavuk brings a wealth of hands-on experience and strategic insight to senior professionals seeking to elevate their skill sets. As an elite instructor on platforms like Pluralsight and through partnerships with industry giants including Google, Adobe, and Microsoft, he has empowered over 150,000 participants across the globe.

Recognized as Cloudera Instructor of the Year in 2016 and widely regarded as a Google Cloud and AI Evangelist, Bhavuk is committed to delivering transformative learning experiences tailored to diverse audiences—from CEOs to developers. His extensive background in technology consulting includes architecting scalable, cross-platform solutions that address the unique needs of global enterprises.

Currently serving as Head of Big Data Sciences & AI Practice and Co-Founder of several technology transformation ventures, Bhavuk has led high-impact training and consulting initiatives that drive innovation and operational excellence. His approach combines theoretical expertise with real-world problem-solving, gained through working closely with Fortune 500 companies.

Passionate about fostering a culture of continuous learning and knowledge sharing, Bhavuk ensures that every engagement—whether a training session or a strategic consultation—equips professionals with the tools they need to thrive in today’s fast-evolving technological landscape. Through comprehensive and impactful educational programs, he is dedicated to helping teams and individuals achieve lasting success.

VIEW MY CHANNEL

About this course: Overview, Learning Outcomes, Who Should Enroll...

Segment - 08 - Definition of Big Data

Segment - 09 - Data Lake Overview

Segment - 10 - Key Roles in Big Data Science Project

Segment - 11 - Big Data Logical Architecture

Segment - 12 - Typical Big Data Pipeline

Segment - 13 - Hadoop Overview

Segment - 14 - Bonus Demystifying JVM vs JDK vs JRE

Segment - 04 - Google Cloud Account Setup

Segment - 05 - Creating a Dataproc Cluster

Segment - 06 - GCP Account Best Practices

Segment - 07 - Twitter Developer Account Setup

Segment - 67 - Conclusion

Segment - 53 - Introduction to Apache Kafka

Segment - 54 - Evolution of Kafka

Segment - 55 - Why Kafka?

Segment - 56 - Apache Kafka Vs Confluent Kafka

Segment - 57 - Kafka Architecture

Segment - 58 - Kafka Demo - Producer Consumer

Segment - 15 - HDFS Overview

Segment - 16 - Small FS vs HDFS

Segment - 17 - HDFS Architecture

Segment - 18 - Hands-on HDFS

Segment - 38 - Hive Overview

Segment - 39 - Hive Architecture

Segment - 40 - Impala Overview

Segment - 41 - Impala Architecture

Segment - 42 - Text vs Binary Data Formats

Segment - 43 - Avro Format

Segment - 44 - Hive Hands-on

Segment - 45 - Hands-on Sqoop + Hive Integration

Segment - 46 - Hands-on Schema Evolution

Segment - 59 - NiFi Overview

Segment - 60 - NiFi UseCases

Segment - 61 - NiFi Limitations

Segment - 62 - NiFi Components and its Architecture

Segment - 63 - Hands-on NiFi Installation on GCP

Segment - 64 - Hands-on Twitter Data Ingestion Using Nifi Part 1

Segment - 65 - Hands-on Twitter Data Ingestion Using Nifi Part 2

Segment - 66 - Hands-on Twitter Data Ingestion Using Nifi Part 3

Segment - 34 - Sqoop Overview

Segment - 35 - Sqoop Architecture

Segment - 36 - Sqoop Installation

Segment - 37 - Hands-on Sqoop

Segment - 19 - Introduction to MR

Segment - 20 - Logical & Physical Architecture of MR

Segment - 21 - YARN (Distributed OS)

Segment - 22 - YARN Architecture

Segment - 23 - Hands-on Spark Job on YARN

Segment - 01 - Course Structure and Approach

Segment - 02 - Course Pre-requisites

Segment - 03 - Course Outcomes

Segment - 24 - RDBMS USPs & its Limitations

Segment - 25 - Polyglot Persistence

Segment - 26 - Why HBase and Limitations

Segment - 27 - HBase Terms

Segment - 28 - HBase Physical Storage

Segment - 29 - HBase Architecture

Segment - 30 - Installation HBase on DataProc cluster

Segment - 31 - Installing Confluent Kafka

Segment - 32 - KSQLDB Troubleshoot

Segment - 33 - Hands-on HBase

Segment - 47 - Spark Overview

Segment - 48 - Spark Logical Architecture

Segment - 49 - Spark Physical Architecture

Segment - 50 - Spark Core Vs Spark SQL

Segment - 51 - Spark Execution Modes

Segment - 52 - Hands-on Spark on Jupyter