Price: 450.00 US$

Description

Spark & Scala Course Contents

Describe Features of Apache Spark
• How Spark fits in Big Data ecosystem
• Why Spark & Hadoop fit together
Define Spark Components
• Driver Program
 Spark Context
• Cluster Manager
• Worker
 Executor
 Task
• Spark RDD
 Spark Context
• Spark Libraries
Load data into Spark
• Different data sources and formats

 HDFS
 Amazon S3
 Local File System
 Text
 JSON
 CSV
 Sequence File

• Create & Use RDD, Data Frames
Apply dataset operations to Resilient Distributed Datasets
• Transformation
• Actions
• Cache Intermediate RDD

 Lineage Graph
 Lazy Evaluation
Use Spark DataFrames for simple queries
• Create Data Frame
• Spark Interactive shell (Scala & Python)
• Spark SQL
Define different ways to run your application
Build and launch a standalone application
• Spark Program Life Cycle
• Function of Spark Context
• Different Way to Launch Spark Application

 Local
 Standalone
 Hadoop YARN
 Apache Mesos

• Launch Spark Application

 Spark-Submit
 Monitor the Spark Job
Describe & Create pair RDD
• Key-Value pair
• Apache Spark vs Apache Hadoop MapReduce
• Create RDD from existing non-pair RDD
• Create pair RDD by loading certain formats
• Create pair RDD from in-memory collection of pairs
Apply Operations on pair RDD
• Group ByKey
• Reduce ByKey
• Other Transformations

 Joins
Control partitioning across nodes
• RDD Partition
• Types of Partition

 Hash Partitioning
 Range Partitioning

• Benefit of Partitioning
• Best Practices
More on Data Frames
• Explore Data in DataFrames
• Create UDFs (user define functions)

 UDF with Scala DSL
 UDF with SQL

• Repartition Data Frames.
• Infer Schema by Reflection
• DataFrame from database table
• DataFrame from JSON
Monitor Apache Spark Applications
• Spark Execution Model
• Debug and Tune Spark Applications
Identify Spark Unified Stack Components
• Spark SQL
• Spark Streaming
• Spark MLib
• Spark GraphX
Benefits of Apache Spark over Hadoop Ecosystem
Describe Spark Data pipeline Use Cases
• Spark Streaming Architecture
• Dstream and a spark streaming application

 Define Use Case (Time Series Data)
 Basic Steps
 Save Data to HBase

• Operations on DStream
 Transformations
 Data Frame and SQL Operations
• Define Windowed Operation

 Sliding Window
 Windowed Computation
 Window based Transformation
 Window Operations

• Fault tolerance of streaming applications

 Fault Tolerance in Spark Streaming
 Fault Tolerance in Spark RDD
 Check pointing
Describe Graph X
Define Regular, Directed, and property graphs
Create a Property Graph
Perform Operations on Graphs
Describe Apache Spark MLib
Describe the Machine Learning Techniques
• Classifications
• Clustering
• Collaborative Filtering
Use Collaborative filtering to predict user choice
Scala
• Introduction
• A first example
• Expressions and Simple Functions
• First Class function
• Classes and Objects
• Case classes and Pattern matching
• Generic types and methods
• Lists
• For- Comprehension
• Mutable State
• Computing with Streams
• Lazy Values
• Implicit Parameters and Conversions
• Handley / Milner type Interface
• Abstraction for concurrency

Contact details: +1 416-834-6577 / +1 201-905-1656
WhatsApp : 9030990003/9000444287
Mail : selfpacedtech@gmail.com/training@selfpacedtech.com

More Details

Total Views:37
Reference Id:#1144198
Phone Number:+1 201-905-1656
Website URL:Click To Visit
Current Rating: /5 0 Vote

Comments

Copyright © 2020 |   All Rights Reserved |   tuffclassified.com |   24x7 support |   Email us : info[at]tuffclassified.com