Evifa-Portal

Hits per page

hits 1 - 2 | 2 hits

Sorting

Online Resource

Apache Spark 2.x cookbook : Cloud-ready recipes to do analytics and data science on Apache Spark (2017)

Yadav, Rishi [VerfasserIn]

Birmingham, UK : Packt Publishing

add to mindlist on the mindlist

Details

ISBN: 9781787127517 , 1787127516

Language: English

Pages: 1 online resource (1 volume) , illustrations

Keywords: Spark (Electronic resource : Apache Software Foundation) ; Big data ; Data mining ; Computer programs ; Electronic books ; Electronic books ; local

Abstract: Over 70 recipes to help you use Apache Spark as your single big data computing platform and master its libraries About This Book This book contains recipes on how to use Apache Spark as a unified compute engine Cover how to connect various source systems to Apache Spark Covers various parts of machine learning including supervised/unsupervised learning & recommendation engines Who This Book Is For This book is for data engineers, data scientists, and those who want to implement Spark for real-time data processing. Anyone who is using Spark (or is planning to) will benefit from this book. The book assumes you have a basic knowledge of Scala as a programming language. What You Will Learn Install and configure Apache Spark with various cluster managers & on AWS Set up a development environment for Apache Spark including Databricks Cloud notebook Find out how to operate on data in Spark with schemas Get to grips with real-time streaming analytics using Spark Streaming & Structured Streaming Master supervised learning and unsupervised learning using MLlib Build a recommendation engine using MLlib Graph processing using GraphX and GraphFrames libraries Develop a set of common applications or project types, and solutions that solve complex big data problems In Detail While Apache Spark 1.x gained a lot of traction and adoption in the early years, Spark 2.x delivers notable improvements in the areas of API, schema awareness, Performance, Structured Streaming, and simplifying building blocks to build better, faster, smarter, and more accessible big data applications. This book uncovers all these features in the form of structured recipes to analyze and mature large and complex sets of data. Starting with installing and configuring Apache Spark with various cluster managers, you will learn to set up development environments. Further on, you will be introduced to working with RDDs, DataFrames and Datasets to operate on schema aware data, and real-time streaming with various sources such as Twitter Stream and Apache Kafka. You will also work through recipes on machine learning, including supervised learning, unsupervised learning & recommendation engines in Spark. Last but not least, the final few chapters delve deeper into the concepts of graph processing using GraphX, securing your implementations, cluster optimization, and troubleshooting. Style and approach This book is packed with intuitive recipes supported with line-by-line explanations to help you...

Note: Description based on online resource; title from title page (viewed June 26, 2017)

URL: https://learning.oreilly.com/library/view/-/9781787127265/?ar

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Online Resource

MPI Ethno. Forsch.

Online Resource

Spark cookbook : over 60 recipes on Spark, covering Spark Core, Spark SQL, Spark Streaming, MLib, and GraphX libraries (2015)

Yadav, Rishi [VerfasserIn]

Birmingham, UK : Packt Publishing

add to mindlist on the mindlist

Details

ISBN: 9781783987078 , 1783987073

Language: English

Pages: 1 online resource (1 volume) , illustrations.

Series Statement: Quick answers to common problems

Keywords: Spark (Electronic resource : Apache Software Foundation) ; Big data ; Data mining ; Computer programs ; Electronic books ; Electronic books ; local

Abstract: Over 60 recipes on Spark, covering Spark Core, Spark SQL, Spark Streaming, MLlib, and GraphX libraries In Detail By introducing in-memory persistent storage, Apache Spark eliminates the need to store intermediate data in filesystems, thereby increasing processing speed by up to 100 times. This book will focus on how to analyze large and complex sets of data. Starting with installing and configuring Apache Spark with various cluster managers, you will cover setting up development environments. You will then cover various recipes to perform interactive queries using Spark SQL and real-time streaming with various sources such as Twitter Stream and Apache Kafka. You will then focus on machine learning, including supervised learning, unsupervised learning, and recommendation engine algorithms. After mastering graph processing using GraphX, you will cover various recipes for cluster optimization and troubleshooting. What You Will Learn Install and configure Apache Spark with various cluster managers Set up development environments Perform interactive queries using Spark SQL Get to grips with real-time streaming analytics using Spark Streaming Master supervised learning and unsupervised learning using MLlib Build a recommendation engine using MLlib Develop a set of common applications or project types, and solutions that solve complex big data problems Use Apache Spark as your single big data compute platform and master its libraries

Note: Includes index. - Description based on online resource; title from cover (Safari, viewed August 13, 2015)

URL: https://learning.oreilly.com/library/view/-/9781783987061/?ar

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Online Resource

MPI Ethno. Forsch.

hits 1 - 2 | 2 hits