Evifa-Portal

Hits per page

hits 1 - 3 | 3 hits

Sorting

Online Resource

Apache Spark solution for rank product (2015)

Parsian, Mahmoud [VerfasserIn]

[Place of publication not identified] : O'Reilly

add to mindlist on the mindlist

Details

ISBN: 9781491951064

Language: English

Pages: 1 online resource (1 streaming video file (42 min., 5 sec.)) , digital, sound, color

Keywords: Spark (Electronic resource : Apache Software Foundation) ; Social networks ; Data processing ; Statistical methods ; Electronic commerce ; Data processing ; Statistical methods ; Electronic videos ; local

Abstract: "The 'rank product' is a statistical technique, used for detecting differentially regulated genes in replicated microarray experiments. The technique has achieved widespread acceptance and is now used more broadly, in such diverse fields as RNAi analysis, proteomics, and machine learning. The 'rank product' technique may be used in ranking users (in social networks) and items (such as Amazon.com). Given large set of genes, users, or items, in this webcast I will present two distinct Spark solutions: (using groupByKey() and combineByKey()) for solving the 'rank product.'"--Resource description page.

Note: Title from title screen (viewed February 12, 2016). - Date of publication from resource description page

URL: lizenzpflichtig

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Online Resource

MPI Ethno. Forsch.

Online Resource

Data algorithms : recipes for scaling up with Hadoop and Spark (2015)

Parsian, Mahmoud [VerfasserIn]

[Sebastopol, CA] : O'Reilly Media

add to mindlist on the mindlist

Details

Language: English

Pages: 1 online resource (1 volume) , illustrations

Keywords: MapReduce (Computer file) ; Apache Hadoop ; Electronic data processing ; Big data ; Electronic books ; Electronic books ; local

Abstract: If you are ready to dive into the MapReduce framework for processing large datasets, this practical book takes you step by step through the algorithms and tools you need to build distributed MapReduce applications with Apache Hadoop or Apache Spark. Each chapter provides a recipe for solving a massive computational problem, such as building a recommendation system. You'll learn how to implement the appropriate MapReduce solution with code that you can use in your projects. Dr. Mahmoud Parsian covers basic design patterns, optimization techniques, and data mining and machine learning solutions for problems in bioinformatics, genomics, statistics, and social network analysis. This book also includes an overview of MapReduce, Hadoop, and Spark.

Note: Place of publication suggested by publisher's website. - Includes bibliographical references. - Description based on online resource; title from title page (Safari, viewed February 18, 2015)

URL: https://learning.oreilly.com/library/view/-/9781491906170/?ar

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Online Resource

MPI Ethno. Forsch.

Online Resource

Data Algorithms with Spark (2021)

Parsian, Mahmoud [VerfasserIn]

[Erscheinungsort nicht ermittelbar] : O'Reilly Media, Inc. | Boston, MA : Safari

add to mindlist on the mindlist

Details

Language: English

Pages: 1 online resource (110 pages)

Edition: 1st edition

Keywords: Electronic books ; local ; Electronic books

Abstract: Apache Spark’s speed, ease of use, sophisticated analytics, and multilanguage support makes practical knowledge of this cluster-computing framework a required skill for data engineers and data scientists. With this hands-on guide, anyone looking for an introduction to Spark will learn practical algorithms and examples for this framework using PySpark. In each chapter, author Mahmoud Parsian shows you how to solve a data problem with a set of Spark transformations and algorithms. You’ll learn how to tackle problems involving ETL, design patterns, machine learning algorithms, data partitioning, and genomics analysis. Each detailed recipe includes PySpark algorithms using the PySpark driver and shell script. With this book, you will: Learn how to select Spark transformations for optimized solutions Explore powerful transformations and reductions including reduceByKey(), combineByKey(), and mapPartitions() Understand data partitioning for optimized queries Design machine learning algorithms including Naive Bayes, linear regression, and logistic regression Build and apply a model using PySpark design patterns Apply motif finding algorithms to graph data Analyze graph data by using the GraphFrames API Apply PySpark algorithms to clinical and genomics data (such as DNA-Seq)

Note: Online resource; Title from title page (viewed December 25, 2021) , Mode of access: World Wide Web.

URL: lizenzpflichtig

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Online Resource

MPI Ethno. Forsch.

hits 1 - 3 | 3 hits