Evifa-Portal

1

Online Resource

Streaming architecture : new designs using Apache Kafka and MapR streams (2016)

Dunning, Ted [VerfasserIn] 1956- ; Friedman, Ellen [MitwirkendeR]

Sebastopol, CA : O'Reilly Media

add to mindlist on the mindlist

Details

Language: English

Pages: 1 online resource (1 volume) , illustrations

Edition: First edition.

Keywords: Streaming technology (Telecommunications) ; Electronic books ; Electronic books ; local

Abstract: More and more data-driven companies are looking to adopt stream processing and streaming analytics. With this concise ebook, you'll learn best practices for designing a reliable architecture that supports this emerging big-data paradigm. Authors Ted Dunning and Ellen Friedman (Real World Hadoop) help you explore some of the best technologies to handle stream processing and analytics, with a focus on the upstream queuing or message-passing layer. To illustrate the effectiveness of these technologies, this book also includes specific use cases. Ideal for developers and non-technical people alike, this book describes: Key elements in good design for streaming analytics, focusing on the essential characteristics of the messaging layer New messaging technologies, including Apache Kafka and MapR Streams, with links to sample code Technology choices for streaming analytics: Apache Spark Streaming, Apache Flink, Apache Storm, and Apache Apex How stream-based architectures are helpful to support microservices Specific use cases such as fraud detection and geo-distributed data streams Ted Dunning is Chief Applications Architect at MapR Technologies, and active in the open source community. He currently serves as VP for Incubator at the Apache Foundation, as a champion and mentor for a large number of projects, and as committer and PMC member of the Apache ZooKeeper and Drill projects. Ted is on Twitter as ted_dunning. Ellen Friedman, a committer for the Apache Drill and Apache Mahout projects, is a solutions consultant and well-known speaker and author, currently writing mainly about big data topics. With a PhD in Biochemistry, she has years of experience as a research scientist and has written about a variety of technical topics. Ellen is on Twitter as @Ellen_Friedman.

Note: Includes bibliographical references. - Description based on online resource; title from title page (Safari, viewed May 20, 2016)

URL: https://learning.oreilly.com/library/view/-/9781491953914/?ar

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Online Resource

MPI Ethno. Forsch.

2

Online Resource

Data where you want it : geo-distribution of big data and analytics (2017)

Dunning, Ted [VerfasserIn] 1956- ; Friedman, B. Ellen [MitwirkendeR]

Sebastopol, CA : O'Reilly Media

add to mindlist on the mindlist

Details

Language: English

Pages: 1 online resource (1 volume) , illustrations

Edition: First edition.

Keywords: Computer networks ; Management ; Information technology ; Management ; Business enterprises ; Computer networks ; Management ; Electronic books ; Electronic books ; local

Abstract: Many organizations have begun to rethink the strategy of allowing regional teams to maintain independent databases that are periodically consolidated with the head office. As businesses extend their reach globally, these hierarchical approaches no longer work. Instead, an enterprise's entire data infrastructure-including multiple types of data persistence-needs to be shared and updated everywhere at the same time with fine-grained control over who has access. This practical report examines the requirements and challenges of constructing a geo-distributed data platform, including examples of specific technologies designed to meet them. Authors Ted Dunning and Ellen Friedman also provide real-world use cases that show how low-latency geo-distribution of very large-scale data and computation provide a competitive edge. With this report, you'll explore: How replication and mirroring methods for data movement provide the large scale, low latency, and low cost that systems demand The importance of multimaster replication of data streams and databases Advantages (and disadvantages) of cloud neutrality, cloud bursting, and hybrid cloud architecture for transferring data Why effective data governance is a complex process that requires the right tools for controlling and monitoring geo-distributed data How to make containers work for geo-distributed data at scale, even where stateful applications are involved Use cases that demonstrate how telecoms and online advertisers distribute large quantities of data

Note: Includes bibliographical references. - Description based on online resource; title from title page (Safari, viewed September 12, 2018)

URL: https://learning.oreilly.com/library/view/-/9781491983577/?ar

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Online Resource

MPI Ethno. Forsch.

3

Online Resource

Online evaluation of machine learning models (2019)

Dunning, Ted [VerfasserIn]

[Erscheinungsort nicht ermittelbar] : O'Reilly Media, Inc. | Boston, MA : Safari

add to mindlist on the mindlist

Details

Language: English

Pages: 1 online resource (1 video file, approximately 41 min.)

Edition: 1st edition

Keywords: Electronic videos ; local

Abstract: Academic machine learning almost exclusively involves offline evaluation of machine learning models. In the real world this is, somewhat surprisingly, only good enough for a rough cut that eliminates the real dogs. For production work, online evaluation is often the only option to determine which of several final-round candidates might be chosen for further use. As Einstein is rumored to have said, theory and practice are the same, in theory. In practice, they are different. So it is with models. Part of the problem is interaction with other models and systems. Part of the problem has to do with the variability of the real world. Often, there are adversaries at work. It may even be sunspots. One particular problem arises when models choose their own training data and thus couple back onto themselves. In addition to these difficulties, production models almost always have service-level agreements that have to do with how quickly they must produce results and how often they are allowed to fail. These operational considerations can be as important as the accuracy of the model: the right results returned late are worse than slightly wrong results returned in time. Ted Dunning (MapR) offers a survey of useful ways to evaluate models in the real world, breaking the problem of evaluation apart into operational and function evaluation and demonstrating how to do each without unnecessary pain and suffering. You'll learn about decoy and canary models, nonlinear latency histogramming, model-delta diagrams, and more. These techniques may sound arcane, but each is simple at heart and doesn't require any advanced mathematics to understand. Along the way, he shares exciting visualization techniques that will help make differences strikingly apparent. This session was recorded at the 2019 O'Reilly Strata Data Conference in San Francisco.

Note: Online resource; Title from title screen (viewed October 31, 2019)

URL: lizenzpflichtig

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Online Resource

MPI Ethno. Forsch.

4

Online Resource

Sharing big data safely : managing data security (2015)

Dunning, Ted [VerfasserIn] 1956- ; Friedman, B. Ellen [MitwirkendeR]

Sebastopol, CA : O'Reilly Media

add to mindlist on the mindlist

Details

Language: English

Pages: 1 online resource (1 volume) , illustrations

Edition: First edition.

Keywords: Big data ; Security measures ; Data protection ; Electronic books ; Electronic books ; local

Abstract: Many big data-driven companies today are moving to protect certain types of data against intrusion, leaks, or unauthorized eyes. But how do you lock down data while granting access to people who need to see it? In this practical book, authors Ted Dunning and Ellen Friedman offer two novel and practical solutions that you can implement right away.

Note: Includes bibliographical references. - Description based on online resource; title from title page (viewed January 13, 2016)

URL: https://learning.oreilly.com/library/view/-/9781491953624/?ar

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Online Resource

MPI Ethno. Forsch.

5

Online Resource

Practical machine learning : innovations in recommendation (2014)

Dunning, Ted [VerfasserIn] 1956- ; Friedman, B. Ellen [MitwirkendeR]

Sebastopol, CA : O'Reilly Media

add to mindlist on the mindlist

Details

Language: English

Pages: 1 online resource (1 v.) , ill.

Keywords: Machine learning ; Development ; Machine learning ; Case studies ; Electronic books ; Electronic books ; local

Abstract: Building a simple but powerful recommendation system is much easier than you think. Approachable for all levels of expertise, this report explains innovations that make machine learning practical for business production settings-and demonstrates how even a small-scale development team can design an effective large-scale recommendation system. Apache Mahout committers Ted Dunning and Ellen Friedman walk you through a design that relies on careful simplification. You'll learn how to collect the right data, analyze it with an algorithm from the Mahout library, and then easily deploy the recommender using search technology, such as Apache Solr or Elasticsearch. Powerful and effective, this efficient combination does learning offline and delivers rapid response recommendations in real time. Understand the tradeoffs between simple and complex recommenders Collect user data that tracks user actions-rather than their ratings Predict what a user wants based on behavior by others, using Mahoutfor co-occurrence analysis Use search technology to offer recommendations in real time, complete with item metadata Watch the recommender in action with a music service example Improve your recommender with dithering, multimodal recommendation, and other techniques

Note: Includes bibliographical references. - Description based on print version record

URL: https://learning.oreilly.com/library/view/-/9781491915707/?ar

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Online Resource

MPI Ethno. Forsch.

6

Online Resource

AI and analytics in production : how to make it work (2018)

Dunning, Ted [VerfasserIn] 1956- ; Friedman, Ellen [MitwirkendeR]

Sebastopol, CA : O'Reilly Media

add to mindlist on the mindlist

Details

Language: English

Pages: 1 online resource (1 volume) , illustrations

Edition: First edition.

Keywords: Real-time data processing ; Electronic data processing ; Distributed processing ; Data mining ; Electronic books ; Electronic books ; local

Abstract: If you've begun to deploy large-scale data systems into production, or have at least explored the process, this practical ebook shows business team leaders, business analysts, and technical developers how to make your big data analytics, machine learning, and AI initiatives production ready. Authors Ted Dunning and Ellen Friedman provide a non-technical guide to best practices for a process that can be quite challenging. Rather than provide a complex review of tools, this ebook explores fundamental ideas on how to make your analytics production easier and more effective, based on the authors' observations across a wide range of industries. Whether your organization is just getting started or already has data-driven applications in production, you'll find helpful content that will help you succeed.. Gain an understanding of the goals, challenges, and potential pitfalls of deploying analytics and AI to production Learn the best way to design, plan, and execute large data systems in production Focus on the special case of machine learning and AI in production Examine MapR, a data platform with the technical capabilities to support emerging trends for large-scale data Explore a range of design patterns that work well for production customers across various sectors Get best practices for avoiding various gotchas as you move to production

Note: Includes bibliographical references. - Description based on online resource; title from title page (Safari, viewed November 5, 2018)

URL: https://learning.oreilly.com/library/view/-/9781492044116/?ar

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Online Resource

MPI Ethno. Forsch.

7

Online Resource

Real-world Hadoop (2015)

Dunning, Ted [VerfasserIn] 1956- ; Friedman, B. Ellen [MitwirkendeR]

Sebastopol, CA : O'Reilly Media

add to mindlist on the mindlist

Details

Language: English

Pages: 1 online resource (1 volume) , illustrations

Parallel Title: Erscheint auch als

Keywords: Apache Hadoop ; Electronic data processing ; Distributed processing ; File organization (Computer science) ; Cloud computing ; Electronic books ; Electronic books ; local

Abstract: If you're a business team leader, CIO, business analyst, or developer interested in how Apache Hadoop and Apache HBase-related technologies can address problems involving large-scale data in cost-effective ways, this book is for you. Using real-world stories and situations, authors Ted Dunning and Ellen Friedman show Hadoop newcomers and seasoned users alike how NoSQL databases and Hadoop can solve a variety of business and research issues. You'll learn about early decisions and pre-planning that can make the process easier and more productive. If you're already using these technologies, you'll discover ways to gain the full range of benefits possible with Hadoop. While you don't need a deep technical background to get started, this book does provide expert guidance to help managers, architects, and practitioners succeed with their Hadoop projects. Examine a day in the life of big data: India's ambitious Aadhaar project Review tools in the Hadoop ecosystem such as Apache's Spark, Storm, and Drill to learn how they can help you Pick up a collection of technical and strategic tips that have helped others succeed with Hadoop Learn from several prototypical Hadoop use cases, based on how organizations have actually applied the technology Explore real-world stories that reveal how MapR customers combine use cases when putting Hadoop and NoSQL to work, including in production Ted Dunning is Chief Applications Architect at MapR Technologies, and committer and PMC member of the Apache's Drill, Storm, Mahout, and ZooKeeper projects. He is also mentor for Apache's Datafu, Kylin, Zeppelin, Calcite, and Samoa projects. Ellen Friedman is a solutions consultant, speaker, and author, writing mainly about big data topics. She is a committer for the Apache Mahout project and a contributor to the Apache Drill project.

Note: Description based on print version record

URL: https://learning.oreilly.com/library/view/-/9781491928899/?ar

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Online Resource

MPI Ethno. Forsch.

8

Online Resource

Time series databases : new ways to store and access data (2014)

Dunning, Ted [VerfasserIn] 1956- ; Friedman, B. Ellen [MitwirkendeR]

Sebastopol, CA : O'Reilly Media

add to mindlist on the mindlist

Details

Language: English

Pages: 1 online resource (1 volume) , illustrations

Keywords: Time-series analysis ; Data processing ; Information storage and retrieval systems ; Electronic books ; Electronic books ; local

Abstract: Time series data is of growing importance, especially with the rapid expansion of the Internet of Things. This concise guide shows you effective ways to collect, persist, and access large-scale time series data for analysis. You'll explore the theory behind time series databases and learn practical methods for implementing them. Authors Ted Dunning and Ellen Friedman provide a detailed examination of open source tools such as OpenTSDB and new modifications that greatly speed up data ingestion.

Note: Includes bibliographical references. - Description based on online resource; title from title page (Safari, viewed January 5, 2015)

URL: https://learning.oreilly.com/library/view/-/9781491920909/?ar

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Online Resource

MPI Ethno. Forsch.

9

Online Resource

Practical machine learning : a new look at anomaly detection (2014)

Dunning, Ted [VerfasserIn] 1956- ; Friedman, Ellen [MitwirkendeR]

Sebastopol, CA : O'Reilly Media

add to mindlist on the mindlist

Details

Language: English

Pages: 1 online resource (1 v.) , ill.

Keywords: Machine learning ; Anomaly detection (Computer security) ; Electronic books ; Electronic books ; local

Abstract: Anomaly detection is the detective work of machine learning: finding the unusual, catching the fraud, discovering strange activity in large and complex datasets. But, unlike Sherlock Holmes, you may not know what the puzzle is, much less what "suspects" you're looking for. This O'Reilly report uses practical example to explain how the underlying concepts of anomaly detection work.

Note: Description based on online resource; title from title page (Safari, viewed Aug. 29, 2014)

URL: https://learning.oreilly.com/library/view/-/9781491914151/?ar

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Online Resource

MPI Ethno. Forsch.

10

Online Resource

Practical feature engineering (2020)

Dunning, Ted [VerfasserIn]

[Erscheinungsort nicht ermittelbar] : O'Reilly Media, Inc. | Boston, MA : Safari

add to mindlist on the mindlist

Details

Language: English

Pages: 1 online resource (1 video file, approximately 39 min.)

Edition: 1st edition

Keywords: Electronic videos ; local

Abstract: Feature engineering is generally the section that gets left out of machine learning books, but it’s also the most important part of successful models, even in today’s world of deep learning. While academic courses on machine learning focus on gradients and the latest flavor of recurrent network, Ted Dunning (MapR) explores the techniques that practitioners in the real world are seeking out better features and figuring out how to extract value using a variety of time-honored (and occasionally exceptionally clever) heuristics. In a sense, feature engineering is the Rodney Dangerfield of machine learning, never getting any respect. It is, however, the task that will get you the most value for time spent in terms of model performance. This work is not just the work of the data scientist. Good features encode business realities as well and are the cross-product of good business sense and good data engineering. Prerequisite knowledge A basic understanding of how machine learning is used to teach models What you'll learn Learn some surprising techniques that can help you solve some really hard problems This session is from the 2019 O'Reilly Strata Conference in New York, NY.

Note: Online resource; Title from title screen (viewed February 28, 2020) , Mode of access: World Wide Web.

URL: lizenzpflichtig

Permalink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Online Resource

MPI Ethno. Forsch.