You may have to Search all our reviewed books and magazines, click the sign up button below to create a free account.
Summary The Spark distributed data processing platform provides an easy-to-implement tool for ingesting, streaming, and processing data from any source. In Spark in Action, Second Edition, you’ll learn to take advantage of Spark’s core features and incredible processing speed, with applications including real-time computation, delayed evaluation, and machine learning. Spark skills are a hot commodity in enterprises worldwide, and with Spark’s powerful and flexible Java APIs, you can reap all the benefits without first learning Scala or Hadoop. Foreword by Rob Thomas. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the t...
As data continues to grow and become more complex, organizations seek innovative solutions to manage their data effectively. Data mesh is one solution that provides a new approach to managing data in complex organizations. This practical guide offers step-by-step guidance on how to implement data mesh in your organization. In this book, Jean-Georges Perrin and Eric Broda focus on the key components of data mesh and provide practical advice supported by code. Data engineers, architects, and analysts will explore a simple and intuitive process for identifying key data mesh components and data products. You'll learn a consistent set of interfaces and access methods that make data products easy ...
Summary Spark in Action teaches you the theory and skills you need to effectively handle batch and streaming data using Spark. Fully updated for Spark 2.0. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the Technology Big data systems distribute datasets across clusters of machines, making it a challenge to efficiently query, stream, and interpret them. Spark can help. It is a processing system designed specifically for distributed data. It provides easy-to-use interfaces, along with the performance you need for production-quality analytics and machine learning. Spark 2 also adds improved programming APIs, better performance...
Proven techniques and principles for modernizing legacy systems into new architectures that deliver serious competitive advantage. For a business to thrive, it needs a modern software architecture that is aligned with its corporate architecture. This book presents concrete practices that sync software, product, strategy, team dynamics, and work practices. You’ll evolve your technical and social architecture together, reducing needless dependencies and achieving faster flow of innovation across your organization. In Architecture Modernization: Socio-technical alignment of software, strategy, and structure you’ll learn how to: Identify strategic ambitions and challenges using listening and...
Many enterprises are investing in a next-generation data lake, hoping to democratize data at scale to provide business insights and ultimately make automated intelligent decisions. In this practical book, author Zhamak Dehghani reveals that, despite the time, money, and effort poured into them, data warehouses and data lakes fail when applied at the scale and speed of today's organizations. A distributed data mesh is a better choice. Dehghani guides architects, technical leaders, and decision makers on their journey from monolithic big data architecture to a sociotechnical paradigm that draws from modern distributed architecture. A data mesh considers domains as a first-class concern, applie...
Apache Spark is amazing when everything clicks. But if you haven’t seen the performance improvements you expected, or still don’t feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau and Rachel Warren demonstrate performance optimizations to help your Spark queries run faster and handle larger data sizes, while using fewer resources. Ideal for software engineers, data engineers, developers, and system administrators working with large-scale data applications, this book describes techniques that can reduce data infrastructure costs and developer hours. Not only will you gain a more comprehensive understanding of Spark, you’ll also learn ...
Make Software Architecture Choices That Maximize Value and Innovation "[Vernon and Jaskuła] provide insights, tools, proven best practices, and architecture styles both from the business and engineering viewpoint. . . . This book deserves to become a must-read for practicing software engineers, executives as well as senior managers." --Michael Stal, Certified Senior Software Architect, Siemens Technology Strategic Monoliths and Microservices helps business decision-makers and technical team members clearly understand their strategic problems through collaboration and identify optimal architectural approaches, whether the approach is distributed microservices, well-modularized monoliths, or ...
Update Your Architectural Practices for New Challenges, Environments, and Stakeholder Expectations "I am continuously delighted and inspired by the work of these authors. Their first book laid the groundwork for understanding how to evolve the architecture of a software-intensive system, and this latest one builds on it in some wonderfully actionable ways." --Grady Booch, Chief Scientist for Software Engineering, IBM Research Authors Murat Erder, Pierre Pureur, and Eoin Woods have taken their extensive software architecture experience and applied it to the practical aspects of software architecture in real-world environments. Continuous Architecture in Practice provides hands-on advice for l...
The perimeter defenses guarding your network perhaps are not as secure as you think. Hosts behind the firewall have no defenses of their own, so when a host in the "trusted" zone is breached, access to your data center is not far behind. That’s an all-too-familiar scenario today. With this practical book, you’ll learn the principles behind zero trust architecture, along with details necessary to implement it. The Zero Trust Model treats all hosts as if they’re internet-facing, and considers the entire network to be compromised and hostile. By taking this approach, you’ll focus on building strong authentication, authorization, and encryption throughout, while providing compartmentalized access and better operational agility. Understand how perimeter-based defenses have evolved to become the broken model we use today Explore two case studies of zero trust in production networks on the client side (Google) and on the server side (PagerDuty) Get example configuration for open source tools that you can use to build a zero trust network Learn how to migrate from a perimeter-based network to a zero trust network in production
Quickly find solutions to common programming problems encountered while processing big data. Content is presented in the popular problem-solution format. Look up the programming problem that you want to solve. Read the solution. Apply the solution directly in your own code. Problem solved! PySpark Recipes covers Hadoop and its shortcomings. The architecture of Spark, PySpark, and RDD are presented. You will learn to apply RDD to solve day-to-day big data problems. Python and NumPy are included and make it easy for new learners of PySpark to understand and adopt the model. What You Will Learn Understand the advanced features of PySpark2 and SparkSQL Optimize your code Program SparkSQL with Python Use Spark Streaming and Spark MLlib with Python Perform graph analysis with GraphFrames Who This Book Is For Data analysts, Python programmers, big data enthusiasts