You may have to Search all our reviewed books and magazines, click the sign up button below to create a free account.
Research Software Engineering: A Guide to the Open Source Ecosystem strives to give a big-picture overview and an understanding of the opportunities of programming as an approach to analytics and statistics. The book argues that a solid "programming" skill level is not only well within reach for many but also worth pursuing for researchers and business analysts. The ability to write a program leverages field-specific expertise and fosters interdisciplinary collaboration as source code continues to become an important communication channel. Given the pace of the development in data science, many senior researchers and mentors, alongside non-computer science curricula lack a basic software engineering component. This book fills the gap by providing a dedicated programming-with-data resource to both academic scholars and practitioners. Key Features overview: breakdown of complex data science software stacks into core components applied: source code of figures, tables and examples available and reproducible solely with license cost-free, open source software reader guidance: different entry points and rich references to deepen the understanding of selected aspects
Unlike the first edition, the new edition has been split into two books. Thoroughly revised and updated, this is the first book of the second edition of Introduction to Data Science: Data Wrangling and Visualization with R. It introduces skills that can help you tackle real-world data analysis challenges. These include R programming, data wrangling with dplyr, data visualization with ggplot2, file organization with UNIX/Linux shell, version control with Git and GitHub, and reproducible document preparation with Quarto and knitr. The new edition includes additional material/chapters on data.table, locales, and accessing data through APIs. The book is divided into four parts: R, Data Visualiza...
Mathematical Engineering of Deep Learning provides a complete and concise overview of deep learning using the language of mathematics. The book provides a self-contained background on machine learning and optimization algorithms and progresses through the key ideas of deep learning. These ideas and architectures include deep neural networks, convolutional models, recurrent models, long/short-term memory, the attention mechanism, transformers, variational auto-encoders, diffusion models, generative adversarial networks, reinforcement learning, and graph neural networks. Concepts are presented using simple mathematical equations together with a concise description of relevant tricks of the tra...
Spatial data is crucial to improve decision-making in a wide range of fields including environment, health, ecology, urban planning, economy, and society. Spatial Statistics for Data Science: Theory and Practice with R describes statistical methods, modeling approaches, and visualization techniques to analyze spatial data using R. The book provides a comprehensive overview of the varying types of spatial data, and detailed explanations of the theoretical concepts of spatial statistics, alongside fully reproducible examples which demonstrate how to simulate, describe, and analyze spatial data in various applications. Combining theory and practice, the book includes real-world data science exa...
Classification problems are common in business, medicine, science, engineering and other sectors of the economy. Data scientists and machine learning professionals solve these problems through the use of classifiers. Choosing one of these data driven classification algorithms for a given problem is a challenging task. An important aspect involved in this task is classifier performance analysis (CPA). Introduction to Classifier Performance Analysis with R provides an introductory account of commonly used CPA techniques for binary and multiclass problems, and use of the R software system to accomplish the analysis. Coverage draws on the extensive literature available on the subject, including ...
Data Science: A First Introduction with Python focuses on using the Python programming language in Jupyter notebooks to perform data manipulation and cleaning, create effective visualizations, and extract insights from data using classification, regression, clustering, and inference. It emphasizes workflows that are clear, reproducible, and shareable, and includes coverage of the basics of version control. Based on educational research and active learning principles, the book uses a modern approach to Python and includes accompanying autograded Jupyter worksheets for interactive, self-directed learning. The text will leave readers well-prepared for data science projects. It is designed for l...
Data graphics are used extensively to present information. Understanding graphics is a lot about understanding the data represented by the graphics, having a feel not just for the numbers themselves, the reliability and uncertainty associated with them, but also for what they mean. This book presents a practical approach to data visualisation with real applications front and centre. The first part of the book is a series of case studies, each describing a graphical analysis of a real dataset. The second part pulls together ideas from the case studies and provides an overview of the main factors affecting understanding graphics. Key Features: Explains how to get insights from graphics. Emphasises the value of drawing many graphics. Underlines the importance for analysis of background knowledge and context. Readers may be data scientists, statisticians or people who want to become more visually literate. A knowledge of Statistics is not required, just an interest in data graphics and some experience of working with data. It will help if the reader knows something of basic graphic forms such as barcharts, histograms, and scatterplots.
The Data Preparation Journey: Finding Your Way With R introduces the principles of data preparation within in a systematic approach that follows a typical data science or statistical workflow. With that context, readers will work through practical solutions to resolving problems in data using the statistical and data science programming language R. These solutions include examples of complex real-world data, adding greater context and exposing the reader to greater technical challenges. This book focuses on the Import to Tidy to Transform steps. It demonstrates how “Visualise” is an important part of Exploratory Data Analysis, a strategy for identifying potential problems with the data p...
The field of artificial intelligence, data science, and analytics is crippling itself. Exaggerated promises of unrealistic technologies, simplifications of complex projects, and marketing hype are leading to an erosion of trust in one of our most critical approaches to making decisions: data driven. This book aims to fix this by countering the AI hype with a dose of realism. Written by two experts in the field, the authors firmly believe in the power of mathematics, computing, and analytics, but if false expectations are set and practitioners and leaders don’t fully understand everything that really goes into data science projects, then a stunning 80% (or more) of analytics projects will continue to fail, costing enterprises and society hundreds of billions of dollars, and leading to non-experts abandoning one of the most important data-driven decision-making capabilities altogether. For the first time, business leaders, practitioners, students, and interested laypeople will learn what really makes a data science project successful. By illustrating with many personal stories, the authors reveal the harsh realities of implementing AI and analytics.
Data Scientists are experts at analyzing, modelling and visualizing data but, at one point or another, have all encountered difficulties in collaborating with or delivering their work to the people and systems that matter. Born out of the agile software movement, DevOps is a set of practices, principles and tools that help software engineers reliably deploy work to production. This book takes the lessons of DevOps and aplies them to creating and delivering production-grade data science projects in Python and R. This book’s first section explores how to build data science projects that deploy to production with no frills or fuss. Its second section covers the rudiments of administering a se...