You may have to Search all our reviewed books and magazines, click the sign up button below to create a free account.
Entity Resolution (ER) lies at the core of data integration and cleaning and, thus, a bulk of the research examines ways for improving its effectiveness and time efficiency. The initial ER methods primarily target Veracity in the context of structured (relational) data that are described by a schema of well-known quality and meaning. To achieve high effectiveness, they leverage schema, expert, and/or external knowledge. Part of these methods are extended to address Volume, processing large datasets through multi-core or massive parallelization approaches, such as the MapReduce paradigm. However, these early schema-based approaches are inapplicable to Web Data, which abound in voluminous, noi...
This book outlines the basic principles of creation and maintenance of taxonomies and thesauri. It also provides step by step instructions for building a taxonomy or thesaurus and discusses the various ways to get started on a taxonomy construction project. Often, the first step is to get management and budgetary approval, so I start this book with a discussion of reasons to embark on the taxonomy journey. From there I move on to a discussion of metadata and how taxonomies and metadata are related, and then consider how, where, and why taxonomies are used. Information architecture has its cornerstone in taxonomies and metadata. While a good discussion of information architecture is beyond th...
Many data-intensive applications that use machine learning or artificial intelligence techniques depend on humans providing the initial dataset, enabling algorithms to process the rest or for other humans to evaluate the performance of such algorithms. Not only can labeled data for training and evaluation be collected faster, cheaper, and easier than ever before, but we now see the emergence of hybrid human-machine software that combines computations performed by humans and machines in conjunction. There are, however, real-world practical issues with the adoption of human computation and crowdsourcing. Building systems and data processing pipelines that require crowd computing remains difficult. In this book, we present practical considerations for designing and implementing tasks that require the use of humans and machines in combination with the goal of producing high-quality labels.
This book constitutes the refereed proceedings of the 6th International Conference on Asian Digital Libraries, ICADL 2003, held in Kuala Lumpur, Malaysia in December 2003. The 68 revised full papers presented together with 15 poster abstracts and 3 invited papers were carefully reviewed from numerous submissions. The papers are organized in topical sections on information retrieval techniques, multimedia digital libraries, data mining and digital libraries, machine architecture and organization, human resources and training, human-computer interaction, digital library infrastructure, building and using digital libraries, knowledge management, intellectual property rights and copyright, e-learning and mobile learning, data storage and retrieval, digital library services, content development, information retrieval and Asian languages, and metadata.
Everybody knows what relevance is. It is a "ya'know" notion, concept, idea–no need to explain whatsoever. Searching for relevant information using information technology (IT) became a ubiquitous activity in contemporary information society. Relevant information means information that pertains to the matter or problem at hand—it is directly connected with effective communication. The purpose of this book is to trace the evolution and with it the history of thinking and research on relevance in information science and related fields from the human point of view. The objective is to synthesize what we have learned about relevance in several decades of investigation about the notion in infor...
This book is intended for anyone interested in learning more about how search works and how it is evaluated. We all use search—it's a familiar utility. Yet, few of us stop and think about how search works, what makes search results good, and who, if anyone, decides what good looks like. Search has a long and glorious history, yet it continues to evolve, and with it, the measurement and our understanding of the kinds of experiences search can deliver continues to evolve, as well. We will discuss the basics of how search engines work, how humans use search engines, and how measurement works. Equipped with these general topics, we will then dive into the established ways of measuring search u...
This book introduces fundamentals of information communication. At first, concepts and characteristics of information and information communication are summarized. And then five classic models of information communication are introduced. The mechanisms and fundamental laws of the information transmission process are also discussed. In order to realize information communication, impediments in information communication process are identified and analyzed. For the purpose of investigating implications of Internet information communication, patterns and characteristics of information communication in the Internet and Web 2.0 environment are also analyzed. In the end, case studies are provided for readers to understand the theory.
The field of human information behavior runs the gamut of processes from the realization of a need or gap in understanding, to the search for information from one or more sources to fill that gap, to the use of that information to complete a task at hand or to satisfy a curiosity, as well as other behaviors such as avoiding information or finding information serendipitously. Designers of mechanisms, tools, and computer-based systems to facilitate this seeking and search process often lack a full knowledge of the context surrounding the search. This context may vary depending on the job or role of the person; individual characteristics such as personality, domain knowledge, age, gender, perce...
Citation analysis—the exploration of reference patterns in the scholarly and scientific literature—has long been applied in a number of social sciences to study research impact, knowledge flows, and knowledge networks. It has important information science applications as well, particularly in knowledge representation and in information retrieval. Recent years have seen a burgeoning interest in citation analysis to help address research, management, or information service issues such as university rankings, research evaluation, or knowledge domain visualization. This renewed and growing interest stems from significant improvements in the availability and accessibility of digital bibliogra...
The rise of social media technologies has created new ways to seek and share information for millions of users worldwide, but also has presented new challenges for libraries in meeting users where they are within social spaces. From social networking sites such as Facebook and Google+, and microblogging platforms such as Twitter and Tumblr to the image and video sites of YouTube, Flickr, Instagram, and to geotagging sites such as Foursquare, libraries have responded by establishing footholds within a variety of social media platforms and seeking new ways of engaging with online users in social spaces. Libraries are also responding to new social review sites such as Yelp and Tripadvisor, awar...