You may have to Search all our reviewed books and magazines, click the sign up button below to create a free account.
This is the eBook of the printed book and may not include any media, website access codes, or print supplements that may come packaged with the bound book. Search Engines: Information Retrieval in Practice is ideal for introductory information retrieval courses at the undergraduate and graduate level in computer science, information science and computer engineering departments. It is also a valuable tool for search engine and information retrieval professionals. Written by a leader in the field of information retrieval, Search Engines: Information Retrieval in Practice , is designed to give undergraduate students the understanding and tools they need to evaluate, compare and modify search engines. Coverage of the underlying IR and mathematical models reinforce key concepts. The book’s numerous programming exercises make extensive use of Galago, a Java-based open source search engine.
A statisticallanguage model, or more simply a language model, is a prob abilistic mechanism for generating text. Such adefinition is general enough to include an endless variety of schemes. However, a distinction should be made between generative models, which can in principle be used to synthesize artificial text, and discriminative techniques to classify text into predefined cat egories. The first statisticallanguage modeler was Claude Shannon. In exploring the application of his newly founded theory of information to human language, Shannon considered language as a statistical source, and measured how weH simple n-gram models predicted or, equivalently, compressed natural text. To do this...
The amounts of information that are ?ooding people both at the workplace and in private life have increased dramatically in the past ten years. The number of paper documents doubles every four years, and the amount of information stored on all data carriers every six years. New knowledge, however, increases at a considerably lower rate. Possibilities for automatic content recognition in various media and for the processing of documents are therefore becoming more important every day. Especially in economic terms, the e?cient handling of information, i.e., ?- ing the right information at the right time, is an invaluable resource for any enterprise, but it is particularly important for small- ...
The symposium on which this volume was based brought together approximately fifty scientists from a variety of backgrounds to discuss the rapidly-emerging set of competing technologies for exploiting a massive quantity of textual information. This group was challenged to explore new ways to take advantage of the power of on-line text. A billion words of text can be more generally useful than a few hundred logical rules, if advanced computation can extract useful information from streams of text and help find what is needed in the sea of available material. While the extraction task is a hot topic for the field of natural language processing and the retrieval task is a solid aspect in the fie...
The NSF Center for Intelligent Information Retrieval (CIIR) was formed in the Computer Science Department of the University of Massachusetts, Amherst, in 1992. Through its efforts in basic research, applied research, and technology transfer, the CIIR has become known internationally as one of the leading research groups in the area of information retrieval. The CIIR focuses on research that results in more effective and efficient access and discovery in large, heterogeneous, distributed text and multimedia databases. The scope of the work that is done in the CIIR is broad and goes significantly beyond `traditional' areas of information retrieval such as retrieval models, cross-lingual search...
Held in Gaithersburg, MD, August November 2-4, 1994. The conference was co-sponsored by the National Inst. of Standards and Technology (NIST) and the Advanced Research Projects Agency (ARPA) and was attended by 150 people involved in the 32 participating groups. Evaluates new technologies in text retrieval. Includes 34 papers: indexing structures, fragmentation schemes, probabilistic retrieval, latent semantic indexing, interactive document retrieval, and much more. Numerous graphs, tables and charts.
The Turn analyzes the research of information seeking and retrieval (IS&R) and proposes a new direction of integrating research in these two areas: the fields should turn off their separate and narrow paths and construct a new avenue of research. An essential direction for this avenue is context as given in the subtitle Integration of Information Seeking and Retrieval in Context. Other essential themes in the book include: IS&R research models, frameworks and theories; search and works tasks and situations in context; interaction between humans and machines; information acquisition, relevance and information use; research design and methodology based on a structured set of explicit variables...
Karen Spärck Jones is one of the major figures of 20th century and early 21st Century computing and information processing. Her ideas have had an important influence on the development of Internet Search Engines. Her contribution has been recognized by awards from the natural language processing, information retrieval and artificial intelligence communities, including being asked to present the prestigious Grace Hopper lecture. She continues to be an active and influential researcher. Her contribution to the scientific evaluation of the effectiveness of such computer systems has been quite outstanding. This book celebrates the life and work of Karen Spärck Jones in her seventieth year. It consists of fifteen new and original chapters written by leading international authorities reviewing the state of the art and her influence in the areas in which Karen Spärck Jones has been active. Although she has a publication record which goes back over forty years, it is clear even the very early work reviewed in the book can be read with profit by those working on recent developments in information processing like bioinformatics and the semantic web.
Information retrieval (IR) is becoming an increasingly important area as scientific, business and government organisations take up the notion of "information superhighways" and make available their full text databases for searching. Containing a selection of 35 papers taken from the 17th Annual SIGIR Conference held in Dublin, Ireland in July 1994, the book addresses basic research and provides an evaluation of information retrieval techniques in applications. Topics covered include text categorisation, indexing, user modelling, IR theory and logic, natural language processing, statistical and probabilistic models of information retrieval systems, routing, passage retrieval, and implementation issues.
Collections of digital documents can nowadays be found everywhere in institutions, universities or companies. Examples are Web sites or intranets. But searching them for information can still be painful. Searches often return either large numbers of matches or no suitable matches at all. Such document collections can vary a lot in size and how much structure they carry. What they have in common is that they typically do have some structure and that they cover a limited range of topics. The second point is significantly different from the Web in general. The type of search system that we propose in this book can suggest ways of refining or relaxing the query to assist a user in the search process. In order to suggest sensible query modifications we would need to know what the documents are about. Explicit knowledge about the document collection encoded in some electronic form is what we need. However, typically such knowledge is not available. So we construct it automatically.