You may have to Search all our reviewed books and magazines, click the sign up button below to create a free account.
Speech is the natural medium of human communication, but audible speech can be overheard by bystanders and excludes speech-disabled people. This work presents a speech recognizer based on surface electromyography, where electric potentials of the facial muscles are captured by surface electrodes, allowing speech to be processed nonacoustically. A system which was state-of-the-art at the beginning of this book is substantially improved in terms of accuracy, flexibility, and robustness.
Spoken Dialogue Systems Technology and Design covers key topics in the field of spoken language dialogue interaction from a variety of leading researchers. It brings together several perspectives in the areas of corpus annotation and analysis, dialogue system construction, as well as theoretical perspectives on communicative intention, context-based generation, and modelling of discourse structure. These topics are all part of the general research and development within the area of discourse and dialogue with an emphasis on dialogue systems; corpora and corpus tools and semantic and pragmatic modelling of discourse and dialogue.
Before designing a speech application system, three key questions have to be answered: who will use it, why and how often? This book focuses on these high-level questions and gives a criteria of when and how to design speech systems. After an introduction, the state-of-the-art in modern voice user interfaces is displayed. The book goes on to evolve criteria for designing and evaluating successful voice user interfaces. Trends in this fast growing area are also presented.
In this landmark project, Moratto and Zhang evaluate how conference interpreting developed as a profession in China, and the directions in which it is heading. Bringing together perspectives from leading researchers in the field, Moratto and Zhang present a thematically organized analysis of the trajectory of professional conference interpreting in China. This includes discussion of the pedagogies used both currently and historically, the professionalization of interpreter education, and future prospects for virtual reality, multimodal conferences, and artificial intelligence. Taken as a whole, the contributors present a rich and detailed picture of the development of conference interpreting in China since 1979, its status today, and how it is likely to develop in the coming decades. An essential resource for scholars and students of conference interpreting in China, alongside its sister volume, The Pioneers of Chinese Interpreting: Insiders’ Accounts on the Rise of a Profession.
This book provides the reader with the knowledge necessary for comprehension of the field of Intelligent Audio Analysis. It firstly introduces standard methods and discusses the typical Intelligent Audio Analysis chain going from audio data to audio features to audio recognition. Further, an introduction to audio source separation, and enhancement and robustness are given. After the introductory parts, the book shows several applications for the three types of audio: speech, music, and general sound. Each task is shortly introduced, followed by a description of the specific data and methods applied, experiments and results, and a conclusion for this specific task. The books provides benchmark results and standardized test-beds for a broader range of audio analysis tasks. The main focus thereby lies on the parallel advancement of realism in audio analysis, as too often today’s results are overly optimistic owing to idealized testing conditions, and it serves to stimulate synergies arising from transfer of methods and leads to a holistic audio analysis.
This book integrates a wide range of research topics related to and necessary for the development of proactive, smart, computers in the human interaction loop, including the development of audio-visual perceptual components for such environments; the design, implementation and analysis of novel proactive perceptive services supporting humans; the development of software architectures, ontologies and tools necessary for building such environments and services, as well as approaches for the evaluation of such technologies and services. The book is based on a major European Integrated Project, CHLI (Computers in the Human Interaction Loop), and throws light on the paradigm shift in the area of HCI that rather than humans interactive directly with machines, computers should observe and understand human interaction, and support humans during their work and interaction in an implicit and proactive manner.
Speech processing and speech transmission technology are expanding fields of active research. New challenges arise from the 'anywhere, anytime' paradigm of mobile communications, the ubiquitous use of voice communication systems in noisy environments and the convergence of communication networks toward Internet based transmission protocols, such as Voice over IP. As a consequence, new speech coding, new enhancement and error concealment, and new quality assessment methods are emerging. Advances in Digital Speech Transmission provides an up-to-date overview of the field, including topics such as speech coding in heterogeneous communication networks, wideband coding, and the quality assessment...
This book constitutes the refereed proceedings of the 26th Symposium of the German Association for Pattern Recognition, DAGM 2004, held in Tbingen, Germany in August/September 2004. The 22 revised papers and 48 revised poster papers presented were carefully reviewed and selected from 146 submissions. The papers are organized in topical sections on learning, Bayesian approaches, vision and faces, vision and motion, biologically motivated approaches, segmentation, object recognition, and object recognition and synthesis.
The two-volume set LNCS 8935 and 8936 constitutes the thoroughly refereed proceedings of the 21st International Conference on Multimedia Modeling, MMM 2015, held in Sydney, Australia, in January 2015. The 49 revised regular papers, 24 poster presentations, were carefully reviewed and selected from 189 submissions. For the three special session, a total of 18 papers were accepted for MMM 2015. The three special sessions are Personal (Big) Data Modeling for Information Access and Retrieval, Social Geo-Media Analytics and Retrieval and Image or video processing, semantic analysis and understanding. In addition, 9 demonstrations and 9 video showcase papers were accepted for MMM 2015. The accepted contributions included in these two volumes represent the state-of-the-art in multimedia modeling research and cover a diverse range of topics including: Image and Video Processing, Multimedia encoding and streaming, applications of multimedia modelling and 3D and augmented reality.