You may have to Search all our reviewed books and magazines, click the sign up button below to create a free account.
Spoken Dialogue Systems Technology and Design covers key topics in the field of spoken language dialogue interaction from a variety of leading researchers. It brings together several perspectives in the areas of corpus annotation and analysis, dialogue system construction, as well as theoretical perspectives on communicative intention, context-based generation, and modelling of discourse structure. These topics are all part of the general research and development within the area of discourse and dialogue with an emphasis on dialogue systems; corpora and corpus tools and semantic and pragmatic modelling of discourse and dialogue.
Speech is the natural medium of human communication, but audible speech can be overheard by bystanders and excludes speech-disabled people. This work presents a speech recognizer based on surface electromyography, where electric potentials of the facial muscles are captured by surface electrodes, allowing speech to be processed nonacoustically. A system which was state-of-the-art at the beginning of this book is substantially improved in terms of accuracy, flexibility, and robustness.
Before designing a speech application system, three key questions have to be answered: who will use it, why and how often? This book focuses on these high-level questions and gives a criteria of when and how to design speech systems. After an introduction, the state-of-the-art in modern voice user interfaces is displayed. The book goes on to evolve criteria for designing and evaluating successful voice user interfaces. Trends in this fast growing area are also presented.
This book provides the reader with the knowledge necessary for comprehension of the field of Intelligent Audio Analysis. It firstly introduces standard methods and discusses the typical Intelligent Audio Analysis chain going from audio data to audio features to audio recognition. Further, an introduction to audio source separation, and enhancement and robustness are given. After the introductory parts, the book shows several applications for the three types of audio: speech, music, and general sound. Each task is shortly introduced, followed by a description of the specific data and methods applied, experiments and results, and a conclusion for this specific task. The books provides benchmark results and standardized test-beds for a broader range of audio analysis tasks. The main focus thereby lies on the parallel advancement of realism in audio analysis, as too often today’s results are overly optimistic owing to idealized testing conditions, and it serves to stimulate synergies arising from transfer of methods and leads to a holistic audio analysis.
Speech processing and speech transmission technology are expanding fields of active research. New challenges arise from the 'anywhere, anytime' paradigm of mobile communications, the ubiquitous use of voice communication systems in noisy environments and the convergence of communication networks toward Internet based transmission protocols, such as Voice over IP. As a consequence, new speech coding, new enhancement and error concealment, and new quality assessment methods are emerging. Advances in Digital Speech Transmission provides an up-to-date overview of the field, including topics such as speech coding in heterogeneous communication networks, wideband coding, and the quality assessment...
This book constitutes the refereed proceedings of the 26th Symposium of the German Association for Pattern Recognition, DAGM 2004, held in Tbingen, Germany in August/September 2004. The 22 revised papers and 48 revised poster papers presented were carefully reviewed and selected from 146 submissions. The papers are organized in topical sections on learning, Bayesian approaches, vision and faces, vision and motion, biologically motivated approaches, segmentation, object recognition, and object recognition and synthesis.
This work combines interdisciplinary knowledge and experience from research fields of psychology, linguistics, audio-processing, machine learning, and computer science. The work systematically explores a novel research topic devoted to automated modeling of personality expression from speech. For this aim, it introduces a novel personality assessment questionnaire and presents the results of extensive labeling sessions to annotate the speech data with personality assessments. It provides estimates of the Big 5 personality traits, i.e. openness, conscientiousness, extroversion, agreeableness, and neuroticism. Based on a database built on the questionnaire, the book presents models to tell apart different personality types or classes from speech automatically.
The two-volume set LNCS 8325 and 8326 constitutes the thoroughly refereed proceedings of the 20th Anniversary International Conference on Multimedia Modeling, MMM 2014, held in Dublin, Ireland, in January 2014. The 46 revised regular papers, 11 short papers, and 9 demonstration papers were carefully reviewed and selected from 176 submissions. 28 special session papers and 6 papers from Video Browser Showdown workshop are also included in the proceedings. The papers included in these two volumes cover a diverse range of topics including: applications of multimedia modelling, interactive retrieval, image and video collections, 3D and augmented reality, temporal analysis of multimedia content, compression and streaming. Special session papers cover the following topics: Mediadrom: artful post-TV scenarios, MM analysis for surveillance video and security applications, 3D multimedia computing and modeling, social geo-media analytics and retrieval, multimedia hyperlinking and retrieval.
In this landmark project, Moratto and Zhang evaluate how conference interpreting developed as a profession in China, and the directions in which it is heading. Bringing together perspectives from leading researchers in the field, Moratto and Zhang present a thematically organized analysis of the trajectory of professional conference interpreting in China. This includes discussion of the pedagogies used both currently and historically, the professionalization of interpreter education, and future prospects for virtual reality, multimodal conferences, and artificial intelligence. Taken as a whole, the contributors present a rich and detailed picture of the development of conference interpreting in China since 1979, its status today, and how it is likely to develop in the coming decades. An essential resource for scholars and students of conference interpreting in China, alongside its sister volume, The Pioneers of Chinese Interpreting: Insiders’ Accounts on the Rise of a Profession.
The two-volume set LNCS 8935 and 8936 constitutes the thoroughly refereed proceedings of the 21st International Conference on Multimedia Modeling, MMM 2015, held in Sydney, Australia, in January 2015. The 49 revised regular papers, 24 poster presentations, were carefully reviewed and selected from 189 submissions. For the three special session, a total of 18 papers were accepted for MMM 2015. The three special sessions are Personal (Big) Data Modeling for Information Access and Retrieval, Social Geo-Media Analytics and Retrieval and Image or video processing, semantic analysis and understanding. In addition, 9 demonstrations and 9 video showcase papers were accepted for MMM 2015. The accepted contributions included in these two volumes represent the state-of-the-art in multimedia modeling research and cover a diverse range of topics including: Image and Video Processing, Multimedia encoding and streaming, applications of multimedia modelling and 3D and augmented reality.