You may have to Search all our reviewed books and magazines, click the sign up button below to create a free account.
State-of-the-art database systems manage and process a variety of complex objects, including strings and trees. For such objects equality comparisons are often not meaningful and must be replaced by similarity comparisons. This book describes the concepts and techniques to incorporate similarity into database systems. We start out by discussing the properties of strings and trees, and identify the edit distance as the de facto standard for comparing complex objects. Since the edit distance is computationally expensive, token-based distances have been introduced to speed up edit distance computations. The basic idea is to decompose complex objects into sets of tokens that can be compared effi...
Entity Resolution (ER) lies at the core of data integration and cleaning and, thus, a bulk of the research examines ways for improving its effectiveness and time efficiency. The initial ER methods primarily target Veracity in the context of structured (relational) data that are described by a schema of well-known quality and meaning. To achieve high effectiveness, they leverage schema, expert, and/or external knowledge. Part of these methods are extended to address Volume, processing large datasets through multi-core or massive parallelization approaches, such as the MapReduce paradigm. However, these early schema-based approaches are inapplicable to Web Data, which abound in voluminous, noi...
Interacting with graphs using queries has emerged as an important research problem for real-world applications that center on large graph data. Given the syntactic complexity of graph query languages (e.g., SPARQL, Cypher), visual graph query interfaces make it easy for non-programmers to query such graph data repositories. In this book, we present recent developments in the emerging area of visual graph querying paradigm that bridges traditional graph querying with human computer interaction (HCI). Specifically, we focus on techniques that emphasize deep integration between the visual graph query interface and the underlying graph query engine. We discuss various strategies and guidance for...
The chapter “An Efficient Index for Reachability Queries in Public Transport Networks” is available open access under a Creative Commons Attribution 4.0 International License via link.springer.com.
This book constitutes the refereed proceedings of the 6th International XML Database Symposium, XSym 2009, held in Lyon, France, in August 2009 in conjunction with the International Conference on Very Large Data Bases, VLDB 2009. The 8 revised full papers together with 7 short paper were carefully reviewed and selected from 26 submissions. Covering all current aspects of core database technology for XML data management, XML and data integration, and development and deployment of XML applications, the papers are organized in topical sections on XML twig queries, query execution, xml document parsing and compression, XQuery and XML transaction management and schema design.
Data usually comes in a plethora of formats and dimensions, rendering the exploration and information extraction processes challenging. Thus, being able to perform exploratory analyses in the data with the intent of having an immediate glimpse on some of the data properties is becoming crucial. Exploratory analyses should be simple enough to avoid complicate declarative languages (such as SQL) and mechanisms, and at the same time retain the flexibility and expressiveness of such languages. Recently, we have witnessed a rediscovery of the so-called example-based methods, in which the user, or the analyst, circumvents query languages by using examples as input. An example is a representative o...
This two volume set LNCS 8644 and LNCS 8645 constitutes the refereed proceedings of the 25th International Conference on Database and Expert Systems Applications, DEXA 2014, held in Munich, Germany, September 1-4, 2014. The 37 revised full papers presented together with 46 short papers, and 2 keynote talks, were carefully reviewed and selected from 159 submissions. The papers discuss a range of topics including: data quality; social web; XML keyword search; skyline queries; graph algorithms; information retrieval; XML; security; semantic web; classification and clustering; queries; social computing; similarity search; ranking; data mining; big data; approximations; privacy; data exchange; data integration; web semantics; repositories; partitioning; and business applications.
The EGOV Conference Series intends to assess the state of the art in e-Gove- ment and to provide guidance for research and development in this fast-moving ?eld. The annual conferences bring together leading research experts and p- fessionals from all over the globe. Thus, EGOV 2003 in Prague built on the achievements of the 1st EGOV Conference (Aix-en-Provence, 2002), which p- vided an illustrative overview of e-Government activities. This year the interest even increased: nearly 100 contributions, and authors coming from 34 countries. In this way EGOV Conference 2003 was a reunion for professionals from all over the globe. EGOV 2003 brought some changes in the outline and structure of the c...
This book constitutes the refereed proceedings of the 12th International Symposium on Foundations of Information and Knowledge Systems, FoIKS 2022, held in Helsinki, Finland, in June 2022. The 13 full papers presented were carefully reviewed and selected from 21 submissions. The papers address various topics such as information and knowledge systems, including submissions that apply ideas, theories or methods from specific disciplines to information and knowledge systems. Examples of such disciplines are discrete mathematics, logic and algebra, model theory, databases, information theory, complexity theory, algorithmics and computation, statistics and optimization.
Large-scale data analytics using machine learning (ML) underpins many modern data-driven applications. ML systems provide means of specifying and executing these ML workloads in an efficient and scalable manner. Data management is at the heart of many ML systems due to data-driven application characteristics, data-centric workload characteristics, and system architectures inspired by classical data management techniques. In this book, we follow this data-centric view of ML systems and aim to provide a comprehensive overview of data management in ML systems for the end-to-end data science or ML lifecycle. We review multiple interconnected lines of work: (1) ML support in database (DB) systems...