SIRET Research Group
Department of Software Engineering
Faculty of Mathematics and Physics
Charles University
Malostranské nám. 25,
118 00 Prague
Czech Republic
email: | info@siret.cz |
phone: | +420 95155 4227 |
The general paradigm for content-based retrieval is the similarity search model, which consists of three key components. First, given a database of complex data objects (e.g., multimedia documents), a set of feature descriptors must be extracted from the actual data objects. Second, a distance function must be defined on the descriptors that mimics the similarity between the respective objects. Third, a query is given using the query-by-example concept, that is, distances are evaluated between a query (example) descriptor and all the descriptors in the database while those sufficiently close (similar) to the example are returned to the user as a result.
Had we accept the above outlined query process as a naive implementation (sequential search of the entire database), there would be no problem and no constraints on the distance function. However, the distance function is often computationally expensive and the databases are too large to be searched both sequentially and efficiently. Hence, there were developed various models for indexing similarity, while the most of them follow the metric space model that assumes a metric distance function. The metric postulates allow to partition the descriptor space such that query processing visits only the prospective partitions, making the search efficient. However, the restriction on just metric distances is quite serious because real-world applications often require non-metric distances or even dynamic distances that change because of evolving user preferences. The SRG aims at investigating general techniques for indexing metric, non-metric and dynamic distance functions at large scale.
The multimedia data (images, audio, video) already confirmed their dominant role within the flood of data available over the Internet. With the exponential growth of multimedia data volumes, the means of multimedia retrieval cannot keep relying just on the conventional keyword-search technology that requires an annotation given by a set of keywords. Not only the annotation is mostly unavailable for all the multimedia data at such a large scale, but even the available annotatations usually suffer from subjectiveness and incompleteness. Thus, content-based multimedia retrieval systems need to be designed that employ similarity search models and techniques considering the actual multimedia content rather than the keywords. The SRG is involved in two research directions concerning indexing in content-based multimedia retrieval, in particular, the multimedia exploration access methods and indexing adaptive similarity. The outcomes of this research will be incorporated into our web-based Smart image retrieval system (SIR).
Bioinformatics consists in application of computer science and mathematics to the field of molecular biology, in order to solve complex biological problems. We are interested mainly in the development of efficient algorithms and computational tools which are widely used by the bioinformatics community. We are mostly involved in the following fields.
SIRET Research Group
Department of Software Engineering
Faculty of Mathematics and Physics
Charles University
Malostranské nám. 25,
118 00 Prague
Czech Republic
email: | info@siret.cz |
phone: | +420 95155 4227 |