Skip to Content

GAUK 57907

Name: 
Similarity search in biological databases
Start year: 
2007
End year: 
2008

In recent years volume of gene and protein banks (databases) grows rapidly. The reason for storing huge volumes of gene and protein sequences in one place is not only for browsing these sequences itself, but in the first place searching for similarities among stored sequences. Similar sequences indicate similar functionality which helps in finding functions of unknown genes.

Investigators
Investigator: 
david.hoksza
Investigator role: 
Principal investigator
Investigator: 
tomas.skopal
Investigator role: 
Team member

GAUK 18208

Name: 
Distributed and parallel metric indexing in multimedia databases
Start year: 
2008
End year: 
2009

Current data processing applications use data with considerably less structure and much less precise queries than traditional database systems. The multimedia data, like images or videos, that offer query-by-example search, are a typical example. Such data can neither be ordered in a canonical manner nor meaningfully searched by precise database queries that would return exact matches. This novel situation is what has given rise to a similarity searching.

Investigators
Investigator: 
jakub.lokoc
Investigator role: 
Principal investigator
Investigator: 
tomas.skopal
Investigator role: 
Team member

GAČR 201/09/0683

Name: 
Similarity Searching in Very Large Multimedia Databases
Start year: 
2009
End year: 
2011

Finished, rated as excellent

Investigators
Investigator: 
tomas.skopal
Investigator role: 
Co-Investigator

GAČR P202/11/0968

Name: 
Large-scale Nonmetric Similarity Search in Complex Domains
Start year: 
2011
End year: 
2014

The similarity search is popular in various areas of computing, including multimedia databases, data mining, bioinformatics, etc. For a long time, the database approaches to similarity search assumed the similarity as a metric distance. Due to its properties, metric similarity allows to index a database such that it can be queried efficiently (quickly). However, together with the increasing complexity of data across various domains, there appeared many similarities in recent years that were not metrics (i.e., nonmetrics).

Investigators
Investigator: 
tomas.skopal
Investigator role: 
Principal investigator
Investigator: 
david.hoksza
Investigator role: 
Team member
Investigator: 
jakub.lokoc
Investigator role: 
Team member
Investigator: 
jiri.novak
Investigator role: 
Team member
Investigator: 
juraj.mosko
Investigator role: 
Team member
Investigator: 
tomas.bartos
Investigator role: 
Team member

GAUK 430711

Name: 
Application of Metric and Non-metric Indexing Methods in Computational Proteomics
Start year: 
2011
End year: 
2012

The volume of unstructured databases grows extremely whereas its annotation is problematic. The similarity search concept based on a similarity function defined for each pair of database objects is more suitable for this kind of data. The similarity is usually modelled by a distance function satisfying metric axioms, which allows efficient indexing. However, metric axioms can be very restrictive for domain experts who may prefer non-metric functions.

Investigators
Investigator: 
jakub.galgonek
Investigator role: 
Principal investigator
Investigator: 
tomas.skopal
Investigator role: 
Team member
Investigator: 
jiri.novak
Investigator role: 
Team member
Investigator: 
jakub.lokoc
Investigator role: 
Team member

Tomáš Skopal (head)

Name: 
Tomáš
Surname: 
Skopal (head)
Homepage: 
http://siret.ms.mff.cuni.cz/skopal
Research interests: 
  • similarity search in metric and nonmetric spaces
  • multimedia databases
  • database indexing
  • information retrieval
Type of member: 
Staff
State: 
Active

sdfg

SETTER

Type: 
Bioinformatics & Cheminformatics
One line description: 
RNA structure similarity search
Annotation: 

SETTER web-server utilizes SETTER (SEcondary sTructure-based TERtiary Structure Similarity Algorithm) method for fast and accurate structural pairwise alignment. The server is capable of comparing a pair of RNA structures or using one strucutre as a query and search against a user-defined database of RNA structures. The efficiency of the algorithm is given by the decomposition of the RNA structure into the set of non-overlapping generalized secondary structure motifs (GSSUs). GSSU usually resembles a hairpin motif possibly containing bulges and/or internal loops in its stem part. A segmentation to GSSUs offers good scalability with respect to the structure size (SETTER scales linearly with the structure size) because the number of residues in GSSUs (SETTER scales quadratically with the GSSU size) generally does not increase with increased size of the RNA structure. The underlying SETTER algorithm is both accurate and very fast, and does not impose limits on the size of aligned RNA structures. SETTER is able to compare a pair of even the largest RNA structure in less than one minute.

Developers: 
david.hoksza

SIR

Type: 
Multimedia
One line description: 
Smart image retrieval
Annotation: 

When determining visual similarity of two images, it is evaluated on feature representations which consist of some content-based image properties. The conventional feature representations aggregate and store these properties in global feature histograms (e.g.,
MPEG-7 visual descriptors).

Recent feature representations, however, adaptively aggregate local image features in more flexible feature signatures, which can be
compared by adaptive similarity measures. The SIR engine developed at SIRET research group combines traditional MPEG-7 visual descriptors with feature signatures, leading to improved similarity search in image collections.

Currently, the SIR engine operates in a demo mode as a standalone image search engine. In order to manage large image collections in real time, the engine employs original database indexing technology. The SIR engine also includes meta-search functionality that allows to augment/rerank/explore results provided by other image search engines, such as Google Images and others. The actual version of the online re-ranking and exploration tool employes the particle physics model, that both distributes images on the screen and automatically creates visually similar clusters (as a side effect). To refer this tool, you can refer our publications - Image Exploration using Online Feature Extraction and Reranking (ICMR, 2012) and SIR: The Smart Image Retrieval Engine (SISAP, 2012).

Developers: 
jakub.lokoc
tomas.skopal

David Hoksza

Name: 
David
Surname: 
Hoksza
Homepage: 
http://siret.ms.mff.cuni.cz/members/hoksza
Research interests: 
  • bioinformatics
  • structural bioinformatics
  • cheminformatics
  • protein and RNA databases
Type of member: 
Staff
State: 
Active

Nostrud Neque Quidem Neo Interdico

node (page) - Melior pagus vulpes meus jugis ut. Iriure valde consectetuer abigo at bene ideo commoveo. Quadrum abbas vero macto neo probo ille et vulpes. Sino virtus rusticus brevitas mos damnum ad.
Vulputate et vero zelus feugiat os olim obruo. Facilisis quadrum proprius gravis velit humo nunc wisi imputo. Antehabeo validus dolore tation facilisi ullamcorper gemino.
Syndicate content