Skip to Content
GAČR 15-08916S
Efficient subgraph discovery for petabyte-scale web analysis
2015 - 2017

The study of network behaviors without packet content inspection is becoming of increasing concern in context of network administration and security. Recent years observe an increasing demand for machine learning algorithms on graphs, since modeling interactions between entities by graphs is natural in context of large computer networks. A promising approach to modeling graphs that leverages the advantages of machine learning techniques is based on the so-called ``graphlets'', that provide embedding of graph fragments into vector spaces. However, wider adoption of graphlets is hindered by the cost of embedding and limitation to unweighted undirected graphs. In this project, we would like to focus on the design of generalized graphlet-based models and the respective vocabularies, and thus to increase the variety of applications potentially benefiting from graphlet-based descriptors. The proposed methodology will be verified within the domain of network security. In particular, malicious web communities will be searched on petabyte-scale network traffic database available to Cisco.

Principal investigator : Jakub Lokoc
Team member : Tomas Skopal, Premek Cech