Skip to Content

Neo4j database of proteins for protein-protein interaction identification

The process of data extraction from PDB to the Neo4J DB is described in the paper "Using Neo4j for mining protein graphs: a case study" published at the 2nd International Workshop on NoSQL Databases, Emerging Database Technologies and Applications .

  • DB download (258 MB) generated using neo4jShell -c dump


  • 69,200 protein structures translating into about 69,200 connected components
  • 14,285,327 nodes
  • 37,788,722 edges
  • DB size (uncompressed) ~ 3GB
  • Neo4j version: 2.2.0