Neo4j Interview Questions & Answers

Neo4j is an open source NOSQL graph database, implemented in Java. It saves data structured in graphs rather than in tables.

Neo4j is widely used for

  • Highly connected data – Social Network
  • Recommendation- ( e-coomerce)
  • Path Finding
  • Data First Schema (bottom-up)
  • Schema Evolution
  • A* (Least Cost Path)
                              Neo4j                                   MySQL
  • It consists of vertices and edges. Each vertex or node represent a key value or attribute
  • It is possible to store dynamic content like images, videos, audio,
  • It has the capability for deep search into the database without affecting the performance along with efficient timing
  • We can relate any two objects in neo4j by the mean of making relationship between any two nodes
  •  In relational databases, attributes are appended in plain table format
  • In relational databases, such as MySQL, it’s difficult to store videos, audios, images,
  • It takes longer time for database search and also inconvenient compared to neo4j
  • It lacks relationship and difficult to use them for connected graphs and data

Some important characteristics of neo4j includes

  • Materializing of relationship at creation time, resulting in no penalties for runtime queries
  • Continuous time traversals for relationship in the graph both in breadth and depth due to double linking on the storage level between nodes and relationships
  • Relationship in Neo4j is fast and make it possible to materialize and use new relationships later on to “shortcut” and speed up the domain data when new requirement arise
  • It can do memory caching for graphs and provides compact storage, resulting in efficient scale-up
  • It is written on the top of JVM

The role of building blocks

  • Nodes: They are entities
  • Relationship: It connects entities and structure domain
  • Properties: It consists of meta-data and attributes
  • Labels: It group nodes by role

You use “$” prompt to run all CQL commands in Neo4j.

There are two different types of object caches in Neo4j

  • Reference Caches: With this cache, Neo4j will use as much as allocated JVM heap memory as it can hold nodes and relationships
  • High-performance Caches: It get assigned a certain maximum amount of space on the JVM heap and will delete objects whenever it grows bigger than that.

Relationship and Nodes are added to the object cache as soon as they are accessed

Neo4j uses Cypher query language, which is unique to Neo4j. Traversing the graph requires to know where you want to begin (Start), the rules that allow traversal (Match) and what data you are expecting back (Return). The basic query consists of

  • START n
  • MATCH n-[r]- m
  • RETURN r;

As such Neo4j got RESTful API, you can query over the web, or you can run it locally.  It runs in the Heroku or Cloud.

To delete/remove entire graph directory you can use command rm –rf data/* as such Neo4j is not storing anything outside that.

Neo4J allows to store and retrieve multiple complex relations. The capability of Neo4j to do complex query in real time is really helpful in identifying a brute force attack much quicker.  The most crucial thing in detecting such attacks is to capture enough information about each requests like

  • Client real IP address and not the proxy one
  • Login failure or attempt success information
  • Timestamp

There was no indexing in earlier days for Neo4j, but later on it was introduced with new feature Automatic Indexes by using the command

Neo4j stores graph data in a number of different store files, and each store file consists of the data for a specific part of the graph for example relationships, nodes, properties etc. for example Neostore.nodestore.db, neostore.propertystore.db and so on.

Neo4j CQL command can be used for

  • To create nodes with and without properties
  • To create a relationship between nodes with properties
  • To create a relationship between nodes without properties
  • To make multiple or single labels to a Node or a Relationship

The CQL MATCH command in NEO4j is used for

  • To get data about properties and nodes from the database
  • To get data about relationship, nodes and properties from the database

The syntax for MATCH command is

MATCH

(

<node-name>:<label-name>

)

The rule for using MATCH command is that you cannot use this command alone to fetch data from the database otherwise it will show invalid syntax error.

Neo4j CQL use SET clause for the following purpose

  • Update or Add properties values
  • Add new properties to existing Relationship or Node

Neo4j CQL LIMIT clause is used for limit or filter the number of rows return by a query.

The IN Operator syntax in NEO4j would be something like this

IN[ <Collection-of-values>]

Neo4j stores primitive array in a compressed way in order to save the space on disk, to do that it uses a “bit saving” algorithm.