Book name: Graph Databases – NEW
OPPORTUNITIES FOR CONNECTED DATA
Authors – Ian Robinson, Jim Webber
and Emil Eifrem
Publisher – O’REILLY MEDIA
Book can be downloaded for free from here - http://neo4j.com/books/
Chapter 6 is about Graph
Database Internals which discuss the implementation of graph
databases. It considers most common architectures and Neo4j graph
database architecture for discussion.
Native Graph Processing - A database
engine that utilizes index-free adjacency is one in which each node
maintains direct references to its adjacent nodes. Each node,
therefore, acts as a micro-index of other nearby nodes, which is much
cheaper than using global indexes. A nonnative graph database engine,
in contrast, uses (global) indexes to link nodes together. Also there
is good explanation on how Index-Free Adjacency Leads to Low-Cost
“Joins”.
Native Graph Storage – Neo4j stores
graph data in a number of different store files. Each store file
contains the data for a specific part of the graph (e.g. , there are
separate stores for nodes, relationships, labels, and properties).
Then it explains Neo4j node and relationship store file record
structure in detail.
Programmatic APIs – Following the
APIs are discussed:
- Kernel API: These allow user code to listen to transactions as they flow through the kernel, and thereafter to react (or not) based on the data content and lifecycle stage of the transaction.
- Core API: This is an imperative Java API that exposes the graph primitives of nodes, relationships, properties, and labels to the user. When used for reads, the API is lazily evaluated, meaning that relationships are only traversed as and when the calling code demands the next node.
- Traversal Framework: A declarative Java API which enables the user to specify a set of constraints that limit the parts of the graph the traversal is allowed to visit.
In next section, following
Nonfunctional Characteristics are discussed in detail:
- Transactions (How transactions are implemented in Neo4j)
- Recoverability
- Availability (Replication)
- Scale – Capacity (graph size), Latency (response time), Read and Write throughput.
Chapter 7 is Predictive Analysis with
Graph Theory which examine some analytical techniques and algorithms
for processing graph data.
Following search/path finding
algorithms are explained in brief:
- Depth- and Breadth- First Search
- Path-Finding with Dijkstra’s Algorithm
- The A* (A-star) Algorithm
In next section, Graph Theory and
Predictive Modeling is explained with following points:
- Triadic Closures – A triadic closure is a common property of social graphs, where we observe that if two nodes are connected via a path involving a third node, there is an increased likelihood that the two nodes will become directly connected at some point in the future.
- Structural Balance – Relationship balance between nodes of a graph.
- Local Bridges – A connection between two sub-graphs.
Book also has Appendix which gives
NOSQL Overview. Readers new to NOSQL, should read this overview first
for better understanding of the book.
No comments:
Post a Comment