Tuesday, October 28, 2025

Pagerank in Oracle Graph (SQL/PGQ)

Oracle 26ai has support for running advance Graphs Algorithms like Pagerank on Property Graphs (SQL/PGQ) stored the in database. While Graphs represent relationships, Graph algorithms provide additional features that quantitatively capture how an entity is connected to other entities.

This blog is intended to explore the Graph Algorithms and how to use then Oracle 26ai. In this post we will look at the Pagerank Algorithm.

When analyzing Graphs, we are often interested in what is the value of a vertex (node) in relation to other vertices? This is where the Centrality, PageRank, Betweenness Centrality measures become useful.  In pagerank, we measure importance of nodes by incoming edges, importance of neighboring nodes.

Pagerank Algorithm:

  1. Give the initial rank to the starting node only
  2. Each node gets ranks from incoming edges, and distribute own ranks to outgoing edges
  3. Iterate until the ranks are settled  

Let's start with a simple example where have a Beneficial Ownership graph of some fictitious entities:

Let's first create some sample data and a Property Graph that captures the Beneficial Ownership relationship.

 


create table entity(entity_name varchar(80));
create table ownership(entity_name varchar(80), owner_name varchar(80), percentage float);


insert into entity values 
  ('Uroosa'), ('Fatima'), ('Zainab')
  , ('Maryam'), ('Hannah'), ('Acme Corp.')
  , ('Northwind'), ('Fourth Coffee'), ('Tasmanian Traders')
  , ('Blue Yonder'), ('Contoso'), ('Hotel Kitchen Sink'), ('Coffee Corp.');
insert into OWNERSHIP values 
  ('Acme Corp.', 'Uroosa', 0.80)
  , ('Acme Corp.', 'Fatima', 0.20)
  , ('Coffee Corp.', 'Maryam', 1)
  , ('Hotel Kitchen Sink', 'Hannah', 0.25)
  , ('Hotel Kitchen Sink', 'Uroosa', 0.15)
  , ('Fourth Coffee', 'Zainab', 0.19)
  , ('Fourth Coffee', 'Acme Corp.', 0.30)
  , ('Hotel Kitchen Sink', 'Northwind', 0.09)
  , ('Fourth Coffee', 'Blue Yonder', 0.51)
  , ('Hotel Kitchen Sink', 'Fourth Coffee', 0.51) 
  , ('Blue Yonder', 'Coffee Corp.', 0.51)
  , ('Blue Yonder', 'Tasmanian Traders', 0.49);

Next create a Property Graph to capture the Beneficial Ownership relationships. In CREATE PROPERTY GRAPH statement VERTEX TABLES clause and EDGE TABLES clause list the source tables. This creates a graph with materialized vertices and edges.
CREATE OR REPLACE PROPERTY GRAPH beneficial_owner_graph
  VERTEX TABLES (
    entity KEY ( entity_name ) PROPERTIES ARE ALL COLUMNS
)
  EDGE TABLES (
   ownership AS ownership_percentage
     key (entity_name, owner_name)
     SOURCE KEY ( owner_name ) REFERENCES entity( entity_name )
     DESTINATION KEY ( entity_name ) REFERENCES entity( entity_name )
     PROPERTIES ARE ALL COLUMNS
);
  

Next run the pagerank algorithm on this graph.


%python-pgx

print(session)
graph = session.read_graph_by_name("BENEFICIAL_OWNER_GRAPH", "pg_sql")
print(graph)
analyst.pagerank(graph)
analyst.degree_centrality(graph)
analyst.in_degree_centrality(graph)
analyst.out_degree_centrality(graph)
analyst.vertex_betweenness_centrality(graph)

After running algorithms, the results are stored into the graph, e.g. each node gets a new property "pagerank". Users can access the results using pgql:

graph.query_pgql("""
SELECT 
 a.degree
 , a.in_degree
 , a.out_degree
 , a.betweenness
FROM MATCH (a)
ORDER BY a.pagerank DESC
LIMIT 10
""").print()

 Here is the output from the Pagerank algorithm: 


PgxSession(id: 9a0e419f-733f-415a-a8a7-388aa2f48d4a, name: ADMIN)
PgxGraph(name: BENEFICIAL_OWNER_GRAPH_6, v: 13, e: 12, directed: True, memory(Mb): 0)
+-------------------------------------------------------------------------------------------+
| ENTITY_NAME        | pagerank             | degree | in_degree | out_degree | betweenness |
+-------------------------------------------------------------------------------------------+
| Hotel Kitchen Sink | 0.10169935096153847  | 4      | 4         | 0          | 0.0         |
| Fourth Coffee      | 0.07722548076923078  | 4      | 3         | 1          | 7.0         |
| Blue Yonder        | 0.03949038461538462  | 3      | 2         | 1          | 6.0         |
| Acme Corp.         | 0.026250000000000006 | 3      | 2         | 1          | 3.0         |
| Coffee Corp.       | 0.021346153846153848 | 2      | 1         | 1          | 3.0         |
| Northwind          | 0.01153846153846154  | 1      | 0         | 1          | 0.0         |
| Zainab             | 0.01153846153846154  | 1      | 0         | 1          | 0.0         |
| Hannah             | 0.01153846153846154  | 1      | 0         | 1          | 0.0         |
| Tasmanian Traders  | 0.01153846153846154  | 1      | 0         | 1          | 0.0         |
| Maryam             | 0.01153846153846154  | 1      | 0         | 1          | 0.0         |
+-------------------------------------------------------------------------------------------+

Each vertex has a pagerank value between 0 and 1. Higher the value, more important the node. This is additional information about your data. It is explicilitly not present in the data, but derived from how data is connected.

No comments:

Post a Comment