Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parallel implementation for all_pairs_bellman_ford_path #14

Merged
merged 7 commits into from
Dec 5, 2023
1 change: 1 addition & 0 deletions nx_parallel/algorithms/shortest_paths/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .weighted import *
61 changes: 61 additions & 0 deletions nx_parallel/algorithms/shortest_paths/weighted.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
from joblib import Parallel, delayed
from networkx.algorithms.shortest_paths.weighted import (
single_source_bellman_ford_path
)

import nx_parallel as nxp

__all__ = ["all_pairs_bellman_ford_path"]


def all_pairs_bellman_ford_path(G, weight="weight"):
"""Compute shortest paths between all nodes in a weighted graph.

Parameters
----------
G : NetworkX graph

weight : string or function (default="weight")
If this is a string, then edge weights will be accessed via the
edge attribute with this key (that is, the weight of the edge
joining `u` to `v` will be ``G.edges[u, v][weight]``). If no
such edge attribute exists, the weight of the edge is assumed to
be one.

If this is a function, the weight of an edge is the value
returned by the function. The function must accept exactly three
positional arguments: the two endpoints of an edge and the
dictionary of edge attributes for that edge. The function must
return a number.

Returns
-------
paths : iterator
(source, dictionary) iterator with dictionary keyed by target and
shortest path as the key value.

Notes
-----
Edge weight attributes must be numerical.
Distances are calculated as sums of weighted edges traversed.

"""
def _calculate_shortest_paths_subset(G, source, weight):
return (source, single_source_bellman_ford_path(G, source, weight=weight))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think G and weight are already loaded into the outer function's namespace, so they will be found when used within this helper function. So you can remove those two inputs and make this a function of only source. That also shortens the later code that calls this function. Less time patching together function arguments, more time needed for variable lookups. But I think it could be faster overall. Can you tell?

Copy link
Member Author

@Schefflera-Arboricola Schefflera-Arboricola Nov 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, passing only source is a little faster, following are the speed-ups for the same :

Screenshot 2023-11-03 at 6 04 59 PM

I have changed it in the recent commit


if hasattr(G, "graph_object"):
G = G.graph_object

nodes = G.nodes

total_cores = nxp.cpu_count()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be moving this to the function signature and match something like what scikit-learn does (n_jobs) but not a blocker here.


paths = Parallel(n_jobs=total_cores, return_as="generator")(
delayed(_calculate_shortest_paths_subset)(
G,
source,
weight=weight
)
for source in nodes
)
return paths
4 changes: 4 additions & 0 deletions nx_parallel/interface.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
from nx_parallel.algorithms.centrality.betweenness import betweenness_centrality
from nx_parallel.algorithms.shortest_paths.weighted import all_pairs_bellman_ford_path
from nx_parallel.algorithms.efficiency_measures import (
local_efficiency,
)
Expand Down Expand Up @@ -43,6 +44,9 @@ class Dispatcher:
# Efficiency
local_efficiency = local_efficiency

# Shortest Paths : all pairs shortest paths(bellman_ford)
all_pairs_bellman_ford_path = all_pairs_bellman_ford_path

# =============================

@staticmethod
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions timing/timing_comparison.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,3 +27,6 @@ local_efficiency

tournament is_reachable
![alt text](heatmap_is_reachable_timing.png)

all_pairs_bellman_ford_path
![alt text](heatmap_all_pairs_bellman_ford_path_timing.png)
13 changes: 11 additions & 2 deletions timing/timing_individual_function.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
import time
import random
import types

import networkx as nx
import pandas as pd
Expand All @@ -11,23 +13,30 @@
heatmapDF = pd.DataFrame()
number_of_nodes_list = [10, 50, 100, 300, 500]
pList = [1, 0.8, 0.6, 0.4, 0.2]
currFun = nx.betweenness_centrality
currFun = nx.all_pairs_bellman_ford_path
for i in range(0, len(pList)):
p = pList[i]
for j in range(0, len(number_of_nodes_list)):
num = number_of_nodes_list[j]

# create original and parallel graphs
G = nx.fast_gnp_random_graph(num, p, directed=False)

# for weighted graphs
for u, v in G.edges():
G[u][v]['weight'] = random.random()

H = nx_parallel.ParallelGraph(G)

# time both versions and update heatmapDF
t1 = time.time()
c = currFun(H)
if type(c)==types.GeneratorType: d = dict(c)
t2 = time.time()
parallelTime = t2 - t1
t1 = time.time()
c = currFun(G)
if type(c)==types.GeneratorType: d = dict(c)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth checking that the results are the same? (outside the timing part)
something like assert d1 == d2.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked d1==d2 for the all_pairs_bellman_ford_path before committing. It was true for all cases. But, for betweenness_centrality it was not always true. I had to round up all the values. I can add separate tests for all the algorithms, if that seems good to you.

t2 = time.time()
stdTime = t2 - t1
timesFaster = stdTime / parallelTime
Expand Down