Network Connectivity
Original Source: https://www.coursera.org/specializations/data-science-python
Clustering Coefficient
Triadic closure: The tendency for people who share connections in a social network to become connected.
To measure the prevalence of triadic closure in a network, we use clustering coefficient
.
Local Clustering Coefficient
Fraction of pairs of the nodes’ friends that are friends with each other
Example
Local Clustering Coefficient of ‘C’ = number of pairs of C's friends who are friends
/ number of pairs of C's friends
= number of pairs of C's friends who are friends
/ (degree of C
X (degree of C
-1) / 2)
2 / (12/2) = 0.333
We will assume that the local clustering coefficient of a node of degree less than 2 to be 0.
%matplotlib inline
import networkx as nx
import matplotlib.pyplot as plt
import numpy as np
np.random.seed(0)
G = nx.Graph()
G.add_edges_from([('A', 'K'), ('A', 'B'), ('A', 'C'), ('B', 'C'), ('B', 'K'),
('C', 'E'), ('C', 'F'), ('D', 'E'), ('E', 'F'), ('E', 'H'), ('F', 'G'), ('I', 'J')])
print(nx.clustering(G, 'C'))
print(nx.clustering(G, 'J'))
0.3333333333333333
0
Global Clustering Coefficient
Approach 1) Average clustering: Average local clustering coefficient over all nodes in the graph
nx.average_clustering(G)
0.28787878787878785
Approach 2) Transitivity: Percentage of “open triads” that are triangles in a network
Transitivity = 3 X Number of closed triads (Triangles)
/ Number of open triads
nx.transitivity(G)
0.4090909090909091
Transitivity vs Average Clustering Coefficient
Transitivity weights nodes with large degree higher
Distance Measures
Distance between two nodes
the length of the shortest path between them.
Example
The distance between node A and H is 4
G.remove_edge('A','C')
G.add_edges_from([('E', 'I')])
print(nx.shortest_path(G, 'A', 'H'))
print(nx.shortest_path_length(G, 'A', 'H'))
['A', 'B', 'C', 'E', 'H']
4
Distance between one node to every other nodes
Breadth-first search: a systematic and efficient procedure for computing distances from a node to all other nodes in a large network by “discovering”nodes in layers
# Generating breadth-first search tree
T = nx.bfs_tree(G, 'A')
T.edges()
OutEdgeView([('A', 'K'), ('A', 'B'), ('B', 'C'), ('C', 'E'), ('C', 'F'), ('E', 'D'), ('E', 'H'), ('E', 'I'), ('F', 'G'), ('I', 'J')])
nx.shortest_path_length(G, 'A')
{'A': 0,
'K': 1,
'B': 1,
'C': 2,
'E': 3,
'F': 3,
'D': 4,
'H': 4,
'I': 4,
'G': 4,
'J': 5}
Global distance measures
1. Average distance
nx.average_shortest_path_length(G)
2.5272727272727273
2. Diameter: maximum distance between any pair of nodes
nx.diameter(G)
5
3. Eccentricitiy: largest distance between node n and all other nodes
nx.eccentricity(G)
{'A': 5,
'K': 5,
'B': 4,
'C': 3,
'E': 3,
'F': 3,
'D': 4,
'H': 4,
'G': 4,
'I': 4,
'J': 5}
4. radius: minimum eccentricity
nx.radius(G)
3
5. Periphery: set of nodes that have eccentricity equal to the diameter
nx.periphery(G)
['A', 'K', 'J']
6. center: set of nodes that have eccentricity equal to the radius
nx.center(G)
['C', 'E', 'F']
Connected Components
Connectivity in undirected graphs
An undirected graph is connected if, for every pair nodes, there is a path between them.
G = nx.Graph()
G.add_edges_from([('A', 'B'), ('A', 'C'), ('G', 'J'), ('A', 'E'), ('A', 'G'), ('B', 'C'), ('B', 'D'), ('B', 'E'), ('C', 'D'), ('C', 'E'), ('F', 'G'), ('F', 'I'), ('F', 'J'), ('I', 'J'), ('I', 'H'), ('I', 'G'), ('H', 'G'), ('J', 'O'), ('A', 'N'), ('M', 'L'), ('M', 'K'), ('L', 'K'), ('L', 'N'), ('L', 'O'), ('O', 'K'), ('O', 'N')])
nx.is_connected(G)
True
G.remove_edges_from([('A', 'G'), ('A', 'N'), ('J', 'O')])
nx.is_connected(G)
False
Connected component
A subset of nodes such as:
- Every node in the subset has a path to every other node.
- No other node has a path to any node in the subset.
nx.number_connected_components(G)
3
sorted(nx.connected_components(G))
[{'A', 'B', 'C', 'D', 'E'},
{'F', 'G', 'H', 'I', 'J'},
{'K', 'L', 'M', 'N', 'O'}]
nx.node_connected_component(G, 'M')
{'K', 'L', 'M', 'N', 'O'}
Connectivitiy in directed graphs
A directed graph is strongly connected if, for every pair nodes u and v, there is a directed path from u to v and a directed path from v to u.
This graph is not strongly connected since there is no directed path from A and H.
G_di = nx.DiGraph()
G_di.add_edges_from([('A', 'B'), ('C', 'A'), ('A', 'E'), ('G', 'J'), ('G', 'A'), ('D', 'B'), ('B', 'E'), ('B' 'C'), ('C', 'D'), ('E', 'C'), ('E', 'D'), ('F', 'G'), ('I', 'F'), ('J', 'F'), ('I', 'J'), ('I', 'H'), ('H', 'I'), ('I', 'G'), ('H', 'G'), ('O', 'J'), ('J', 'O'), ('A', 'N'), ('L', 'M'), ('K', 'M'), ('K', 'L'), ('N', 'L'), ('O', 'L'), ('O', 'K'), ('N', 'O')])
nx.is_strongly_connected(G_di)
False
A directed graph is weakly connected if replacing all directed edges with undirected edges produces a connected undirected graph.
nx.is_weakly_connected(G_di)
True
Strongly connected component
A subset of nodes such as:
- Every node in the subset has a directed path to every other node.
- No other node has a directed path to every node in the subset.
sorted(nx.strongly_connected_components(G_di))
[{'M'},
{'L'},
{'K'},
{'A', 'B', 'C', 'D', 'E', 'F', 'G', 'J', 'N', 'O'},
{'H', 'I'}]
sorted(nx.weakly_connected_components(G_di))
[{'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O'}]
Network Robustness
Network robustness is the ability of a network to maintain its general structural properties when it faces failures or attacks. (ex. airport closures, internet router failures, power line failures)
Failures or attacks means removal of nodes or edges, and structural properties means connectivity of a network.
graph G (undirected graph)
Measurements of network robustness
1. Node connectivity
smallest number of nodes that can be removed from a graph in order to disconnect it.
G.add_edges_from([('A', 'G'), ('A', 'N'), ('J', 'O')])
nx.node_connectivity(G)
1
# which edges?
nx.minimum_node_cut(G)
{'A'}
2. Edge connectivity
smallest number of edges that can be removed from a graph in order to disconnect it.
nx.edge_connectivity(G)
2
# which edges?
nx.minimum_edge_cut(G)
{('A', 'N'), ('J', 'O')}
For Directed Graphs
graph G_di (directed graph)
- Simple Paths
Imagine node G wants to send a message to node L by passing it along to other nodes in this network.What options does G have to deliver the message?
sorted(nx.all_simple_paths(G_di, 'G', 'L'))
[['G', 'A', 'N', 'L'],
['G', 'A', 'N', 'O', 'K', 'L'],
['G', 'A', 'N', 'O', 'L'],
['G', 'J', 'O', 'K', 'L'],
['G', 'J', 'O', 'L']]
** 1. Node Connectivitiy**
If we wanted to block the message from G to L by removing nodes from the network, how many nodes would we need to remove?
nx.node_connectivity(G, 'G', 'L')
2
nx.minimum_node_cut(G, 'G', 'L')
{'N', 'O'}
nx.minimum_node_cut
only returns one case of many. For example, cutting ‘J’ and ‘A’ would have also disconnected the network.
2. Edge Connectivity
If we wanted to block the message from G to L by removing edges from the network, how many edges would we need to remove?
nx.edge_connectivity(G, 'G', 'L')
2
nx.minimum_edge_cut(G, 'G', 'L')
{('A', 'N'), ('J', 'O')}
Visualizing Networks
We will create a graph which simulates populations of a state and mobility between states.
Then we will visualize it in different ways.
# create a graph
n_nodes = 20
G = nx.fast_gnp_random_graph(n_nodes, 0.3)
edges = list(G.edges())
n_edges = len(edges)
# set node attributes (population and location)
population = np.random.randint(low=1000, high=10000, size=n_nodes)
location = list(zip(np.random.randn(n_nodes), np.random.randn(n_nodes)))
node_attr = {i: {'population': population[i], 'location': location[i]} for i in range(20)}
nx.set_node_attributes(G, node_attr)
# set edge attributes (weight, which means mobility between states here)
weight = np.random.randint(low=1, high=100, size=n_edges)
edge_attr = {edges[i]: {'weight': weight[i]} for i in range(n_edges)}
nx.set_edge_attributes(G, edge_attr)
# check that the attributes are assigned successfully
print(nx.get_node_attributes(G, 'population'))
print(nx.get_node_attributes(G, 'location'))
print(nx.get_edge_attributes(G, 'weight'))
{0: 3732, 1: 4264, 2: 5859, 3: 8891, 4: 5373, 5: 6874, 6: 7744, 7: 4468, 8: 1705, 9: 3599, 10: 3222, 11: 8768, 12: 3897, 13: 1537, 14: 7216, 15: 7921, 16: 7036, 17: 3163, 18: 6072, 19: 5851}
{0: (-0.07770456650883938, 2.161717373494259), 1: (1.0896301649283022, -0.9569314292805337), 2: (0.09654266759327351, 0.06731083219488099), 3: (1.4186671084676297, 0.20649883721106427), 4: (1.1682731374280053, -0.45688132622556726), 5: (0.9471859464018728, -1.0599757640207523), 6: (1.0854870347559817, 0.6149573156935568), 7: (2.3822244479133983, 1.4296607664750955), 8: (-0.40602373632291733, -0.21195225635208395), 9: (0.2664453411788736, -0.08033725576207178), 10: (-1.3557137239186656, 0.4053977840180792), 11: (-0.11410253188685747, 0.11860658523361353), 12: (-0.844230864581227, 1.2544140683375582), 13: (0.7056408100507655, 1.419102040788718), 14: (-0.39878617335193706, -0.7438560828698322), 15: (-0.8271965334869942, -2.5174371027954736), 16: (-0.41574470158503485, -1.5070960235350006), 17: (-0.5245121937314283, 1.149076125381565), 18: (0.8131012718382519, -1.193578251944493), 19: (-0.2292506291606075, 1.1410424458433976)}
{(0, 2): 70, (0, 3): 42, (0, 8): 36, (0, 11): 65, (0, 17): 96, (1, 5): 70, (1, 6): 95, (1, 7): 1, (1, 8): 51, (1, 9): 37, (1, 13): 35, (1, 14): 49, (2, 7): 94, (3, 4): 4, (3, 10): 99, (3, 11): 43, (3, 12): 78, (3, 17): 22, (3, 19): 74, (4, 6): 1, (4, 10): 11, (4, 11): 44, (4, 15): 59, (4, 18): 24, (4, 19): 60, (5, 6): 3, (5, 7): 99, (5, 14): 63, (5, 15): 36, (6, 13): 95, (7, 10): 68, (7, 12): 83, (7, 13): 47, (7, 17): 21, (7, 18): 82, (9, 12): 51, (9, 16): 28, (9, 17): 15, (9, 18): 42, (10, 14): 59, (10, 17): 66, (10, 19): 37, (11, 12): 11, (11, 13): 87, (11, 14): 44, (11, 16): 12, (11, 19): 3, (12, 14): 52, (12, 16): 81, (12, 17): 33, (12, 19): 55, (13, 14): 1, (14, 15): 39, (14, 18): 20, (14, 19): 47, (16, 18): 43, (17, 19): 57}
# draw the graph using the default spring layout
plt.figure(figsize=(10,9))
nx.draw_networkx(G)
# See what layouts are available in networkX
[x for x in nx.__dir__() if x.endswith('_layout')]
['circular_layout',
'kamada_kawai_layout',
'random_layout',
'rescale_layout',
'shell_layout',
'spring_layout',
'spectral_layout',
'fruchterman_reingold_layout']
# Draw the graph using the random layout
plt.figure(figsize=(10,9))
pos = nx.random_layout(G)
nx.draw_networkx(G, pos)
# Draw the graph using the circular layout
plt.figure(figsize=(10,9))
pos = nx.circular_layout(G)
nx.draw_networkx(G, pos)
# Draw the graph using custom node positions
plt.figure(figsize=(10,7))
pos = nx.get_node_attributes(G, 'location')
nx.draw_networkx(G, pos)
Draw the graph adding alpha and softening edge color
plt.figure(figsize=(10,7))
nx.draw_networkx(G, pos, alpha=0.7, edge_color='.4')
plt.axis('off')
plt.tight_layout();
Draw graph with varying node color, node size, and edge width. Also specify edge that has weight bigger that 90.
plt.figure(figsize=(10,7))
node_color = [G.degree(v) for v in G]
node_size = [0.1*nx.get_node_attributes(G, 'population')[v] for v in G]
edge_width = [0.015*G[u][v]['weight'] for u,v in G.edges()]
nx.draw_networkx(G, pos, node_size=node_size,
node_color=node_color, alpha=0.7, with_labels=True,
width=edge_width, edge_color='.4', cmap=plt.cm.Blues)
greater_than_90 = [x for x in G.edges(data=True) if x[2]['weight']>90]
nx.draw_networkx_edges(G, pos, edgelist=greater_than_90, edge_color='r', alpha=0.2, width=5)
plt.axis('off')
plt.tight_layout();
Leave a Comment