Network Connectivity

Original Source: https://www.coursera.org/specializations/data-science-python

Clustering Coefficient

Triadic closure: The tendency for people who share connections in a social network to become connected.

To measure the prevalence of triadic closure in a network, we use clustering coefficient.

Local Clustering Coefficient

Fraction of pairs of the nodes’ friends that are friends with each other

Example

local clustering coefficient

Local Clustering Coefficient of ‘C’ = number of pairs of C's friends who are friends / number of pairs of C's friends = number of pairs of C's friends who are friends / (degree of C X (degree of C-1) / 2)
2 / (12/2) = 0.333

We will assume that the local clustering coefficient of a node of degree less than 2 to be 0.

%matplotlib inline

import networkx as nx
import matplotlib.pyplot as plt
import numpy as np
np.random.seed(0)

G = nx.Graph()
G.add_edges_from([('A', 'K'), ('A', 'B'), ('A', 'C'), ('B', 'C'), ('B', 'K'),
('C', 'E'), ('C', 'F'), ('D', 'E'), ('E', 'F'), ('E', 'H'), ('F', 'G'), ('I', 'J')])
print(nx.clustering(G, 'C'))
print(nx.clustering(G, 'J'))
0.3333333333333333
0

Global Clustering Coefficient

Approach 1) Average clustering: Average local clustering coefficient over all nodes in the graph

nx.average_clustering(G)
0.28787878787878785

Approach 2) Transitivity: Percentage of “open triads” that are triangles in a network
Transitivity = 3 X Number of closed triads (Triangles) / Number of open triads

triads

nx.transitivity(G)
0.4090909090909091

Transitivity vs Average Clustering Coefficient

Transitivity weights nodes with large degree higher

transitivity vs average clustering coefficient

Distance Measures

Distance between two nodes

the length of the shortest path between them.
Example

graph

The distance between node A and H is 4

G.remove_edge('A','C')
G.add_edges_from([('E', 'I')])

print(nx.shortest_path(G, 'A', 'H'))
print(nx.shortest_path_length(G, 'A', 'H'))
['A', 'B', 'C', 'E', 'H']
4

Distance between one node to every other nodes

Breadth-first search: a systematic and efficient procedure for computing distances from a node to all other nodes in a large network by “discovering”nodes in layers

breadth-first serach

# Generating breadth-first search tree
T = nx.bfs_tree(G, 'A')
T.edges()
OutEdgeView([('A', 'K'), ('A', 'B'), ('B', 'C'), ('C', 'E'), ('C', 'F'), ('E', 'D'), ('E', 'H'), ('E', 'I'), ('F', 'G'), ('I', 'J')])
nx.shortest_path_length(G, 'A')
{'A': 0,
 'K': 1,
 'B': 1,
 'C': 2,
 'E': 3,
 'F': 3,
 'D': 4,
 'H': 4,
 'I': 4,
 'G': 4,
 'J': 5}

Global distance measures

1. Average distance

nx.average_shortest_path_length(G)
2.5272727272727273

2. Diameter: maximum distance between any pair of nodes

nx.diameter(G)
5

3. Eccentricitiy: largest distance between node n and all other nodes

nx.eccentricity(G)
{'A': 5,
 'K': 5,
 'B': 4,
 'C': 3,
 'E': 3,
 'F': 3,
 'D': 4,
 'H': 4,
 'G': 4,
 'I': 4,
 'J': 5}

4. radius: minimum eccentricity

nx.radius(G)
3

5. Periphery: set of nodes that have eccentricity equal to the diameter

nx.periphery(G)
['A', 'K', 'J']

6. center: set of nodes that have eccentricity equal to the radius

nx.center(G)
['C', 'E', 'F']

Connected Components

Connectivity in undirected graphs

An undirected graph is connected if, for every pair nodes, there is a path between them.

graph

G = nx.Graph()
G.add_edges_from([('A', 'B'), ('A', 'C'), ('G', 'J'), ('A', 'E'), ('A', 'G'), ('B', 'C'), ('B', 'D'), ('B', 'E'), ('C', 'D'), ('C', 'E'), ('F', 'G'), ('F', 'I'), ('F', 'J'), ('I', 'J'), ('I', 'H'), ('I', 'G'), ('H', 'G'), ('J', 'O'), ('A', 'N'), ('M', 'L'), ('M', 'K'), ('L', 'K'), ('L', 'N'), ('L', 'O'), ('O', 'K'), ('O', 'N')])
nx.is_connected(G)
True
G.remove_edges_from([('A', 'G'), ('A', 'N'), ('J', 'O')])
nx.is_connected(G)
False

Connected component

A subset of nodes such as:

  1. Every node in the subset has a path to every other node.
  2. No other node has a path to any node in the subset.
nx.number_connected_components(G)
3
sorted(nx.connected_components(G))
[{'A', 'B', 'C', 'D', 'E'},
 {'F', 'G', 'H', 'I', 'J'},
 {'K', 'L', 'M', 'N', 'O'}]
nx.node_connected_component(G, 'M')
{'K', 'L', 'M', 'N', 'O'}

Connectivitiy in directed graphs

A directed graph is strongly connected if, for every pair nodes u and v, there is a directed path from u to v and a directed path from v to u.

graph

This graph is not strongly connected since there is no directed path from A and H.

G_di = nx.DiGraph()
G_di.add_edges_from([('A', 'B'), ('C', 'A'), ('A', 'E'), ('G', 'J'), ('G', 'A'), ('D', 'B'), ('B', 'E'), ('B' 'C'), ('C', 'D'), ('E', 'C'), ('E', 'D'), ('F', 'G'), ('I', 'F'), ('J', 'F'), ('I', 'J'), ('I', 'H'), ('H', 'I'), ('I', 'G'), ('H', 'G'), ('O', 'J'), ('J', 'O'), ('A', 'N'), ('L', 'M'), ('K', 'M'), ('K', 'L'), ('N', 'L'), ('O', 'L'), ('O', 'K'), ('N', 'O')])
nx.is_strongly_connected(G_di)
False

A directed graph is weakly connected if replacing all directed edges with undirected edges produces a connected undirected graph.

nx.is_weakly_connected(G_di)
True

Strongly connected component
A subset of nodes such as:

  1. Every node in the subset has a directed path to every other node.
  2. No other node has a directed path to every node in the subset.
sorted(nx.strongly_connected_components(G_di))
[{'M'},
 {'L'},
 {'K'},
 {'A', 'B', 'C', 'D', 'E', 'F', 'G', 'J', 'N', 'O'},
 {'H', 'I'}]
sorted(nx.weakly_connected_components(G_di))
[{'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O'}]

Network Robustness

Network robustness is the ability of a network to maintain its general structural properties when it faces failures or attacks. (ex. airport closures, internet router failures, power line failures)
Failures or attacks means removal of nodes or edges, and structural properties means connectivity of a network.

graph G (undirected graph)

graph

Measurements of network robustness

1. Node connectivity
smallest number of nodes that can be removed from a graph in order to disconnect it.

G.add_edges_from([('A', 'G'), ('A', 'N'), ('J', 'O')])
nx.node_connectivity(G)
1
# which edges?
nx.minimum_node_cut(G)
{'A'}

2. Edge connectivity
smallest number of edges that can be removed from a graph in order to disconnect it.

nx.edge_connectivity(G)
2
# which edges?
nx.minimum_edge_cut(G)
{('A', 'N'), ('J', 'O')}

For Directed Graphs

graph G_di (directed graph)

graph

  • Simple Paths
    Imagine node G wants to send a message to node L by passing it along to other nodes in this network.What options does G have to deliver the message?
sorted(nx.all_simple_paths(G_di, 'G', 'L'))
[['G', 'A', 'N', 'L'],
 ['G', 'A', 'N', 'O', 'K', 'L'],
 ['G', 'A', 'N', 'O', 'L'],
 ['G', 'J', 'O', 'K', 'L'],
 ['G', 'J', 'O', 'L']]

** 1. Node Connectivitiy**
If we wanted to block the message from G to L by removing nodes from the network, how many nodes would we need to remove?

nx.node_connectivity(G, 'G', 'L')
2
nx.minimum_node_cut(G, 'G', 'L')
{'N', 'O'}

nx.minimum_node_cut only returns one case of many. For example, cutting ‘J’ and ‘A’ would have also disconnected the network.

2. Edge Connectivity
If we wanted to block the message from G to L by removing edges from the network, how many edges would we need to remove?

nx.edge_connectivity(G, 'G', 'L')
2
nx.minimum_edge_cut(G, 'G', 'L')
{('A', 'N'), ('J', 'O')}

Visualizing Networks

We will create a graph which simulates populations of a state and mobility between states.
Then we will visualize it in different ways.

# create a graph
n_nodes = 20
G = nx.fast_gnp_random_graph(n_nodes, 0.3)
edges = list(G.edges())
n_edges = len(edges)

# set node attributes (population and location)
population = np.random.randint(low=1000, high=10000, size=n_nodes)
location = list(zip(np.random.randn(n_nodes), np.random.randn(n_nodes)))
node_attr = {i: {'population': population[i], 'location': location[i]} for i in range(20)}
nx.set_node_attributes(G, node_attr)

# set edge attributes (weight, which means mobility between states here)
weight = np.random.randint(low=1, high=100, size=n_edges)
edge_attr = {edges[i]: {'weight': weight[i]} for i in range(n_edges)}
nx.set_edge_attributes(G, edge_attr)

# check that the attributes are assigned successfully
print(nx.get_node_attributes(G, 'population'))
print(nx.get_node_attributes(G, 'location'))
print(nx.get_edge_attributes(G, 'weight'))
{0: 3732, 1: 4264, 2: 5859, 3: 8891, 4: 5373, 5: 6874, 6: 7744, 7: 4468, 8: 1705, 9: 3599, 10: 3222, 11: 8768, 12: 3897, 13: 1537, 14: 7216, 15: 7921, 16: 7036, 17: 3163, 18: 6072, 19: 5851}
{0: (-0.07770456650883938, 2.161717373494259), 1: (1.0896301649283022, -0.9569314292805337), 2: (0.09654266759327351, 0.06731083219488099), 3: (1.4186671084676297, 0.20649883721106427), 4: (1.1682731374280053, -0.45688132622556726), 5: (0.9471859464018728, -1.0599757640207523), 6: (1.0854870347559817, 0.6149573156935568), 7: (2.3822244479133983, 1.4296607664750955), 8: (-0.40602373632291733, -0.21195225635208395), 9: (0.2664453411788736, -0.08033725576207178), 10: (-1.3557137239186656, 0.4053977840180792), 11: (-0.11410253188685747, 0.11860658523361353), 12: (-0.844230864581227, 1.2544140683375582), 13: (0.7056408100507655, 1.419102040788718), 14: (-0.39878617335193706, -0.7438560828698322), 15: (-0.8271965334869942, -2.5174371027954736), 16: (-0.41574470158503485, -1.5070960235350006), 17: (-0.5245121937314283, 1.149076125381565), 18: (0.8131012718382519, -1.193578251944493), 19: (-0.2292506291606075, 1.1410424458433976)}
{(0, 2): 70, (0, 3): 42, (0, 8): 36, (0, 11): 65, (0, 17): 96, (1, 5): 70, (1, 6): 95, (1, 7): 1, (1, 8): 51, (1, 9): 37, (1, 13): 35, (1, 14): 49, (2, 7): 94, (3, 4): 4, (3, 10): 99, (3, 11): 43, (3, 12): 78, (3, 17): 22, (3, 19): 74, (4, 6): 1, (4, 10): 11, (4, 11): 44, (4, 15): 59, (4, 18): 24, (4, 19): 60, (5, 6): 3, (5, 7): 99, (5, 14): 63, (5, 15): 36, (6, 13): 95, (7, 10): 68, (7, 12): 83, (7, 13): 47, (7, 17): 21, (7, 18): 82, (9, 12): 51, (9, 16): 28, (9, 17): 15, (9, 18): 42, (10, 14): 59, (10, 17): 66, (10, 19): 37, (11, 12): 11, (11, 13): 87, (11, 14): 44, (11, 16): 12, (11, 19): 3, (12, 14): 52, (12, 16): 81, (12, 17): 33, (12, 19): 55, (13, 14): 1, (14, 15): 39, (14, 18): 20, (14, 19): 47, (16, 18): 43, (17, 19): 57}
# draw the graph using the default spring layout
plt.figure(figsize=(10,9))
nx.draw_networkx(G)

png

# See what layouts are available in networkX
[x for x in nx.__dir__() if x.endswith('_layout')]
['circular_layout',
 'kamada_kawai_layout',
 'random_layout',
 'rescale_layout',
 'shell_layout',
 'spring_layout',
 'spectral_layout',
 'fruchterman_reingold_layout']
# Draw the graph using the random layout
plt.figure(figsize=(10,9))
pos = nx.random_layout(G)
nx.draw_networkx(G, pos)

png

# Draw the graph using the circular layout
plt.figure(figsize=(10,9))
pos = nx.circular_layout(G)
nx.draw_networkx(G, pos)

png

# Draw the graph using custom node positions
plt.figure(figsize=(10,7))

pos = nx.get_node_attributes(G, 'location')
nx.draw_networkx(G, pos)

png

Draw the graph adding alpha and softening edge color

plt.figure(figsize=(10,7))

nx.draw_networkx(G, pos, alpha=0.7,  edge_color='.4')

plt.axis('off')
plt.tight_layout();

png

Draw graph with varying node color, node size, and edge width. Also specify edge that has weight bigger that 90.

plt.figure(figsize=(10,7))

node_color = [G.degree(v) for v in G]
node_size = [0.1*nx.get_node_attributes(G, 'population')[v] for v in G]
edge_width = [0.015*G[u][v]['weight'] for u,v in G.edges()]

nx.draw_networkx(G, pos, node_size=node_size,
                 node_color=node_color, alpha=0.7, with_labels=True,
                 width=edge_width, edge_color='.4', cmap=plt.cm.Blues)


greater_than_90 = [x for x in G.edges(data=True) if x[2]['weight']>90]
nx.draw_networkx_edges(G, pos, edgelist=greater_than_90, edge_color='r', alpha=0.2, width=5)


plt.axis('off')
plt.tight_layout();

png

Leave a Comment