{
"metadata": {
"name": "",
"signature": "sha256:0f51e58de9c2d13317870983e692ad145bb2aa6c081dfbefc1620bcc660a71b5"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#Lab9 - Random Graphs\n",
"\n",
"In this lab, we explore random graphs, introduced by Erdos and Renyi. You will need to install networkx in order to complete this lab (http://networkx.github.io/documentation/latest/install.html, http://stackoverflow.com/questions/9836909/easy-install-networkx). Many of you may already have networkx because it comes default with the Anaconda installation of iPython."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You will need the following basic imports as well as a function written to draw graphs for you. The structure of a graph object is a collection of edges, in (node1, node2) form. You should know how to use `draw_graph`, but you don't really need to know how it works. Play around with it to see what it does. Wow! Look at those pretty graphs `:)`"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%matplotlib inline\n",
"from pylab import *\n",
"import random as rnd\n",
"import networkx as nx\n",
"from __future__ import division\n",
"\n",
"rcParams['figure.figsize'] = 12, 12 # that's default image size for this interactive session\n",
"\n",
"def draw_graph(graph, labels=None, graph_layout='shell',\n",
" node_size=1600, node_color='blue', node_alpha=0.3,\n",
" node_text_size=12,\n",
" edge_color='blue', edge_alpha=0.3, edge_tickness=1,\n",
" edge_text_pos=0.3,\n",
" text_font='sans-serif'):\n",
" \"\"\" \n",
" Based on: https://www.udacity.com/wiki/creating-network-graphs-with-python\n",
" We describe a graph as a list enumerating all edges.\n",
" Ex: graph = [(1,2), (2,3)] represents a graph with 2 edges - (node1 - node2) and (node2 - node3)\n",
" \"\"\"\n",
" \n",
" # create networkx graph\n",
" G=nx.Graph()\n",
"\n",
" # add edges\n",
" for edge in graph:\n",
" G.add_edge(edge[0], edge[1])\n",
"\n",
" # these are different layouts for the network you may try\n",
" # shell seems to work best\n",
" if graph_layout == 'spring':\n",
" graph_pos=nx.spring_layout(G)\n",
" elif graph_layout == 'spectral':\n",
" graph_pos=nx.spectral_layout(G)\n",
" elif graph_layout == 'random':\n",
" graph_pos=nx.random_layout(G)\n",
" else:\n",
" graph_pos=nx.shell_layout(G)\n",
"\n",
" # draw graph\n",
" nx.draw_networkx_nodes(G,graph_pos,node_size=node_size, \n",
" alpha=node_alpha, node_color=node_color)\n",
" nx.draw_networkx_edges(G,graph_pos,width=edge_tickness,\n",
" alpha=edge_alpha,edge_color=edge_color)\n",
" nx.draw_networkx_labels(G, graph_pos,font_size=node_text_size,\n",
" font_family=text_font)\n",
" # show graph\n",
" plt.show()"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"graph = [(1,2),(2,3),(1,3)]\n",
"draw_graph(graph)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"graph = [(1,1),(2,2)]\n",
"draw_graph(graph) # no self-loops, so put a self-loop if you want a disconnected node"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Lets create a function that returns all the nodes that can be reached from a certain starting point given the representation of a graph above.\n",
"\n",
"### Q1. Fill out the following method, `find_connected_component`, that takes graph and starting_node as input arguments. It must return a set of nodes that are connected to the `starting_node`, including the `starting_node` itself."
]
},
{
"cell_type": "code",
"collapsed": true,
"input": [
"def find_connected_component(graph, starting_node):\n",
" \"\"\"\n",
" >>> graph = [(1,2),(2,3),(1,3)]\n",
" >>> find_connected_component(graph,1)\n",
" {1, 2, 3}\n",
" >>> graph = [(1,1),(2,3),(2,4),(3,5),(3,6),(4,6),(1,7),(7,8),(1,8)]\n",
" >>> find_connected_component(graph,1)\n",
" {1, 7, 8}\n",
" >>> find_connected_component(graph,2)\n",
" {2, 3, 4, 5, 6}\n",
" \"\"\"\n",
" connected_nodes = set()\n",
" connected_nodes.add( starting_node )\n",
" \n",
" #Your code here\n",
" \n",
" return connected_nodes "
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Q2. Fill out the following method, `connected_components`, that takes graph and returns all the connected components of the graph. You may want to use the function you wrote above."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def connected_components(graph):\n",
" \"\"\"\n",
" >>> graph = [(1,1),(2,3),(2,4),(3,5),(3,6),(4,6),(1,7),(7,8),(1,8)]\n",
" >>> connected_components(graph)\n",
" [{1, 7, 8}, {2, 3, 4, 5, 6}]\n",
" >>> largest_component_size(graph)\n",
" 5\n",
" \"\"\"\n",
" nodes = set()\n",
" components = []\n",
"\n",
" # Your code here\n",
" \n",
" return components\n",
"\n",
"# These guys should work after you've implemented connected_components\n",
"component_sizes = lambda graph: [len(component) for component in (connected_components(graph))]\n",
"largest_component_size = lambda graph: max(component_sizes(graph))"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we want to create a function that, given the number of nodes in a graph, will randomly generate edges between nodes. That is, we want to construct a random graph folowing the Erdos-Renyi model.\n",
"\n",
"### Q3. Fill out the following function to create an Erdos-Renyi random graph $\\operatorname{G}(n,p)$. For each pair of nodes, randomly create an edge between them with probability $p$. Return the resulting graph (same format as before)."
]
},
{
"cell_type": "code",
"collapsed": true,
"input": [
"def G(n,p):\n",
" graph = [] \n",
" # Recall that we describe a graph as a list enumerating all edges. Node names can be numbers.\n",
" \n",
" #Your code here\n",
" \n",
" return graph"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Try this!\n",
"graph = G(10,0.1)\n",
"draw_graph(graph)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Phase Transitions!!!\n",
"\n",
"Now let's examine some of the qualatative properties of a random graph developed in the original Erdos & Renyi paper."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"epsilon = 1/100"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Transition 1: If $np < 1$, then a graph in $\\operatorname{G}(n, p)$ will almost surely have no connected components of size larger than $\\operatorname{O}(\\log(n))$"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"largest_sizes = []\n",
"n = 50\n",
"p = 1/50 - epsilon\n",
"for i in range(1000):\n",
" graph = G(n,p)\n",
" largest_sizes.append(largest_component_size(graph))\n",
" \n",
"\n",
"print \"We expect the largest component size to be on the order of: \", np.log2(n)\n",
"print \"True average size of the largest component: \", np.mean(largest_sizes)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Transition 2: If $np = 1$, then a graph in $\\operatorname{G}(n, p)$ will almost surely have a largest component whose size is of order $n^{2/3}$."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"largest_sizes = []\n",
"n = 50\n",
"p = 1/50\n",
"for i in range(1000):\n",
" graph = G(n,p)\n",
" largest_sizes.append(largest_component_size(graph))\n",
" \n",
"\n",
"print \"We expect the largest component size to be on the order of: \", n**(2/3)\n",
"print \"True average size of the largest component: \", np.mean(largest_sizes)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Transition 3: If $np \u2192 c > 1$, where $c$ is a constant, then a graph in $\\operatorname{G}(n,p)$ will almost surely have a unique giant component containing a positive fraction of the vertices. No other component will contain more than $\\operatorname{O}(\\log(n))$ vertices.\n",
"\n",
"We'll increase the number of nodes by a factor of 10 here so we can see this more clearly. Pay attention the precipitious decline from the size of the largest connected tomponent to that of all the rest."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"largest_sizes = []\n",
"epsilon = 1/10000\n",
"n = 5000\n",
"p = 1/5000 + epsilon\n",
"graph = G(n,p)\n",
"print sorted(component_sizes(graph))[::-1]"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Transition 4: If $p<\\tfrac{(1-\\epsilon)\\ln n}{n}$, then a graph in $\\operatorname{G}(n,p)$ will almost surely contain isolated vertices, and thus be disconnected."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"largest_sizes = []\n",
"epsilon = 1/20000\n",
"n = 10000\n",
"p = (1-epsilon)*np.log(n) / n - epsilon\n",
"num_isolated = 0\n",
"for _ in range(10):\n",
" graph = G(n,p)\n",
" if 1 in component_sizes(graph):\n",
" num_isolated += 1\n",
"print \"Probability of graphs containing isolated vertices: \", num_isolated / 10"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Transition 5: If $p>\\tfrac{(1-\\epsilon)\\ln n}{n}$, then a graph in $\\operatorname{G}(n,p)$ will almost surely be connected."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"largest_sizes = []\n",
"epsilon = 1/20000\n",
"n = 10000\n",
"p = (1-epsilon)*np.log(n) / n + epsilon\n",
"num_isolated = 0\n",
"for _ in range(10):\n",
" graph = G(n,p)\n",
" if 1 in component_sizes(graph):\n",
" num_isolated += 1\n",
"print \"Probability that graphs are connected: \", 1 - num_isolated / 10"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Cool! Now we've experimentally verified the results of the Erdos-Reyni paper. Isn't it neat that you can rigorously formalize this kind of qualatative behavior of a graph? And that you can clearly see these transitions in simulation? I think it's cool."
]
}
],
"metadata": {}
}
]
}