{ "metadata": { "name": "", "signature": "sha256:0f51e58de9c2d13317870983e692ad145bb2aa6c081dfbefc1620bcc660a71b5" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "#Lab9 - Random Graphs\n", "\n", "In this lab, we explore random graphs, introduced by Erdos and Renyi. You will need to install networkx in order to complete this lab (http://networkx.github.io/documentation/latest/install.html, http://stackoverflow.com/questions/9836909/easy-install-networkx). Many of you may already have networkx because it comes default with the Anaconda installation of iPython." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You will need the following basic imports as well as a function written to draw graphs for you. The structure of a graph object is a collection of edges, in (node1, node2) form. You should know how to use draw_graph, but you don't really need to know how it works. Play around with it to see what it does. Wow! Look at those pretty graphs :)" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%matplotlib inline\n", "from pylab import *\n", "import random as rnd\n", "import networkx as nx\n", "from __future__ import division\n", "\n", "rcParams['figure.figsize'] = 12, 12 # that's default image size for this interactive session\n", "\n", "def draw_graph(graph, labels=None, graph_layout='shell',\n", " node_size=1600, node_color='blue', node_alpha=0.3,\n", " node_text_size=12,\n", " edge_color='blue', edge_alpha=0.3, edge_tickness=1,\n", " edge_text_pos=0.3,\n", " text_font='sans-serif'):\n", " \"\"\" \n", " Based on: https://www.udacity.com/wiki/creating-network-graphs-with-python\n", " We describe a graph as a list enumerating all edges.\n", " Ex: graph = [(1,2), (2,3)] represents a graph with 2 edges - (node1 - node2) and (node2 - node3)\n", " \"\"\"\n", " \n", " # create networkx graph\n", " G=nx.Graph()\n", "\n", " # add edges\n", " for edge in graph:\n", " G.add_edge(edge[0], edge[1])\n", "\n", " # these are different layouts for the network you may try\n", " # shell seems to work best\n", " if graph_layout == 'spring':\n", " graph_pos=nx.spring_layout(G)\n", " elif graph_layout == 'spectral':\n", " graph_pos=nx.spectral_layout(G)\n", " elif graph_layout == 'random':\n", " graph_pos=nx.random_layout(G)\n", " else:\n", " graph_pos=nx.shell_layout(G)\n", "\n", " # draw graph\n", " nx.draw_networkx_nodes(G,graph_pos,node_size=node_size, \n", " alpha=node_alpha, node_color=node_color)\n", " nx.draw_networkx_edges(G,graph_pos,width=edge_tickness,\n", " alpha=edge_alpha,edge_color=edge_color)\n", " nx.draw_networkx_labels(G, graph_pos,font_size=node_text_size,\n", " font_family=text_font)\n", " # show graph\n", " plt.show()" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "graph = [(1,2),(2,3),(1,3)]\n", "draw_graph(graph)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "graph = [(1,1),(2,2)]\n", "draw_graph(graph) # no self-loops, so put a self-loop if you want a disconnected node" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Lets create a function that returns all the nodes that can be reached from a certain starting point given the representation of a graph above.\n", "\n", "### Q1. Fill out the following method, find_connected_component, that takes graph and starting_node as input arguments. It must return a set of nodes that are connected to the starting_node, including the starting_node itself." ] }, { "cell_type": "code", "collapsed": true, "input": [ "def find_connected_component(graph, starting_node):\n", " \"\"\"\n", " >>> graph = [(1,2),(2,3),(1,3)]\n", " >>> find_connected_component(graph,1)\n", " {1, 2, 3}\n", " >>> graph = [(1,1),(2,3),(2,4),(3,5),(3,6),(4,6),(1,7),(7,8),(1,8)]\n", " >>> find_connected_component(graph,1)\n", " {1, 7, 8}\n", " >>> find_connected_component(graph,2)\n", " {2, 3, 4, 5, 6}\n", " \"\"\"\n", " connected_nodes = set()\n", " connected_nodes.add( starting_node )\n", " \n", " #Your code here\n", " \n", " return connected_nodes " ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Q2. Fill out the following method, connected_components, that takes graph and returns all the connected components of the graph. You may want to use the function you wrote above." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def connected_components(graph):\n", " \"\"\"\n", " >>> graph = [(1,1),(2,3),(2,4),(3,5),(3,6),(4,6),(1,7),(7,8),(1,8)]\n", " >>> connected_components(graph)\n", " [{1, 7, 8}, {2, 3, 4, 5, 6}]\n", " >>> largest_component_size(graph)\n", " 5\n", " \"\"\"\n", " nodes = set()\n", " components = []\n", "\n", " # Your code here\n", " \n", " return components\n", "\n", "# These guys should work after you've implemented connected_components\n", "component_sizes = lambda graph: [len(component) for component in (connected_components(graph))]\n", "largest_component_size = lambda graph: max(component_sizes(graph))" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we want to create a function that, given the number of nodes in a graph, will randomly generate edges between nodes. That is, we want to construct a random graph folowing the Erdos-Renyi model.\n", "\n", "### Q3. Fill out the following function to create an Erdos-Renyi random graph $\\operatorname{G}(n,p)$. For each pair of nodes, randomly create an edge between them with probability $p$. Return the resulting graph (same format as before)." ] }, { "cell_type": "code", "collapsed": true, "input": [ "def G(n,p):\n", " graph = [] \n", " # Recall that we describe a graph as a list enumerating all edges. Node names can be numbers.\n", " \n", " #Your code here\n", " \n", " return graph" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "# Try this!\n", "graph = G(10,0.1)\n", "draw_graph(graph)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Phase Transitions!!!\n", "\n", "Now let's examine some of the qualatative properties of a random graph developed in the original Erdos & Renyi paper." ] }, { "cell_type": "code", "collapsed": false, "input": [ "epsilon = 1/100" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Transition 1: If $np < 1$, then a graph in $\\operatorname{G}(n, p)$ will almost surely have no connected components of size larger than $\\operatorname{O}(\\log(n))$" ] }, { "cell_type": "code", "collapsed": false, "input": [ "largest_sizes = []\n", "n = 50\n", "p = 1/50 - epsilon\n", "for i in range(1000):\n", " graph = G(n,p)\n", " largest_sizes.append(largest_component_size(graph))\n", " \n", "\n", "print \"We expect the largest component size to be on the order of: \", np.log2(n)\n", "print \"True average size of the largest component: \", np.mean(largest_sizes)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Transition 2: If $np = 1$, then a graph in $\\operatorname{G}(n, p)$ will almost surely have a largest component whose size is of order $n^{2/3}$." ] }, { "cell_type": "code", "collapsed": false, "input": [ "largest_sizes = []\n", "n = 50\n", "p = 1/50\n", "for i in range(1000):\n", " graph = G(n,p)\n", " largest_sizes.append(largest_component_size(graph))\n", " \n", "\n", "print \"We expect the largest component size to be on the order of: \", n**(2/3)\n", "print \"True average size of the largest component: \", np.mean(largest_sizes)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Transition 3: If $np \u2192 c > 1$, where $c$ is a constant, then a graph in $\\operatorname{G}(n,p)$ will almost surely have a unique giant component containing a positive fraction of the vertices. No other component will contain more than $\\operatorname{O}(\\log(n))$ vertices.\n", "\n", "We'll increase the number of nodes by a factor of 10 here so we can see this more clearly. Pay attention the precipitious decline from the size of the largest connected tomponent to that of all the rest." ] }, { "cell_type": "code", "collapsed": false, "input": [ "largest_sizes = []\n", "epsilon = 1/10000\n", "n = 5000\n", "p = 1/5000 + epsilon\n", "graph = G(n,p)\n", "print sorted(component_sizes(graph))[::-1]" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Transition 4: If $p<\\tfrac{(1-\\epsilon)\\ln n}{n}$, then a graph in $\\operatorname{G}(n,p)$ will almost surely contain isolated vertices, and thus be disconnected." ] }, { "cell_type": "code", "collapsed": false, "input": [ "largest_sizes = []\n", "epsilon = 1/20000\n", "n = 10000\n", "p = (1-epsilon)*np.log(n) / n - epsilon\n", "num_isolated = 0\n", "for _ in range(10):\n", " graph = G(n,p)\n", " if 1 in component_sizes(graph):\n", " num_isolated += 1\n", "print \"Probability of graphs containing isolated vertices: \", num_isolated / 10" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Transition 5: If $p>\\tfrac{(1-\\epsilon)\\ln n}{n}$, then a graph in $\\operatorname{G}(n,p)$ will almost surely be connected." ] }, { "cell_type": "code", "collapsed": false, "input": [ "largest_sizes = []\n", "epsilon = 1/20000\n", "n = 10000\n", "p = (1-epsilon)*np.log(n) / n + epsilon\n", "num_isolated = 0\n", "for _ in range(10):\n", " graph = G(n,p)\n", " if 1 in component_sizes(graph):\n", " num_isolated += 1\n", "print \"Probability that graphs are connected: \", 1 - num_isolated / 10" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Cool! Now we've experimentally verified the results of the Erdos-Reyni paper. Isn't it neat that you can rigorously formalize this kind of qualatative behavior of a graph? And that you can clearly see these transitions in simulation? I think it's cool." ] } ], "metadata": {} } ] }