{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Lab 5 (Part 1) - PageRank\n",
"\n",
"#### Authors:\n",
"\n",
"##### v1.0 (2014 Fall) Rishi Sharma \\*, Sahaana Suri \\*, Kangwook Lee \\*\\*, Kannan Ramchandran \\*\\*\n",
"##### v1.1 (2015 Fall) Kabir Chandrasekher \\*, Max Kanwal \\*, Kangwook Lee \\*\\*, Kannan Ramchandran \\*\\*\n",
"\n",
"From Wikipedia: \n",
"\n",
"\n",
"> PageRank is an algorithm used by Google Search to rank websites in their search engine results. PageRank was named after Larry Page, one of the founders of Google. PageRank is a way of measuring the importance of website pages. According to Google:\n",
"\n",
">>PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites.\n",
"\n",
"\n",
"There are four common frameworks by which academics view Google's PageRank algorithm. The first looks at the social impact, both positive and negative, of immediate access to previously unimaginable knowledge through one centralized terminal. The second, and most mathematical, sees PageRank as a computation of the Singular Value Decomposition (SVD) of the adjacency matrix of the graph formed by the internet, with particular emphasis paid to the first few singular vectors. The third, and most far-reaching, practical technical implication of Google's work is the implementation of algorithms and computation at enormous scale. Much of the computing infrastructure which operates at a global scale deployed today can trace its origins to Google's need to perform SVD on an object as enormous as the Internet. Finally, a more intuitive way to look at the PageRank algorithm is through the lens of a web crawler (or many web crawler) acting as an agent (or agents) in a Markov Chain the size of the web. We will investigate this viewpoint.\n",
"\n",
"This crawler is searching for an approximate \"invariant\" distribution (why does a true invariant distribution almost certainly not exist?) and will rank pages based on their \"probability\" in this generated distribution. In order to do so, our crawler chooses to follow a link uniformly at random from the page it is on in order to arrive at a new page, keeping tally of how many times it has visited each page. If this crawler runs for a really, really long time, the fraction of time it has spent on each webpage will approximately be the probability of being on that page (assuming we account for pathologies in the Markov chain which we will discuss soon). We then rank pages in decreasing order of probability."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Alright, great! Let's do stuff. First, visit the following webpage, and see how many web pages can be reached by clicking the links on each page.\n",
"http://www.eecs.berkeley.edu/~kw1jjang/ee126/1.html\n",
"\n",
"There are total of $8$ pages, and they are connected as follows.\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Since we choose a link at uniform from each page, the probability of going between pages $x$ and $y$ is $\\Large \\frac{\\text{# of pages from x to y}}{\\text{# of pages leaving x}}$\n",
"\n",
"Thus the Markov chain generated by the web pages above is\n",
"\n",
"\n",
"\n",
"and the transition matrix of the Markov chain is\n",
"\n",
"$$\n",
"\\left( \\begin{array}{cccccccc}\n",
"0 & \\frac{1}{5} & \\frac{1}{5} & \\frac{1}{5} & \\frac{1}{5} & 0 & 0 & \\frac{1}{5} \\\\\n",
"\\frac{1}{2} & 0 & 0 & 0 & \\frac{1}{2} & 0 & 0 & 0 \\\\\n",
"0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\\\\n",
"0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\\\\n",
"0 & \\frac{1}{3} & \\frac{1}{3} & \\frac{1}{3} & 0 & 0 & 0 & 0 \\\\\n",
"\\frac{1}{4} & 0 & \\frac{1}{4} & 0 & 0 & 0 & \\frac{1}{4} & \\frac{1}{4} \\\\\n",
"\\frac{1}{4} & \\frac{1}{4} & 0 & 0 & \\frac{1}{4} & \\frac{1}{4} & 0 & 0 \\\\\n",
"\\frac{1}{5} & \\frac{1}{5} & \\frac{1}{5} & \\frac{1}{5} & 0 & 0 & \\frac{1}{5} & 0\n",
"\\end{array} \\right)\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## $\\mathcal{Q}$1. Find the steady-state (invariant/stationary) distribution $\\pi$ of the Markov chain above. How do you know that it exists? \n",
"\n",
" The Markov matrix is copied in code below. This might make your computation easier, but you can solve this in any way you wish. (Note: don't forget about the difference between right and left eigenvectors) "
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false,
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 0. 0.2 0.2 0.2 0.2 0. 0.\n",
" 0.2 ]\n",
" [ 0.5 0. 0. 0. 0.5 0. 0.\n",
" 0. ]\n",
" [ 0. 1. 0. 0. 0. 0. 0.\n",
" 0. ]\n",
" [ 0. 0. 0. 0. 1. 0. 0.\n",
" 0. ]\n",
" [ 0. 0.33333333 0.33333333 0.33333333 0. 0. 0.\n",
" 0. ]\n",
" [ 0.25 0. 0.25 0. 0. 0. 0.25\n",
" 0.25 ]\n",
" [ 0.25 0.25 0. 0. 0.25 0.25 0.\n",
" 0. ]\n",
" [ 0.2 0.2 0.2 0.2 0. 0. 0.2\n",
" 0. ]]\n"
]
}
],
"source": [
"import numpy as np\n",
"from __future__ import division\n",
"\n",
"P = np.matrix([[0, 1/5, 1/5, 1/5, 1/5, 0, 0, 1/5],\n",
" [1/2, 0, 0, 0, 1/2, 0, 0, 0],\n",
" [0, 1, 0, 0, 0, 0, 0, 0],\n",
" [0, 0, 0, 0, 1, 0, 0, 0],\n",
" [0, 1/3, 1/3, 1/3, 0, 0, 0, 0],\n",
" [1/4,0,1/4,0,0,0,1/4,1/4],\n",
" [1/4, 1/4, 0, 0, 1/4, 1/4, 0, 0],\n",
" [1/5, 1/5, 1/5, 1/5, 0, 0, 1/5, 0]])\n",
"\n",
"P_transpose = P.T\n",
"\n",
"print P"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Your code here"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Simulation time!\n",
"\n",
"We now want to empirically test what we solved above by modeling a random user hopping along those webpages. We will start the user at \"1.html\" and behave as per the Markov chain above. In the code below, we simulate this and keep track of the average amount of time a user spends in each state. We will expect that after enough iterations, the fraction of time spent in each state should approach the stationary distribution.\n",
"\n",
"We use the `parse_links()` method to parse all hyperlinks in a page. We use the library Beautiful Soup in order to complete this portion of the lab in order to easiliy parse pages. Once you download the latest release, you must build and install setup.py. Alternatively, use pip or easy_install (help)."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import re\n",
"import sys\n",
"import urllib\n",
"import urlparse\n",
"import random\n",
"from bs4 import BeautifulSoup\n",
"\n",
"#http://wolfprojects.altervista.org/articles/change-urllib-user-agent/ \n",
"\n",
"class MyOpener(urllib.FancyURLopener):\n",
" version = 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.15) Gecko/20110303 Firefox/3.6.15'\n",
"\n",
"#This function will parse a url to give you the domain. Test it!\n",
"def domain(url):\n",
" #urlparse breaks down the url passed it, and you split the hostname up \n",
" #Ex: hostname=\"www.google.com\" becomes ['www', 'google', 'com']\n",
" hostname = urlparse.urlparse(url).hostname.split(\".\")\n",
" hostname = \".\".join(len(hostname[-2]) < 4 and hostname[-3:] or hostname[-2:])\n",
" return hostname\n",
" \n",
"#This function will return all the urls on a page, and return the start url if there is an error or no urls\n",
"def parse_links(url, url_start):\n",
" url_list = []\n",
" myopener = MyOpener()\n",
" try:\n",
" #open, read, and parse the text using beautiful soup\n",
" page = myopener.open(url)\n",
" text = page.read()\n",
" page.close()\n",
" soup = BeautifulSoup(text, \"lxml\")\n",
"\n",
" #find all hyperlinks using beautiful soup\n",
" for tag in soup.findAll('a', href=True):\n",
" #concatenate the base url with the path from the hyperlink\n",
" tmp = urlparse.urljoin(url, tag['href'])\n",
" #we want to stay in the berkeley domain. This becomes more relevant later\n",
" if domain(tmp).endswith('berkeley.edu'):\n",
" url_list.append(tmp)\n",
" if len(url_list) == 0:\n",
" return [url_start]\n",
" return url_list\n",
" except:\n",
" return [url_start]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## $\\mathcal{Q}$2. Simulating a Random Walk\n",
"\n",
"In the following code block, use the above functions to surf the web pages described by the Markov chain above. This code block may take a while to run. If it is taking more than a couple of minutes, maybe try reducing `num_of_visits` in order to at least get results. Also, running your code while connected to AirBears may help if you have a slow internet connection at home."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"scrolled": true
},
"outputs": [],
"source": [
"import random\n",
"\n",
"#the url we want to begin with \n",
"url_start = \"http://www.eecs.berkeley.edu/~kw1jjang/ee126/1.html\"\n",
"current_url = url_start\n",
"\n",
"#parameter to set the number of transitions you make/different pages you visit\n",
"num_of_visits = 1000\n",
"\n",
"#dictionary of pages visited so far\n",
"visit_history = {}\n",
"\n",
"#initialize dictionary since we know exactly where we'll end up\n",
"for i in range(1, 9):\n",
" page = \"http://www.eecs.berkeley.edu/~kw1jjang/ee126/\" + str(i) + \".html\"\n",
" visit_history[page] = 0\n",
"\n",
"for i in range(num_of_visits):\n",
"\n",
" # Your code here"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Print your results:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"for i in range(1, 9):\n",
" page = \"http://www.eecs.berkeley.edu/~kw1jjang/ee126/\" + str(i) + \".html\"\n",
" print 'Fraction of time staying on page %d is %f' % (i, float(visit_history[page])/num_of_visits)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Does this approximately match the invariant distribution you expected?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Generalizing to the Web\n",
"\n",
"The toy websites given above conveniently form an *irreducible* Markov chain (look up what this means if you do not remember from class), but most of the web will not look like this. There will be fringes of the internet containing only self-loops, or some web pages which do not link to others at all. In order to account for such pathologies in the web, we need to make a more intelligent surfer. The simplest idea would be to just jump back to the starting page if there are no links found on the page you are on, and to always return to a \"good\" starting point with probability $p$ on every page.\n",
"\n",
"This is a very naive scheme, and there are many more intelligent methods by which you can sample from the distribution of the Internet, accounting for its pathologies and all."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Ranking Berkeley Professors\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following code is a (weak) attempt to rank the Berkeley faculty based on a crawler which begins on the EECS research homepage."
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"collapsed": false,
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"1 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"2 Visiting... http://www.eecs.berkeley.edu/XRG/entrepreneur.html\n",
"3 Visiting... http://www.eecs.berkeley.edu/Resguide/acad.shtml\n",
"4 Visiting... http://www.eecs.berkeley.edu/department/people.shtml\n",
"5 Visiting... http://www.eecs.berkeley.edu/education/\n",
"6 Visiting... http://www.eecs.berkeley.edu/education/\n",
"7 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"8 Visiting... http://www.eecs.berkeley.edu/department/internal.shtml\n",
"9 Visiting... http://www.eecs.berkeley.edu/Gradadm/\n",
"10 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"11 Visiting... http://www.eecs.berkeley.edu/cal/\n",
"12 Visiting... http://events.berkeley.edu/index.php/calendar/sn/eecs.html?view=quick&timeframe=month&filter=Secondary%20Event%20Type&filtersel=32\n",
"13 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"14 Visiting... http://www.eecs.berkeley.edu/department/Impact.shtml\n",
"15 Visiting... http://www.eecs.berkeley.edu\n",
"16 Visiting... http://www.eecs.berkeley.edu/XRG/conferences.shtml\n",
"17 Visiting... http://www.eecs.berkeley.edu/BEARS/2008\n",
"18 Visiting... http://www.eecs.berkeley.edu/Accommodations\n",
"19 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"20 Visiting... http://www.eecs.berkeley.edu/\n",
"21 Visiting... http://www.eecs.berkeley.edu/department/OutreachPrograms.shtml\n",
"22 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"23 Visiting... http://www.eecs.berkeley.edu/department/staff.shtml\n",
"24 Visiting... http://www.eecs.berkeley.edu/education/degrees.shtml\n",
"25 Visiting... http://www.eecs.berkeley.edu/education/degrees.shtml#cse\n",
"26 Visiting... http://www.eecs.berkeley.edu/education/\n",
"27 Visiting... http://www.eecs.berkeley.edu/department/about.shtml\n",
"28 Visiting... http://www.eecs.berkeley.edu/education/\n",
"29 Visiting... http://www.eecs.berkeley.edu/department/internal.shtml\n",
"30 Visiting... http://www.eecs.berkeley.edu/education/\n",
"31 Visiting... http://www.eecs.berkeley.edu/Resguide/admin.shtml\n",
"32 Visiting... http://www.eecs.berkeley.edu/Resguide/admin.shtml#directory\n",
"33 Visiting... http://www.eecs.berkeley.edu/Resguide/admin.shtml#av1\n",
"34 Visiting... http://caltime.berkeley.edu/\n",
"35 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"36 Visiting... http://www.eecs.berkeley.edu/Research/Areas/\n",
"37 Visiting... http://www.eecs.berkeley.edu/Research/Areas/EDUC/\n",
"38 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"39 Visiting... http://www.eecs.berkeley.edu/department/internal.shtml\n",
"40 Visiting... http://www.eecs.berkeley.edu/Students/\n",
"41 Visiting... http://www.eecs.berkeley.edu/XRG/IAB/\n",
"42 Visiting... http://www.eecs.berkeley.edu/\n",
"43 Visiting... http://www.eecs.berkeley.edu/Includes/copyright.shtml\n",
"44 Visiting... http://www.eecs.berkeley.edu/cal/\n",
"45 Visiting... http://www.berkeley.edu\n",
"46 Visiting... http://www.berkeley.edu\n",
"47 Visiting... http://www.berkeley.edu/utility/jobs\n",
"48 Visiting... http://www.berkeley.edu/research\n",
"49 Visiting... http://www.berkeley.edu/about/history-discoveries\n",
"50 Visiting... http://www.berkeley.edu/about/history-discoveries\n",
"51 Visiting... http://www.berkeley.edu/about/history-discoveries\n",
"52 Visiting... http://www.berkeley.edu/about/experience-berkeley\n",
"53 Visiting... http://www.berkeley.edu/about/experience-berkeley\n",
"54 Visiting... http://bulletin.berkeley.edu\n",
"55 Visiting... http://bulletin.berkeley.edu/tuition-fees-financial-aid/\n",
"56 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"57 Visiting... http://www.eecs.berkeley.edu/XRG/eecsgifts.shtml\n",
"58 Visiting... http://www.eecs.berkeley.edu/\n",
"59 Visiting... http://www.eecs.berkeley.edu/XRG/recruitment.shtml\n",
"60 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"61 Visiting... http://www.eecs.berkeley.edu/Pubs/\n",
"62 Visiting... http://www.eecs.berkeley.edu/department/about.shtml\n",
"63 Visiting... http://www.berkeley.edu\n",
"64 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"65 Visiting... http://www.eecs.berkeley.edu/Research/Areas/Centers/\n",
"66 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"67 Visiting... http://www.eecs.berkeley.edu/XRG/menu.shtml\n",
"68 Visiting... http://www.eecs.berkeley.edu/education/usli/\n",
"69 Visiting... http://www.eecs.berkeley.edu/XRG/recruitment.shtml\n",
"70 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"71 Visiting... http://www.eecs.berkeley.edu/education/courses.shtml\n",
"72 Visiting... http://www.eecs.berkeley.edu/education/\n",
"73 Visiting... http://www.eecs.berkeley.edu/Students/directories.shtml\n",
"74 Visiting... http://www.eecs.berkeley.edu/Students/\n",
"75 Visiting... http://www.eecs.berkeley.edu/Includes/privacy.shtml\n",
"76 Visiting... http://www.eecs.berkeley.edu/Gradadm/\n",
"77 Visiting... http://www.eecs.berkeley.edu/XRG/eecsgifts.shtml\n",
"78 Visiting... http://www.eecs.berkeley.edu/education/\n",
"79 Visiting... http://www.eecs.berkeley.edu/department/staff.shtml\n",
"80 Visiting... http://www.eecs.berkeley.edu/department/people.shtml\n",
"81 Visiting... http://www.eecs.berkeley.edu/education/usli/\n",
"82 Visiting... http://www.eecs.berkeley.edu/XRG/entrepreneur.html\n",
"83 Visiting... http://www.eecs.berkeley.edu/Pubs/\n",
"84 Visiting... http://www.eecs.berkeley.edu/department/internal.shtml\n",
"85 Visiting... http://www.eecs.berkeley.edu/department/about.shtml\n",
"86 Visiting... http://www.eecs.berkeley.edu/department/\n",
"87 Visiting... http://www.eecs.berkeley.edu/Directions/\n",
"88 Visiting... http://www.eecs.berkeley.edu/Includes/copyright.shtml\n",
"89 Visiting... http://www.eecs.berkeley.edu/department/internal.shtml\n",
"90 Visiting... http://www.eecs.berkeley.edu/Research/Areas/Centers/\n",
"91 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"92 Visiting... http://www.eecs.berkeley.edu/department/people.shtml\n",
"93 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"94 Visiting... http://www.eecs.berkeley.edu/help\n",
"95 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"96 Visiting... http://www.eecs.berkeley.edu/department/about.shtml\n",
"97 Visiting... http://www.eecs.berkeley.edu/department/people.shtml\n",
"98 Visiting... http://www.eecs.berkeley.edu/deptinfo/login.html?returnto=http://www.eecs.berkeley.edu:80/department/people.shtml\n",
"99 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"100 Visiting... http://www.eecs.berkeley.edu/deptinfo\n",
"101 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"102 Visiting... http://www.eecs.berkeley.edu/Resguide/admin.shtml\n",
"103 Visiting... http://www.eecs.berkeley.edu/department/emergency/sodaill.shtml\n",
"104 Visiting... http://www.eecs.berkeley.edu/department/internal.shtml\n",
"105 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"106 Visiting... http://www.eecs.berkeley.edu/Faculty/Awards/#nae\n",
"107 Visiting... http://www.eecs.berkeley.edu/Pubs/\n",
"108 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"109 Visiting... http://www.eecs.berkeley.edu/news/\n",
"110 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"111 Visiting... http://www.eecs.berkeley.edu/department/internal.shtml\n",
"112 Visiting... http://www.eecs.berkeley.edu/XRG/IAB/\n",
"113 Visiting... http://www.eecs.berkeley.edu/department/history.shtml\n",
"114 Visiting... http://www.eecs.berkeley.edu/department/OutreachPrograms.shtml\n",
"115 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"116 Visiting... http://www.eecs.berkeley.edu/education/degrees.shtml\n",
"117 Visiting... http://coe.berkeley.edu/students/current-undergraduates/majors-minors/simultaneous-degrees.html/\n",
"118 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"119 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"120 Visiting... http://www.eecs.berkeley.edu/Resguide/admin.shtml\n",
"121 Visiting... http://www.eecs.berkeley.edu/department/\n",
"122 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"123 Visiting... http://www.eecs.berkeley.edu/deptinfo\n",
"124 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"125 Visiting... http://www.eecs.berkeley.edu/Pubs/\n",
"126 Visiting... http://www.eecs.berkeley.edu/education/courses.shtml\n",
"127 Visiting... http://www.eecs.berkeley.edu/Scheduling/EE/schedule-next.html\n",
"128 Visiting... http://www.eecs.berkeley.edu/Courses/Data/79.html\n",
"129 Visiting... http://www.eecs.berkeley.edu/Gradadm/\n",
"130 Visiting... http://www.eecs.berkeley.edu/department/internal.shtml\n",
"131 Visiting... http://www.eecs.berkeley.edu/Gradadm/\n",
"132 Visiting... http://www.eecs.berkeley.edu/help\n",
"133 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"134 Visiting... http://www.eecs.berkeley.edu/XRG/recruitment.shtml\n",
"135 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"136 Visiting... http://www.eecs.berkeley.edu/Colloquium/\n",
"137 Visiting... http://www.eecs.berkeley.edu/Colloquium/Archives/10-11/Fall2010/index.shtml\n",
"138 Visiting... http://www.eecs.berkeley.edu/Directions/\n",
"139 Visiting... http://www.eecs.berkeley.edu/Resguide/stud.shtml\n",
"140 Visiting... http://www.eecs.berkeley.edu/Programs/Notes/section6.shtml#6.2\n",
"141 Visiting... http://www.eecs.berkeley.edu/Programs/Notes/index.shtml\n",
"142 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"143 Visiting... http://www.eecs.berkeley.edu/department/people.shtml\n",
"144 Visiting... http://www.eecs.berkeley.edu/Resguide/admin.shtml\n",
"145 Visiting... http://www.eecs.berkeley.edu/Resguide/admin.shtml#top\n",
"146 Visiting... http://www.eecs.berkeley.edu/Resguide/admin.shtml#events\n",
"147 Visiting... http://www.eecs.berkeley.edu/Resguide/admin.shtml#parking\n",
"148 Visiting... http://www.eecs.berkeley.edu/department/emergency/coryill.shtml\n",
"149 Visiting... http://www.eecs.berkeley.edu/education/courses.shtml\n",
"150 Visiting... http://www.eecs.berkeley.edu/Scheduling/EE/schedule.html\n",
"151 Visiting... http://www.eecs.berkeley.edu/Courses/Data/928.html\n",
"152 Visiting... http://www.eecs.berkeley.edu/department/history.shtml\n",
"153 Visiting... http://www.eecs.berkeley.edu/Includes/copyright.shtml\n",
"154 Visiting... http://www.eecs.berkeley.edu/cal/\n",
"155 Visiting... http://www.eecs.berkeley.edu/alumni/lists.shtml\n",
"156 Visiting... http://www.eecs.berkeley.edu/cal/\n",
"157 Visiting... http://www.eecs.berkeley.edu/education/courses.shtml\n",
"158 Visiting... http://www.cs.berkeley.edu/Scheduling/faq.shtml\n",
"159 Visiting... http://www.cs.berkeley.edu/Research/\n",
"160 Visiting... http://www.cs.berkeley.edu/department/\n",
"161 Visiting... http://www.cs.berkeley.edu/Visiting/Scholars/menu.shtml\n",
"162 Visiting... http://www.cs.berkeley.edu/education/degrees.shtml\n",
"163 Visiting... http://www.cs.berkeley.edu/education/\n",
"164 Visiting... http://www.cs.berkeley.edu/Research/\n",
"165 Visiting... http://www.cs.berkeley.edu/cal/\n",
"166 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"167 Visiting... http://www.eecs.berkeley.edu/help\n",
"168 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"169 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"170 Visiting... http://www.eecs.berkeley.edu/department/internal.shtml\n",
"171 Visiting... http://www.eecs.berkeley.edu/education/usli/\n",
"172 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"173 Visiting... http://www.eecs.berkeley.edu/bears/\n",
"174 Visiting... http://crest.berkeley.edu/\n",
"175 Visiting... http://crest.berkeley.edu/carbonroadmap.html\n",
"176 Visiting... http://crest.berkeley.edu/bears.html\n",
"177 Visiting... http://crest.berkeley.edu/mission.html\n",
"178 Visiting... http://sinberise.berkeley.edu/\n",
"179 Visiting... http://sinberise.berkeley.edu/\n",
"180 Visiting... http://sinberise.berkeley.edu/\n",
"181 Visiting... http://sinberise.berkeley.edu/\n",
"182 Visiting... http://sinberise.berkeley.edu/\n",
"183 Visiting... http://sinberise.berkeley.edu/\n",
"184 Visiting... http://sinberise.berkeley.edu/\n",
"185 Visiting... http://sinberise.berkeley.edu/\n",
"186 Visiting... http://sinberise.berkeley.edu/\n",
"187 Visiting... http://sinberise.berkeley.edu/\n",
"188 Visiting... http://sinberise.berkeley.edu/\n",
"189 Visiting... http://sinberise.berkeley.edu/\n",
"190 Visiting... http://sinberise.berkeley.edu/\n",
"191 Visiting... http://sinberise.berkeley.edu/\n",
"192 Visiting... http://sinberise.berkeley.edu/\n",
"193 Visiting... http://sinberise.berkeley.edu/\n",
"194 Visiting... http://sinberise.berkeley.edu/\n",
"195 Visiting... http://sinberise.berkeley.edu/\n",
"196 Visiting... http://sinberise.berkeley.edu/\n",
"197 Visiting... http://sinberise.berkeley.edu/\n",
"198 Visiting... http://sinberise.berkeley.edu/\n",
"199 Visiting... http://sinberise.berkeley.edu/\n",
"200 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"201 Visiting... http://www.berkeley.edu\n",
"202 Visiting... http://www.berkeley.edu/students\n",
"203 Visiting... http://studentcentral.berkeley.edu/\n",
"204 Visiting... http://studentcentral.berkeley.edu/about\n",
"205 Visiting... http://studentcentral.berkeley.edu/faqs\n",
"206 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"207 Visiting... http://www.eecs.berkeley.edu/education/usli/\n",
"208 Visiting... http://www.eecs.berkeley.edu/department/people.shtml\n",
"209 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"210 Visiting... http://www.eecs.berkeley.edu/Facilities/\n",
"211 Visiting... https://buffy.eecs.berkeley.edu/PHP/bldgreq/startup.php?dept=eecs\n",
"212 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"213 Visiting... http://www.eecs.berkeley.edu\n",
"214 Visiting... http://www.eecs.berkeley.edu/cal/\n",
"215 Visiting... http://www.eecs.berkeley.edu/department/statistics.shtml\n",
"216 Visiting... http://www.eecs.berkeley.edu/education/degrees.shtml\n",
"217 Visiting... http://www.eecs.berkeley.edu/Prospective/ugrad.shtml\n",
"218 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"219 Visiting... http://www.eecs.berkeley.edu/XRG/eecsgifts.shtml\n",
"220 Visiting... http://www.eecs.berkeley.edu/education/\n",
"221 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"222 Visiting... http://www.eecs.berkeley.edu/department/about.shtml\n",
"223 Visiting... http://www.eecs.berkeley.edu/department/internal.shtml\n",
"224 Visiting... http://www.eecs.berkeley.edu/deptinfo/login.html?returnto=http://www.eecs.berkeley.edu:80/department/internal.shtml\n",
"225 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"226 Visiting... http://www.eecs.berkeley.edu/Faculty/Awards/\n",
"227 Visiting... http://www.eecs.berkeley.edu/Faculty/Awards/siam.shtml\n",
"228 Visiting... http://www.eecs.berkeley.edu/deptinfo\n",
"229 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"230 Visiting... http://www.eecs.berkeley.edu/Resguide/acad.shtml\n",
"231 Visiting... http://www.eecs.berkeley.edu/Resguide/acad.shtml#top\n",
"232 Visiting... http://www.eecs.berkeley.edu/Resguide/acad.shtml#course.reports\n",
"233 Visiting... http://www.eecs.berkeley.edu/Resguide/acad.shtml#top\n",
"234 Visiting... http://www.eecs.berkeley.edu/Resguide/acad.shtml#forms\n",
"235 Visiting... http://www.eecs.berkeley.edu/Scheduling/CS/\n",
"236 Visiting... http://bulletin.berkeley.edu/courses/compsci/\n",
"237 Visiting... http://bulletin.berkeley.edu/search/?P=COMPSCI 199\n",
"238 Visiting... http://bulletin.berkeley.edu/student-life/\n",
"239 Visiting... http://students.berkeley.edu/finaid/\n",
"240 Visiting... http://students.berkeley.edu/prospective-students\n",
"241 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"242 Visiting... http://www.eecs.berkeley.edu/bears/\n",
"243 Visiting... http://www.eecs.berkeley.edu/BEARS/2015/program.pdf\n",
"244 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"245 Visiting... http://www.berkeley.edu\n",
"246 Visiting... http://www.berkeley.edu/research\n",
"247 Visiting... http://www.berkeley.edu/research\n",
"248 Visiting... http://financialaid.berkeley.edu/\n",
"249 Visiting... http://financialaid.berkeley.edu/estimate-your-financial-aid\n",
"250 Visiting... http://financialaid.berkeley.edu/understanding-your-commitment\n",
"251 Visiting... http://financialaid.berkeley.edu/berkeley-undergraduate-scholarship\n",
"252 Visiting... http://financialaid.berkeley.edu/frequently-asked-questions\n",
"253 Visiting... http://financialaid.berkeley.edu/prizes-and-honors-nicola-de-lorenzo-prize-music-composition\n",
"254 Visiting... http://financialaid.berkeley.edu/innovative-berkeley-aid-programs\n",
"255 Visiting... http://financialaid.berkeley.edu/financial-aid-glance\n",
"256 Visiting... http://financialaid.berkeley.edu/graduate-award-guide\n",
"257 Visiting... http://financialaid.berkeley.edu/prizes-and-honors-university-medal\n",
"258 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"259 Visiting... http://www.eecs.berkeley.edu/Resguide/stud.shtml\n",
"260 Visiting... http://www.eecs.berkeley.edu/education/\n",
"261 Visiting... http://www.eecs.berkeley.edu/Gradadm/\n",
"262 Visiting... http://www.eecs.berkeley.edu/MEng/\n",
"263 Visiting... http://www.eecs.berkeley.edu/XRG/menu.shtml\n",
"264 Visiting... http://www.eecs.berkeley.edu/Students/directories.shtml\n",
"265 Visiting... http://www.eecs.berkeley.edu/Visiting/Scholars/menu.shtml\n",
"266 Visiting... http://www.eecs.berkeley.edu/education/\n",
"267 Visiting... http://www.eecs.berkeley.edu/education/courses.shtml\n",
"268 Visiting... http://www.eecs.berkeley.edu/Scheduling/CS/schedule.html\n",
"269 Visiting... http://www.eecs.berkeley.edu/Courses/Data/335.html\n",
"270 Visiting... http://www.eecs.berkeley.edu/Resguide/admin.shtml\n",
"271 Visiting... http://www.eecs.berkeley.edu/department/statistics.shtml\n",
"272 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"273 Visiting... http://www.eecs.berkeley.edu/department/staffinfo.shtml\n",
"274 Visiting... http://www.eecs.berkeley.edu/Colloquium/\n",
"275 Visiting... http://www.eecs.berkeley.edu/education/\n",
"276 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"277 Visiting... http://www.eecs.berkeley.edu/cal/\n",
"278 Visiting... http://www.eecs.berkeley.edu/XRG/menu.shtml\n",
"279 Visiting... http://www.eecs.berkeley.edu/Students/\n",
"280 Visiting... http://www.eecs.berkeley.edu/cal/\n",
"281 Visiting... http://www.eecs.berkeley.edu/Diversity/\n",
"282 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"283 Visiting... http://www.eecs.berkeley.edu/department/statistics.shtml\n",
"284 Visiting... http://www.eecs.berkeley.edu/Includes/privacy.shtml\n",
"285 Visiting... http://www.eecs.berkeley.edu/department/history.shtml\n",
"286 Visiting... http://netshow01.eecs.berkeley.edu/zadeh2010.wmv\n",
"287 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"288 Visiting... http://www.eecs.berkeley.edu/help\n",
"289 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"290 Visiting... http://www.eecs.berkeley.edu/department/internal.shtml\n",
"291 Visiting... http://www.eecs.berkeley.edu/Resguide/admin.shtml\n",
"292 Visiting... http://www.eecs.berkeley.edu/Resguide/admin.shtml#timesheets\n",
"293 Visiting... http://www.eecs.berkeley.edu/department/people.shtml\n",
"294 Visiting... http://www.eecs.berkeley.edu/department/staff.shtml\n",
"295 Visiting... http://www.eecs.berkeley.edu/XRG/menu.shtml\n",
"296 Visiting... http://www.eecs.berkeley.edu/department/history.shtml\n",
"297 Visiting... http://www.eecs.berkeley.edu/\n",
"298 Visiting... http://www.eecs.berkeley.edu/Faculty/Homepages/chang-hasnain.html\n",
"299 Visiting... http://www.eecs.berkeley.edu/XRG/conferences.shtml\n",
"300 Visiting... http://www-bisc.cs.berkeley.edu/BISCSE2005/\n",
"301 Visiting... http://www-bisc.cs.berkeley.edu/XRG/menu.shtml\n",
"302 Visiting... http://www-bisc.cs.berkeley.edu/Directions/\n",
"303 Visiting... http://www-bisc.cs.berkeley.edu/department/about.shtml\n",
"304 Visiting... http://www-bisc.cs.berkeley.edu/Directions/\n",
"305 Visiting... http://www-bisc.cs.berkeley.edu/Research/\n",
"306 Visiting... http://www-bisc.cs.berkeley.edu/department/staff.shtml\n",
"307 Visiting... http://www-bisc.cs.berkeley.edu/department/OutreachPrograms.shtml\n",
"308 Visiting... http://www-bisc.cs.berkeley.edu/Research/\n",
"309 Visiting... http://www-bisc.cs.berkeley.edu/Pubs/\n",
"310 Visiting... http://www-bisc.cs.berkeley.edu/department/internal.shtml\n",
"311 Visiting... http://www-bisc.cs.berkeley.edu/XRG/\n",
"312 Visiting... http://events.berkeley.edu/index.php/calendar/sn/eecs.html?view=quick&timeframe=month&filter=Secondary%20Event%20Type&filtersel=32\n",
"313 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"314 Visiting... http://www.eecs.berkeley.edu/Research/Projects/\n",
"315 Visiting... http://www.eecs.berkeley.edu/Research/Areas/\n",
"316 Visiting... http://www.eecs.berkeley.edu/department/history.shtml\n",
"317 Visiting... http://netshow01.eecs.berkeley.edu/zadeh2010.wmv\n",
"318 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"319 Visiting... http://www.eecs.berkeley.edu/Faculty/Awards/\n",
"320 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"321 Visiting... http://www.eecs.berkeley.edu/Faculty/Awards/#nms\n",
"322 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"323 Visiting... http://www.eecs.berkeley.edu/department/officers.shtml\n",
"324 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"325 Visiting... http://www.eecs.berkeley.edu/department/about.shtml\n",
"326 Visiting... http://www.eecs.berkeley.edu/department/administration.shtml\n",
"327 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"328 Visiting... http://www.eecs.berkeley.edu/department/people.shtml\n",
"329 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"330 Visiting... http://www.eecs.berkeley.edu/cal/\n",
"331 Visiting... http://events.berkeley.edu/index.php/calendar/sn/eecs.html?view=quick&timeframe=month&filter=Secondary%20Event%20Type&filtersel=32\n",
"332 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"333 Visiting... http://www.eecs.berkeley.edu/cal/\n",
"334 Visiting... http://www.eecs.berkeley.edu/Diversity/\n",
"335 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"336 Visiting... http://www.eecs.berkeley.edu/Research/Areas/Centers/\n",
"337 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"338 Visiting... http://www.eecs.berkeley.edu/XRG/menu.shtml\n",
"339 Visiting... http://www.eecs.berkeley.edu/department/officers.shtml\n",
"340 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"341 Visiting... http://www.eecs.berkeley.edu/Facilities/\n",
"342 Visiting... http://www.eecs.berkeley.edu/Resguide/roomsched.shtml\n",
"343 Visiting... http://www.eecs.berkeley.edu/department/statistics.shtml\n",
"344 Visiting... http://guide.berkeley.edu/search/?P=compsci+61a\n",
"345 Visiting... http://guide.berkeley.edu/tuition-fees-financial-aid/\n",
"346 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"347 Visiting... http://www.eecs.berkeley.edu/cal/\n",
"348 Visiting... http://www.berkeley.edu\n",
"349 Visiting... http://www.berkeley.edu/atoz/dept\n",
"350 Visiting... http://financialaid.berkeley.edu/\n",
"351 Visiting... http://financialaid.berkeley.edu/private-alternative-loan\n",
"352 Visiting... http://financialaid.berkeley.edu/loans\n",
"353 Visiting... http://financialaid.berkeley.edu/grants\n",
"354 Visiting... http://financialaid.berkeley.edu/cost-attendance-2014-15\n",
"355 Visiting... http://financialaid.berkeley.edu/money-matters-101\n",
"356 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"357 Visiting... http://www.eecs.berkeley.edu/education/\n",
"358 Visiting... http://www.eecs.berkeley.edu/education/usli/\n",
"359 Visiting... http://www.eecs.berkeley.edu/Resguide/stud.shtml\n",
"360 Visiting... http://www.eecs.berkeley.edu/Students/Handbook/section3.shtml#3.2.4\n",
"361 Visiting... http://www.eecs.berkeley.edu/Students/Handbook/section3.shtml#3.3.5\n",
"362 Visiting... http://career.berkeley.edu/\n",
"363 Visiting... http://career.berkeley.edu/Tools/Tools\n",
"364 Visiting... http://career.berkeley.edu/Tools/JobSearchExercise\n",
"365 Visiting... http://career.berkeley.edu/StaffFaculty/StaffFaculty\n",
"366 Visiting... http://career.berkeley.edu/Info/MakeAppt\n",
"367 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"368 Visiting... http://www.eecs.berkeley.edu/Faculty/Awards/\n",
"369 Visiting... http://www.eecs.berkeley.edu/department/about.shtml\n",
"370 Visiting... http://www.eecs.berkeley.edu/Visiting/Scholars/menu.shtml\n",
"371 Visiting... http://www.berkeley.edu\n",
"372 Visiting... http://newsroom.haas.berkeley.edu/article/patrick-awuah-mba-99-and-founder-ghana%E2%80%99s-aseshi-university-wins-macarthur-genius-grant\n",
"373 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"374 Visiting... http://www.eecs.berkeley.edu/Visiting/Scholars/menu.shtml\n",
"375 Visiting... http://www.eecs.berkeley.edu/Faculty/Lists/\n",
"376 Visiting... http://events.berkeley.edu/index.php/calendar/sn/eecs.html?view=quick&timeframe=month&filter=Secondary%20Event%20Type&filtersel=32\n",
"377 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"378 Visiting... http://www.eecs.berkeley.edu/department/people.shtml\n",
"379 Visiting... http://www.eecs.berkeley.edu/department/internal.shtml\n",
"380 Visiting... http://www.eecs.berkeley.edu/cal/\n",
"381 Visiting... http://www.eecs.berkeley.edu/education/usli/\n",
"382 Visiting... http://www.eecs.berkeley.edu/Includes/copyright.shtml\n",
"383 Visiting... http://www.eecs.berkeley.edu/Facilities/\n",
"384 Visiting... http://www.eecs.berkeley.edu/department/internal.shtml\n",
"385 Visiting... http://www.eecs.berkeley.edu/Pubs/\n",
"386 Visiting... http://www.eecs.berkeley.edu/Resguide/stud.shtml\n",
"387 Visiting... http://www.eecs.berkeley.edu/education/showcase/\n",
"388 Visiting... http://www.eecs.berkeley.edu/Faculty/Homepages/wolisz.html\n",
"389 Visiting... http://www.berkeley.edu\n",
"390 Visiting... http://www.berkeley.edu\n",
"391 Visiting... http://www.berkeley.edu/about/visit\n",
"392 Visiting... http://www.berkeley.edu/about/visit\n",
"393 Visiting... http://www.berkeley.edu/research/museumscollections\n",
"394 Visiting... http://grad.berkeley.edu/\n",
"395 Visiting... http://grad.berkeley.edu/events/category/professional-development/month/\n",
"396 Visiting... http://grad.berkeley.edu/event/the-hacker-within-pandas/\n",
"397 Visiting... http://gradlectures.berkeley.edu\n",
"398 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"399 Visiting... http://www.eecs.berkeley.edu/XRG/conferences.shtml\n",
"400 Visiting... http://www.eecs.berkeley.edu/Resguide/admin.shtml\n",
"401 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"402 Visiting... http://www.eecs.berkeley.edu/Visiting/Scholars/menu.shtml\n",
"403 Visiting... http://www.eecs.berkeley.edu/Pubs/\n",
"404 Visiting... http://www.eecs.berkeley.edu/XRG/\n",
"405 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"406 Visiting... http://www.eecs.berkeley.edu/Faculty/Awards/#nms\n",
"407 Visiting... http://www.eecs.berkeley.edu/deptinfo/login.html?returnto=http://www.eecs.berkeley.edu:80/Faculty/Awards/#nms\n",
"408 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"409 Visiting... http://www.eecs.berkeley.edu/department/Impact.shtml\n",
"410 Visiting... http://www.eecs.berkeley.edu/department/officers.shtml\n",
"411 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"412 Visiting... http://www.eecs.berkeley.edu/department/people.shtml\n",
"413 Visiting... http://www.eecs.berkeley.edu/department/about.shtml\n",
"414 Visiting... http://www.eecs.berkeley.edu/Colloquium/\n",
"415 Visiting... http://www.eecs.berkeley.edu/Colloquium/Archives/05-06/Fall2005/\n",
"416 Visiting... http://www.eecs.berkeley.edu/XRG/\n",
"417 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"418 Visiting... http://www.eecs.berkeley.edu/department/OutreachPrograms.shtml\n",
"419 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"420 Visiting... http://www.eecs.berkeley.edu/department/people.shtml\n",
"421 Visiting... http://www.eecs.berkeley.edu/XRG/IAB/\n",
"422 Visiting... http://www.eecs.berkeley.edu/Prospective/ugrad.shtml\n",
"423 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"424 Visiting... http://www.eecs.berkeley.edu/department/officers.shtml\n",
"425 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"426 Visiting... http://www.eecs.berkeley.edu/Prospective/ugrad.shtml\n",
"427 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"428 Visiting... http://www.eecs.berkeley.edu/Pubs/\n",
"429 Visiting... http://www.eecs.berkeley.edu/cal/\n",
"430 Visiting... http://www.eecs.berkeley.edu/\n",
"431 Visiting... http://www.eecs.berkeley.edu/Research/Projects/\n",
"432 Visiting... http://www.eecs.berkeley.edu/Resguide/admin.shtml\n",
"433 Visiting... http://www.eecs.berkeley.edu/Resguide/admin.shtml#top\n",
"434 Visiting... http://www.eecs.berkeley.edu/Staff/Procedures/account.shtml\n",
"435 Visiting... https://calnet.calnet.berkeley.edu/myi/questions/set\n",
"436 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"437 Visiting... http://www.eecs.berkeley.edu\n",
"438 Visiting... http://www.eecs.berkeley.edu/deptinfo\n",
"439 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"440 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"441 Visiting... http://www.eecs.berkeley.edu/Research/Projects/\n",
"442 Visiting... http://www.eecs.berkeley.edu/department/people.shtml\n",
"443 Visiting... http://www.eecs.berkeley.edu/Faculty/Lists/faculty.shtml\n",
"444 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"445 Visiting... http://events.berkeley.edu/index.php/calendar/sn/eecs.html?view=quick&timeframe=month&filter=Secondary%20Event%20Type&filtersel=32\n",
"446 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"447 Visiting... http://www.eecs.berkeley.edu/deptinfo/login.html?returnto=http://www.eecs.berkeley.edu:80/Research/\n",
"448 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"449 Visiting... http://www.eecs.berkeley.edu/education/\n",
"450 Visiting... http://www.eecs.berkeley.edu/department/people.shtml\n",
"451 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"452 Visiting... http://www.berkeley.edu\n",
"453 Visiting... http://www.berkeley.edu/research/museumscollections\n",
"454 Visiting... http://essig.berkeley.edu/\n",
"455 Visiting... http://bnhm.berkeley.edu/\n",
"456 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"457 Visiting... http://www.eecs.berkeley.edu/Students/\n",
"458 Visiting... http://www.eecs.berkeley.edu/deptinfo/login.html?returnto=http://www.eecs.berkeley.edu:80/Students/\n",
"459 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"460 Visiting... http://www.eecs.berkeley.edu/Research/Projects/\n",
"461 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"462 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"463 Visiting... http://www.eecs.berkeley.edu/department/OutreachPrograms.shtml\n",
"464 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"465 Visiting... http://www.eecs.berkeley.edu/XRG/menu.shtml\n",
"466 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"467 Visiting... http://www.eecs.berkeley.edu/department/people.shtml\n",
"468 Visiting... http://www.eecs.berkeley.edu/Research/Areas/\n",
"469 Visiting... http://www.eecs.berkeley.edu/department/officers.shtml\n",
"470 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"471 Visiting... http://www.berkeley.edu\n",
"472 Visiting... http://news.berkeley.edu/\n",
"473 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"474 Visiting... http://www.eecs.berkeley.edu/Diversity/\n",
"475 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"476 Visiting... http://www.eecs.berkeley.edu/cal/\n",
"477 Visiting... http://www.eecs.berkeley.edu/department/internal.shtml\n",
"478 Visiting... http://www.eecs.berkeley.edu/department/about.shtml\n",
"479 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"480 Visiting... http://www.eecs.berkeley.edu/Faculty/Awards/#turing\n",
"481 Visiting... http://www.eecs.berkeley.edu/education/\n",
"482 Visiting... http://www.eecs.berkeley.edu/department/history.shtml\n",
"483 Visiting... http://www.eecs.berkeley.edu/education/courses.shtml\n",
"484 Visiting... http://www.eecs.berkeley.edu/Scheduling/EE/schedule-draft.html\n",
"485 Visiting... http://www.eecs.berkeley.edu/Faculty/Homepages/ctnguyen.html\n",
"486 Visiting... http://www.eecs.berkeley.edu/\n",
"487 Visiting... http://www.eecs.berkeley.edu/Faculty/Homepages/salahuddin.html\n",
"488 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"489 Visiting... http://www.eecs.berkeley.edu/help\n",
"490 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"491 Visiting... http://www.eecs.berkeley.edu/alumni/lists.shtml\n",
"492 Visiting... http://www.eecs.berkeley.edu/Includes/privacy.shtml\n",
"493 Visiting... http://www.eecs.berkeley.edu/alumni/lists.shtml\n",
"494 Visiting... http://www.eecs.berkeley.edu/help\n",
"495 Visiting... http://www.eecs.berkeley.edu/Research/\n",
"496 Visiting... http://www.eecs.berkeley.edu/Pubs/\n",
"497 Visiting... http://www.eecs.berkeley.edu/alumni/lists.shtml\n",
"498 Visiting... http://www.eecs.berkeley.edu/education/\n",
"499 Visiting... http://www.eecs.berkeley.edu/Students/directories.shtml\n"
]
}
],
"source": [
"url_start = \"http://www.eecs.berkeley.edu/Research/\"\n",
"current_url = url_start\n",
"num_of_visits = 500\n",
"\n",
"#List of professors obtained from the EECS page\n",
"profs = ['Abbeel','Agrawala','Alon','Anantharam','Arcak','Arias','AsanoviÄ‡','Bachrach','Bajcsy','Bodik','Bokor','Boser','Brewer','Canny','Chang-Hasnain','Culler','Darrell','Demmel','Fearing','Fox','Franklin','Garcia','Goldberg','Hartmann','Harvey','Hellerstein','Javey','Joseph','Katz','Keutzer','Liu','Klein','Kubiatowicz','Lee','Lustig','Maharbiz','Malik','Nguyen','Niknejad','Nikolic',\"O'Brien\",'Parekh','Patterson','Paxson','Pisano','Rabaey','Ramchandran','Roychowdhury','Russell','Sahai','Salahuddin','Sanders','Sangiovanni-Vincentelli','Sastry','Sen','Seshia','Shenker','Song','Song','Spanos','Stoica','Stojanovic','Tomlin','Tygar','Walrand','Wawrzynek','Wu','Yablonovitch','Yelick','Zakhor']\n",
"#Bad URLs help take care of some pathologies that ruin our surfing\n",
"bad_urls = ['http://www.erso.berkeley.edu/','http://www.eecs.berkeley.edu/Rosters/roster.name.nostudentee.html','http://www.eecs.berkeley.edu/Resguide/admin.shtml#aliases','http://www.eecs.berkeley.edu/department/EECSbrochure/c1-s3.html']\n",
"\n",
"#Creating a dictionary to keep track of how often we come across a professor\n",
"profdict = {}\n",
"for i in profs:\n",
" profdict[i] = 0\n",
"\n",
"for i in range(num_of_visits):\n",
" print i , ' Visiting... ', current_url\n",
" if random.random() < 0.95: #follow a link!\n",
" url_list = parse_links(current_url, url_start)\n",
" updated = False\n",
" while not updated:\n",
" current_url = random.choice(url_list)\n",
" updated = True\n",
" if current_url in bad_urls or \"iris\" in current_url or \"Deptonly\" in current_url or \"anchor\" in current_url or \"erso\" in current_url: #dealing with more pathologies:\n",
" updated = False\n",
" myopener = MyOpener()\n",
" page = myopener.open(current_url)\n",
" text = page.read()\n",
" page.close()\n",
" #Figuring out which professor is mentioned on a page.\n",
" for p in profs:\n",
" profdict[p]+= 1 if \" \" + p + \" \" in text else 0 #can use regex re.findall(i,text), but it's overkill\n",
" else: #click the \"home\" button!\n",
" current_url = url_start"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"collapsed": false,
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1 1.000000: Chang-Hasnain\n",
"2 0.916667: Lee\n",
"3 0.583333: Russell\n",
"4 0.458333: Joseph\n",
"5 0.458333: Franklin\n",
"6 0.416667: Patterson\n",
"7 0.416667: Paxson\n",
"8 0.416667: Sen\n",
"9 0.375000: Wu\n",
"10 0.375000: Demmel\n",
"11 0.333333: Yablonovitch\n",
"12 0.291667: Harvey\n",
"13 0.291667: Bajcsy\n",
"14 0.291667: Brewer\n",
"15 0.291667: Bokor\n",
"16 0.291667: Hartmann\n",
"17 0.291667: Goldberg\n",
"18 0.291667: Pisano\n",
"19 0.250000: Katz\n",
"20 0.250000: Fearing\n",
"21 0.208333: Tomlin\n",
"22 0.125000: Yelick\n",
"23 0.125000: Arias\n",
"24 0.125000: Culler\n",
"25 0.083333: Liu\n",
"26 0.083333: Song\n",
"27 0.083333: Salahuddin\n",
"28 0.041667: Zakhor\n",
"29 0.041667: Sangiovanni-Vincentelli\n",
"30 0.041667: Anantharam\n",
"31 0.041667: Nguyen\n",
"32 0.041667: Rabaey\n",
"33 0.041667: Lustig\n",
"34 0.041667: Garcia\n",
"35 0.041667: Javey\n",
"36 0.041667: Darrell\n",
"37 0.041667: Tygar\n",
"38 0.000000: Boser\n",
"39 0.000000: Walrand\n",
"40 0.000000: Niknejad\n",
"41 0.000000: Fox\n",
"42 0.000000: Shenker\n",
"43 0.000000: Hellerstein\n",
"44 0.000000: Seshia\n",
"45 0.000000: Stojanovic\n",
"46 0.000000: Sanders\n",
"47 0.000000: Ramchandran\n",
"48 0.000000: Klein\n",
"49 0.000000: Roychowdhury\n",
"50 0.000000: Agrawala\n",
"51 0.000000: Canny\n",
"52 0.000000: AsanoviÄ‡\n",
"53 0.000000: Arcak\n",
"54 0.000000: O'Brien\n",
"55 0.000000: Nikolic\n",
"56 0.000000: Keutzer\n",
"57 0.000000: Kubiatowicz\n",
"58 0.000000: Sastry\n",
"59 0.000000: Sahai\n",
"60 0.000000: Alon\n",
"61 0.000000: Maharbiz\n",
"62 0.000000: Malik\n",
"63 0.000000: Wawrzynek\n",
"64 0.000000: Parekh\n",
"65 0.000000: Bachrach\n",
"66 0.000000: Abbeel\n",
"67 0.000000: Bodik\n",
"68 0.000000: Stoica\n",
"69 0.000000: Spanos\n"
]
}
],
"source": [
"prof_ranks = [pair[0] for pair in sorted(profdict.items(), key = lambda item: item[1], reverse=True)]\n",
"top_score = profdict[prof_ranks[0]]\n",
"for i in range(len(prof_ranks)):\n",
" print \"%d %f: %s\" % (i+1,profdict[prof_ranks[i]]/top_score, prof_ranks[i])"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"24"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"top_score"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As you can see, the top score only got 24 mentions, so not only is this method indirect in trying to achieve its goal, but it is quite difficult to efficiently sample the true distribution we want."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## $\\mathcal{Q}$3. Try your hand at applying the above idea to a website you personally visit (somewhat) frequently! Do a simple crawl (similar to the above) and see if you can figure out something interesting. (Keep it simple.)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# Your code here"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.5"
}
},
"nbformat": 4,
"nbformat_minor": 0
}