{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Audio File Matching\n", "\n", "This notebook continues the audio file matching problem. Be sure to have song.wav and clip.wav in the same directory as the notebook.\n", "\n", "In this notebook, we will look at the problem of searching for a small audio clip inside a song.\n", "\n", "The song \"Mandelbrot Set\" by Jonathan Coulton is licensed under CC BY-NC 3.0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Run the next block of code before preceding" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import numpy as np\n", "import wave\n", "import matplotlib.pyplot as plt\n", "import scipy.io.wavfile\n", "import operator\n", "from IPython.display import Audio\n", "%matplotlib inline\n", "\n", "given_file = 'song.wav'\n", "target_file = 'clip.wav'\n", "rate_given, given_signal = scipy.io.wavfile.read(given_file)\n", "rate_target, target_signal = scipy.io.wavfile.read(target_file)\n", "given_signal = given_signal[:2000000].astype(float)\n", "target_signal = target_signal.astype(float)\n", "def play_clip(start, end, signal=given_signal):\n", " scipy.io.wavfile.write('temp.wav', rate_given, signal[start:end].astype(np.int16))\n", " return Audio(url='temp.wav', autoplay=True)\n", "\n", "def run_comparison(target_signal, given_signal, idxs=None):\n", " # Run everything if not called with idxs set to something\n", " if idxs is None:\n", " idxs = [i for i in range(len(given_signal)-len(target_signal))]\n", " return idxs, [vector_compare(target_signal, given_signal[i:i+len(target_signal)])\n", " for i in idxs]\n", "\n", "play_clip(0, len(given_signal))\n", "\n", "#scipy.io.wavfile.write(target_file, rate_given, (-0.125*given_signal[1380000:1380000+70000]).astype(np.int16))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will load the song into the variable `given_signal` and load the short clip into the variable `target_signal`. Your job is to finish code that will identify the short clip's location in the song. The clip we are trying to find will play after executing the following block." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "Audio(url=target_file, autoplay=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have defined the function 'vector_compare' to compare vectors by the \"cosine similarity measure.\"\n", "Run the following code that compares example vectors. Because the song has a lot of data, you should use the provided examples from the previous parts of the problem before running the later code. Do you results here make sense given your answers to previous parts of the problem?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def vector_compare(desired_vec, test_vec):\n", " \"\"\"This function compares two vectors, returning a number.\n", " The test vector with the largest magnitude return value is regarded as being closest to the desired vector.\"\"\"\n", " return np.dot(desired_vec.T, test_vec)/(np.linalg.norm(desired_vec)*np.linalg.norm(test_vec)+1.e-10)\n", "\n", "print(\"PART A:\")\n", "print(vector_compare(np.array([1,1,1]), np.array([1,1,1])))\n", "print(vector_compare(np.array([1,1,1]), np.array([-1,-1,-1])))\n", "print(\"PART C:\")\n", "print(vector_compare(np.array([1,2,3]), np.array([1,2,3])))\n", "print(vector_compare(np.array([1,2,3]), np.array([2,3,4])))\n", "print(vector_compare(np.array([1,2,3]), np.array([3,4,5])))\n", "print(vector_compare(np.array([1,2,3]), np.array([4,5,6])))\n", "print(vector_compare(np.array([1,2,3]), np.array([5,6,7])))\n", "print(vector_compare(np.array([1,2,3]), np.array([6,7,8])))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Part 1\n", "Run the following code that runs `vector_compare` on every subsequence in the song- it will probably take at least 5 minutes. How do you interpret this plot to find where the clip is in the song?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import time\n", "\n", "t0 = time.time()\n", "idxs, song_compare = run_comparison(target_signal, given_signal)\n", "t1 = time.time()\n", "plt.plot(idxs, song_compare)\n", "print (\"That took %(time).2f minutes to run\" % {'time':(t1-t0)/60.0} )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Part 2\n", "Fill in the following code to use `song_compare` to print the index of `given_signal` where `target_signal` begins. Then, verify that your answer is correct by playing the song at that index using the `play_clip` function." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "index = np.????(np.abs(np.array(song_compare)))\n", "print (index)\n", "play_clip(index,index+len(target_signal))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Hint: Have you heard of an argmax?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Problem 7 Image Stitching\n", "\n", "This section of the notebook continues the image stiching problem. Be sure to have a `figures` folder in the same directory as the notebook. The `figures` folder should contain the files:\n", "\n", " Berkeley_banner_1.jpg\n", " Berkeley_banner_2.jpg\n", " stacked_pieces.jpg\n", " lefthalfpic.jpg\n", " righthalfpic.jpg\n", " \n", "Note: This structure is present in the provided HW2 zip file." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Run the next block of code before proceeding\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "tags": [ "worksheet-0" ] }, "outputs": [], "source": [ "import numpy as np\n", "import numpy.matlib\n", "import matplotlib.pyplot as plt\n", "from mpl_toolkits.mplot3d import Axes3D\n", "from numpy import pi, cos, exp, sin\n", "import matplotlib.image as mpimg\n", "import matplotlib.transforms as mtransforms\n", "\n", "\n", "%matplotlib inline\n", "\n", "#loading images\n", "image1=mpimg.imread('figures/Berkeley_banner_1.jpg')\n", "image1=image1/255.0\n", "image2=mpimg.imread('figures/Berkeley_banner_2.jpg')\n", "image2=image2/255.0\n", "image_stack=mpimg.imread('figures/stacked_pieces.jpg')\n", "image_stack=image_stack/255.0\n", "\n", "\n", "image1_marked=mpimg.imread('figures/lefthalfpic.jpg')\n", "image1_marked=image1_marked/255.0\n", "image2_marked=mpimg.imread('figures/righthalfpic.jpg')\n", "image2_marked=image2_marked/255.0\n", "\n", "def euclidean_transform_2to1(transform_mat,translation,image,position,LL,UL):\n", " new_position=np.round(transform_mat.dot(position)+translation)\n", " new_position=new_position.astype(int)\n", "\n", " \n", " if (new_position>=LL).all() and (new_position=LL).all() and (new_position 0.995 and image1[row,col,1] > 0.995 and image1[row,col,2] > 0.995:\n", " temp = euclidean_transform_2to1(matrix_transform,translation,image2,position,LL,UL)\n", " image_rec[row,col,:] = temp\n", " else:\n", " image_rec[row,col,:] = image1[row,col,:]\n", " \n", "\n", "plt.figure(figsize=(20,20))\n", "plt.imshow(image_rec)\n", "plt.axis('on')\n", "plt.show()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.4.3" }, "name": "graphs_for_SOE.ipynb" }, "nbformat": 4, "nbformat_minor": 0 }