The video scan consists of 260+ horizontal lines containing intensity information, along with a bunch of other analog stuff to make sure that the lines line up with each other (synchronization, or "sync", information) and to make sure that you don't write on the screen when the electron beam deflection circuitry is scanning back across the page (blanking).  The format of this signal comes from an earlier age when the world was analog and displays were made using electron guns and phosphors.

The video signal generated by the camera is a composite video signal. That is, the video data (gray levels), horizontal and vertical synchronization information is all encoded on a single wire.  The television locks on to the horizontal and vertical sync information in the video signal. Horizontal sync defines the beginning of a new line, and vertical sync defines the beginning of a new field. The horizontal sweep rate is 60 fields per sec times  262.5 lines per field or 15.75 KHz.The horizontal sweep interval is divided into three portions as: blanking, synchronization, and data. Each line lasts 63.5 microseoncds (=  1.0/15.75KHz). Retrace occurs during the blanking interval, and the leftmost visible data occurs at the end of the blanking interval. The horizontal synchronization signal occurs approximately in the middle of the blanking interval. Note that the video signal is analog, not binary, with levels from white to black, to synch.

So a single horizontal line of video (shown below) lasts for 63.5 microseconds and has several components.  The longest piece is the analog signal that represents the intensity of the image as a function of time (position along the scan line).  That analog piece is 53 microseconds long.  After the analog signal, there is some semi-digital information that tells the TV (or your project's FSM) exactly when the next line is about to start relative to the last one.  This is the horizontal sync signal.
In addition to the horizontal sync signal, at the end of each frame of video there is a vertical sync signal that indicates when the next frame is going to start, and which tells the TV's controller to turn off the electron gun while the beam deflection circuits are bringing the beam back to the top left corner of the screen.  The actual encoding of the vertical sync information is a bit cryptic, but you've got a chip on your board that takes the video signal and gives you a clear vertical sync signal.

Don't forget that the video signals are all asynchronous as far as your Xilinx board is concerned.  Treat them with caution!

The various sync signals are shown below .  Note that the vertical blanking interval is roughly 1 millisecond long.  This gives you plenty of time to think, do some math on a good sized chunk of a frame, transmit information on the serial line, etc.

 

Since the vertical scan is much slower than the horizontal scan, a longer vertical blanking interval used to be necessary to avoid seeing the retrace lines. The example in the figure indicates a blanking time of 15 horizontal periods. How many horizontal lines will be visible? Approximately 247 (= 262 -15) lines. The standard blanking interval is between 18 and 21 horizontal periods. It is ok if your design misses a few lines on  the top or bottom of the screen. A wide SYNC pulse triggers the vertical oscillator in the monitor to start another cycle, however, the horizontal oscillator must be kept synchronized during this time. Hence the serrations in the COMP_SYNCH.H signal.

Figuring out where the top and left side of the image is could be complicated. Fortunately, you will be using the LM1881 sync separator in checkpoint 2. It takes as input the composite video signal, and provides outputs of vertical sync and composite sync. So to find the beginning of the video frame, wait about 17 horizontal periods after the vertical sync signal goes low.  Then begin reading in video data about 8.8 microseconds after the composite sync goes low. Read a line, then wait for the next line until you have read all the lines you need.
 

For more info on the video signal, see checkpoint 2.