Let's say we want to refocus the image — i.e. show the foreground clearly while the background becomes blurry, or vice versa. With lightfield data, we can do this as a post-processing step!
Given our 17-by-17 camera grid, let's take a camera close to the center and use that as our fixed observation point. Because we have the camera positions, we can compute the relative offset between any two cameras. For example, a camera near the center (let's say camera (8, 7)) captured an image that is fairly similar to the image captured by the center camera (8, 8). However, the image from camera (0, 1) was taken at a completely different position, and thus captured the scene from the different angle. In order to compensate for this, we can shift the image from this camera by some amount. We can use the millimeter difference in their (x, y) positions, scaled by some alpha factor, as the shift amount. By using a multiplicative scaling factor, we preserve the relative shifts, ensuring that cameras further from the center are shifted proportionally more to compensate for a larger difference in perspective.
Then, taking all these scaled images and averaging them produces an image that shows a particular depth. This is highly dependent on that alpha value mentioned above. A lower alpha (e.g. 0) will show the background of the image in clear focus, which makes intuitive sense -- objects in the background are less prone to moving as the camera/observer moves. For example, if you took a portrait of someone with the Sun behind them, and then moved a foot to the left and took another picture, the person will appear to have shifted a lot while the sun will not move (in relation to other objects in the background). This is because the light rays from the Sun are essentially parallel.
As we scale our alpha up — all the way to around 0.5, in the gif above — we shift the area in focus closer to the observer. By playing with this alpha value, we can choose where in the image we want to be in focus.