This is the second episode of my new Computer Vision for the Robotic Age podcast. This episode is about video-I/O. The podcast demonstrates how a video player with proper audio/video synchronisation can be implemented with the Interactive Ruby Shell. The Sintel short film (Copyright Blender Foundation) was used as a video for testing.
Here’s the source code of the Ruby video player created in the podcast:
require'rubygems'# load FFMPEG bindingsrequire'hornetseye_ffmpeg'# load X.Org bindingsrequire'hornetseye_xorg'# load ALSA bindingsrequire'hornetseye_alsa'# include the namespaceincludeHornetseye# open a video fileinput=AVInput.new'sintel.mp4'# open sound output with sampling rate of videoalsa=AlsaOutput.new'default:0',input.sample_rate,input.channels# read first audio frameaudio_frame=input.read_audio# display images using width of 600 pixels and XVideo hardware accelerationX11Display.show600,:output=>XVideoOutputdo|display|# read an imageimg=input.read# while there is space in the audio output buffer ... whilealsa.avail>=audio_frame.shape[1]# ... write previous frame to audio bufferalsa.writeaudio_frame# read new audio frameaudio_frame=input.read_audioend# compute difference of video clock to audio clockdelay=input.video_pos-input.audio_pos+(alsa.delay+audio_frame.shape[1]).quo(alsa.rate)# suspend program in order to synchronise the video with the audiodisplay.event_loop[delay,0].max# display imageimgend
This is the first episode of my new Computer Vision for the Robotic Age podcast. This episode is on replacing the background of a live video with another video. The background replacement algorithm is implemented live using the HornetsEye real-time computer vision library for the Ruby programming language.
I am currently working on camera calibration. Many implementations require the user to manuallly point out corners. Here is an idea on how to detect and label corners automatically.
Compute corners of input image (and use non-maxima suppression).
Count corners in each component.
Look for a component which contains exactly 40 corners.
Get largest component of inverse of grid (i.e. the surroundings).
Grow that component and find all corners on it (i.e. corners on the boundary of the grid).
Find centre of gravity of all corners and compute vectors from centre to each boundary corner.
Sort boundary corners by angle of those vectors.
Use non-maxima suppression on list of length of vectors to get the 4 “corner corners” (convexity).
Use the locations of the 4 “corner corners” to compute a planar homography mapping the image coordinates of the 8 times 5 grid to the ranges 0..7 and 0..4 respectively.
Use the homography to transform the 40 corners and round the coordinates.
Order the points using the rounded coordinates.
Further work is about taking several images to perform the actual camera calibration.
Thanks to Manuel Boissenin for suggesting convexity for finding the “corner corners”.
Update:
After calibrating the camera the ratio of focal length to pixel size is known (also see Zhengyou Zhang’s camera calibration). Once the camera is calibrated, it is possible to estimate the 3D pose of the calibration grid in every frame.