A High level Description of the The CMUcam's Implementation

Tom Kneeland


The CMUcam is an inexpensive optical color recognition system that can be used for, among other things, robotic object recognition. The system consists of two primary components, a digital camera and a dedicated micro controller. Firmware running on the micro controller allows for recognition and tracking of colored objects, referred to as blobs. The system can calculate the mean color and variance of objects within camera view and report back various object related data to a host computer using an RS-232 or TTL serial connection. The system can operate at a respectable rate of 17 frames per second. The host serial connection can run at various rates including 9600 and 115,200 bits per second.


The system consists of two components, a CMOS Camera manufactured by Omnivision and a small micro controller manufactured by Ubicom. The CMOS Camera used in the system is the OV6620 color camera. Its custom built circuit board contains a 17MHz clock and provides the necessary external connection to facilitate using the camera on a 8 or 16 bit bus. The camera's maximum resolution is 356 x 292 at 60fps and a pixel format of 8 bit RGB or YCrCb.

The Ubicom SX28 micro controller provides the brains of the system. It is a 75MHz RISC processor capable of 75 MIPS. The configuration used by the CMUcam has 2048 words of ROM (used to store the program) and 136 bytes of SRAM.

The firmware is written entirely in C and is responsible for Color Blob Tracking, Frame Dumping, "Line Mode", and Color Statistics. Two serial protocols are also implemented in the firmware. One operates using displayable characters so any ASCII terminal can be used to setup and control the camera (such as the Windows HyperTerminal). Another protocol is more compact using binary data and is intended for use with other embedded systems. The program occupies 2035 of the 2048 bytes of ROM and uses 135 of the 136 bytes of SRAM.


The CMUcam configures the Omnivision OV6620 to use a resolution of 176 x 144. The camera is connected to the rest of the CMU designed system using an 8 bit bus. The 135 bytes of RAM used in its software implementation is divided to provide 80 bytes for a pixel buffer and 56 bytes for processing. The 80 byte pixel buffer provides enough space to store one row of camera data for processing. This configuration provides an effective resolution of 80 x 144.

The CMOS camera is not producing 1 byte per pixel however so some software wizardry is needed to fit one row of data into the 80 byte pixel buffer. The CMOS camera can produce pixel data in one of two formats, RGB and YCrCb. A simple mathematical conversion is widely available for RGB -> YCrCb and YCrCb -> RGB so we will limit this description to an RGB example. The exact format used by the CMUcam can be best described using a ratio, 4:2:2. This means that for every 2 red and blue values there are 4 green values. This system takes advantage of the fact that the human eye is more sensitive to the green spectrum than the red and blue. Each component, red, green, or blue, is represented used 8 bits of data. In this format two image pixels are packed into one Macropixel. The figure below illustrates how within one macro pixel the red and blue channels are shared between two green values. This is actually a fairly standard form of compression used in video systems and effectively creates an "average" of 16 bits per pixel.


A quick look at the numbers, 16bpp x 80 pixels per row, would result in a buffer size of 160 bytes. This is twice the allotted 80 bytes. The CMU implementation has to therefore perform even more compression on the image data produced by the camera. The method chosen essentially creates a proprietary 2:1:1 format. The first step is to skip every other pixel. This can be achieved by dropping the second green value in every macropixel. The CMU documentation then states that they reuse, or share, the red and blue values. The figure below shows how two macropixels can be merged into one by dropping 2 green values, and reusing one of the original macropixel's red and blue channels. This process obviously sacrifices image resolution (compressing 176 columns of data into 80) as well as color accuracy.

Compressed Macropixels

Color Blob Tracking

The Color Blob Tracking algorithm is fairly straight forward. As the data streams in from the camera the firmware compares each pixel to a min and max color value (configured by the host) as well as whether matching pixels fall within a configurable bounding box. From the data collected about matching pixels the CMUcam and produce a Confidence Measure calculated by the following formula:

confidence = num_matching_pixels/box_area * 256

The center of an identified blob is also calculated by summing all the x,y coordinates of the matched pixels and dividing that value by the number of detected pixels.

Compressed Macropixels

The above illustration represents an 8x8 bounding box with a 3x3 "blob". Here the centroid calculation would look as follows:

Centriod_x = (2 + 3 + 4) x 3 = 27 / 9 = 3
Centriod_y = (2 + 3 + 4) x 3 = 27 / 9 = 3

So the centroid is the illustration is located at (3,3) and is represented by the dark blue "pixel".

Other Features

There are three other features worth noting provided by the CMUcam. The Frame Dump feature allows the CMUcam to stream the video provided by the CMOS camera directly to the host. This feature is limited speed of the host serial connection and can therefore only dump one or two columns of data at a time. The following picture illustrates an image that can be captured in this manner.

Compressed Macropixels

The CMUcam can also operate in "Line Mode". When in this mode the CMUcam can stream, in real time, a line by line binary image of passing and failing pixels. Being real time this can occur while the camera is scanning for a blob. The figure below illustrates how the binary Line Mode data can be super imposed on a Frame Dump. (Also note that the centriod is being displayed).

Compressed Macropixels

A third feature worth noting is the ability to connect multiple CMU micro controller boards to a single CMOS camera. This feature allows the two CMU boards to process the camera data in parallel, and could be used to track multiple colors simultaneously.


The photo's and technical description was provided by the CMUcam web page located here: http://www-2.cs.cmu.edu/~cmucam/

Information about the Omnivision CMOS camera can be obtained from here: http://www-2.cs.cmu.edu/~cmucam/Downloads/ov6620DSLF.PDF

Information about the Ubicom SX28 micro controller can be found on at this URL: http://www-2.cs.cmu.edu/~cmucam/Downloads/SX-DDS-SX2028AC-16.pdf