At the International Solid-State Circuits Conference this week, MIT researchers unveiled their own Quad HD video chip design.
Quad HD is also known as 4K and ultrahigh-definition (UHD). The new Quad HD video standard enables a fourfold increase in the resolution of TV screens.
At the Consumer Electronics Show (CES) in January, several manufacturers debuted new UHD models.
There is no UHD content yet, but the Japanese government plans to launch the world's first 4K TV broadcast in July 2014, from communications satellites, followed by satellite broadcasting and ground digital broadcasting, NBC News Gadget Box has reported.
Nonetheless, 4K TVs are now on sale by Japanese makers including Sony, Panasonic and Sharp. Other manufacturers include South Korea's LG Electronics.
HEVC
[+]
A key to efficient video compression is predicting future video frames on the basis of past ones. This diagram concerns "intra angular prediction."
(Credit: ISO)
Although the MIT chip isn't intended for commercial release, its developers believe that the challenge of implementing HEVC algorithms in silicon helps illustrate design principles that could be broadly useful.
Moreover, "because now we have the chip with us, it is now possible for us to figure out ways in which different types of video data actually interact with hardware," says Mehul Tikekar, an MIT graduate student in electrical engineering and computer science and one of the paper's co-authors.
How HEVC works
Like older coding standards, the HEVC standard exploits the fact that in successive frames of video, most of the pixels stay the same. Rather than transmitting entire frames, it's usually enough for broadcasters to transmit just the moving pixels, saving a great deal of bandwidth. The first step in the encoding process is thus to calculate "motion vectors" — mathematical descriptions of the motion of objects in the frame.
On the receiving, end, however, that description will not yield a perfectly faithful image, as the orientation of a moving object and the way it's illuminated can change as it moves. So the next step is to add a little extra information to correct motion estimates that are based solely on the vectors. Finally, to save even more bandwidth, the motion vectors and the corrective information are run through a standard data-compression algorithm, and the results are sent to the receiver.
The chip's first trick for increasing efficiency is to "pipeline" the decoding process: a chunk of data is decompressed and passed to a motion-compensation circuit, but as soon as the motion compensation begins, the decompression circuit takes in the next chunk of data. After motion compensation is complete, the data passes to a circuit that applies the corrective data and, finally, to a filtering circuit that smooths out whatever rough edges remain.
Fine-tuning
Pipelining is fairly standard in most video chips, but the MIT researchers developed a couple of other tricks to further improve efficiency. The application of the corrective data, for instance, is a single calculation known as matrix multiplication. A matrix is just a big grid of numbers; in matrix multiplication, numbers in the rows of one matrix are multiplied by numbers in the columns of another, and the results are added together to produce entries in a new matrix.
"We observed that the matrix has some patterns in it," Tikekar explains. In the new standard, a 32-by-32 matrix, representing a 32-by-32 block of pixels, is multiplied by another 32-by-32 matrix, containing corrective information. In principle, the corrective matrix could contain 1,024 different values. But the MIT researchers observed that, in practice, "there are only 32 unique numbers," Tikekar says. "So we can efficiently implement one of these [multiplications] and then use the same hardware to do the rest."
Similarly, Chiraag Juvekar, another graduate student in Chandrakasan's group, developed a more efficient way to store video data in memory. The "naive way," he explains, would be to store the values of each row of pixels at successive memory addresses. In that scheme, the values of pixels that are next to each other in a row would also be adjacent in memory, but the value of the pixels below them would be far away.
In video decoding, however, "it is highly likely that if you need the pixel on top, you also need the pixel right below it," Juvekar says. "So we optimize the data into small square blocks that are stored together. When you access something from memory, you not only get the pixels on the right and left, but you also get the pixels on the top and bottom in the same request."
Chandrakasan's group specializes in low-power devices, and in ongoing work, the researchers are trying to reduce the power consumption of the chip even further, to prolong the battery life of quad-HD cell phones or tablet computers.
One design modification they plan to investigate, Tikekar says, is the use of several smaller decoding pipelines that work in parallel. Reducing the computational demands on each group of circuits would also reduce the chip's operating voltage.