Simple Video Codec Tips

1) If you can store a buffer of at least one extra scanline, you could try the
Paeth predictor + RLE. This will give reasonable prediction of the next
pixel's grayscale value, and if the prediction is OK, the result will often
contain a string of zeroes, and the RLE will do a good job.

If you can "afford it" (in other words, the FPGA is fast enough), you could
use arithmetic coding on the resulting predicted values, with a simple order
0 model, instead of RLE.

Paeth + RLE will do OK on computer generated images, but not on natural
images. Paeth + AC will do OK on both.

Both will fit in 1kb of code for sure.

2) In comp.arch.fpga Melanie Nasic <> wrote:
: I want the compression to be lossless and not based on perceptional
: irrelevancy reductions.

If it has to be lossless there's no way you can guarantee to
get 2:1 compression (or indeed any compression at all!). You
may do, with certain kinds of input, but it's all down to the
statistics of the data. The smaller your storage the less
you can benefit from statistical variation across the image,
and 1 Kbyte is very small!

Given that a lossless system is inevitably 'variable bit rate'
(VBR) the concept of "real time capability" is somewhat vague;
the latency is bound to be variable. In real-world applications
the output bit-rate is often constrained so a guaranteed minimum
degree of compression must be achieved; such systems cannot be
(always) lossless.

From my experience I would say you will need at least a 4-line
buffer to get near to 2:1 compression on a wide range of input
material. For a constant-bit-rate (CBR) system based on a 4x4
integer transform see:

http://www.bbc.co.uk/rd/pubs/whp/whp119.shtml

This is designed for ease of hardware implementation rather than
ultimate performance, and is necessarily lossy.

3) JPEG supports lossless encoding that can fit (at least roughly) within
the constraints you've imposed. It uses linear prediction of the
current pixel based on one or more previous pixels. The difference
between the prediction and the actual value is what's then encoded. The
difference is encoded in two parts: the number of bits needed for the
difference and the difference itself. The number of bits is Huffman
encoded, but the remainder is not.

This has a number of advantages. First and foremost, it can be done
based on only the curent scan line or (depending on the predictor you
choose) only one scan line plus one pixel. In the latter case, you need
to (minutely) modify the model you've outlined though -- instead of
reading, compressing, and discarding an entire scan line, then starting
the next, you always retain one scan line worth of data. As you process
pixel X of scan line Y, you're storing pixels 0 through X+1 of the
current scan line plus pixels X-1 through N (=line width) of the
previous scan line.

Another nice point is that the math involved is always simple -- the
most complex case is one addition, one subtraction and a one-bit right
shift.

4) Though it's only rarely used, there's a lossless version of JPEG
encoding. It's almost completely different from normal JPEG encoding.
This can be done within your constraints, but would be improved if you
can relax them minutely. Instead of only ever using the current scan
line, you can improve things if you're willing to place the limit at
only ever storing one scan line. The difference is that when you're in
the middle of a scan line (for example) you're storing the second half
of the previous scan line, and the first half of the current scan line,
rather than having half of the buffer sitting empty. If you're storing
the data in normal RAM, this makes little real difference -- the data
from the previous scan line will remain in memory until you overwrite
it, so it's only really a question of whether you use it or ignore it.

Yes. In the JPEG 2000 standard, they added JPEG LS, which is another
lossless encoder. A full-blown JPEG LS encoder needs to store roughly
two full scan lines if memory serves, which is outside your
constraints. Nonetheless, if you're not worried about following the
standard, you could create more or less a hybrid between lossless JPEG
and JPEG LS, that would incorporate some advantages of the latter
without the increased storage requirements.

I suspect you could improve the prediction a bit as well. In essence,
you're creating a (rather crude) low-pass filter by averaging a number
of pixels together. That's equivalent to a FIR with all the
coefficients set to one. I haven't put it to the test, but I'd guess
that by turning it into a full-blown FIR with carefully selected
coefficients (and possibly using more of the data you have in the
buffer anyway) you could probably improve the predictions. Better
predictions mean smaller errors, and tighter compression.



發佈了48 篇原創文章 · 獲贊 15 · 訪問量 53萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章