How Image Compression Works

What's up everybody today, we're talking about how digital images are represented compressed and stored on your devices. Okay, let's get started a typical image is represented as a matrix values of this matrix correspond to pixel intensity values. A larger.

Number means a brighter pixel. A smaller number means, a darker pixel color. Images have different color channels for each color components, such as red green and blue, although this is probably the most common way to represent an image it's, not how. They are typically stored on a disk. Why not let's take a look at what happens when we do let's say, we have a 12 megapixel color picture, which means we have 12 million values to store for each color channel leading to a total of 36 million values. If we assume that these values are stored as 8-bit or single, byte integers, we should end up with a 36. Megabyte file, I have a 12 megapixel image.

Here let's see how big it is wait, what that's not even two megabytes. How's this possible. The answer is image. Compression in this case, it's JPEG image compression. You've probably seen the extension JPEG at the end of your image file names.

JPEG is not the only compressed image format, but it's, probably the most common. One JPG is a lossy compression format, meaning that some information in the original image is actually thrown out the more information. You discard. The worse, the image quality gets, so there's, a trade-off between the image quality and the file size, but JPEG makes a profitable trade. Reducing the file size while preserving the perceived image quality because the thrown out parts are designed to be the parts that we wouldn't. Notice easily let's see how this is possible.

The first step is color space conversion instead of representing an image with its red green and blue color component, intensities that are converted into a color space, where one channel represents the light intensities. The other two channels represent the colors. This is a linear transform that can be expressed.

As a matrix multiplication this conversion provides a separation of the luminance from the chrominance components since our visual system is much more sensitive to the changes in brightness than color. We can safely sample the chroma components to save some space. This strategy is called chroma subsampling, and it's used in many image and video processing pipelines, another characteristic of human visual system that we can take advantage of is the frequency dependent contrast sensitivity.

What this. Means is that it's easier to miss small objects or fine details in a picture as compared to the large ones, which is kind of obvious in this figure, the spatial frequency of the bars increases from the left to the right. And the contrast decreases from the bottom to the top.

This may vary from person to person. But as you can see, the bars under the curve are more visible than the rest. This is because our visual system is more sensitive to brightness variations in this range of spatial frequencies. Look at the bars at the low contrast high frequency part of the figure there are barely visible. This phenomenon gives us some room for compression in those less visible frequencies. JPEG compression does that by dividing the image into 8x8 blocks and quantizing them in a frequency domain representation. This is done by comparing each one of these 8x8 blocks with 64 frequency patterns where the spatial frequency increases from left to right and top to bottom.

This process decomposes the image into. Its frequency components, converting a 8x8 block where each element represents a brightness level into another 8 by 8 block, where each element represents the presence of a particular frequency component. This method is called the discrete cosine transform in this representation. We can easily compress the frequencies that are less visible to us by dividing these frequency components with some customs and then quantizing them. The frequency components that we are less sensitive to get divided by. Larger percents as compared to the ones that we are more sensitive to quantization in this context.

Simply means rounding the results to the nearest integers using larger divisors lead to more numbers rounded down to zero this results in higher compression rates. But it also lowers the image quality after quantization. We end up with a lot of zeros in the high frequencies. We can store this information more efficiently by rearranging the elements, if you rearrange these coefficients in a zigzag.

Order from top left right? We can group these zeros together once we have the zeros together instead of storing each one of them separately, we can stir their value and the number of times they consecutively occur in tuples. This technique is called to run length encoding and is used in many other algorithms as well. Finally, we can further compress what's left by encoding, the more frequent values with fewer bits and less frequent values with more bits doing so reduces. The average number of bits per. Symbol this process is called entropy coding, both run-length encoding and Huffman coding are lossless compression methods. No information is thrown out in these steps.

The compression is achieved solely by storing redundant data more efficiently. This type of compression is used to compress transmit and store many types of data, including images, audio and documents when it's time to decode an image. All these steps are reversed since some information is lost during sub sampling and quantization steps. The decoded image, won't be identical to the original one.

However, the compressed images should look almost as good as the original ones when a reasonable compression rate is used compression, artifacts become more visible as the compression rate increases it's hard to show an uncompressed image here and compared to a compressed one because the video you're watching now is compressed as well. One type of image that JPG, particularly falls short is synthetic images, such as web graphics, sharp edges, are. Not common in natural images, but they are in synthetic images, high-frequency components that make up a strong edge in a synthetic image, get compressed harshly leading to visible compression artifacts near the edges. So what to do with synthetic images then use another image formats, such as PNG or web P or better yet use vector graphics when possible vector graphics are stored as mathematical equations, rather than pixel values.

There are lossless and can be scaled to any size without losing quality. Vector graphics are not feasible for pictures, but they're perfect for graphics, like logos, illustrations and diagrams. Alright, that's all for today. I hope you liked it. If you have any comments or questions, let me know in the comment section below subscribe for more videos. And as always thanks for watching and see you next time.

Dated : 17-Mar-2022

Leave Your Comment