Cloudinary Blog

One pixel is worth three thousand words

By Jul 20, 2016

How various image formats compress one-pixel images

A couple of months ago while taking a break from implementing cool new features like q_auto and g_auto, I was joking in our team chat about how well various image formats “compress” one-pixel images. In response, Orly — who runs the blog — asked me if I’d write a post about single-pixel images. I said: "Sure, why not. But it will be a very short blog post. After all, there’s not much you can say about a single pixel."

Looks like I was wrong. Very wrong.

What can you do with one pixel?

Back in the early days of the web, one-pixel images were widely used as a poor man’s solution to do things we now do with CSS. Spacing, creating lines or rectangles, semi-transparent backgrounds: there’s quite a lot you can do by simply scaling one pixel to arbitrary dimensions. Another use of one-pixel images, still a common practice today, is as a web beacon, for tracking or analytics.

In responsive web design, one-pixel images are often used as temporary placeholders while the page is loading. Since most browsers do not support client-hints, some responsive image solutions wait for the page to fully load in order to determine the actual rendered image sizes, and then replace a one-pixel image with the right breakpoint image using JavaScript.

Broken image example

There is one other use of single-pixel images: they can be used as ‘default’ images. If for whatever reason the actual image that you want to show cannot be found, it might in some cases be better to hide that fact (by showing one transparent pixel) than to return a “404 - Not Found” error, which will usually be rendered by browsers as a “broken image” icon. In both cases, you don’t get to see the intended image, but it might look a bit more professional if you don’t ‘rub it in’ by showing a broken image icon.

OK, it looks like one-pixel images do have some uses. So, what’s the best way to encode a 1x1 image?

Obviously, this is a fringe case for image compression formats. If the “image” only consists of a single pixel, there sure is not a lot of data to compress. In fact, the uncompressed data is just one bit to four bytes – depending on how you interpret the data: black & white (1 bit), grayscale (1 byte), grayscale + alpha (2 bytes), RGB (3 bytes), or RGBA (4 bytes).

But you can’t encode just the data. In any image format, you need to specify how to interpret the data. At the very least, you need to know the width and height of the image, and the number of bits or bytes per pixel.

Headers

Typically, to encode the width and height, four bytes are used: two bytes per number (if it were only one byte, the maximum image dimension would be 255x255). Let’s say that we need another byte to encode the color type of the image (e.g. grayscale, RGB or RGBA). In this minimalistic image format, a single-pixel image would take at least 6 bytes (e.g. for a white pixel) and at most 9 bytes (for a semi-transparent, arbitrary color pixel).

However, actual image formats tend to have a “header” that contains quite a bit more information. First of all, the first few bytes of any image format contain a fixed identifier that is only there to say “Hey! I’m a file in this particular file format!”. This fixed sequence of bytes is also known as the magic number. For example, a GIF file always starts with either GIF87a or GIF89a (depending on which version of the GIF spec is used), a PNG file always starts with an 8-byte sequence that includes PNG, JPEG files have a header that contains the string JFIF or Exif, and so on.

Headers can contain all sorts of meta-information about an image. Some of it is format-specific information to indicate what kind of subformat is used, and is necessary to decode the pixels correctly. Some of it might not be necessary to decode the pixels, but is still useful to know how to render them – e.g. color profiles, orientation, gamma, or dots-per-pixel. Some of it might be arbitrary metadata, like comments, timestamps, copyright notices, or GPS coordinates. These things might be optional, or they might be obligatory; it depends on the format specification. Of course all of this metadata has some cost in terms of file size. So let’s focus on “minimal” files, where all of the non-obligatory metadata has been stripped. Otherwise we might be wasting precious bytes on silly things.

Besides headers, image formats may have other kinds of “overhead”. They may contain all kinds of markers and checksums, intended to make the format more robust in case of transmission errors or other forms of corruption. Also, sometimes some kind of padding is required, to ensure that the data gets aligned properly.

One-pixel images – the smallest possible images – reveal exactly how much “overhead” there is in an image format. Let’s take a look.

Here is a hexdump of a 67-byte PNG file, representing a 1x1 white pixel:

00000000  89 50 4e 47 0d 0a 1a 0a  00 00 00 0d 49 48 44 52  |.PNG........IHDR|
00000010  00 00 00 01 00 00 00 01  01 00 00 00 00 37 6e f9  |.............7n.|
00000020  24 00 00 00 0a 49 44 41  54 78 01 63 68 00 00 00  |$....IDATx.ch...|
00000030  82 00 81 4c 17 d7 df 00  00 00 00 49 45 4e 44 ae  |...L.......IEND.|
00000040  42 60 82                                          |B`.|

This file consists of the 8-byte PNG magic number, followed by a header chunk (IHDR) which contains 13 bytes, an image data chunk (IDAT) with 10 bytes of “compressed” image data, and an end marker (IEND). Every chunk starts with a 4-byte chunk length and a 4-byte chunk identifier and ends with a 4-byte chunk checksum, and these three chunks are obligatory, so that’s another 36 bytes, for a total file size of 67 bytes.

A black pixel is also 67 bytes in PNG; a fully transparent pixel is 68 bytes, and an arbitrary RGBA color will be between 67 and 70 bytes.

JPEG has a longer header. The smallest one-pixel JPEG is 160 bytes (Update: 141 bytes). And it cannot be transparent, because JPEG does not support an alpha channel.

GIF is the most compact (in terms of headers) amongst the three universally supported image formats. A white pixel can be encoded as a valid GIF file in just 35 bytes:

00000000  47 49 46 38 37 61 01 00  01 00 80 01 00 00 00 00  |GIF87a..........|
00000010  ff ff ff 2c 00 00 00 00  01 00 01 00 00 02 02 4c  |...,...........L|
00000020  01 00 3b                                          |..;|

and a fully transparent pixel can be done in 43 bytes:

00000000  47 49 46 38 39 61 01 00  01 00 80 01 00 00 00 00  |GIF89a..........|
00000010  ff ff ff 21 f9 04 01 0a  00 01 00 2c 00 00 00 00  |...!.......,....|
00000020  01 00 01 00 00 02 02 4c  01 00 3b                 |.......L..;|

Note that for all of the above formats, you can come up with even smaller files that will still decode to a one-pixel image in all or most browsers, but they are not valid with respect to the format specifications, which means that an image decoder might at any time complain (rightfully) that the file is corrupt, and show the broken image icon which we were trying to avoid.

So what’s the best format for a one-pixel image on the web? That depends. If it’s an opaque pixel, then the answer is GIF. If it’s a fully transparent pixel, then the answer is also GIF. But if it’s a semi-transparent pixel, then the answer is PNG, since GIF only supports all-or-nothing transparency.

Not that all of this matters very much. All of these files fit easily in a single network package, so in practice, there is no real speed difference – and the storage needed for this is negligible anyway. But still, it’s an amusing thing to look at, at least for image format geeks like me.

What about other, more exotic file formats?

If you use WebP for one-pixel images, be sure to use lossless WebP. A single-pixel lossless WebP image is between 34 and 38 bytes. A single-pixel lossy WebP image is between 44 and 104 bytes, depending mostly on whether there’s an alpha channel or not. For example, this is a fully transparent pixel as a 34-byte lossless WebP:

00000000  52 49 46 46 1a 00 00 00  57 45 42 50 56 50 38 4c  |RIFF....WEBPVP8L|
00000010  0d 00 00 00 2f 00 00 00  10 07 10 11 11 88 88 fe  |..../...........|
00000020  07 00                                             |..|

and here is the same pixel as a lossy (default) WebP of 82 bytes:

00000000  52 49 46 46 4a 00 00 00  57 45 42 50 56 50 38 58  |RIFFJ...WEBPVP8X|
00000010  0a 00 00 00 10 00 00 00  00 00 00 00 00 00 41 4c  |..............AL|
00000020  50 48 0b 00 00 00 01 07  10 11 11 88 88 fe 07 00  |PH..............|
00000030  00 00 56 50 38 20 18 00  00 00 30 01 00 9d 01 2a  |..VP8 ....0....*|
00000040  01 00 01 00 02 00 34 25  a4 00 03 70 00 fe fb fd  |......4%...p....|
00000050  50 00                                             |P.|

The main difference between the two, is that a lossy WebP with transparency is actually stored internally as two images, thrown together into one container file: one lossy image for the RGB values, and one lossless image for the alpha values.

BPG

For Bellard’s BPG format, which also has a lossless and a lossy mode, it’s the other way around. The lossy BPG encoding of a single white pixel is 31 bytes, the smallest we’ve seen so far:

00000000  42 50 47 fb 00 00 01 01  00 03 92 47 40 44 01 c1  |BPG........G@D..|
00000010  71 81 12 00 00 01 26 01  af c0 b6 20 bc b6 fc     |q.....&.... ...|

The lossless BPG for the same white pixel is 59 bytes. However, a fully transparent pixel is 57 or 113 bytes as a lossy or lossless BPG, respectively. Interestingly, for a single white pixel, BPG wins versus WebP (31 byte BPG vs 38 byte WebP), but for a single transparent pixel, WebP wins versus BPG (34 byte WebP vs 57 byte BPG).

FLIF

And then there’s FLIF. As the main creator of the Free Lossless Image Format, obviously I cannot forget about that one. Here’s a 15 byte FLIF file for one white pixel:

00000000  46 4c 49 46 31 31 00 01  00 01 18 44 c6 19 c3     |FLIF11.....D...|

And here’s a 14 byte file for a black pixel:

00000000  46 4c 49 46 31 31 00 01  00 01 1e 18 b7 ff        |FLIF11........|

The black pixel file is one byte smaller because the number zero happens to compress better than the number 255. The header is pretty simple: the first four bytes are always “FLIF”, the next byte is a human-readable indication of the color and interlacing type. In this case it is “1”, which means we have just one color channel (i.e. it’s a grayscale image). The next byte indicates the color depth: “1” means one byte per channel. And the next four bytes are the image dimensions, in this case 0x0001 by 0x0001. The last four or five bytes are the actual compressed data.

One fully transparent pixel is also 14 bytes in FLIF:

00000000  46 4c 49 46 34 31 00 01  00 01 4f fd 72 80        |FLIF41....O.r.|

In this case, we have 4 color channels (RGBA) instead of just one. You might expect the data section to be longer in this file (after all, there are four times as many color channels), but that’s not the case: since the alpha value happens to be zero (it’s a fully transparent pixel), the RGB values are considered irrelevant so they don’t end up being encoded at all.

For an arbitrary RGBA color, the FLIF file can be up to 20 bytes.

OK, so FLIF is the clear winner in the “one pixel” category of some weird image encoding competition. If only this were an important thing to compete at :)

Actually, no. FLIF isn’t the winner. Remember the minimalistic (and non-existent) image format I mentioned in the beginning? The one that would encode single-pixel images in 6 to 9 bytes? Well that format doesn’t exist, so I suppose it doesn’t count. But there is an image format that does exist, and which gets quite close to that.

It’s called the Portable Bitmap format (PBM), and it’s an uncompressed image format from the 1980s. Here’s how you could encode a single white pixel as a PBM file in just 8 bytes:

00000000  50 31 0a 31 20 31 0a 30                           |P1.1 1.0|

Actually, forget about the hexdump, this is a human-readable file format. You can open it in a text editor if you want (at least this particular subformat):

P1
1 1
0

The first line (“P1”) indicates that this is a black & white image. Not grayscale; there are only two colors: black (which confusingly gets the number 1) and white (0). The second line indicates the image dimensions. And then it’s just a whitespace-delimited list of numbers, one number per pixel. So in this case just the number 0.

If you need something other than pure white or black, you can use the PGM format to get one pixel in any other shade of gray in just 12 bytes, or the PPM format to get any RGB color in just 14 bytes. This is always smaller than the corresponding FLIF file (or any other compressed format, for that matter).

The traditional PNM family (PBM, PGM and PPM) does not support transparency. There is an extension of PNM though, called Portable Arbitrary Map (PAM), which does support images with transparency. Unfortunately for our current purposes, its syntax is quite a bit more verbose. The smallest valid PAM file that encodes a fully transparent pixel, is the following:

P7
WIDTH 1
HEIGHT 1
DEPTH 4
MAXVAL 1
TUPLTYPE RGB_ALPHA
ENDHDR
\0\0\0\0

On the last line there are four zero (NULL) bytes. The above file is 67 bytes. You might be tempted to use grayscale+alpha instead of RGBA, because that would save two bytes in the data section. But that results in a 71 byte file, since you have to change the TUPLTYPE from RGB_ALPHA to GRAYSCALE_ALPHA. Oh and by the way, your image software might not like the use of MAXVAL 1, so you might need to change that to MAXVAL 255 (which takes two more bytes).

So all in all, for one-pixel images, when there’s no transparency involved, PNM is the smallest (8 to 14 bytes for PNM vs 14 to 18 bytes for FLIF), but when there is transparency, FLIF is smallest (14 to 20 bytes for FLIF vs 67 to 69 bytes for PAM).

Here is a summary table that gives the (optimal) file sizes for various one-pixel images:

	white	black	gray	yellow #FFFF00	transparent	semitransparent #1337BABE
PNG	67	67	67	69	68	70
GIF	35	35	43	35	43	/
JPEG	160	160	159	288	/	/
Lossy WebP	44	44	44	64	82	92
Lossless WebP	38	34	38	36	34	38
Lossy BPG	31	31	29	36	57	62
Lossless BPG	59	59	37	124	113	160
FLIF	15	14	15	18	14	20
PNM/PAM	8	8	12	14	67	69

It might seem a bit surprising that an uncompressed image format actually beats most of the compressed formats at this particular task. But it’s not that surprising if you think about it. One-pixel images are in a sense the worst-case scenario for image compression: they’re all headers and overhead, and very little data. And the very little data there is cannot really be compressed because compression depends on predictability, and how are you supposed to predict one single pixel?

In part two of this blog post I will discuss the other extreme. How well do extremely predictable single-color images perform in various formats? Stay tuned….

Update: Check out part two as well: A one-color image is worth two thousand words

Recent Blog Posts

Generate Waveform Images from Audio with Cloudinary

By David Walsh Aug 30, 2017

This is a reposting of an article written by David Walsh. Check out his blog HERE!
I've been working a lot with visualizations lately, which is a far cry from your normal webpage element interaction coding; you need advanced geometry knowledge, render and performance knowledge, and much more. It's been a great learning experience but it can be challenging and isn't always an interest of all web developers. That's why we use apps and services specializing in complex tasks like Cloudinary: we need it done quickly and by a tool written by an expert.

Make All Images on Your Website Responsive in 3 Easy Steps

By Christian Nwamba Aug 28, 2017

Images are crucial to website performance, but most still don't implement responsive images. It’s not just about fitting an image on the screen, but also making the the image size relatively rational to the device. The srcset and sizes options, which are your best hit are hard to implement. Cloudinary provides an easier way, which we will discuss in this article.

The Future of Audio and Video on the Web

By Prosper Otemuyiwa Aug 23, 2017

Web sites and platforms are becoming increasingly media-rich. Today, approximately 62 percent of internet traffic is made up of images, with audio and video constituting a growing percentage of the bytes.

Embed Images in Email Campaigns at Scale

By Sourav Kundu Aug 21, 2017

tl;dr

Cloudinary is a powerful image hosting solution for email marketing campaigns of any size. With features such as advanced image optimization and on-the-fly image transformation, backed by a global CDN, Cloudinary provides the base for a seamless user experience in your email campaigns leading to increased conversion and performance.

Build the Back-End For Your Own Instagram-style App with Cloudinary

By Christian Nwamba Aug 16, 2017

Github Repo

Managing media files (processing, storage and manipulation) is one of the biggest challenges we encounter as practical developers. These challenges include:

A great service called Cloudinary can help us overcome many of these challenges. Together with Cloudinary, let's work on solutions to these challenges and hopefully have a simpler mental model towards media management.

Build A Miniflix in 10 Minutes

By Prosper Otemuyiwa Aug 14, 2017

Developers are constantly faced with challenges of building complex products every single day. And there are constraints on the time needed to build out the features of these products.

Engineering and Product managers want to beat deadlines for projects daily. CEOs want to roll out new products as fast as possible. Entrepreneurs need their MVPs like yesterday. With this in mind, what should developers do?