I’m a responsible web designer, and as such, since WordPress (finally) accepts media uploads of image/webp
MIME type and since all web browsers newer than september 2020 (even Apple Safari \o/) can display it, I have been moving my photos library to WebP . After all, when you create content, the least you can do is to also provide the smoothest user experience around it.
WebP falls close to magical: lookie those file weights ! 15% savings compared to JPEG at same quality ! What are we waiting for ? Google even claims 25-34% smaller .
There are dozens of WordPress plugins allowing you to convert your old media library on-the-fly, most of them operating as SaaS (Shit as a Software) and doing the conversion on their own servers, which entitles them to make you pay a ridiculous amount for it, one of them I’m very unhappy to have actually paid (something about sparing time, which actually led to losing time AND money). All of them claim that their “aggressive” compression factor is safe for 99% of your pictures.
The most technical ones will go as far as telling you that WebP quality greater than lossy 80
is useless for most pictures, sustaining their claim with a glorious Google logo encoded at various rates. Because everyone knows shooting logos is the bread and butter of every photographer, especially the Adobe Stock ones. Also, logos have their gradients following an hyper-laplacian distribution like any other natural image1. Or maybe not. Who cares about gradients stats anyway ? We are only talking about 2D compression heuristics with entropy and high-frequencies thresholding, after all.
So, while I may have lost all respect for coding monkeys turned into image dudes just because a position opened (and everyone loves pics, right ? They are fun and much easier on the brain than words), especially the internet image dudes, I still fall every time for that silly assumption: people who are supposed to know, actually know. Years pass, I don’t learn: I read docs, I do what they say, I discover it doesn’t work as promised, only then I remember those guys don’t know shit about images. And here I am, loosing faith in humanity one wrong expert at a time.
In my great silliness, I set the third and last plugin I tried to the advised lossy 80
quality and trigger the batch conversion. I have relative faith in it since it uses server-side GraphicsMagic instead of the unfortunate PHP shitstack (GD, Gmagick and the likes) or the laggy HTTP-error connection-timed-out DNS-said-not-today please-retry-later SaaS nonsense.
Everything goes well, until this happens…
To the non-educated eye, this might look ok, but for a photographer it’s not, and for several reasons. See the posterized ring in the background ? First of all, it’s not graceful, but then it has nothing to do there. This comes from a 16 bits scan of an Ilford Delta 400 film shot with a Mamiya RB 67, that is old school analog medium format at 6×7 cm. The silver halide crystals of the Delta 400 emulsion act as a natural dithering which makes high-frequency compressions more difficult and therefore prevent posterization in smooth areas. So, for any compression algo, managing to posterize a Delta 400 scan is a feat of the wrong kind.
Look at the original JPEG at quality 85 :
It’s not 100% clean either, but much better. Granted, this is WebP re-encoding of an already lossy compressed JPEG, so we stack 2 steps of destructive compression. But this is what Google Page Speed insights encourage you to do and what a shitload of plugins enable you to do, while pretending it’s completely safe. It’s not.
I have seen a similar effect in other similar pictures : always pictures with large, smooth, gradients in the background, which happens a lot when some punctual-ish light falls off a wall. That’s not something accidental, smooth fall-off are actively built by photographers to create organic-looking backgrounds with just enough of texture to not get boring, yet discrete enough to not draw attention off the foreground/subject.
So, I wondered how bad it was for actual raw photos encoded straight in darktable. Meaning just one step of encoding. Meaning real WebP quality comparison for real-life studio head-shots, which are one of the last things customers are willing to pay actual photographers to do (instead of snapping their own iPhone). Meaning real money for real professionals. Meaning something the image coding douchebags may have not foreseen, because it doesn’t happen in VS Code (or Vim, for that matter).
Let’s start. The following 2 images use Floyd-Steinberg dithering in 8 bits, with lossy compression set at 90 for both JPEG and WebP (remember, the experts recommend 80 for WebP). All images below, saved in WebP, use the “photo” image hint of Jeroen Oom’s libwebp 1.2.1. Click on images to open the full-size version, or better : right-click and open them in a new tab
JPEG 85 and WebP 90 both fail the test, looking like shit. But WebP looks more like shit: the contrast in posterized rings is higher. And we are already 10% above the recommended quality that “should fit 99% of pictures”. JPEG 90 looks ok though, but it’s a lot heavier.
So, let’s try something else, now: going lossless WebP. That should be our ground truth of WebP supremacy.
So, the WebP quality is now clean, but I’m not impressed with the weight, especially since you need a really good look to distinguish it from JPEG 90, which weighs about a third of that, and it’s forensically similar to JPEG 95, which is a bit more than half. Ooops.
Let’s try something else : redo it, but instead of the light Floyd-Steinberg dithering, use heavier random noise at -48 dB PSNR. That’s a very high PSNR, meaning it should be almost unnoticeable to human eyes but should give an harder time to the high-frequency filtering which is most of the trick behind image compressing.
The WebP is still more prone to posterization. So, I wondered what the WebP quality was that would be as smooth as the JPEG 85 with -48 dB of noise (which was pretty damn smooth). The answer is somewhere between 95 and 96, even though it’s hard to make an equivalence since the quality and texture of the artifacts differ.
Yeah, you read it. WebP is actually 39% heavier than JPEG 85 plus noise for a similar-ish look on this difficult picture, and still not totally as smooth as the JPEG (there is still a tiny bit of ringing). It’s also 30% heavier than JPEG 90 with simple Floyd-Steinberg dithering.
So, what do we take from that ?
First, at similar visual quality and for photographs, WebP is not lighter than JPEG, it’s actually the opposite. All the Google claims rely on measuring the average SSIM and average bit weight over a dataset of images. Call me crazy, but I don’t give a shit about averages. For a gaussian “normal” process , probabilities say half of your sample will be above and half will be below the average (which is also the median in a gaussian distribution). If we designed cars for the average load they would have to sustain, it means we would kill about half of the customers. Instead, we design cars for the worst foreseeable scenario, add a safety factor on top, and they still kill a fair amount of them, but a lot fewer than in the past. Most probabilistic distributions are close to gaussian, so the assumption that average = median ± a little something is fair. Also the SSIM metric is an incomplete, biased, controverted metric of image similarity that takes no actual perceptual metric into account2, it’s just averages, variances and covariances, meaning it is barely a pattern recognition scheme from a machine perspective.
As a photographer, I care about robustness of the visual output. Which means, as a designer, designing for the worst possible image and taking numerical metrics with a grain of salt. And that whole WebP hype is unjustified, in this regard. It surely performs well in well chosen examples, no doubt. The question is : what happens when it doesn’t ? I can’t fine-tune the WebP quality for each individual image on my website, that’s time consuming and WordPress doesn’t even allow that. I can’t have a portfolio of pictures with even 25% posterized backgrounds either, the whole point of a portfolio is to showcase your skills and results, not to take a wild guess on the compression performance of your image backend. Average won’t do, it’s simply not good enough. And in setting the weight vs. quality ratio, the nature of the induced artifacts matters perhaps more than the norm of the deviation. We can tolerate higher variance in random noise than in patterned blotches.
Second, I don’t know why all the techies around have a huge kink over sharpness, but the most challenging situations I have faced as a photographer were with smooth gradients. Or more accurately, gradients that should have been smooth and weren’t in the output. So there is a real issue with the design priorities of image algos from tech guys who clearly lack historical and artistic background, and don’t talk to artists, who anyway have largely decided that they were above science, maths and other menial materialistic concerns. Most test pictures for WebP compression showcase sharp scenes with large depth of field, so lots of details, aka high-frequencies, which have zero chance of posterization and are not the pain point of such algorithms. Lack of sharpness has never destroyed a picture, on the contrary. Painters took as much trouble to render atmospheric haze and sfumato as photographers take now to revert them. But having a staircase in place of a smooth vignette surely is damaging to the picture in an unacceptable way.
Third, big shout-out to all the morons, idiots, douchebags and monkeys who make big claims all around on matters they don’t nearly understand. Why the big words ? I have been told on my previous article that I was too heavy on insults… Well, we live in a time where time is the ultimate luxury, and the idiots-who-should-know-but-didn’t are not only causing damages, they also cost money and time, and I really think they should be punished for this. You can refund money, you can’t refund time. Thing is, as technologies are “improving”, people don’t get more free time because the work doesn’t get any easier. Instead, the tools become more complex and the customer expect more as the tools get faster, meaning workers have as much work as before, only with more complex toolkits. So, actually, better tech doesn’t mean less or easier work for the actual workers, it may just mean better result if it is actually better tech, which, in this case, it is not. The proof has been made here that WebP is simply not robust enough for image makers, regardless of its average performance, if lesser (or even similar) data bandwidth is the ultimate goal. The test done here is simple enough to have been done by anyone much earlier, provided they used image datasets from actual photographers.
Image-making is not just a vocational part-time activity for bored upper-class or retired citizen with enough money to buy 10 k€ camera systems and do mostly nothing out of them. Some people rely on that to make a living. And they are already in a precarious enough situation (even before COVID… how many newspapers still had a photo staff in 2015 ?) to not take more shit from the people who pretend to help them, when they do the opposite. I have the ability to double-check the stupid shit I read here and there, but the large majority of visual artists don’t and will take the word of “experts” for truth even though it contradicts facts they have witnessed themselves for years.
The Google monkeys at Page Speed are idiots when they advise you to move all your content to WebP. Also they are dishonest since they commited it themselves, so they are judge and party. The Google monkeys who said WebP has lower weight at similar average SSIM say nothing because neither the SSIM nor the average are meaningful : none is robust enough, at best it’s a 50/50 % of satisfying/unacceptable outcome. The WordPress plugins monkeys are idiots when they advise and tool you up to convert already lossy JPEGs to WebP. Oh, they probably make all their claims in good faith, the problem is they didn’t see the problems, precisely. And it’s super difficult to argue with people who don’t — literally — see the problems because it’s their bad eyes against your experience, and since people believe only what they see, you are screwed. But then, a lot of lower-tier websites and blog will repeat everything coming from these “trustable” sources, doing even more damage. I have personally lost about a full working week in the past 6 months over that whole WebP migration madness and thanks to all these fake news, to make it work across URL rewriting and CDN redirections, and then to understand why it looked so bad at the end.
Finally, WebP is badly designed. Being necessarily RGB or RGBalpha, there is no way to save a monochrome grey image on single channel. We see that all the posterization here is made worse by magenta and green rings which come solely from the chroma subsampling. With a purely monochrome format saved on a single channel, you don’t introduce any additional chroma shift. It’s as bad as JPEG, but it could have been fixed. That’s what AVIF did, at least, but it won’t be a technical reality for at least another decade.
How do we solve that ?
- Stick to JPEG at 90 quality (or at least 85) if images matter to you, e.g. if you are a visual artist. If images are pretty decorations for your textual content, it doesn’t matter.
- Always add dithering and/or a tiny bit of noise in your images, just to be sure smooth gradients will stay smooth no matter the amount of damage they will take from stupid websites recompressions.
- Don’t convert your old JPEG to WebP even if every idiot around tells you to, unless you find the images shown above remotely acceptable.
- Serve your images from a fast CDN , use responsive image sizes and image lazy loading to improve loading speed and perceived responsiveness from the user/client side, but there is not much more you can do without damaging the quality of your images.
- Avoid all the SaaS ways of converting your images on another server. On paper, they sound great because they relieve your own server from the conversion load, which is good on mutualized hostings. Except they cost, don’t disclose the actual quality factor they use, and don’t work in lots of cases (HTTP connections errors everywhere, especially if you have hardened WordPress with a security plugin). You would be better off with a better hosting and running conversions on your server straight with Image Magick/Graphics Magick (not the PHP interfaces, but directly the server program). There is a WordPress plugin that does just that.
- Devs and techs really need to pull their head out of their arses and start discussing with actual artists to understand their challenges and priorities.
- Devs and techs really need to get a grasp at basic probabilities because… average, really ?
- We really need people able to have one foot in the tech world and the other in the art world, and being able to discuss with both worlds, because having them in two separate bubbles is damaging on a large scale right now, and I don’t see it improving.
KRISHNAN, Dilip et FERGUS, Rob. Fast image deconvolution using hyper-Laplacian priors. Advances in neural information processing systems, 2009, vol. 22, p. 1033-1041. http://people.ee.duke.edu/~lcarin/NIPS2009_0341.pdf ↩︎
DOSSELMANN, Richard et YANG, Xue Dong. A comprehensive assessment of the structural similarity index. Signal, Image and Video Processing, 2011, vol. 5, no 1, p. 81-91. ↩︎