What Matters in Practical
Learned Image Compression

Apple

About

We introduce PICO (Perceptual Image Codec) — the first learned codec that is both practical, and optimized directly for the human visual system. To derive it, we perform a comprehensive study of modeling choices for practical learned codecs, and search over millions of model configurations to jointly optimize over perceptual quality and on-device runtime.

Based on large-scale subjective user studies, PICO provides 2.3-3× bitrate savings against AV1, AV2, VVC, ECM and JPEG-AI, and 20-40% bitrate savings against the best learned codec alternatives. At the same time, on an iPhone 17 Pro Max, it encodes 12MP images as fast as 230ms, and decodes them in 150ms — faster than most top ML-based codecs run on a V100 GPU. Different from most learned codecs, PICO furthermore comes with cross-platform robustness guarantees.

Right comparison
Left comparison
Interactive comparison across different images. PICO (Ours) is fixed on the left. Select an image and comparison method from the overlay buttons, then drag the slider to compare. Best viewed on a large screen.

Comparisons of state-of-the-art traditional and learned codecs across different considerations of practicality.

Performance comparison of PICO against traditional and learned codecs
Comparisons of state-of-the-art traditional and learned codecs. Perceptual BD-rates are based on human ratings from a large-scale subjective study. Speed benchmarks on iPhone 17 Pro Max use identical compiler optimizations.

Citation

If you find our work useful, please cite:

@article{tatwawadi2026pico,
  title={What Matters in Practical Learned Image Compression},
  author={Tatwawadi, Kedar and Rahimzadeh, Parisa and Sun, Zhanghao and Chen, Zhiqi and Yang, Ziyun and Nair, Sanjay and Hasteer, Divija and Rippel, Oren},
  journal={arXiv preprint arXiv:2605.05148},
  year={2026}
}