Andre Bogus Andre "llogiq" Bogus is Chief Rustacean (yes, that's his official title) for Synth, a Rust contributor, and Clippy maintainer. A musician-turned-programmer, he has worked in many fields, from voice acting and teaching, to programming and managing software projects. He enjoys learning new things and telling others about them.

Rust compression libraries

8 min read 2376

Rust Compression Libraries

Editor’s note: This post was updated on Jan. 8, 2021 to correct errors in the original benchmark.

Data compression is an important component in many applications. Fortunately, the Rust community has a number of crates to deal with this.

Unlike my review of Rust serialization libraries, it doesn’t make sense to compare performance between different formats. For some formats, all we have are thin wrappers around C libraries. Here’s hoping most will be ported to Rust soon.

We’ll cover the following in this guide:

What do Rust compression libraries do?

There are two variants of compression utilities: stream compressors and archivers. A stream compressor takes a stream of bytes and emits a shorter stream of compressed bytes. An archiver enables you to serialize multiple files and directories. Some formats (such as the venerable .zip) can both collect and compress files.

For the web, there are only two stream formats that have achieved widespread implementation: gzip/deflate and Brotli. I list gzip and deflate under the same title because they implement the same underlying algorithm, and gzip adds more checks and header information. Some clients also allow for bzip2 compression, but this isn’t as widespread anymore since gzip can be made to get similar compression ratio with Zopfli (trading compression time), and you could use Brotli to go even smaller.

What are the best compression libraries for Rust?

Like any problem, there are myriad solutions that have different trade-offs in terms of runtime for compression and decompression, CPU and memory use vs. compression ratio, the capability to stream data, and safety measures such as checksums. We’ll focus only on lossless compression — no image, audio or video-related lossy compression formats.

For a somewhat realistic benchmark, I’ll use the following files to compress and decompress, ranging from very compressible to very hard to compress:

  • A 100MB file of zeroes created with cat /dev/zero | head -c $[1024 * 1024 * 100] > zeros.bin
  • The concatenated markdown of my personal blog, small and text-heavy (creaated with cat llogiq.github.io/_posts/*.md > blogs.txt)
  • An image of my cat
  • The x86_64 rustc binary of the current stable toolchain (1.47.0)
  • A TV recording of the movie “Hackers,” which I happen to have lying around
  • A 100MB file of pseudorandom numbers created with cat /dev/urandom | head -c $[1024 * 1024 * 100] > randoms.bin

All compression and decompression will be from and into memory. My machine has a Ryzen 3550H four-core CPU running Linux 5.8.18_1.

Note: Some of the compressors failed the roundtrip test, which means the compressed and then uncompressed data failed to match the original. This may mean my implementation of the benchmark is faulty, or it may be a fault in the library. I will conduct further tests. For now, I have marked the respective libraries with an asterisk (*).

We made a custom demo for .
No really. Click here to check it out.

Stream compression libraries for Rust

In the stream compressor department, the benchmark covers the following crates:

DEFLATE/zlib

DEFLATE is an older algorithm that uses a dictionary and Huffman encoding. It has three variants: plain (without any headers), gzip (with some headers, popularized by the UNIX compression utility of the same name), and zlib (which is often used in other file formats and also by browsers). We will benchmark the plain variant.

  • yazi 0.1.3 has a simple interface that will return its own Vec, but there is a more complex interface we benchmark that allows us to supply our own. On the upside, we can choose the compression level
  • deflate 0.8.6 does not allow supplying an output Vec, so its runtime contains allocation*
  • flate2 1.0.14 comes with three possible backends, two of which wrap a C implementation. This benchmark only uses the default backend because I wanted to avoid the setup effort — sorry

Snappy

Snappy is Google’s 2011 answer to LZ77, offering fast runtime with a fair compression ratio.

LZ4

Also released in 2011, LZ4 is another speed-focused algorithm in the LZ77 family. It does away with arithmetic and Huffman coding, relying solely on dictionary matching. This makes the decompressor very simple.

There is some variance in the implementations. Though all of them use basically the same algorithm, the resulting compression ratio may differ since some implementations choose defaults that maximize speed whereas others opt for a higher compression ratio.

ZStandard

The ZStandard (or ‘zstd’) algorithm, published in 2016 by Facebook, is meant for real-time applications. It offers very fast compression and decompression with zlib-level or better compression ratios. It is also used in other cases where time is of the essence, e.g., in BTRFS file system compression.

  • zstd 0.5.3 binds the standard C-based implementation. This needs pkg-config and the libzstd library, including headers*

LZMA

Going in the other direction and trading runtime for higher compression, the LZMA algorithm extends the dictionary-based LZ algorithm class with Markov chains. This algorithm is quite asymmetric in that compression is far slower and requires much more memory than decompression. It is often used for Linux distribution’s package format to allow reduced network usage with agreeable decompression CPU and memory requirements.

Zopfli

Zopfli is a zlib-compatible compression algorithm that trades superior compression ratio for a long runtime. It is useful on the web, where zlib decompression is widely implemented.

Though Zopfli takes more time to compress, this is an acceptable tradeoff for reducing network traffic. Since the format is DEFLATE-compatible, the crate only implements compression.

  • zopfli 0.4.0* (Could not be unpacked matching the original using either deflate or fflate2)

Brotli

Brotli, developed by the Google team that created Zopfli, extends the LZ77-based compression algorithm with second-order context modeling, which gives it an edge in compressing otherwise hard-to-compress data streams.

  • brotli 3.3.0 is a rough translation of the original C++ source that has seen a good number of tweaks and optimization (as the high version number attests). The interface is somewhat unidiomatic, but works well enough.

Archiving libraries for Rust

For the archivers, the benchmark has the tar 0.4.30, zip 0.5.8, and rar 0.2.0 crates.

zip is probably the most well-known format. Its initial release was in 1989, and it’s probably the most widely supported format of its kind. tar, the venerable Tape ARchiver, is the oldest format, with an initial release in 1979. It actually has no compression of its own but, in typical UNIX fashion, delegates to stream archivers such as gzip (DEFLATE), bzip2 (LZW), and xz (LZMA). The rar format is somewhat younger than zip and rose to popularity on file sharing services due to less file overhead and slightly better compression.

Interfaces

Regarding stream processors, there are three possible options to implement the API. The easiest approach is to take a &[u8] slice of bytes and return a Vec<u8> with the compressed data. An obvious optimization is to not return the compressed data, but take a &mut Vec<u8> mutable reference to a Vec in which to write the compressed data instead.

The most versatile interface is obviously a method that takes a impl Read and impl Write, reading from the former and writing into the latter. This may not be optimal for all formats, though, because sometimes you need to go back and fix block lengths in the written output. This interface would leave some performance on the table, unless the output also implements Seek — which, for Vecs, can be done with a std::io::Cursor. On the other hand, it allows us to somewhat comfortably work with data that may not fit in memory.

In any event, for somewhat meaningful comparison, welll compress from RAM to RAM, preallocated where the API allows this. Exceptions are marked as such.

Some libraries allow you to set options to (de)activate certain features, such as checksums, and pick a certain size/runtime tradeoff. This benchmark takes the default configuration, sometimes varying compression level if this is readily available in the API.

Rust compression libraries: Benchmarks

Without further ado, here are the results, presented in six tables to avoid cluttering up your display:

Zeros:

zeros 
Time to pack
Bytes packed
Time to unpack
uncompressed
104.857.600 b
lz4-compression
536.89 ms
411.212 b
250.26 ms
lz4_flex
8.1599 ms
411.223 b
14.096 ms
lz_fear
41.534 ms
411.590 b
73.517 ms
lzzzz
7.1806 ms
411.217 b
6.1133 ns
zstd-level-1
19.437 ms
3.219 b
331.79 ns
zstd-level-2
15.808 ms
3.219 b
333.12 ns
zstd-level-3
13.186 ms
3.219 b
332.96 ns
zstd-level-4
13.131 ms
3.219 b
331.72 ns
zstd-level-5
16.241 ms
3.219 b
331.78 ns
zstd-level-6
16.487 ms
3.219 b
332.63 ns
zstd-level-7
47.429 ms
3.218 b
328.78 ns
zstd-level-8
60.406 ms
3.218 b
331.46 ns
zstd-level-9
61.099 ms
3.218 b
329.93 ns
snap
9.9748 ms
4.918.404 b
59.525 ms
snappy-framed
247.52 ms
4.936.010 b
95.653 ms
snappy-framed + crc
332.09 ms
deflate-Fast
264.42 ms
101.771 b
297.41 us
deflate-Default
233.50 ms
101.771 b
299.65 us
deflate-Best
234.01 ms
101.771 b
299.61 us
flate2-1
31.842 ms
477.265 b
65.131 ms
flate2-2
332.71 ms
101.858 b
247.63 ms
flate2-3
331.96 ms
101.858 b
247.67 ms
flate2-4
331.59 ms
101.858 b
247.70 ms
flate2-5
332.51 ms
101.858 b
247.78 ms
flate2-6
330.03 ms
101.858 b
247.59 ms
flate2-7
330.63 ms
101.858 b
247.84 ms
flate2-8
330.29 ms
101.858 b
247.96 ms
yazi-BestSpeed
328.44 ms
101.858 b
16.490 ms
yazi-Default
329.38 ms
101.858 b
16.434 ms
yazi-BestSize
328.35 ms
101.858 b
16.472 ms
lzma-rs
2.2817 s
2.597.689 b
5.9204 s
lzma-rs/2
21.299 ms
104.862.401 b
25.546 ms
lzma-rs/xz
21.383 ms
104.862.401 b
25.355 ms
zopfli
1434.4 s
103.092 b
brotli
3.8735 s
172 b
217.82 ms
tar
22.269 ms
104.859.136 b
zip
337.60 ms
101.964 b
303.23 ms

Blog:

blog 
Time to pack
bytes packed
Time to unpack
uncompressed
593.820 b
lz4-compression
4.9841 ms
372.750 b
1.7485 ms
lz4_flex
2.8571 ms
334.778 b
816.33 us
lz_fear
4.4788 ms
363.218 b
1.8757 ms
lzzzz
1.5531 ms
363.199 b
6.1198 ns
zstd-level-1
2.2007 ms
251.789 b
50.807 us
zstd-level-2
2.8105 ms
232.466 b
39.041 us
zstd-level-3
3.7554 ms
221.980 b
31.714 us
zstd-level-4
4.0346 ms
219.932 b
31.310 us
zstd-level-5
7.5117 ms
217.398 b
29.830 us
zstd-level-6
10.431 ms
214.032 b
29.733 us
zstd-level-7
14.142 ms
208.353 b
30.104 us
zstd-level-8
17.765 ms
206.647 b
30.767 us
zstd-level-9
26.935 ms
204.925 b
30.561 us
snap
1.8698 ms
355.779 b
645.94 us
snappy-framed
3.3497 ms
355.895 b
742.69 us
snappy-framed + crc
2.0513 ms
deflate-Fast
8.2300 ms
274.956 b
6.8837 ms
deflate-Default
31.865 ms
225.712 b
5.6215 ms
deflate-Best
36.495 ms
225.374 b
5.5747 ms
flate2-1
6.2127 ms
303.492 b
3.0871 ms
flate2-2
8.9741 ms
254.046 b
2.3682 ms
flate2-3
12.651 ms
235.389 b
2.0439 ms
flate2-4
14.586 ms
231.861 b
2.0981 ms
flate2-5
18.574 ms
228.565 b
2.0585 ms
flate2-6
27.958 ms
225.937 b
2.0185 ms
flate2-7
29.698 ms
225.669 b
2.0186 ms
flate2-8
32.423 ms
225.595 b
2.0142 ms
yazi-BestSpeed
7.5707 ms
274.155 b
1.5033 ms
yazi-Default
27.400 ms
225.937 b
1.2362 ms
yazi-BestSize
31.374 ms
225.595 b
1.2380 ms
lzma-rs
22.292 ms
345.651 b
44.046 ms
lzma-rs/2
42.668 us
593.851 b
61.292 us
lzma-rs/xz
34.953 us
593.851 b
61.907 us
zopfli
1.9484 s
216.931 b
brotli
1.1627 s
184.232 b
2.9552 ms
tar
40.794 us
595.456 b
zip
27.536 ms
226.043 b
2.3466 ms

Cat:

cat 
Time to pack
Bytes packed
Time to unpack
uncompressed
5.996.972 b
lz4-compression
35.233 ms
6.019.432 b
1.0572 ms
lz4_flex
1.3853 ms
6.019.565 b
1.0434 ms
lz_fear
4.3359 ms
5.996.995 b
8.9415 ms
lzzzz
1.3029 ms
6.019.558 b
6.0961 ns
zstd-level-1
5.3535 ms
5.997.120 b
277.57 ns
zstd-level-2
5.2604 ms
5.997.120 b
276.35 ns
zstd-level-3
5.9432 ms
5.997.120 b
276.99 ns
zstd-level-4
6.5448 ms
5.997.120 b
276.82 ns
zstd-level-5
26.954 ms
5.997.120 b
276.65 ns
zstd-level-6
33.099 ms
5.997.120 b
278.19 ns
zstd-level-7
31.469 ms
5.997.120 b
276.26 ns
zstd-level-8
32.382 ms
5.997.120 b
276.73 ns
zstd-level-9
59.873 ms
5.997.120 b
277.68 ns
snap
1.1936 ms
5.996.786 b
1.0078 ms
snappy-framed
14.805 ms
5.997.804 b
3.2142 ms
snappy-framed + crc
16.639 ms
deflate-Fast
102.25 ms
5.988.904 b
152.67 ms
deflate-Default
175.94 ms
5.988.712 b
151.66 ms
deflate-Best
176.00 ms
5.988.712 b
151.33 ms
flate2-1
43.591 ms
5.985.064 b
19.644 ms
flate2-2
175.70 ms
5.988.765 b
17.393 ms
flate2-3
178.75 ms
5.988.740 b
17.419 ms
flate2-4
178.34 ms
5.988.733 b
17.401 ms
flate2-5
178.27 ms
5.988.723 b
17.344 ms
flate2-6
178.07 ms
5.988.720 b
17.346 ms
flate2-7
178.25 ms
5.988.720 b
17.349 ms
flate2-8
178.87 ms
5.988.720 b
17.347 ms
yazi-BestSpeed
189.66 ms
5.988.773 b
19.841 ms
yazi-Default
207.22 ms
5.988.720 b
20.053 ms
yazi-BestSize
207.68 ms
5.988.720 b
20.048 ms
lzma-rs
332.58 ms
6.035.262 b
595.84 ms
lzma-rs/2
1.2232 ms
5.997.249 b
1.4694 ms
lzma-rs/xz
1.2213 ms
5.997.249 b
1.4887 ms
zopfli
10.056 s
5.979.994 b
brotli
5.9862 s
5.996.992 b
6.3637 ms
tar
1.2933 ms
5.998.592 b
zip
178.73 ms
5.988.826 b
20.713 ms

rustc:

rustc
Time to pack
Bytes packed
Time to unpack
uncompressed
3.073.584 b
lz4-compression
20.110 ms
1.072.891 b
8.0218 ms
lz4_flex
7.7173 ms
1.010.565 b
3.8216 ms
lz_fear
14.191 ms
1.026.934 b
7.7286 ms
lzzzz
4.2795 ms
1.026.915 b
6.1406 ns
zstd-level-1
6.4406 ms
705.366 b
73.815 us
zstd-level-2
7.0974 ms
687.089 b
66.501 us
zstd-level-3
9.1309 ms
639.689 b
54.920 us
zstd-level-4
12.386 ms
640.254 b
54.440 us
zstd-level-5
23.483 ms
623.721 b
52.706 us
zstd-level-6
30.142 ms
621.271 b
52.636 us
zstd-level-7
41.537 ms
593.284 b
51.566 us
zstd-level-8
49.291 ms
590.382 b
51.925 us
zstd-level-9
61.354 ms
588.635 b
51.922 us
snap
4.6536 ms
1.014.750 b
1.9190 ms
snappy-framed
11.806 ms
1.015.273 b
3.3145 ms
snappy-framed + crc
10.137 ms
deflate-Fast
26.259 ms
803.387 b
23.408 ms
deflate-Default
110.58 ms
670.476 b
19.268 ms
deflate-Best
402.86 ms
665.145 b
19.119 ms
flate2-1
16.242 ms
825.036 b
8.5785 ms
flate2-2
26.419 ms
739.349 b
8.0749 ms
flate2-3
40.826 ms
696.929 b
7.3984 ms
flate2-4
42.507 ms
689.885 b
7.0987 ms
flate2-5
54.415 ms
681.863 b
6.9239 ms
flate2-6
102.29 ms
671.424 b
6.7224 ms
flate2-7
143.61 ms
669.388 b
6.6738 ms
flate2-8
200.04 ms
667.776 b
6.6292 ms
yazi-BestSpeed
26.324 ms
800.773 b
5.1149 ms
yazi-Default
102.27 ms
671.424 b
3.7663 ms
yazi-BestSize
241.13 ms
667.030 b
3.7035 ms
lzma-rs
93.622 ms
1.144.310 b
216.56 ms
lzma-rs/2
589.55 us
3.073.726 b
710.17 us
lzma-rs/xz
592.90 us
3.073.726 b
710.56 us
zopfli
50.235 s
629.089 b
brotli
7.5668 s
484.743 b
12.214 ms
tar
600.54 us
3.075.584 b
zip
102.87 ms
671.530 b
8.1872 ms

“Hackers”:

hackers
Time to pack
Bytes packed
Time to unpack
uncompressed
650.614.271 b
lz4-compression
3.7903 s
651.414.695 b
129.46 ms
lz4_flex
85.586 ms
651.216.503 b
65.234 ms
lz_fear
366.12 ms
648.771.012 b
575.33 ms
lzzzz
63.256 ms
651.305.354 b
6.0870 ns
zstd-level-1
439.54 ms
648.460.233 b
22.411 us
zstd-level-2
441.59 ms
648.418.708 b
23.721 us
zstd-level-3
548.52 ms
648.368.014 b
7.2039 us
zstd-level-4
658.25 ms
648.365.759 b
7.2404 us
zstd-level-5
4.1662 s
648.337.922 b
5.2644 us
zstd-level-6
5.9076 s
648.333.128 b
5.2728 us
zstd-level-7
5.7557 s
648.343.350 b
5.4069 us
zstd-level-8
5.9004 s
648.341.149 b
5.5170 us
zstd-level-9
8.8779 s
648.335.678 b
5.5076 us
snap
128.50 ms
648.918.453 b
107.04 ms
snappy-framed
1.6064 s
649.027.666 b
324.44 ms
snappy-framed + crc
1.8128 s
deflate-Fast
8.6462 s
648.538.656 b
16.544 s
deflate-Default
16.693 s
648.426.143 b
16.252 s
deflate-Best
17.246 s
648.400.325 b
16.208 s
flate2-1
4.6952 s
648.842.015 b
2.0983 s
flate2-2
20.246 s
648.514.794 b
528.90 ms
flate2-3
20.439 s
648.473.313 b
581.54 ms
flate2-4
20.464 s
648.487.896 b
591.07 ms
flate2-5
20.475 s
648.472.324 b
602.50 ms
flate2-6
20.593 s
648.430.184 b
602.63 ms
flate2-7
20.763 s
648.406.714 b
604.39 ms
flate2-8
20.897 s
648.404.739 b
605.78 ms
yazi-BestSpeed
22.319 s
648.532.301 b
232.64 ms
yazi-Default
24.115 s
648.430.184 b
354.83 ms
yazi-BestSize
24.478 s
648.404.238 b
359.26 ms
lzma-rs
36.416 s
657.469.682 b
63.492 s
lzma-rs/2
131.01 ms
650.644.056 b
157.51 ms
lzma-rs/xz
131.32 ms
650.644.056 b
156.57 ms
zopfli
1143.0 s
648.028.632 b
brotli
991.38 s
648.143.053 b
744.05 ms
tar
139.79 ms
650.615.808 b
zip
20.725 s
648.430.290 b
903.65 ms

Random:

random 
Time to pack
Bytes packed
Time to unpack
uncompressed
104.857.600 b
lz4-compression
613.13 ms
105.268.759 b
17.846 ms
lz4_flex
12.249 ms
105.268.808 b
9.4065 ms
lz_fear
58.225 ms
104.857.715 b
98.943 ms
lzzzz
10.981 ms
105.268.808 b
6.0830 ns
zstd-level-1
62.791 ms
104.860.010 b
264.05 ns
zstd-level-2
63.049 ms
104.860.010 b
264.89 ns
zstd-level-3
73.445 ms
104.860.010 b
263.81 ns
zstd-level-4
86.024 ms
104.860.010 b
263.88 ns
zstd-level-5
414.09 ms
104.860.010 b
266.23 ns
zstd-level-6
499.91 ms
104.860.010 b
265.65 ns
zstd-level-7
502.17 ms
104.860.010 b
263.70 ns
zstd-level-8
503.29 ms
104.860.010 b
264.23 ns
zstd-level-9
989.28 ms
104.860.010 b
263.80 ns
snap
20.548 ms
104.862.404 b
17.891 ms
snappy-framed
258.06 ms
104.880.010 b
53.925 ms
snappy-framed + crc
288.75 ms
deflate-Fast
1.3971 s
104.874.105 b
2.6327 s
deflate-Default
2.6175 s
104.874.100 b
2.6149 s
deflate-Best
2.6238 s
104.874.100 b
2.6236 s
flate2-1
762.13 ms
104.954.683 b
290.82 ms
flate2-2
3.2967 s
104.874.120 b
18.494 ms
flate2-3
3.3415 s
104.874.120 b
18.491 ms
flate2-4
3.3275 s
104.874.120 b
18.513 ms
flate2-5
3.3259 s
104.874.120 b
18.515 ms
flate2-6
3.3308 s
104.874.120 b
18.512 ms
flate2-7
3.3412 s
104.874.120 b
18.457 ms
flate2-8
3.3266 s
104.874.120 b
18.456 ms
yazi-BestSpeed
3.6276 s
104.874.120 b
18.529 ms
yazi-Default
3.9292 s
104.874.120 b
18.020 ms
yazi-BestSize
3.9184 s
104.874.120 b
18.064 ms
lzma-rs
5.8499 s
106.344.201 b
10.309 s
lzma-rs/2
21.551 ms
104.862.401 b
25.388 ms
lzma-rs/xz
21.304 ms
104.862.401 b
26.078 ms
zopfli
172.59 s
104.866.020 b
brotli
114.65 s
104.857.921 b
103.72 ms
tar
22.409 ms
104.859.136 b
zip
3.3542 s
104.874.226 b
109.24 ms

As usual, my benchmarks are available on GitHub.

LogRocket: Full visibility into production Rust apps

Debugging Rust applications can be difficult, especially when users experience issues that are difficult to reproduce. If you’re interested in monitoring and tracking performance of your Rust apps, automatically surfacing errors, and tracking slow network requests and load time, try LogRocket.

LogRocket is like a DVR for web apps, recording literally everything that happens on your Rust app. Instead of guessing why problems happen, you can aggregate and report on what state your application was in when an issue occurred. LogRocket also monitors your app’s performance, reporting metrics like client CPU load, client memory usage, and more.

Modernize how you debug your Rust apps — .

Andre Bogus Andre "llogiq" Bogus is Chief Rustacean (yes, that's his official title) for Synth, a Rust contributor, and Clippy maintainer. A musician-turned-programmer, he has worked in many fields, from voice acting and teaching, to programming and managing software projects. He enjoys learning new things and telling others about them.

4 Replies to “Rust compression libraries”

  1. Zip compressing 100mb random data in 60kb? That’s impossible. In fact, it’s very similar for all test inputs, another huge red flag.

    What is happening (from a quick look): After compression, you take the position of the Cursor instead of the length of the compressed data – the whole point of a Cursor is that it’s seekable.

    Also, take a look at the lz4_flex and lzzzz, decompressing in 5.3/7.6 ns – impossibly fast – regardless of input (with one exception that has realistic time). It’s not actually decompressing the data. I don’t know why though, maybe criterion::black_box is failing?

  2. Hi, this is a nice comparison and you put quite some effort into it.

    However, as like many other statistics, it would require additional details on how the data was measured to make the data usable.
    Such as, are those numbers from a single run? Is it a mean value? If it is a mean value, how often was it executed (cold cache, hot cache, …)? If it is a mean value, how big are the outliers, variance or standard deviation.

    Disclaimer: did not look at the Github project, since I prefer to have this information directly available with the data tables.

Leave a Reply