Benchmarking Ruby 2.6 to 3.2

The yearly benchmarking Ruby post

It’s this time of the year again where Ruby is released and everyone asks: Is it faster? You will find out below! And if you are interested, you can compare the results to the previous installations of this post for the years 2016, 2017, 2020 and 2021.

This christmas Ruby 3.2.0 was released, featuring improvements across the board. The YJIT just in time compiler has been ported to Rust and can now be used on ARM machines. And variable width allocation (VWA) has been enabled by default. Let’s see how fast Ruby got in the past year!

The Benchmark Setup

I will be using the same benchmark setup as last year with only a few changes to make the graphs easier to read:

  1. HexaPDF

    The following commands were excecuted in the benchmark/ directory:

    ./rubies.sh "2.6.10 2.7.7 3.0.5 3.1.3y 3.2.0 3.2.0y" optimization -b "hexapdf CS$"
    ./rubies.sh "2.6.10 2.7.7 3.0.5 3.1.3y 3.2.0 3.2.0y" raw_text -b hexapdf
    ./rubies.sh "2.6.10 2.7.7 3.0.5 3.1.3y 3.2.0 3.2.0y" line_wrapping -b "hexapdf C"
    

    A Ruby version with an appended “y” tells the script to activate YJIT.

    Did you know that HexaPDF is part of the YJIT headline benchmarks?

  2. kramdown

    The following command was excecuted in the kramdown repository directory:

    ./benchmark/benchmark-rubies.sh "2.6.10 2.7.7 3.0.5 3.1.3y 3.2.0 3.2.0y"
    

    A Ruby version with an appended “y” tells the script to activate YJIT.

  3. geom2d

    This benchmark is done using the superb benchmark-driver gem.

    benchmark-driver benchmark.yaml --rbenv "2.6.10;2.7.7;3.0.5;3.1.3 --yjit;3.2.0;3.2.0 --yjit" -o record
    benchmark-driver benchmark_driver.record.yml -o gruff
    

Results

HexaPDF

The images are SVG files, click on them to open them in a new window to view details. The raw data is already the post-processed data ready for gnuplot-ingestion, with the time in milliseconds and the memory in kilobytes.

Optimization Benchmark

HexaPDF optimization benchmark

Time "hexapdf 2.6.10" "hexapdf 2.7.7" "hexapdf 3.0.5" "hexapdf 3.1.3-yjit" "hexapdf 3.2.0" "hexapdf 3.2.0-yjit"
"CS a.pdf" 188 191 195 424 189 305
"CS b.pdf" 796 885 860 958 819 854
"CS c.pdf" 1542 1659 1714 1502 1549 1218
"CS d.pdf" 3124 3306 3538 3066 3393 2690
"CS e.pdf" 747 819 820 969 774 797
"CS f.pdf" 45440 48377 49116 37725 46389 34489


Memory "hexapdf 2.6.10" "hexapdf 2.7.7" "hexapdf 3.0.5" "hexapdf 3.1.3-yjit" "hexapdf 3.2.0" "hexapdf 3.2.0-yjit"
"CS a.pdf" 29224 28420 27988 294724 30336 35128
"CS b.pdf" 51104 50380 50012 315624 45756 57380
"CS c.pdf" 49184 52524 52628 319708 54344 59840
"CS d.pdf" 79956 78908 75352 340456 78360 83844
"CS e.pdf" 89796 98672 88084 373956 110704 116940
"CS f.pdf" 586996 602100 597028 872624 547892 532800


Filesize "hexapdf 2.6.10" "hexapdf 2.7.7" "hexapdf 3.0.5" "hexapdf 3.1.3-yjit" "hexapdf 3.2.0" "hexapdf 3.2.0-yjit"
"CS a.pdf" 49227 49226 49228 49226 49226 49227
"CS b.pdf" 11045210 11045211 11045210 11045209 11045212 11045210
"CS c.pdf" 13180717 13180715 13180714 13180715 13180717 13180717
"CS d.pdf" 6418483 6418483 6418483 6418482 6418483 6418483
"CS e.pdf" 21751180 21751181 21751180 21751180 21751181 21751180
"CS f.pdf" 117545254 117545255 117545254 117545254 117545254 117545254

Raw Text Benchmark

HexaPDF raw text benchmark

Time "hexapdf 2.6.10" "hexapdf 2.7.7" "hexapdf 3.0.5" "hexapdf 3.1.3-yjit" "hexapdf 3.2.0" "hexapdf 3.2.0-yjit"
"1x" 547 543 548 720 568 513
"5x" 1856 2182 2090 1862 1954 1570
"10x" 3661 4109 4003 3425 3679 2909
"1x ttf" 570 590 593 742 590 571
"5x ttf" 2148 2299 2423 2185 2221 1880
"10x ttf" 4284 4467 4698 3950 4269 3486


Memory "hexapdf 2.6.10" "hexapdf 2.7.7" "hexapdf 3.0.5" "hexapdf 3.1.3-yjit" "hexapdf 3.2.0" "hexapdf 3.2.0-yjit"
"1x" 35152 34252 32356 298724 38928 43348
"5x" 47480 47648 46792 313284 49712 53444
"10x" 60140 59800 59256 325612 62100 65252
"1x ttf" 33760 34400 33392 298996 36592 41268
"5x ttf" 49204 45592 43896 310380 51056 55004
"10x ttf" 63212 63012 63908 323152 70132 73668


Filesize "hexapdf 2.6.10" "hexapdf 2.7.7" "hexapdf 3.0.5" "hexapdf 3.1.3-yjit" "hexapdf 3.2.0" "hexapdf 3.2.0-yjit"
"1x" 441386 441388 441386 441386 441386 441386
"5x" 2201631 2201631 2201633 2201631 2201631 2201631
"10x" 4403276 4403278 4403276 4403278 4403276 4403278
"1x ttf" 535240 535241 535241 535239 535239 535239
"5x ttf" 2615338 2615339 2615337 2615337 2615339 2615339
"10x ttf" 5217071 5217070 5217071 5217072 5217071 5217073

Line Wrapping Benchmark

HexaPDF line wrapping benchmark

Time "hexapdf 2.6.10" "hexapdf 2.7.7" "hexapdf 3.0.5" "hexapdf 3.1.3-yjit" "hexapdf 3.2.0" "hexapdf 3.2.0-yjit"
"C 400" 1472 1699 1646 1544 1675 1379
"C 200" 1601 1815 1851 1733 1970 1539
"C 100" 1945 2209 2210 1901 2307 1730
"C 50" 3281 3584 3576 2834 3563 2499
"C 400 ttf" 1559 1779 1753 1619 1716 1516
"C 200 ttf" 1717 1976 1958 1732 1966 1622
"C 100 ttf" 2223 2515 2423 2141 2626 1885
"C 50 ttf" 5231 6203 6047 5071 5701 4759


Memory "hexapdf 2.6.10" "hexapdf 2.7.7" "hexapdf 3.0.5" "hexapdf 3.1.3-yjit" "hexapdf 3.2.0" "hexapdf 3.2.0-yjit"
"C 400" 112224 113632 115872 369368 86496 93916
"C 200" 112144 106964 108920 369136 88396 96840
"C 100" 109572 104924 105588 367296 91876 100108
"C 50" 159716 192212 235116 467600 197040 207976
"C 400 ttf" 109004 114376 92308 382244 96928 105088
"C 200 ttf" 115452 103608 103792 372156 90280 95440
"C 100 ttf" 116328 101844 104080 369268 92012 100312
"C 50 ttf" 158264 268552 258640 541588 273504 279660


Filesize "hexapdf 2.6.10" "hexapdf 2.7.7" "hexapdf 3.0.5" "hexapdf 3.1.3-yjit" "hexapdf 3.2.0" "hexapdf 3.2.0-yjit"
"C 400" 361579 361581 361579 361579 361581 361579
"C 200" 408490 408495 408493 408493 408495 408495
"C 100" 463816 463815 463814 463816 463814 463813
"C 50" 569332 569332 569332 569332 569332 569334
"C 400 ttf" 442441 442443 442446 442440 442440 442441
"C 200 ttf" 504456 504458 504457 504456 504456 504457
"C 100 ttf" 606549 606547 606549 606549 606548 606548
"C 50 ttf" 767841 767843 767839 767839 767841 767841

Comments

  • I’m just blown away at how much faster 3.2.0 with YJIT is compared to 3.2.0 without it! This is very visible in longer running benchmarks, like with optimizing f.pdf where the difference is a whopping 25%!

  • The three fastest Rubies are 3.2.0 with YJIT, 3.1.3 with YJIT and then 2.6.10. The “plain” Ruby versions are noticably slower than 2.6.10. However, it is good to see that 3.2.0 is the fastest among them in most benchmarks.

  • There is a big difference in memory usage between 3.1.3 with YJIT and 3.2.0 with YJIT since the latter doesn’t reserve memory up-front anymore. The result is that the memory overhead of 3.2.0 with YJIT compared to plain 3.2.0 is very small. This means that there is no need to tune any YJIT-related memory options anymore!

  • It seems that startup time with YJIT has drastically improved compared to 3.1.3 which is noticable in the short-running benchmarks.

kramdown

kramdown benchmark

# ruby-2.6.10p210 ||	ruby-2.7.7p221 ||	ruby-3.0.5p211 ||	ruby-3.1.3p185-yjit ||	ruby-3.2.0	|| ruby-3.2.0-0-yjit
256	0.68738	0.75570	0.73757	0.67675	0.81496	0.70220
512	1.67798	1.68121	1.54550	1.45382	1.74624	1.54535
1024	3.42654	3.40070	3.44496	3.06991	3.59604	3.18020

Surprinsingly, Ruby 3.2.0 is the slowest one in this benchmark. The good news is that 3.2.0 with YJIT performs very well and is only beaten by 3.1.3 with YJIT.

geom2d

The bars represent instructions per seconds, so larger bars are better.

geom2d small benchmark

Comparison:
                            small
        3.2.0 --yjit:      8200.6 i/s
        3.1.3 --yjit:      7124.2 i/s - 1.15x  slower
              2.6.10:      4373.8 i/s - 1.87x  slower
               3.0.5:      3972.3 i/s - 2.06x  slower
               3.2.0:      3908.2 i/s - 2.10x  slower
               2.7.7:      3789.6 i/s - 2.16x  slower

Like last time Ruby with YJIT performs best in this largely CPU-bound benchmark. And YJIT got faster compared to Ruby 3.1.3!

Conclusion

Ruby 3.2.0 with YJIT is a big step forward performance-wise for all kinds of applications. And I’m very pleased to see that it got faster since 3.1.x and deemed production ready!

Other enhancements like enabled-by-default variable width allocation (VWA) and object shapes are certainly also contributing to a better and more performant Ruby experience.