« See all posts

Is Ruby 2.3 Faster? How to Prepare Yourself for Frozen String Literals and Not Lose Performance

Posted by Alexander Dymo on February 23, 2016

Ruby 2.3 was released in December 2015 with yet another bunch of performance improvements. But is it really faster than 2.2? Let's take a look.

This is the fifth post in my series about Ruby 2.3 performance, and we'll talk about how to port your code to work with frozen string literals, the experimental feature in Ruby 2.3 that will be enabled by default in 3.0. It turns out you can lose performance if you are not careful during porting.

Aren't Frozen String Literals Supposed to Speedup Your Code?

In some cases they make your code faster. In others they do not make any difference. Read my previous post if you're interested in the details.

What Can Go Wrong?

Most of the Ruby code today breaks when you turn on the frozen string literals feature. Chances are you'll have to port both your code and the gems you use.

It turns out that you can actually lose performance in the process. Let me show you the pitfalls.

String#gsub!, String#upcase!, and Other Bang Functions

I've been a huge fan of the bang functions. They (often) save memory, reduce the load on garbage collection, and greatly improve performance.

They obviously do not work on frozen strings. And it's so tempting to simply substitute them with their non-bang variants. But you should not.

Instead, locate the place where the string is constructed, and make it mutable. If you're doing string modifications (usually in template builders or similar code), it's absolutely fine to keep the string mutable. You will not benefit from freezing it anyways.

Consider this example:

s = "some string"
s.gsub!(/some/, 'random')
s.upcase!

To port this code you'll only need to use the String constructor instead of the literal:

s = String.new("some string")
s.gsub!(/some/, 'random')
s.upcase!

Often a place where the string is initialized is far away from the bang function call. Fortunately, Ruby prints the exact construction location in the output. If it does not for you, pass it the --enable-frozen-string-literal-debug flag.

String#<<

This is also my favorite. It's a faster version of String#+= that modifies the string in place instead of creating the separate result string in memory. It's often used like this:

s = ""
s << do_something()
s << do_something_else()

It is bad for performance to change this code to String#+=. As in the previous example, the right way is to create the mutable string in the first place:

s = String.new("")
s << do_something()
s << do_something_else()

This snippet is the common pattern in template generators. In fact, Ruby's own ERB must be patched in this fashion.

String#encode!

This one is frequently used in Ruby gems and Rails. Usually you will see the code like this:

result = "some_string".force_encoding(Encoding::UTF_8).encode!

Here it's actually OK to use non-bang String#encode because String#encode! is one of the few bang functions that does not save any memory. It creates the copy of the string internally.

result = "some_string".force_encoding(Encoding::UTF_8).encode

Function Arguments with Default String Values

Another thing you can often see in the Ruby libraries: a function accepts argument with a default string value, like this:

def do_something(value, encoding = "UTF-8")
  encoding.upcase!
  # ...
end

This function expects the encoding argument to be all upper-case. Its default value is, of course, upper-case. But it doesn't trust the caller and does the upcase! itself to make double sure.

If you change the code to upcase, you'll start copying strings in memory. That is potentially a bad thing, especially in often-used library functions.

There are several better solutions, each with its own benefit and caveat. You could trust the caller and skip upper-casing. Better, you could validate the argument before using. For example:

def do_something(value, encoding = "UTF-8")
  unless ["UTF-8", "ASCII"].include?(encoding)
    raise "bad encoding #{encoding}"
  end
  # ...
end

Passing a symbol to such function would make even more sense. You'd still need a check for allowed encodings though.

Verdict: Be Careful

It's easy to lose performance when making your code work with frozen string literals feature. If you're not careful, you'll allocate extra strings in memory. That will defeat the whole purpose of turning the feature on. Keep in mind the pitfalls I demonstrated above.

Did you like this post? Follow me on Twitter or Google+ to stay updated on Ruby performance optimization news.

Next: Is Ruby 2.3 Faster? No Significant Improvement In Rails Applications
Previous: Ruby Performance Optimization Book Tour, San Francisco Bay Area, March 21-28, 2016