Baptiste Fontaine’s Blog  (back to the website)

Syntax Quiz #1: Ruby’s mysterious percent suite

This is the first post of a serie I’m starting about syntax quirks in various languages. I’ll divide each post in two parts: the first one states the question (mainly “Is this valid? What does it do? How does it work?”); the second one gives an answer.

In Ruby, any sequence of 3+4n (where n≥0) percent signs (%) is valid; can you guess why?

Here are the first members of this suite:

%%%             # n=0
%%%%%%%         # n=1
%%%%%%%%%%%     # n=2
%%%%%%%%%%%%%%% # n=3

I’m using Ruby 2.3.0 but this I was able to test this behavior on the almost-10-years-old Ruby 1.8.5, so you should be fine with any version.

You can stop here and try to solve this problem or skip below for an answer. I’m including an unrelated image below so that people with large screen are not spoiled.

The answer to the problem lies in two things: string literals and string formatting.

You might know you can use %q() to create a string; which can be handy if you have both single and double quotes and don’t want to escape them:

my_str = %q(it's a "valid" string)

This method doesn’t support interpolation with #{} but its uppercased friend does:

my_str = %Q(i's still #{42 - 41} "valid" string)

The equivalent also exists to create arrays of strings with %w() and %W(), regular expressions with %r, as well as %i to create arrays of symbols starting in Ruby 2.0.0:

names = %w(alice bob charlie) # => ["alice", "bob", "charlie"]
names.each { |name| puts "Hello #{name}" }

my_syms = %i(my symbols) # => [:my, :symbols]

puts "yep" if %q(my string) =~ %r(my.+regexp)

You can also use [],{} or <> instead of parentheses:

%w{foo bar}             # => ["foo", "bar"]
%q[my string goes here] # => "my string goes here"
%i<a b c>               # => [:a, :b, :c]

Ruby lets you use a percent sign alone as an alias of %Q:

%(foo bar) == %<foo bar> # => true
%{foo bar} == "foo bar"  # => true

But wait; there’s more! You can also use most non-alphanumeric characters like | (%|my string|), ^ (%w^x y z^), or… %:

%w%my array%        # => ["my", "array"]
%q%my string%       # => "my string"
%%my other string%  # => "my other string"

This means that %||, %^^ or %%% can be used to denote an empty string (don’t do that in real programs, please). It answers the problem for the case n=0: %%% is an empty string; the first percent sign indicates it’s a literal string, and the following two are respectively the beginning and end delimiters.

The second part of our answer is string formatting.

If you have ever written a Python program you know it supports string formatting à la sprintf with %:

print "this is %s, I'm %d years-old" % ("Python", 25)

Well, Ruby supports the same method, called String#%:

puts "this is %s, I'm %d years-old" % ["Ruby", 21]

In both languages you can drop the array/tuple if you have only one argument:

print "I'm %s" % "Python"
print "I'm %d" % 25
puts "I'm %s" % "Ruby"
puts "I'm %d" % 21

Both will raise an exception if you have not enough arguments but only Python will do it if you have too many of them:

# Python
print "I'm %s" % ["Python", "Ruby"]
# => TypeError: not all arguments converted during string formatting
# Ruby
puts "I'm %s" % ["Ruby", "Python"]
# prints "I'm Ruby"

This means that while "" % "" is syntaxically valid in both languages, only Ruby runs it without error, because Python raises an exception telling that the argument (the string on the right) is not used.

If we combine this knowledge with what we have above with literal strings we now know we can write the following in Ruby:

%%% % %%% # equivalent to "" % ""

The last key is that it works without spaces and can be chained:

""  % ""  % ""  # => ""
%%% % %%% % %%% # => ""
%%%%%%%%%%%     # => ""

The 3+4n refers to the way the expression is constructed: the first three percent signs are an empty string, and the next four ones are the formatting operator followed by another empty string.

Want more of these? Here are a few other valid Ruby expressions using strings and percent signs (one per line); guess how they’re parsed and evaluated:

%# <- is it really valid? :-#