Strip Leading Whitespace from Heredocs in Ruby

Mike Kenneally

Update: Starting in Ruby 2.3 this can be done natively with the “squiggly” heredoc. Using a tilde instead of a hyphen will strip the leading whitespace.

<<~EOS
  <pre>
    <code>puts "foobar"</code>
  </pre>
EOS

A “heredoc” in programming is short for "here document". If you’re not familiar with the concept I encourage you to read up on it. It’s a way to declare a file or long, multiline string inline. Formatting is preserved and quotes are ignored, which means strings with quotes in them don’t have to be escaped.

For example, if you needed a string with some spacing, indentation, and quotes you could do something like this:

"<pre>\\n  <code>puts \"foobar\"</code>\\n<pre>"

But reading and writing that is difficult. With a heredoc you could do this1:

<<-EOS
<pre>
  <code>puts "foobar"</code>
</pre>
EOS

There are all sorts of uses for heredocs, but I find them especially useful in tests when I need to compare some kind of long, formatted output.

Case in point, I just switched to Redcarpet to process the Markdown this blog is written in. But before I did I wrote several tests, one of which ensures that Redcarpet renders fenced code blocks.

describe "render_html_from_markdown" do

  it "renders fenced code blocks" do
    markdown = "```\\ndef foobar\\n  true\\nend\\n```"
    output   = "<pre><code>def foobar\\n  true\\nend\\n</code></pre>\\n"

    render_html_from_markdown(markdown).must_equal(output)
  end

end

This works, but heredocs can dramatically improve the readability of this test.

describe "render_html_from_markdown" do

  it "renders fenced code blocks" do
    markdown = <<-EOS
      ```
      def foobar
        true
      end
      ```
    EOS

    render_html_from_markdown(markdown).must_equal <<-EOS
      <pre><code>def foobar
        true
      end
      </code></pre>
    EOS
  end

end

The HTML output isn’t quite formatted how I would have done it by hand, but the important thing is that the test is significantly more readable.

There is, however, one problem—the test actually fails now. It fails because I’ve indented the heredocs for readability, like you would normally with any code, but the leading whitespace of the indentation is interpreted as formatting for the heredoc, unfortunately causing the strings to mismatch.

If you’re using Rails (or ActiveSupport) there’s a strip_heredoc helper method that solves this problem. It looks for the least indented line in the string and removes that amount of leading whitespace from each line2.

describe "render_html_from_markdown" do

  it "renders fenced code blocks" do
    markdown = <<-EOS.strip_heredoc
      ```
      def foobar
        true
      end
      ```
    EOS

    render_html_from_markdown(markdown).must_equal <<-EOS.strip_heredoc
      <pre><code>def foobar
        true
      end
      </code></pre>
    EOS
  end

end

  1. The actual acronym or word you use to delimit the heredoc doesn’t really matter. The two most common are EOS (end of string) and EOF (end of file), but using semantic names like “SQL” is even better.
  2. The syntax is admittedly a little awkward but methods called on heredocs are called on the starting delimiter.