Capture standard output in Ruby

Recently, I wrote a little Ruby script where I wanted to capture the data that is send to standard output. To make things clear, it’s not about capturing output from a subprocess. What I wanted to do was calling a method and redirecting everything this method writes to standard output to a string buffer. It took me a while to figure out how this can be done, so I thought I better write it down. Maybe this is helpful also for someone else.

Ruby has two ways to access standard output. First, there is a global constant STDOUT that refers to an IO object. You can print data to standard output by sending for example a puts message to STDOUT.

STDOUT.puts "Hello, World!"
=> Hello, World!
(This code is rarely seen in Ruby programs. The Kernel module also provides methods like puts and print which delegate to standard output, so one can simply write puts "Hello, World! instead of STDOUT.puts "Hello, World!").

The second way to access standard output is through the global variable $stdout. Both, STDOUT and $stdout refer to the same IO object, so you can substitute STDOUT with $stdout in the example above. The difference, however, is that STDOUT is a global constant and $stdout is a variable. This means that you cannot assign a new value to STDOUT (well, you can, but Ruby will issue a warning and you will burn in programmer’s hell forever if you ignore it). But what you can do is assign another value to $stdout. We will see in a moment why this is important.

My first attempt was to use the reopen method that is provided by the IO class. This method takes an IO object or a path and optionally a mode string. In the first case, reopen re-associates the receiver with the given IO object. In the second case, reopen opens a new output stream on the path and then re-associates the receiver with this stream.

old_stdout = STDOUT.dup
STDOUT.reopen('/tmp/ruby-output')
puts "Hello, World!"
STDOUT.reopen(old_stdout)

In the example above, we use reopen on STDOUT to send all data that is written to standard output to the file /tmp/ruby-output (warning: don’t try this in irb since after STDOUT.reopen(...), you won’t see any more output!). So far so good, but as I mentioned earlier, I wanted to have the output in a string. Of course, we could use File.read('/tmp/ruby-output') to read the contents of the file into a string but well, that’s not exactly an elegant solution. What we really want is an alternative implementation of IO that collects all output into a string. Searching the ruby documentation quickly revealed StringIO, an IO compatible class that provides pseudo I/O on a String object. Fine, so let’s try it out:

old_stdout = STDOUT.dup
out = StringIO.new
STDOUT.reopen(out)
puts "Hello, World!"
STDOUT.reopen(old_stdout)
puts out.string

Opposed to the previous example, we don’t reopen STDOUT to a file but to a StringIO object. Sadly, this doesn’t work. The reopen statement fails with “can’t convert StringIO into String (TypeError)”. It looks like reopen does not accept the StringIO object. In fact, the implementation of IO#reopen checks if the first argument is an IO object. If not, the argument is converted to a string and a new IO object is opened on the path represented by that string. The error occurs because reopen does not recognize the StringIO object as a valid IO object and fails to convert it to a string. Typing StringIO.superclass into irb reveals Data as the superclass of StringIO! Since reopen checks whether the class of the first argument is_a?(IO), it does not accept our StringIO object as an IO object.

As a side node, the IO implementation does not follow the concept of duck typing here (while it could do so easily). Instead of checking the base-class of the first argument, it would be better to check if the object behaves like an IO object. It doesn’t really matter whether its class is a subclass of IO. As long as it behaves like an IO, it should be ok (duck typing: “looks like a duck, walks like a duck, must be a duck!”). Checking the behavior at this point would mean to check that the object responds to messages that could be send to an IO object. In fact, reopen could simply omit the check and let errors happen as soon as messages are send to the object that it doesn’t understand.

So, back to our reopen problem. A solution would be to create a subclass of IO that simply delegates all messages to a StringIO object. Or we could implement our own StringIO class. Both solutions are feasible, but require some work and add complexity to the application. Luckily, we still have the global variable $stdout. Since it is a variable, we can assign it a new value.

old_stdout = $stdout
out = StringIO.new
$stdout = out
puts "Hello, World!"
$stdout = old_stdout
puts out.string

Simply assigning a StringIO object to $stdout does the trick. Since the implementation of Kernel#puts delegates to $stdout, the output is send to our StringIO object. This solution has a few drawbacks, however. First, it doesn’t work if someone writes to standard output directly via STDOUT. Second, output from subprocesses is not captured. If this is a problem, you need to reopen STDOUT, either to a file or to a custom IO object. If not, the simple solution from above should work fine.

There is one more pitfall in the solution (that also applies to the reopen solution). If an exception occurs between $stdout=out and $stdout = old_stdout, $stdout won’t be set back to its original value. To avoid this problem, we have to surround our “business code” with a begin...ensure block.

old_stdout = $stdout
out = StringIO.new
$stdout = out
begin
   puts "Hello, World!"
ensure
   $stdout = old_stdout
end
puts out.string

Now, the ratio of “infrastructure code” to “business code” is not very good. Well, it wasn’t good before either :-). We have exactly two lines where the actual work is done and seven lines dealing with standard output redirection. Fortunately, Ruby gives us blocks so we could easily encapsulate the infrastructure code.

def with_stdout_captured
   old_stdout = $stdout
   out = StringIO.new
   $stdout = out
   begin
      yield
   ensure
      $stdout = old_stdout
   end
   out.string
end

out = with_stdout_captured do
   puts "Hello, World!"
end
puts out

The method with_stdout_captured makes the code easier to read. Not only does it hide the details of redirecting standard output but it also clearly reveals the intention when someone else is reading the code.

Ok, this was a rather lengthy post about something simple as capturing standard output. However, at least I learned a lot about standard output in Ruby and if you read until here, I hope you enjoyed it.

Links:

2 Comments »

  1. Nick said,

    September 21, 2007 @ 8:24 pm

    This rocks. I wrote a couple of scripts to check on mysql replication. The scripts write to a log file and stardard out, but I want to send a daily report on the replication, and this should make it easy to capture the output from the script and stick it in an email. Thanks

  2. Lancer Kind said,

    February 19, 2008 @ 9:57 pm

    Great article! This outlines a solution that goes a long way into solving unit testing of stdout that is a problem with Java/C#. It’s also great that based on the need you’ve set up for code reuse, you introduced blocks/yield.

    The problem around StringIO not working is interesting. I agree with your conclusion about the code not using duck typing. Duck typing is something new that a lot of us are going to need to work at to get used to. Checking the type of the superclass in order to make decisions about its capabilities feels like the developer was still stuck in the strong type checking paradigm in Java/C#. When Java was the new cool OO kid on the block in 1995, one saw a lot of Java code that was written as if it was C (poor encapsulation, lots of public fields, few/really-long methods, checking return codes instead of using exception handling). I expect a lot of Ruby code is going to look like Java/C# code but implemented in Ruby for a while until people unlearn some habits. (I know I’m working to do that.)

    Bravo! Reading this has been a great use of my time!

RSS feed for comments on this post · TrackBack URI

Leave a Comment