Capture standard output in Ruby
Recently, I wrote a little Ruby script where I wanted to capture the data that is send to standard output. To make things clear, it’s not about capturing output from a subprocess. What I wanted to do was calling a method and redirecting everything this method writes to standard output to a string buffer. It took me a while to figure out how this can be done, so I thought I better write it down. Maybe this is helpful also for someone else.
Ruby has two ways to access standard output. First, there is a global constant STDOUT that refers to an IO object. You can print data to standard output by sending for example a puts message to STDOUT.
STDOUT.puts "Hello, World!" => Hello, World!
puts "Hello, World! instead of STDOUT.puts "Hello, World!").
The second way to access standard output is through the global variable $stdout. Both, STDOUT and $stdout refer to the same IO object, so you can substitute STDOUT with $stdout in the example above. The difference, however, is that STDOUT is a global constant and $stdout is a variable. This means that you cannot assign a new value to STDOUT (well, you can, but Ruby will issue a warning and you will burn in programmer’s hell forever if you ignore it). But what you can do is assign another value to $stdout. We will see in a moment why this is important.
My first attempt was to use the reopen method that is provided by the IO class. This method takes an IO object or a path and optionally a mode string. In the first case, reopen re-associates the receiver with the given IO object. In the second case, reopen opens a new output stream on the path and then re-associates the receiver with this stream.
old_stdout = STDOUT.dup
STDOUT.reopen('/tmp/ruby-output')
puts "Hello, World!"
STDOUT.reopen(old_stdout)
In the example above, we use reopen on STDOUT to send all data that is written to standard output to the file /tmp/ruby-output (warning: don’t try this in irb since after STDOUT.reopen(...), you won’t see any more output!). So far so good, but as I mentioned earlier, I wanted to have the output in a string. Of course, we could use File.read('/tmp/ruby-output') to read the contents of the file into a string but well, that’s not exactly an elegant solution. What we really want is an alternative implementation of IO that collects all output into a string. Searching the ruby documentation quickly revealed StringIO, an IO compatible class that provides pseudo I/O on a String object. Fine, so let’s try it out:
old_stdout = STDOUT.dup out = StringIO.new STDOUT.reopen(out) puts "Hello, World!" STDOUT.reopen(old_stdout) puts out.string
Opposed to the previous example, we don’t reopen STDOUT to a file but to a StringIO object. Sadly, this doesn’t work. The reopen statement fails with “can’t convert StringIO into String (TypeError)”. It looks like reopen does not accept the StringIO object. In fact, the implementation of IO#reopen checks if the first argument is an IO object. If not, the argument is converted to a string and a new IO object is opened on the path represented by that string. The error occurs because reopen does not recognize the StringIO object as a valid IO object and fails to convert it to a string. Typing StringIO.superclass into irb reveals Data as the superclass of StringIO! Since reopen checks whether the class of the first argument is_a?(IO), it does not accept our StringIO object as an IO object.
As a side node, the IO implementation does not follow the concept of duck typing here (while it could do so easily). Instead of checking the base-class of the first argument, it would be better to check if the object behaves like an IO object. It doesn’t really matter whether its class is a subclass of IO. As long as it behaves like an IO, it should be ok (duck typing: “looks like a duck, walks like a duck, must be a duck!”). Checking the behavior at this point would mean to check that the object responds to messages that could be send to an IO object. In fact, reopen could simply omit the check and let errors happen as soon as messages are send to the object that it doesn’t understand.
So, back to our reopen problem. A solution would be to create a subclass of IO that simply delegates all messages to a StringIO object. Or we could implement our own StringIO class. Both solutions are feasible, but require some work and add complexity to the application. Luckily, we still have the global variable $stdout. Since it is a variable, we can assign it a new value.
old_stdout = $stdout out = StringIO.new $stdout = out puts "Hello, World!" $stdout = old_stdout puts out.string
Simply assigning a StringIO object to $stdout does the trick. Since the implementation of Kernel#puts delegates to $stdout, the output is send to our StringIO object. This solution has a few drawbacks, however. First, it doesn’t work if someone writes to standard output directly via STDOUT. Second, output from subprocesses is not captured. If this is a problem, you need to reopen STDOUT, either to a file or to a custom IO object. If not, the simple solution from above should work fine.
There is one more pitfall in the solution (that also applies to the reopen solution). If an exception occurs between $stdout=out and $stdout = old_stdout, $stdout won’t be set back to its original value. To avoid this problem, we have to surround our “business code” with a begin...ensure block.
old_stdout = $stdout out = StringIO.new $stdout = out begin puts "Hello, World!" ensure $stdout = old_stdout end puts out.string
Now, the ratio of “infrastructure code” to “business code” is not very good. Well, it wasn’t good before either :-). We have exactly two lines where the actual work is done and seven lines dealing with standard output redirection. Fortunately, Ruby gives us blocks so we could easily encapsulate the infrastructure code.
def with_stdout_captured
old_stdout = $stdout
out = StringIO.new
$stdout = out
begin
yield
ensure
$stdout = old_stdout
end
out.string
end
out = with_stdout_captured do
puts "Hello, World!"
end
puts out
The method with_stdout_captured makes the code easier to read. Not only does it hide the details of redirecting standard output but it also clearly reveals the intention when someone else is reading the code.
Ok, this was a rather lengthy post about something simple as capturing standard output. However, at least I learned a lot about standard output in Ruby and if you read until here, I hope you enjoyed it.
Links: