Ticket #121 (closed enhancement: duplicate)

Opened 1 year ago

Last modified 5 months ago

Progressive HTTP output

Reported by: b.candl..@pobox.com Assigned to:
Priority: medium Milestone: The Future
Component: Merb Keywords:
Cc:

Description

It can be useful in several scenarios if a web application can return data to the browser in a stream, sending data as it becomes available, rather than building up the whole body first and then sending it back in one burst. Examples include:

  • Returning progress information for long-running requests
  • Proxy applications (where the proxied data is of arbitary size)
  • Screen updates using x-multipart-replace
  • Any time you want to send a large amount of data back, but X-sendfile isn't appropriate

This is easy to do with a traditional CGI, for example:

#!/usr/bin/ruby -w
$stdout.sync = true
puts "Content-Type: text/html"
puts
puts "<html>", " " * 256
puts "<body>"
(1..10).each do |i|
  puts "<p>#{i}</p>"
  sleep 1
end
puts "</body></html>" 

It would be a plus point to the merb framework if it was able to do this (assuming that the underlying mongrel server allows it).

I tried to do it by creating a socket pair, returning one end of this from the controller, and pumping data into it from another thread:

==> dist/app/controllers/hello.rb <==
require 'thread'
require 'socket'

class Hello < Application
  def world
    inp, out = Socket.pair("PF_UNIX","SOCK_STREAM",0)
    inp.sync = true
    out.sync = true
    Thread.new(out) do |o|
      o.puts "<html>#{" " * 256}"
      o.puts "<body>"
      (1..10).each do |i|
        o.puts "<p>#{i}</p>"
        sleep 1
      end
      o.puts "</body></html>"
      o.close
    end
    return inp
  end
end

However, testing this seems to show that all the output is buffered and returned as a single burst after 10 seconds.

telnet localhost 4000
GET /hello/world HTTP/1.0
<blank line>

Change History

08/11/07 05:04:39 changed by r.@tinyclouds.org

You can do this with Merb. The catch is that you must circumvent Mongrel writing the Content-Length header, which either requires monkey-patching Mongrel::HttpResponse#send_status or sending your own headers. Here is a Merb action which does that

  def show
    response.header['Content-Type'] = 'text/plain'
    response.header['Cache-Control'] = 'no-cache'
    response.header['Pragma'] = 'no-cache'
    
    wait = params[:wait].to_f > 0.05 ? params[:wait].to_f : 1
    
    Proc.new do
      response.write("HTTP/1.1 200 OK\n")
      response.write("Cache-Control: no-cache\n")
      response.write("Pragma: no-cache\n")
      response.write("Content-type: text/json\n\n")
      
      while true
        sleep wait
        response.write "Hello World\n"
      end
      response.done = true
    end
  end

08/11/07 06:51:25 changed by b.candl..@pobox.com

Thank you. I had just got around to working out how to do this myself with Mongrel. I found that I also had to undo the $tcp_cork_opts on the socket to make it work. Here's the complete code:

require 'rubygems'
require 'mongrel'

class SimpleHandler < Mongrel::HttpHandler
  def process(request, response)
    if $tcp_cork_opts
      # Mongrel does setsockopt($tcp_cork_opts) to prevent TCP sending
      # partial segments. We need to undo it.
      response.socket.setsockopt(*($tcp_cork_opts[0..-2] + [0]))
    end

    response.send_status(nil)   # BUG: wrongly adds "Content-Length: " header.
    response.header.out.seek(-18, IO::SEEK_CUR) # So we have to unset it.
    response.header["Content-Type"] = "text/plain"
    response.send_header
    response.write "<html>#{" " * 256}\n"
    response.write "<body>\n"
    (1..10).each do |i|
      response.write "<p>#{i}</p>#{" " * 1}\n"
      #response.socket.flush # -- not needed
      sleep 1
    end
    response.write "</body></html>\n"
  end
end

h = Mongrel::HttpServer.new("0.0.0.0", "3000")
h.register("/test", SimpleHandler.new)
h.run.join

Now, if I run your example under Merb, I find that the output is not buffered to fill whole TCP segments. But strangely, it doesn't come one line at a line, it comes in twos and threes:

$ echo -en "GET /hello/show HTTP/1.0\r\n\r\n" | nc -o /dev/stderr localhost 4000
> 00000000 47 45 54 20 2f 68 65 6c 6c 6f 2f 73 68 6f 77 20 # GET /hello/show
> 00000010 48 54 54 50 2f 31 2e 30 0d 0a 0d 0a             # HTTP/1.0....
HTTP/1.1 200 OK
Cache-Control: no-cache
Pragma: no-cache
Content-type: text/json

Hello World
Hello World
< 00000000 48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f 4b 0a # HTTP/1.1 200 OK.
< 00000010 43 61 63 68 65 2d 43 6f 6e 74 72 6f 6c 3a 20 6e # Cache-Control: n
< 00000020 6f 2d 63 61 63 68 65 0a 50 72 61 67 6d 61 3a 20 # o-cache.Pragma:
< 00000030 6e 6f 2d 63 61 63 68 65 0a 43 6f 6e 74 65 6e 74 # no-cache.Content
< 00000040 2d 74 79 70 65 3a 20 74 65 78 74 2f 6a 73 6f 6e # -type: text/json
< 00000050 0a 0a 48 65 6c 6c 6f 20 57 6f 72 6c 64 0a 48 65 # ..Hello World.He
< 00000060 6c 6c 6f 20 57 6f 72 6c 64 0a                   # llo World.
Hello World
Hello World
Hello World
< 0000006a 48 65 6c 6c 6f 20 57 6f 72 6c 64 0a 48 65 6c 6c # Hello World.Hell
< 0000007a 6f 20 57 6f 72 6c 64 0a 48 65 6c 6c 6f 20 57 6f # o World.Hello Wo
< 0000008a 72 6c 64 0a                                     # rld.
Hello World
Hello World
Hello World
< 0000008e 48 65 6c 6c 6f 20 57 6f 72 6c 64 0a 48 65 6c 6c # Hello World.Hell
< 0000009e 6f 20 57 6f 72 6c 64 0a 48 65 6c 6c 6f 20 57 6f # o World.Hello Wo
< 000000ae 72 6c 64 0a                                     # rld.
Hello World
Hello World
< 000000b2 48 65 6c 6c 6f 20 57 6f 72 6c 64 0a 48 65 6c 6c # Hello World.Hell
< 000000c2 6f 20 57 6f 72 6c 64 0a                         # o World.
Hello World
Hello World
< 000000ca 48 65 6c 6c 6f 20 57 6f 72 6c 64 0a 48 65 6c 6c # Hello World.Hell
< 000000da 6f 20 57 6f 72 6c 64 0a                         # o World.
Hello World
Hello World
... etc

This suggests to me that something is flushing the TCP buffer every 2 or 3 seconds. If I add the $tcp_cork_opts frig to your code, then it comes a line at a time as expected.

So, I guess the answer is "yes this can be done", but it would be nice if the recipe were documented somewhere :-) Thanks again for the code.

08/11/07 07:04:09 changed by b.candl..@pobox.com

10/20/07 08:32:37 changed by todd.fish..@gmail.com

Hi,

I think you need to set the HTTP Transfer-Encoding: chunked header.

From the rfc => http://www.w3.org/Protocols/rfc2616/rfc2616.html

"The chunked encoding modifies the body of a message in order to transfer it as a series of chunks, each with its own size indicator, followed by an OPTIONAL trailer containing entity-header fields. This allows dynamically produced content to be transferred along with the information necessary for the recipient to verify that it has received the full message."

Here's a snippet of code I used to send these headers, in a proxy server I worked on from here =>http://mongrel-esi.googlecode.com/svn/trunk/lib/esi/handler.rb

begin
  # we detected the surrogate wants us to process the response checking for ESI 
  # tags.

  response.header["Transfer-Encoding"] = "chunked"
  # this is the important part, rather then send the whole document back we send in chunks
  # each fragment is it's own chunk, this does mean we require http 1.1
  header = Mongrel::Const::STATUS_FORMAT % .@status, Mongrel::HTTP_STATUS_CODE..@status]]
  header.gsub!(/Connection: close\r\n/,'')
  response.header.out.rewind
  header << response.header.out.read + Mongrel::Const::LINE_END
  header.gsub!(/Status:.*?\r\n/,'')
  response.write( header ) 

  #puts header

  @parser.process_io( :input_stream => proxy_response,
                      :response_headers => response.header,
                      :request => request,
                      :http_params => @params ) do |buffer|
    # send a new chunk
    size = buffer.size
    chunk_header = "#{"%x" % size}" + Mongrel::Const::LINE_END
    #puts chunk_header.inspect
    response.write( chunk_header )  # write the chunk size
    #puts buffer.inspect
    response.write( buffer + Mongrel::Const::LINE_END )  # write the chunk
  end
rescue => e 
  response.write( error_response(..@url) ) 
end
response.write( "0\r\n\r\n" ) 
response.done = true

10/20/07 12:48:43 changed by b.candl..@pobox.com

I don't think it is necessary to set chunked encoding. The code snippet above dated 08/11/07 06:51:25 shows how it *can* work without this.

The point is: Merb doesn't in itself support sending a response in stages. You need to manipulate directly the underlying Mongrel response object - i.e. you have to call methods on response to send the headers, send the chunks of body as required (with or without HTTP chunking), and finally mark the response as sent. You also need to frig about with $tcp_cork_opts under Linux.

This is all fine by me, as it's an uncommon usage scenario. However I would like it to be documented: "yes you can do this with Merb, and here's how". So far the only documentation I know of is this ticket :-)

01/09/08 23:51:53 changed by davidl..@berkeley.edu

I'm assuming there needs to be changes made to ERB / Erubis / Eruby / HAML / Markaby for this to work since those renderers all return completed pages rather than in increments.

01/10/08 01:02:14 changed by b.candl..@pobox.com

I'd be happy to render partials explicitly as required. This is easy enough for ERB anyway. Markaby may not be so useful as AFAIK you can't generate unbalanced HTML.

So a solution might be to have a variant of 'render', say 'render_now', which instead of just returning a string actually sends it to the browser as a HTTP chunk (and the headers first, if they've not yet been sent), leaving the connection open. Any string returned at the end of the controller action would be the final chunk. This would basically be Todd Fisher's code above, wrapped up nicely.

You would still have to manage the controller lock if you have these long-running actions: i.e. either disable it entirely, or return a proc from your controller action which does the actual work.

02/11/08 12:20:39 changed by shayarne..@gmail.com

  • status changed from new to closed.
  • resolution set to duplicate.

02/11/08 12:29:07 changed by b.candl..@pobox.com

For my information, could you reference the ticket which this is a duplicate of?