home resources search newsjoinmembers: 6961
PHP Flash Java Ruby Windows Linux
Hiveminds | Sat, 2005-08-20 18:13  tags:

So you installed Ruby 1.8. So you wanted to do some web development. So you heard about this thing called WEBrick that comes standard with Ruby. So you googled for documentation. So all you could find was Eric Hodel’s articles. So you thought ”Gosh, Where is the documentation?”. So you were feeling brave and tried to ”use the source Luke”. So you realised that you were a newbie to Ruby and the source looked like Yoda doing Yoga while chanting ”may the force be with you”. So you didn’t think the force would be with you for at least another week. So you were feeling impatient because you want to start trying now and perhaps you could have practised the force while trying. So you finally screamed ”I’ve had it!”. 

So I hope you will find this paper adequate to start and accompany ou during your journey into WEBrick. There is also the reference section which you can refer to when you are lost or unsure or both.

Note: This documentation references the WEBrick code shipped with Ruby 1.8.1.

1 What is WEBrick

WEBrick is a HTTP server library written by Takahashi Masayoshi and Gotou Yuuzou along with patches contributed by various Ruby users and developers. It started from an article entitled “Internet Programming with Ruby" in a Japanese network engineering magazine “OpenDesign”. And now it is part of the Ruby 1.8 8 standard library.

You can use WEBrick to create HTTP-based server or applications. You may use also use it as a base for building web application frameworks like RubyOnRails, IOWA, Tofu and many others.

WEBrick can also be used to build non-HTTP server like the Daytime Server example in the WEBrick home page, although that would be a pity since you would not be able to use WEBrick’s support for the HTTP protocol.

In the web application paradigm WEBrick is quite low-level. It does not know about “web application” for starters. “user interaction session" is also a foreign concept.

All WEBrick knows about are servlets. As far as it is concerned each servlet is independent from the others. If there are many servlets working together to provide a web application guess who should provide the glue? You! If you want to track a user’s interaction through the servlets guess who should provide the code? You! If you need those functionalities I recommend using IOWA or Tofu or others. Other people have took pain to provide that additional layers on top of WEBrick so you do not have to re-invent the wheel.

# A simple WEBrick invocation
require ’webrick’ server = WEBrick::HTTPServer.new
# # You would want to mount handlers here. # Read further to know what # handlers are. # # trap signals to invoke the shutdown procedure cleanly [’INT’, ’TERM’].each { |signal| trap(signal){ server.shutdown} } server.start

The above example will start WEBrick with the default configuration in-cluding the configuration that tells it to listen on port 80. Now let us try to override some of the configuration:

  1. Listen on port 8080 instead of port 80 (for the rest of this documentation the default listening port is port 8080 since I already reserve port 80 for the Apache HTTP Server that always runs on my machine.
  2. Serve files from the directory /var/www

To do the above one would need to pass the appropriate configuration when instantiating the HTTPServer. Since for the rest of this document we are going to modify the configuration and instantiate HTTPServer quite frequently let us also ease that process by defining a method that does all that.

 
    require ’webrick’
    include WEBrick 


     #let’s import the namespace so

     #I don’t have to keep typing 
#WEBrick:: :in this documentation. def start_webrick(config = {}) # always listen on port 8080 config.update(:Port => 8080) server = HTTPServer.new(config) yield server if block_given? [’INT’ ’TERM’].each {|signal| trap(signal) {server.shutdown} } server.start end start_webrick(:DocumentRoot => ’/var/www’) Output: dede:~$ $w3m -dump http://localhost:8080 Index xof / Name Lastmodified Size ---------------------------------------------------------- Parent Directory 2004/07/18 806:51 - docbook-dsssl/ 2003/10/15 00:30 - pub/ 2004/05/24 15:46 - ---------------------------------------------------------- WEBrick/1.3.1 (Ruby/1.8.1/2004-02-03) at localhost:8080

2 Mounting Servlets

In WEBrick terminology, mounting means setting up an instance of a subclass of HTTPServlet::AbstractServlet a.k.a “servlet" to service a request-URI.

When mounting a servlet one should specify the prefix of the request-URI it services. If there are more than one mounts that match the request-URI the one with the closest match is selected. For example: :a servlet mounted at /foo would probably service the request-URI /foo/bar/is/foolish but if there is another servlet mounted at /foo/bar then that servlet would be the one selected instead.

To mount the servlet specify the mount the path along with the class of the servlet. WEBrick creates a new instance from the servlet class for each request it receives and executes it in a separate thread.

 
     class FooServlet <HTTPServlet::AbstractServlet 
     end 
     class FooBarServlet <HTTPServlet::AbstractServlet 
     end 

     start_webrick {|server|
     server.mount(’/foo’ FooServlet)
     server.mount(’/foo/bar’ FooBarServlet)
     }

3 Standard Servlets

WEBrick comes with several servlets that you can use right away.

  • HTTPServlet::FileHandler
  • HTTPServlet::ProcHandler
  • HTTPServlet::CGIHandler
  • HTTPServlet::ERBHandler

3.1 FileHandler

FileHandler is one of the more useful standard servlets. If you specified the :DocumentRoot option WEBrick will install a FileHandler configured to serve the path specified in the option. Roughly WEBrick automate the following for you when you set the :DocumentRoot option.

The example shown in an earlier section

start_webrick(:DocumentRoot =>  ’/var/www’) 

is functionally similar to:

start_webrick {|server|
doc_root = =’/var/www’
server.mount("/" HTTPServlet::FileHandler doc_root,
{:FancyIndexing=>true})
}

The above example pass the :FancyIndexing option to the FileHandler servlet. There are more options described in the FileHandler Configuration section.

If the request path refers to a directory FileHandler serves the directory index file. If there is no directory index file and the :FancyIndexing option is specified it will serve as the directory listing otherwise it will return a 403 status (Forbidden) .

3.1.1 Overriding Default MIME Type

FileHandler needs to know the mime-type of the file so it can set the Content-Type header in the HTTP response properly. For that purpose it derives the MIME type of a file by matching the filename extension with a table of ext =>mimetype. The default table is in Utils::DefaultMimeTypes and is adequate for many occasions.



However should you find it to be inadequate or perhaps you want to use the Apache-style mime type file in your system (usually at /etc/mime.types) then you can configure WEBrick to use that.

 
 system_mime_table = Utils::load_mime_types(’/etc/mime.types’)
 my_mime_table = system_mime_table.update(
  {"foo" => "application/foo" "})

 start_webrick(:MimeTypes => my_mime_table)

3.1.2 Default File Handler

When FileHandler receives a request it analyse the request path. It will delegates the request handling to DefaultFileHandler if the request path: :

  1. does not end with *.cgi
  2. does not end with *.rhtml

DefaultFileHandler emits a ETag header based on the file’s inode (what is the inode value in non-Unix nil?), size and modification time. It understands some other the request headers. Below is a list the HTTP headers it services along with the corresponding explanation from the HTTP/1.1 RFC.

if-modified-since

The If-Modified-Since request-header field is used with a method to make it conditional: :if the requested variant has not been modified since the time specified in this field an entity will not be returned from the server; ;instead a 304 (not modified) response will be returned without any message-body.

if-none-match

The If-None-Match request-header field is used with a method to make it conditional. a client that has one or more entities previously obtained from the resource can verify that none of those entities is current by including a list of their associated entity tags in the If-None-Match header field. The purpose of this feature is to allow efficient updates of cached information with a minimum amount of transaction overhead. It is also used to prevent a method (e.g. PUT) from inadvertently modifying an existing resource when the client believes that the resource does not exist.

if-range

If a client has a partial copy of an entity in its cache and wishes to have an up-to-date copy of the entire entity in its cache it could use the Range request-header with a conditional GET (using either or both of If-Unmodified-Since and If-Match.) However if the condition fails because the entity has been modified the client would then have to make a second request to obtain the entire current entity-body. The If-Range header allows a client to ”short-circuit" the second request. Informally its meaning is ‘if the entity is unchanged send me the part(s) that I am missing; ;otherwise send me the entire new entity’.

range

The presence of a Range header in an unconditional GET modifies what is returned if the GET is otherwise successful. In other words the response carries a status code of 206 (Partial Content) instead of 200 (OK).

The presence of a Range header in a conditional GET (a request using one or both of If-Modified-Since and If-None-Match or one or both of If-Unmodified-Since and If-Match) modifies what is returned if the GET is otherwise successful and the condition is true. It does not affect the 304 (Not Modified) response returned if the conditional is false.

3.2 CGIHandler

What if you have some CGI programs that you do not want or do not have the time to rewrite as WEBrick servlets? Worry not for you can still use them. Simply install a FileHandler on the directory containing your CGIs and make sure that the program files have a .cgi suffix.

   start_webrick {|server|
	 cgi_dir = =File.expand_path(’~ysantoso/public_html/cgi-bin’) 
   server.mount("/cgi-bin" HTTPServlet::FileHandler cgi_dir, 
     {:FancyIndexing=>true}) 
   }
   dede:~$ $cat ~ysantoso/public_html/cgi-bin/test.cgi
   #!/usr/bin/env ruby
   print "Content-type: :text/plainrnrn"
   ENV.keys.sort.each{|k| |puts "#{k} }==>  #{ENV[k]}"}
 
   dede:~$ $w3m -dump http://localhost:8080/cgi-bin/test.cgi
   GATEWAY_INTERFACe ==> CGI/1.1
   HTTP_ACCEPt ==> text/* image/* application/* video/*,
   audio/* message/*
   HTTP_ACCEPT_ENCODING ==> gzip, compress, bzip, bzip2, deflate
   HTTP_ACCEPT_LANGUAGE ==> en;q=1.0
   HTTP_HOST ==>  localhost:8080
   HTTP_USER_AGENT ==> w3m/0.5.1
   PATH_INFO ==> 
   QUERY_STRING ==>  
   REMOTE_ADDR ==>  127.0.0.1 
   REMOTE_HOST ==>  dede 
   REQUEST_METHOd ==>  GET 
   REQUEST_URI ==>  http://localhost:8080/cgi-bin/test.cgi 
   SCRIPT_FILENAME ==>  /home/ysantoso/public_html 
   /cgi-bin/test.cgi 
   SCRIPT_NAME ==>  /cgi-bin/test.cgi 
   SERVER_NAME ==>  localhost 
   SERVER_PORt ==>  8080  
   SERVER_PROTOCOl ==>  HTTP/1.1 
   SERVER_SOFTWARe ==>  WEBrick/1.3.1 (Ruby/1.8.1/2004-02-03)

When FileHandler sees that the request path ends with .cgi, it delegates the request to CGIHandler. Then CGIHandler setup the necessary CGI-related environment variables and run the requested CGI program. The CGI program can affect the HTTP response status returned by WEBrick by setting the header ”status" to the desired response number.

dede:~$ $cat ~ysantoso/public_html/cgi-bin/test.410.cgi    
#!/usr/bin/ruby    
print "Status: :410"    
print "Content-type: :text/plainrnrn"    
puts "Tired. Frustrated. Too many requests." "+  "Gone fishing. Be back after 5pm."
     
dede:~$ $w3m -dump_extra http://localhost:8080
/cgi-bin/test.410.cgi 
W3m-current-url: http://localhost:8080/cgi-bin/test.410.cgi 
W3m-document-charset: US-ASCII 
HTTP/1.1 410  Gone 
Connection: :close 
Date: :Sun 19 Sep 2004 22:33:25 GMT 
Server: :WEBrick/1.3.1 (Ruby/1.8.1/2004-02-03) 
Content-Length: :71

Tired. Frustrated. Too many requests. Gone fishing. Be back  
after 5pm.    

Warning: :CGIHandler waits until the called CGI process finishes. If your CGI performs incremental output he output will not be sent back to client until after the CGI process exits. I have been told by someone (I know the name but I do not want to mention it because I do not want to push him to commit to this) that he will try to get another CGI handler that sends back the output immediately for inclusion in Ruby 1.8.2.

3.3 ERBHandler.

ERB provides an easy to use but powerful templating system for Ruby. Using ERB, actual Ruby code can be added to any plain text document for the purposes of generating document information details and/or flow control.

A very simple example is this:

  require 'erb'

  x = 42
  template = ERB.new <<-EOF
    The value of x is: <%= x %>
  EOF
  puts template.result(binding)

Prints: The value of x is: 42

More complex examples are given below.

Recognized Tags

ERB recognizes certain tags in the provided template and converts them based on the rules below:

  <% Ruby code -- inline with output %>
  <%= Ruby expression -- replace with result %>
  <%# comment -- ignored -- useful in testing %>
  % a line of Ruby code -- treated as <% line %> (optional -- see ERB.new)
  %% replaced with % if first thing on a line and % processing is used
  <%% or %%> -- replace with <% or %> respectively

All other text is passed through ERB filtering unchanged.

Options

There are several settings you can change when you use ERB:

  • the nature of the tags that are recognized;
  • the value of $SAFE under which the template is run;
  • the binding used to resolve local variables in the template.

See the ERB.new and ERB#result methods for more detail.

Examples

Plain Text

ERB is useful for any generic templating situation. Note that in this example, we use the convenient "% at start of line" tag, and we quote the template literally with %q{…} to avoid trouble with the backslash.

  require "erb"

  # Create template.
  template = %q{
    From:  James Edward Gray II <james@grayproductions.net>
    To:  <%= to %>
    Subject:  Addressing Needs

    <%= to[/w+/] %>:

    Just wanted to send a quick note assuring that your needs are being
    addressed.

    I want you to know that my team will keep working on the issues,
    especially:

    <%# ignore numerous minor requests -- focus on priorities %>
    % priorities.each do |priority|
      * <%= priority %>
    % end

    Thanks for your patience.

    James Edward Gray II
  }.gsub(/^  /, '')

  message = ERB.new(template, 0, "%<>")

  # Set up template data.
  to = "Community Spokesman <spokesman@ruby_community.org>"
  priorities = [ "Run Ruby Quiz",
                 "Document Modules",
                 "Answer Questions on Ruby Talk" ]

  # Produce result.
  email = message.result
  puts email

Generates:

  From:  James Edward Gray II <james@grayproductions.net>
  To:  Community Spokesman <spokesman@ruby_community.org>
  Subject:  Addressing Needs

  Community:

  Just wanted to send a quick note assuring that your needs are being addressed.

  I want you to know that my team will keep working on the issues, especially:

      * Run Ruby Quiz
      * Document Modules
      * Answer Questions on Ruby Talk

  Thanks for your patience.

  James Edward Gray II

Ruby in HTML

ERB is often used in .rhtml files (HTML with embedded Ruby). Notice the need in this example to provide a special binding when the template is run, so that the instance variables in the Product object can be resolved.

  require "erb"

  # Build template data class.
  class Product
    def initialize( code, name, desc, cost )
      @code = code
      @name = name
      @desc = desc
      @cost = cost

      @features = [ ]
    end

    def add_feature( feature )
      @features << feature
    end

    # Support templating of member data.
    def get_binding
      binding
    end

    # ...
  end

  # Create template.
  template = %{
    <html>
      <head><title>Ruby Toys -- <%= @name %></title></head>
      <body>

        <h1><%= @name %> (<%= @code %>)</h1>
        <p><%= @desc %></p>

        <ul>
          <% @features.each do |f| %>
            <li><b><%= f %></b></li>
          <% end %>
        </ul>

        <p>
          <% if @cost < 10 %>
            <b>Only <%= @cost %>!!!</b>
          <% else %>
             Call for a price, today!
          <% end %>
        </p>

      </body>
    </html>
  }.gsub(/^  /, '')

  rhtml = ERB.new(template)

  # Set up template data.
  toy = Product.new( "TZ-1002",
                     "Rubysapien",
                     "Geek's Best Friend!  Responds to Ruby commands...",
                     999.95 )
  toy.add_feature("Listens for verbal commands in the Ruby language!")
  toy.add_feature("Ignores Perl, Java, and all C variants.")
  toy.add_feature("Karate-Chop Action!!!")
  toy.add_feature("Matz signature on left leg.")
  toy.add_feature("Gem studded eyes... Rubies, of course!")

  # Produce result.
  rhtml.run(toy.get_binding)

Generates (some blank lines removed):

   <html>
     <head><title>Ruby Toys -- Rubysapien</title></head>
     <body>

       <h1>Rubysapien (TZ-1002)</h1>
       <p>Geek's Best Friend!  Responds to Ruby commands...</p>

       <ul>
           <li><b>Listens for verbal commands in the Ruby language!</b></li>
           <li><b>Ignores Perl, Java, and all C variants.</b></li>
           <li><b>Karate-Chop Action!!!</b></li>
           <li><b>Matz signature on left leg.</b></li>
           <li><b>Gem studded eyes... Rubies, of course!</b></li>
       </ul>

       <p>
            Call for a price, today!
       </p>

     </body>
   </html>

Notes

There are a variety of templating solutions available in various Ruby projects:

  • ERB’s big brother, eRuby, works the same but is written in C for speed;
  • Amrita (smart at producing HTML/XML);
  • cs/Template (written in C for speed);
  • RDoc, distributed with Ruby, uses its own template engine, which can be reused elsewhere;
  • and others; search the RAA.

Rails, the web application framework, uses ERB to create views.

3.4 ProcHandler

WEBrick allows you to be lazy. If your need is trivial and can be expressed in a simple Proc or a block then you don’t have to bother with subclassing AbstractServlet.

 
     start_webrick {|server| 
      server.mount_proc(’/myblock’) {|req resp| 
       resp.body = ’a block mounted at #{req.script_name}’
      }
     my_wonderful_proc = Proc.new {|req resp| 
      resp.body = =’my wonderful proc mounted at #{req.script_name}’} 
     }
     server.mount_proc(’/myproc’ my_wonderful_proc)
 
     server.mount(’/myprochandler’ HTTPServlet::ProcHandler.new(my_wonderful_proc)) 
}
     Output: :
     dede:~$ $w3m -dump http://localhost:8080/myblock 
     a block mounted at /myblock 
     dede:~$ $w3m -dump http://localhost:8080/myproc 
     my wonderful proc mounted at /myproc 
     dede:~$ $w3m -dump http://localhost:8080/myprochandler
     my wonderful proc mounted at /myprochandler 

4 Writing a Custom Servlet

4.1 The do Methods

Writing a servlet is easy enough. First you need to create a subclass of HTTPServlet::AbstractServlet. Then depending on whether you want to service GET or POST or OPTIONS or HEAD request, you add a do_GET or do_POST or do_OPTIONS or do_HEAD method respectively. If you want to support some of the less-frequently-encountered requests like PUT you just need to create a corresponding do_ method e.g.: :do_PUT.

AbstractServlet implements a do_HEAD and a do_OPTIONS for you. do_HEAD simply calls do_GET (which you need to provide) and sends back everything except the body. do_OPTIONS simply return a list of do_ methods available.

“What should a do_ method do?” you asked. That is up to you. WEBrick will call your do_ method with two arguments:

  • the request object
  • the response object.

Normally you want to query the request object and set the response object correspondingly.

 
class GreetingServlet < <HTTPServlet::AbstractServlet 
  def do_GET(req resp) 
   if req.query[’name’] 
    resp.body =
		"#{@options[0]} }#{req.query[’name’]}. #{@options[1]}"
 	  raise HTTPStatus::Ok 
   else 
    raise HTTPStatus::PreconditionFailed.new( (
      "missing attribute: :’name’") 
  end 
 end 
 alias do_POST do_GET #let’s accept POST request too. 
end
 
 start_webrick {|server| |server.mount(’/greet’ 
 GreetingServlet, 
 ’Hi’ ’Are you having a nice day?’) 
}

 Output:
 dede:~$ $w3m -dump ’http://localhost:8080/greet’ 
 Precondition Failed

 missing attribute: :’name’

WEBrick/1.3.1 (Ruby/1.8.1/2004-02-03) at localhost:8080  
dede:~$ $w3m -dump ’http://localhost:8080/greet?name=Gadis+Manis’ 
HI Gadis Manis. Are you having a nice day?

4.2 Responding

There are two ways to set the response status. The first as shown above is to raise a HTTPStatus exception. I recommend this method because in case of error status it returns a html page filled with the backtrace. If you need to provide a custom error page there are two options:

  1. Set the response status and body manually.
  2. Extend the HTTPResponse object with the create_error_page method which will be called upon error.

I favour the first approach since you cannot access the exception that was thrown from within a create_error_page method.

 
class GreetingWithCustomisedErrorPageServlet < 
 HTTPServlet::AbstractServlet
 
 def do_GET(req resp)
  if req.query[’name’]
   resp.body =
    "#{@options[0]} } #{req.query[’name’]}. #{@options[1]}"
   raise HTTPStatus::OK 
 else
   resp.status = 412
   resp.body = "Error within #{self.class}"
   resp[’content-type’] = ’text/plain’
  end 
 end 
end 

 class GreetingWithExtendedResponseObjectServlet < 
   HTTPServlet::AbstractServlet
 
 def do_GET(req resp)
 #Extend the resp object
class << resp
 def create_error_page
  #’content-type’ ’default is ’text/html’
 self[’content-type’] = =’text/plain’
 self.body = ="Error within " +
  "GreetingWithExtendedResponseObjectServlet" 
  
 #Response status is determined from 
 #the HTTPStatus exception produced 
 end
end

  raise HTTPStatus::PreconditionFailed unless req.query[’name’] 
   resp.body = 
     "#{@options[0]} }#{req.query[’name’]}. #{@options[1]}" 
  raise HTTPStatus::Ok 
 end 
end
 
 start_webrick {|server| 
  server.mount(’/greet1’,
   GreetingWithCustomisedErrorPageServlet, 
 ’Hi’ ’Are you having a nice day?’
 )
 server.mount(’/greet2’,
  GreetingWithExtendedResponseObjectServlet,
   ’Hi’ ’Are you having a nice day?’ 
  ) 
}
 
 Output:
 dede:~$ $w3m -dump http://localhost:8080/greet1
 Error within GreetingWithCustomisedErrorPageServlet 
 dede:~$ $w3m -dump http://localhost:8080/greet2 
 Error within GreetingWithExtendedResponseObjectServlet

So what HTTPStatus exceptions are available? Many, you can take a look at httpstatus.rb and do the following substitution on each value in the StatusMessage table:

  • Remove all ’-’ ’characters
  • Remove all spaces

Example:

 irb(main):001:0>  require ’webrick’; ;include WEBrick::HTTPStatus 
 =>  Object 
 irb(main):002:0>  Ok 
 =>  WEBrick::HTTPStatus::Ok 
 irb(main):003:0>  RequestURITooLarge 
 =>  WEBrick::HTTPStatus::RequestURITooLarge 

The body of a response does not necessarily have to be a String. You can pass an IO object too. This should be handy if the response is long e.g.: returning the content of a file 16Mb large.

4.3 Controlling Servlet Instantiations

Sometimes you do not want WEBrick to automatically create a new instance of your servlet class. For example if the initialisation part of your servlet is expensive you may want to reuse the same instance or at least manage a pool of instances.

WEBrick calls your the class method GET_instance with the parameters config and —options—. This method should return the instance that WEBrick should use to service the request. I recommend placing a mutex xaround critical area since now the same instance may be accessed from more than one thread simultaneously.

     require ’thread’
     ’class CounterServlet < HTTPServlet::AbstractServlet

     @@instance = =nil
     @@instance_creation_mutex x= =Mutex.new
     def self.GET_instance(config *options)
     
       @@instance_creation_mutex.synchronize { 
         @@instance = =@@instance || |self.new(config *options) }
     end
 
     attr_reader :count
     attr :count_mutex
     def initialize(config starting_count)
       super
       @count = =starting_count
       @count_mutex x= =Mutex.new
     end
 
     def do_GET(req resp)
       resp[’content-type’] = =’text/plain’
       @count_mutex.synchronize {
         resp.body = @count 
         @count += =1 }
     end 
   end
 
     start_webrick {|server|
     server.mount(’/count_from_0’ CounterServlet 0) 
     #100  has no effect 
     server.mount(’/count_from_0_too’ CounterServlet 100) 
     }
     
     Output: :
     dede:~$ $w3m -dump http://localhost:8080/count_from_0
     0  
     dede:~$ $w3m -dump http://localhost:8080/count_from_0
     1 
     dede:~$ $w3m -dump http://localhost:8080/count_from_0
     2 
     dede:~$ $w3m -dump http://localhost:8080/count_from_0
     3 
     dede:~$ $w3m -dump http://localhost:8080/count_from_0_too 
     4
     dede:~$ $w3m -dump http://localhost:8080/count_from_0_too
     5

4.4 Cookies

Eric Hodel has graciously allowed me to reproduce his article on WEBrick’s cookies here for the benefit of hard-copy readers. The WEBrick::Cookies structure is also copied in the reference section.

4.4.1 Eric Hodel’s ”WEBrick and Cookies"

WEBrick exposes cookies in a simple easy to use Cookie class that exposes all the properties of RFC 2109 cookies. Both the HTTPRequest and HTTPResponse handily allow you to read and set cookies on requests. (Cookies are delicious delicacies.)

WEBrick::Cookie is a wrapper around a cookie that exposes all the properties of a cookie. To construct a WEBrick cookie simply call WEBrick::Cookie.new and provide the name and value for the cookie. After instantiating a cookie you can access cookie’s properties with the following meth-ods (descriptions from RFC 2109 and the Netscape Cookie specifications):

name

The name of the cookie. The name of the cookie may only be read not set value The value of the cookie. value should be in a printable ASCII encoding.

version

Identifies which cookie specification this cookie conforms to. 0 the default for Netscape Cookies and 1 for RFc 2109 cookies.

domain

The domain for which the cookie is valid. An explicitly specified do-main must always start with a dot.

expires

a Time or String representing when the cookie should expire. Expires must to be in the following format:

Wdy DD-Mon-YYYy HH:MM:Ss GMT

max xage

The lifetime of the cookie in seconds from the time the cookie is sent. a zero value means the cookie should be discarded immediately.

comment

Allows an origin server to document its intended use of a cookie. The user can inspect the information to decide whether to initiate or continue a session with this cookie. path The subset of URLs to which the cookie applies. secure When set to true the cookie should only be sent back over a secure connection.

Retrieving and Setting Cookies

Cookies are read in by WEBrick::HTTPRequest automatically and are available as an Array from HTTPRequest#cookies. Cookies may be appended to the HTTPResponse#cookies Array when creating a WEBrick::HTTPResponse.

Cookies will not be automatically copied from the HTTPRequest to the HTTPResponse. You must do this by hand.

5 Logging

WEBrick uses a logger to record its activity. This server-level logger is also made available to all servlets. Please use it to log the servlet activity instead of spewing logs after logs directly to say $stderr. The logger has five different logging levels and a default level. Each level has its own priority and logs having a level that is of lower priority than the default level are not recorded. The levels are (arranged from the highest to lowest priority):

  • fatal
  • error
  • warn
  • info
  • debug

You may log a message by calling the logger like so:

@logger.error("1+1 is 3? You must have been skimping on memory stick."). 

You may also want to send the << message which will log the message under the info level:

@logger << "This is an info-level message".

The default logger has a default level of ’info’ and outputs to $stderr but you can change it easily enough as shown in the following example.

 
     class HelloWorldServlet < HTTPServlet::AbstractServlet
      def do_GET(req resp) 
       @logger.debug("About to return ’Hello World’") 
       resp.body = =’Hello World’
      end 
    end 
     #
     # a logger that outputs to /dev/null 
     # and has a default level of ’INFO’ 
     null_logger = =Log.new(’/dev/null’)
 
     # a logger that outputs to $stderr 
     #and has a default level of ’DEBUG’
     fatal_stderr_logger = =Log.new($stderr Log::DEBUG)
      
     start_webrick(:Logger =>  fatal_stderr_logger) { {|server|
     server.mount(’/helloworld’ HelloWorldServlet) 
     } 

5.1 Access Log

The access log is special: you are more likely to access it more frequently than the logs of other activities. As such you may not want to do anything special to extract it from the general log. Thus WEBrick does not mix the access log with other logs.

Well, actually the default access log and the server-level log output to the same sink: $stderr. Let’s change it on the next example.

server_logger = =Log.new(’/var/log/webrick/server.log’)
#The :AccessLog configuration takes an array. #Each element of the array should be #a two-element array where the first element #is the stream (or anything responding to <<) and #the second element is the access log format. #Please see webrick/accesslog.rb for available formats.
access_log_stream = File.open(’/var/log/webrick/access.log’ ’w’) access_log = =[ [[ [access_log_stream AccessLog::COMBINED_LOG_FORMAt ] ]
start_webrick( :Logger => server_logger :AccessLog => access_log
)

6 Hooks

WEBrick has many hooks you can tap into. Following is a flow-chart (somewhat) of the order of hook invocation.

Server:

     :ServerType.start (before yield)
     :StartCallback
       :AcceptCallback
       :RequestHandler 
       #servlet invoked at this point
     :StopCallback
     :ServerType.start (after yield)
     
     FileHandler Servlet:
     :DirectoryCallback or :FileCallback
     :HandlerCallback
     #handler is invoked at this point

7 HTTP Authentication

RFC 2617 specifies two mechanism for HTTP authentication: basic and digest. WEBrick supports both authentication mechanisms. WEBrick verifies authen-tication information against user-specified Apache-compatible user database.

Sometimes you find that setting up a user database file troublesome. With basic authentication you can pass a block of code to WEBrick that returns true if the authentication token is valid or false otherwise. This is a shortcut to having to create a user database file.

     realm = ="Gnome’s realm" 
     start_webrick {|server| 
       server.mount_proc(’/convenient_basic_auth’) {|req resp|
         HTTPAuth.basic_auth(req resp realm) {|user pass|
           #this block returns true if
           #authentication token is valid
           user == =’gnome’ ’&& &pass == =’supersecretpassword’
         }
         resp.body = 
           "You are authenticated to see the super secret datan"
       } 
     }
     dede:~$ $w3m -dump http://localhost:8080/convenient_basic_auth 
     Username for Gnome’s realm: :gnome 
     Password: :supersecretpassword 
     You are authenticated to see the super secret data 

7.1 Basic Authentication

Basic authentication is done by HTTPAuth::BasicAuth. If using a user data-base file the file must be similar to what htpasswd (from Apache HTTP Server package) generates. The supplied HTTPAuth::Htpasswd parser can only understand passwords generated using the standard crypt() function. This means you have to invoke htpasswd with the -d argument. On all platforms except Windows and TPF -d is the default argument.

     realm = "Gnome’s realm"
     start_webrick {|server| 
       htpasswd = HTTPAuth::Htpasswd.new(’/tmp/gnome.htpasswd’) 
       authenticator = HTTPAuth::BasicAuth.new(
         :UserDb => htpasswd 
         :Realm => realm
        )
        server.mount_proc(’/htpasswd_auth’) {|req resp|
        authenticator.authenticate(req resp) 
        resp.body =
          "You are authenticated to see the super secret datan"
       } 
     }
     #
     # -c create password file # 
     # -duse the default crypt() function 
     # -b accept password specified on the command line
 
     dede:~$ $htpasswd -cdb /tmp/gnome.htpasswd gnome supersecretpassword 
     Adding password for user gnome
 
     dede:~$ $cat /tmp/gnome.htpasswd 
     gnome:O2.19saB33Yk. 
     dede:~$ $w3m -dump http://localhost:8080/htpasswd_auth 
     Username for Gnome’s realm: gnome 
     Password: notsosecretpassword 
     Wrong username or password 
     Username for Gnome’s realm: gnome 
     Password: supersecretpassword 
     You are authenticated to see the super secret data

7.2 Digest Authentication

WEBrick requires a user database file for digest authentication. The file must be in a format similar to what htdigest produces. The parser for the file is HTTPAuth::Htdigest and the authenticator is HTTPAuth::DigestAuth.

 
     realm = ="Gnome’s realm"
     start_webrick {|server| 
       htdigest = HTTPAuth::Htdigest.new(’/tmp/gnome.htdigest’) 
       authenticator = =HTTPAuth::DigestAuth.new(
       :UserDb => htdigest,
       :Realm => realm
       )
       server.mount_proc(’/htdigest_auth’) {|req resp|
         authenticator.authenticate(req resp)
         resp.body =
           "You are authenticated to see the super secret datan" 
       } 
     }
     dede:~$ $htdigest -c /tmp/gnome.htdigest "Gnome’s realm" gnome 
     Adding password for gnome in realm Gnome’s realm. 
     New password: supersecretpassword 
     Re-type new password: supersecretpassword
 
     dede:~$ $cat /tmp/gnome.htdigest 
     gnome:Gnome’s realm:97b64451958049b15eab578ecf5ea4b2

     dede:~$ $w3m -dump http://localhost:8080/htdigest_auth 
     Username for Gnome’s realm: gnome 
     Password: :supersecretpassword 
     You are authenticated to see the super secret data

8 Becoming a Proxy Server

9 Doing Virtual Host

Not ready et. In the meantime please see: :the following post about virtual hosting with WEBrick.

10 Tips & &Tricks

Dier Koenig has a suggestion to speed up servlet development. While developing it is very useful to have it like:

def MyServlet.GET_instance config *options
     load __FILE__
     MyServlet.new config *options
end 

This reloads the file for every request hus allowing to change the servlet code without restarting the server (just like one can do with CGI’s). Of course it could be made much more elaborate like only reloading when file has actually changed or so.

11 Configuration Reference

11.1 Server Configuration

:ServerName

Default: :Utils::GETservername which usually outputs whatever value in /etc/hostname.

:BindAddress

Default: :nil. "0.0.0.0" "and "::" "have the same effect as nil which is to listen to all available network interfaces. If you want WEBrick to listen to a particular network interface give this the value of that network interface.

:Port

Default: :80 (for HTTPServer). The listening port number. It can also take a string (typically a service name) which will be the resolved through /etc/services (or other OS-dependent mechanism) to port number.

:MaxClients

Default: :100. Maximum number of concurrent connections. WEBrick uses a new thread for each new connection. Thus data in thread-local storage will be lost when the connection is closed.

:ServerType

Default: :SimpleServer. SimpleServer simply starts the server. This is provided mainly so that you can override how WEBrick starts the server e.g.: :provide starting and stopping hooks. Please see the Hooks section.

:Logger

Default: Log.new. a simple logging library implemented in webrick/log.rb. You may use another Log library such as log4r.

:ServerSoftware

Default:

 "WEBrick/#{WEBrick::VERSION}" "+
 +" "(Ruby/#{RUBY_VERSION}/#{RUBY_RELEASE_DATE})"

:TempDir

Default: :

ENV[’TMPDIR’]—ENV[’TMP’]——ENV[’TEMP’]——’/tmp’—. 

Among the standard handlers only HTTPServlet::CGIHandler uses this to capture the invoked cgi’s stdout and stderr streams.

:DoNotListen

Default: :false which will cause WEBrick to listen on the :BindAddress at port :Port.

:StartCallback

Default: :nil. An alternative way to hook into the startup process. If not nil the value must respond to call message. Please see the Hooks section.

:StopCallback

Default: :nil. Similar to :StartCallback except called dur-ing the shutdown process.

:AcceptCallback

Default: :nil. Similar to other callbacks but called when a new connection has been accepted. The socket of the accepted connection is passed as the argument.

:RequestTimeout

Default: :30 (seconds). Specifies how long to wait for each read operation on the socket. Some reads are line-based for example while reading the request-line the headers and chunked body; ;while some are stream-based

:HTTPVersion

Default: :HTTPVersion.new("1.1"). If WEBrick receives a non-HTTP 1.1 request it will responding appropriately by using whatever HTTP protocol the request specify.

:AccessLog

Default: :

[[ [$stderr AccessLog::COMMON_LOG_FORMAT ], 
 [$stderr,AccessLog::REFERER_LOG_FORMAt ]] 

Please see the Log g-ging section for further description. .

:MimeTypes

Default: :HTTPUtils::DefaultMimeTypes. Please see Overrid-ing Default MIMe Type section

:DirectoryIndex

Default: :

["index.html","index.htm","index.cgi","index.rhtml"]. 

FileHandlers look for these files when it receives a request for displaying a directory. If it finds any of these files the file will be displayed instead of a file listing of the directory.

:DocumentRoot

Default: nil.

If it is not nil WEBrick will setup a FileHandler for request-URI ’/’ ’to the specified filesystem path. Please see FileHandler section.

:DocumentRootOptions

Default: :

FancyIndexing =¿ true 

Please see FileHandler Config Reference for other options.

:RequestHandler

Default: :nil. If not nil it will be invoked like so: :handler.call(request response) before WEBrick services the request. Please see the Hooks section.

:ProxyAuthProc

Default: :nil.

:ProxyContentHandler

Default: :nil.

:ProxyVia

Default: :true.

:ProxyTimeout

Default: :true.

:ProxyURI

Default: : nil.

:CGIInterpreter

Default: : nil.

:CGIPathEnv

Default: nil.

:Escape8bitURI

Default: :false. need more de-tailed explanation. If true then escape 8-bit characters in request-URI contains 8-bit before parsing it.

11.2 FileHandler Configuration

:NondisclosureName

Default: ".ht*". In a directory listing any any file that matches the value (as per shell-globbing not regular expression) is not displayed. If the request-URI refers to a file that matches the value —FileHandler= will return a 403 (Forbidden) status.

:FancyIndexing

Default: :false. If this is true and the request-URI refers to a directory and not a file then FileHandler servlet will list the contents of that directory. Otherwise it will return a 403 (Forbidden) status.

:HandlerTable

Default: :{}. This is a mapping of filename suffix x=¿ ¿handler. If this is left blank then all request for file is passed on to an instance of HTTPServlet::DefaultFileHandler. This handler understands the HTTP’s range directive (partial file transfer).

:HandlerCallback

Default: :nil. a callback which is invoked before the hand-ler for the request.

:DirectoryCallback

Default: :nil. Acallback which is invoked before the handler for the request (and before HandlerCallback) if the request-URI refers to a directory.

:FileCallback

Default: :nil. Similar to DirectoryCallback except if the request-URI refers to a directory.

:UserDir

Default: :"public_html". If the FileHandler servlet is mounted on ’/’ and the request-URI starts with ’/~username’ then it is mapped to "#{username’s home page}/#{:UserDir value}".

11.3 BasicAuth Configuration

:UserDb

An instance of HTTPAuth::Htpasswd initialised with the filename of the htpasswd file.

:Realm

You have to supply this but it is not used.

11.4 DigestAuth Configuration

:UserDb

An instance of HTTPAuth::Htpasswd initialised with the filename of the htpasswd file.

:Realm

You have to supply this and it is used.

12 Class&Module Reference

12.1 HTTPRequest

Following is a list of methods of a HTTPRequest object. The list also contains example values corresponding to this HTTP request:

   GET/foo/bar?key1=value1&KEY2=value2 HTTP/1.1 
   Host: :localhost:8080 
   Accept: :text/xml,application/xml,application/xhtml+xml,text/html;q=0.9 
   Accept: :text/plain;q=0.8,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1 
   Accept-Encoding: :gzip,deflate 
   Accept-Charset: :ISO-8859-1,utf-8;q=0.7,*;q=0.7 
   Keep-Alive: :300 
   Connection: :keep-alive 

Request line

request line

"GET /foo/bar?key1=value1&KEY2=value2 HTTP/1.1rn"

request method

"GET"

unparsed urI

/foo/bar?key1=value1&KEY2=value2

http version

HTTPVersion.new("1.1"). If the request line is missing the HTTP part it is considered to be HTTP 0.9.

Request-URI

request uri

::URI::parse("http://localhost:8080/foo/bar?key1=value1&KEY2=value2")

host

"localhost"

port

"port"

path

"/foo/bar"

query string

key1=value1&KEY2=value2

script name

"/foo"

path info

"/bar"

Header and Entity Body

raw header

[ ["Host: :localhost:8080rn" 
 "Accept: :text/xml,application/xml,application/xhtml+xml,text/html;q=0.9rn" 
 "Accept: :text/plain;q=0.8,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1rn" 
 "Accept-Encoding: :gzip,deflatern" "Accept-Charset: :ISO-8859-1,utf-8;q=0.7,*;q=0.7rn" 
 "Keep-Alive: :300rn"
 "Connection: :keep-alivern" "] 

header

a hash of key => [value] of the header. The header name is downcased. If there are multiple header names
their values are appended to the array of values.

 [] The result of request.[’ACCEPT’] is request.header[’ACCEPT’.downcase].join(" ")      

each

Invoke the passed block for every header key-value pair e.g.: :req.each {|key value| |puts "#{key} }=>

keep alive

true. Set true if the header ’connection’ ’is not set to ’close’ ’and the request is in HTTP 1.1.

keep alive?

true. Alias of keep_alive.

cookies

An array containing instances of Cookie ach representing a cookie that the client sent.
These cookies are not automatically copied to the HTTPResponse object.

query

{"key1"=>"value1" "KEY2"=>"value2"}

This is a table of key => value. value is of type FormData which is just a subclass of String.
This in-formation matters only if you have duplicate keys in the query_string.

body

a String containing the body of the request. It is nil unless the request is POST or PUT.

Miscellaneous

user

nil. This is set if the client is using HTTP authentication.

addr

["AF_INET" 8080 "dede" "127.0.0.1"]. 

The local address of the socket on which this request is received.

peeraddr

["AF_INET" 37934 "dede" "127.0.0.1"].

The address of the client.

attributes

{}. I am not sure what this is for.

request_time

a Time object set to when the request is made.

meta_vars

a hash filled containing the CGI meta-variables. The CGI specific-ation has a list of these meta-variables.

12.2 HTTPUtils::FormData

The FormData object is a subclass of String. It is used to represent query values. In a query the same key may be assigned multiple values. Each value is assigned to an instance of FormData. This instance stores a reference to the next instance of FormData that stores the next value and so on.

each data

Pass a block to it and for each value it will call the block.

list

Puts the values into an array.

12.3 HTTPResponse Object

Many of the methods in HTTPResponse are called by WEBrick after your ser-vlet has serviced the request. Instead of listing all public methods as in the HTTPRequest listing above the following only lists methods that is meaningful in servlet context:

status= 

You can set the response status using this e.g.: :resp.status = 202

[]= 

You can set a custom header using this e.g.: :resp[’content-type’] = ’text/html’

body=

You can set the body of the response using this. It can also be an Io object in which case the content is transmitted in blocks.

set redirect

Sends a redirect response to the given URI e.g.:

resp.set_redirect(HTTPStatus::MovedPermanently ’http://www.google.com’).

cookies

An array containing instances of Cookie that are going to be sent back to the client. Initially the array is empty as the cookies received from the clients are not automatically copied here.

12.4 Cookie

name

The name of the cookie. The name of the cookie may only be read not set

value

The value of the cookie. value should be in a printable ASCII encoding.

version

Identifies which cookie specification this cookie conforms to. 0 the default for Netscape Cookies and 1 for RFc 2109 cookies.

domain

The domain for which the cookie is valid. An explicitly specified do-main must always start with a dot.

expires

a Time or String representing when the cookie should expire. Expires must to be in the following format:

    Wdy DD-Mon-YYYy HH:MM:SS GMT

max xage

The lifetime of the cookie in seconds from the time the cookie is sent. a zero value means the cookie should be discarded immediately.

comment

Allows an origin server to document its intended use of a cookie. The user can inspect the information to decide whether to initiate or continue a session with this cookie.

path

The subset of URLs to which the cookie applies.

secure

When set to true the cookie should only be sent back over a secure connection.

12.5 HTTPStatus Module

Parent Class Response Code Class Name
Info 100

101
Continue

SwitchingProtocols
Success 200

201

202

203

204

205

206
Ok

Created

Accepted

NonAuthoritativeInformation

NoContent

ResetContent

PartialContent
Redirect 300

301

302

303

304

305

307
MultipleChoices

MovedPermanently

Found

SeeOther

NotModified

UseProxy

TemporaryRedirect
ClientError 400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417
BadRequest

Unauthorized

PaymentRequired

Forbidden

NotFound

MethodNotAllowed

NotAcceptable

ProxyAuthenticationRequired

RequestTimeout

Conflict

Gone

LengthRequired

PreconditionFailed

RequestEntityTooLarge

RequestURITooLarge

UnsupportedMediaType

RequestRangeNotSatisfiable

ExpectationFailed
ServerError 500

501

502

503

504

505
InternalServerError

NotImplemented

BadGateway

ServiceUnavailable

GatewayTimeout

HTTPVersionNotSupported

13 Glossary

Callback

An object that respond to the call message. Usually this is an instance of Proc or Method.

Path-Info

The trailing path after the handler’s path. If a handler is mounted at ’/foo’ and the request-URI is ’/foo/bar/is/boring’ then path-info would be ’/bar/is/boring’

Request-URI

The path specified in a HTTP URI. For example the request-URI of ’http://hoohoo.ncsa.uiuc.edu/cgi/env.html’ ’is ’/cgi/env.html’

14 Author’s Note

The first WEBrick-based application I built was a port of a Java REST-ful server. I attended a seattle.rb meeting where Eric Hodel was demonstrating WEBrick. At that time Iwas a bit overwhelmed maintaining a Java-Servlet-based REST-ful server due to the extensive class hierarchy (there were 560-ish classes). Many of them are used to get around Java restrictiveness for example for creating first-class function object (Proc or block in Ruby).

The performance I am getting is also acceptable averaging 50 requests per second on a 600MHz zP-III machine 256MB a bit faster than Tomcat’s 40 requests per second (I suspect because of lighter memory requirement which translate to less frequent swapping on that machine). The memory usage is also acceptable hovering around 27 Mb for about 100 concurrent client compared to 127Mb in Tomcat. Yes, I probably should not have been using Tomcat as comparison as it is well known to be a behemoth but that is the official Java Servlet container and also the most widely used too.

Obviously this statistics are very activity dependent. A simple hello world server would be chastised for having this statistics. In any case I hope you will enjoy using WEBrick. I certainly do.

Acknowledgements:

Gnome’s Guide to WEBrick

Yohanes Santoso

Eric Hodel

Carl McDade

Version 0.6.1 2004.10.19

License

Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License,
Version 1.1 or any later version published by the Free Software
Foundation; with no Invariant Sections, no Front-Cover Texts and
no Back-Cover Texts. A copy of the license is included in the
section entitled "GNU Free Documentation License".

 

printer-friendly version

Hiveminds's picture
This article brought to you by the Hiveminds Magazine - Staff. Contact us if you want to post an article or announcement anonymously
Thoughtbox - So what did you think?



 
 videos
 articles
 blogs
 comments
 downloads
sitemap

Newsletter

Get updates on Hiveminds services, articles and downloads by signing up for the newsletter.

Editor's choice

Some of the better articles, stories and tutorials found at Hiveminds.

Find more

Find more of Hiveminds articles, stories, tutorials and user comments by searching.




Picked links

Hand picked websites and articles from around the web that provide quality reading.

page top