by Steven J. Owens (unless otherwise attributed)
The Servlet Specification is pretty readable. I highly recommend getting it and reading it.
http://jcp.org/aboutJava/communityprocess/final/jsr154/index.html
A servlet engine gives you three things.
First, it gives you a set of basic tools for dealing with HTTP - parsing HTTP requests into objects, and generating HTTP responses.
Second, it gives you the concept of a servlet itself - you map a URL to a servlet class, the servlet engine instantiates that class the first time a request comes in to that URL, and it keeps that class instance around thereafter, to deal with subsequent requests.
Third, the servlet engine gives you a handy framework to handle a lot of the stuff that you have to have for a web application, like server-side sessions, user authentication, filters, the request dispatcher, etc.
If serious programmers often run into the problem of reinventing HTTP, inexperienced programmers often make the mistake of trying to cram too much into a servlet. It's important to keep in mind that the servlet is just an entry point into the JVM, and you can have all sorts of things happen thereafter.
Another mistake that serious programmers trip over is that the servlet itself is only instantiated once; effectively it's a singleton and it has to be thread-safe. You have to build multi-thread-safe ways for the servlet to interact with your application as necessary. If servlets were designed "properly" in an OO sense, you'd be issued a new instance of a servlet class for each incoming request. But that's not the way they're designed.
(Caveat: each servlet is instantiated once per JVM; in clustered servlet engine environments, you have multiple JVMs, so you have multiple instances. This is generally true of singletons as well; if you really need a singleton in a full-blown multi-JVM java application server context, then you really need to use some of the larger J2EE concepts to implement it.)
(Another Caveat: in theory the servlet spec allows for servlet instances to be unloaded and reloaded, but in practice I've never seen that actually used. There are servlet methods you override to handle the saving and reloading of any servlet state in such a case, but pretty much every best practice steers you away from having much, if any, state in the servlet anyway. State should go in the servlet engine's session manager or in the database.)
Here's a basic step by step of what happens when a servlet is used:
1) The servlet engine listens for incoming network connections.
2) When a connection comes in, the servlet engine parses the data sent over the connection - the HTTP request, which may include parameters - as an HTTPServletRequest object.
3) Once the servlet engine has parsed the data, the engine looks at the file that the HTTP request is for.
4) If that file name is listed in the servlet engine configuration file as a servlet, the servlet engine checks to see if the servlet class has been instantiated yet.
5) If the servlet class for that servlet has not been instantiated, the servlet engine instantiates it. This is a class that's descended from HttpServlet.
6) After the servlet engine instantiates the servlet class, the engine invokes the servlet class init() method, so the servlet can do any setup work necessary. If it ever unloads the class (when shutting down, for example) the servlet engine will call the servlet's destroy() method, to give it a chance to clean house, close database connections, etc.
7) After the servlet init() method returns, the servlet engine invokes the servlet doGet method, with the HTTP request object as a parameter.
The servlet can do anything a normal java program can do, except that, because it is running on a separate machine, it cannot display a GUI to the user. Any user interaction has to take place within the HTTP request and response.
The other major limitation of a servlet - and of the HTTP protocol - is that it has to rely on the client - the browser, or a special HTTP client you write - to send each request, to initiate each exchange. It has no way to proactively push data to the client (though there are some tricks you can do at the browser level to make sure the client keeps requesting updates, without needing the user to keep clicking reload).
For example, there is no way for the servlet engine to know if the user shuts down the browser, unless the user first clicks on a "log out" link or button. The browser does not usually keep a connection open to the server. There is no way, without going outside the HTTP protocol, for the servlet to open a connection to the browser without the browser first opening a connection to the servlet.
8) When the servlet engine invokes the servlet doGet() method, it also passes a HttpServletResponse object as a parameter, in addition to the request object. The servlet uses the response object to format the response and send it back.
Specifically, the response object has an OutputWriter in it, that the servlet uses to write to the output stream. Note that, because HTTP requests are MIMEs, the headers go first. Once you start writing body content to the output stream, any headers you've set on the response object are written out, then a blank line separating the headers from the body, then the body. You can fiddle with this by using response.setBufferSize() to tell the response to buffer the output. I don't like this, I prefer to be more deliberate and conscious about buffering, constructing the output I'm going to send back and then writing it all to the OutputWriter at once.
But sooner or later, we get to the next step:
9) The response object writes the response - headers and body - back across the still-open network connection.
10) The servlet engine closes the network connection.
The elements of the servlet framework are:
As of servlet spec 2.3, everything in a suite of servlets, JSPs and affiliated data and java files is gathered together in a "web application", a formalized hierarchy of directories, config files, classes, web-visible data files, etc.
One of the most important files is the web.xml file, which contains configuration entries for most of the important parts of your wegb application, including URL mappings - this URL gets this servlet, and so forth. In fact, one of the more annoying bits of the web application spec is that you're not only able, but are required to specify URL mappings for your servlets and everything else. This gives you a heck of a lot of power, of course, which was the idea. It also adds a considerable amount of config file tweaking before you can get anything done, which I find annoying. JSPs are somewhat exempt from this - they get automatically mapped.
Each user has a session, which you can get at via request.getSession(), which gives you an instance of HttpSession. The session is basically a Hashtable. You can setAttribute("label", object) and getAttribute("label", object).
When the browser first loads a page from the servlet engine, the engine sets a JSESSIONID cookie, which contains a unique value to identify that user's session data thereafter. All of this is hidden from your servlet, you don't have to worry about it, just call getSession().
You can also get and set attributes on the request itself. This is pretty pointless by itself, since the request object only hangs around for the duration of this request and response, but it comes in handy when you forward a request off to another servlet for further handling. (A forward is like a redirect, but it all happens behind the scenes at the servlet engine, invisble to the browser).
You can also set and get attributes on the ServletContext, which are available application-wide, not just for this user.
Take a file full of HTML tags. Rename it, from foo.html to foo.jsp. Stick it in the public area of a webapp. Congratulations, you've just written your first JSP.
JSP is basically a way to generate brute-force, presentation-oriented servlets fairly quickly and easily.
JSP is the opposite of a templating scheme. In a templating scheme, you have a template data file that contains the static, boilerplate tags, with a few special placeholders. Some sort of template processor reads in the template file, gets the data, interpolates the data where the placeholders are, and now you have a finished HTML page.
JSP is not my favorite technology, but it is the sun-endorsed standard, and it is easier than lots of print statements in hand-coded servlets. The main things I don't like about JSP are that:
I have come to the conclusion that using JSP is tolerable, but only if you keep your use of it extremely shallow. See MVC and Model 2, below.
At its simplest, a javabean is a brainless class with simply a bunch of instance variables and getters/setters. More complex javabeans have more logic, but in essence they still present the appearance of simplicity to the programs that interact with them.
In a technical sense, a javabean is simply a java class that follows a set of conventions for how methods are named. The resulting class conforms to a standard, and there are various tools and APIs out there that know how to manipulate and use classes that conform to that standard.
For the full and actual details, go read the javabean spec:
http://java.sun.com/products/javabeans/docs/spec.html
The standard itself is pretty simple - essentially there needs to be a zero-arguments constructor and the instance variable getter/setter methods all need to follow the format:
SomeType someVariable; public SomeType getSomeVariable() ; public void setSomeVariable(SomeType)
Note that the actual instance variable name "someVariable" starts with a lowercase letter, but the method names that first letter gets capitalized. This inconsistency occasionally trips people up.
The two probably most commonly used tools/APIs that use javabeans are Swing and JSP/servlets. The servlet API per se doesn't actually do much with javabeans, but JSP has a variety of tags to make life easier when working with javabeans, and the new JSP expression language also has stuff for working with javabeans properties (which is what an instance variable is called in javabean land).
You hear a lot of stuff about MVC (Model-View-Controller) and Model 2 (which is a nickname for the MVC approach to servlets/JSP development).
While these are good ideas, sometimes the hype gets a bit higher than my hip boots. Basically they boil down to good design, which a competent developer didn't really need explained to him or her. However, what might be useful is a brief explanation of the customary approach to implementing MVC in the servlet world.
The MVC approach is to use javabeans for the model, JSPs for the view, and a servlet for the controller. Specifically:
Any given rendered web page has an HTML form that submits to the One True Servlet (the controller).
The form includes a parameter that the OneTrueServlet uses to decide what to do, what business logic to run, what data to get. This produces some output data, which the OneTrueServlet sticks in a javabean -- usually whatever logic returns the results in the form of a javabean.
The OneTrueServlet then attaches that javabean to the request as a request attribute, and server-side redirects (using the RequestDispatcher) the request off to a JSP.
The JSP is a very shallow, presentation-oriented JSP. It essentially consists of HTML tags, with the occasional JSP tag to pull data out of the javabean. Any complex logic should either be off behind the OneTrueServlet, or in a custom JSP tag (see taglib).
This JSP, in turn, has a form that submits to the OneTrueServlet.
Continued in Part 4