A 30-Second Intro to Server-Side Web Programming
While Web programming today typically involves quite a number of software frameworks and different protocols, the core bits of Web programming haven’t changed much since they were invented in the early 1990s. For simple applications, such as the one you’ll write in Chapter 29, you need to understand only a few key concepts, so I’ll review them quickly here. Experienced Web programmers can skim or skip the rest of this section.1
To start, you need to understand the roles the Web browser and the Web server play in Web programming. While a modern browser comes with a lot of bells and whistles, the core functionality of a Web browser is to request Web pages from a Web server and then render them. Typically those pages will be written in the Hypertext Markup Language (HTML), which tells the browser how to render the page, including where to insert inline images and links to other Web pages. HTML consists of text marked up with tags that give the text a structure that the browser uses when rendering the page. For instance, a simple HTML document looks like this:
<html>
<head>
<title>Hello</title>
</head>
<body>
<p>Hello, world!</p>
<p>This is a picture: <img src="some-image.gif"></p>
<p>This is a <a href="another-page.html">link</a> to another page.</p>
</body>
</html>
Figure 26-1 shows how the browser renders this page.
Figure 26-1. Sample Web page
The browser and server communicate using a protocol called the Hypertext Transfer Protocol (HTTP). While you don’t need to worry about the details of the protocol, it’s worth understanding that it consists entirely of a sequence of requests initiated by the browser and responses generated by the server. That is, the browser connects to the Web server and sends a request that includes, at the least, the desired URL and the version of HTTP that the browser speaks. The browser can also include data in its request; that’s how the browser submits HTML forms to the server.
To reply to a request, the server sends a response made up of a set of headers and a body. The headers contain information about the body, such as what type of data it is (for instance, HTML, plain text, or an image), and the body is the data itself, which is then rendered by the browser. The server can also send an error response telling the browser that its request couldn’t be answered for some reason.
And that’s pretty much it. Once the browser has received the complete response from the server, there’s no communication between the browser and the server until the next time the browser decides to request a page from the server.2 This is the main constraint of Web programming—there’s no way for code running on the server to affect what the user sees in their browser unless the browser issues a new request to the server.3
Some Web pages, called static pages, are simply HTML files stored on the Web server and served up when requested by the browser. Dynamic pages, on the other hand, consist of HTML generated each time the page is requested by a browser. For instance, a dynamic page might be generated by querying a database and then constructing HTML to represent the results of the query.4
When generating its response to a request, server-side code has four main pieces of information to act on. The first piece of information is the requested URL. Typically, however, the URL is used by the Web server itself to determine what code is responsible for generating the response. Next, if the URL contains a question mark, everything after the question mark is considered to be a query string, which is typically ignored by the Web server except that it makes it available to the code generating the response. Most of the time the query string contains a set of key/value pairs. The request from the browser can also contain post data, which also usually consists of key/value pairs. Post data is typically used to submit HTML forms. The key/value pairs supplied in either the query string or the post data are collectively called the query parameters.
Finally, in order to string together a sequence of individual requests from the same browser, code running in the server can set a cookie, sending a special header in its response to the browser that contains a bit of opaque data called a cookie. After a cookie is set by a particular server, the browser will send the cookie with each request it sends to that server. The browser doesn’t care about the data in the cookie—it just echoes it back to the server for the server-side code to interpret however it wants.
These are the primitive elements on top of which 99 percent of server-side Web programming is built. The browser sends a request, the server finds some code to handle the request and runs it, and the code uses query parameters and cookies to determine what to do.