From http request to dynamically rendering a webpage

Author: Admin
August 1, 2013
From http request to dynamically rendering a webpage

Overview

Emil Stenstrom put together an excellent high-level overview of the rendering process in the context of a modern web application; Now let’s go through an edited and updated version by Stanford.

1. You begin by typing a URL into address bar in your preferred browser or clicking a link.

2. The browser parses the URL to find the protocol, host, port, and path.

3. If HTTP was specified, it forms a HTTP request.

4. To reach the host, it first needs to translate the human readable host into an IP address,and it does this by doing a DNS lookup on the host.

5. Then a socket needs to be opened from the user’s computer to that IP address, on the port specified (most often port 80 for HTTP).

6. When a network connection is open, the HTTP request is sent to the host. The details
of this connection are specified in the 7-layer OSI model.

7. The host forwards the request to the server software (most often Apache) configured to
listen on the specified port.

8. The server inspects the request (most often only the path), and the subsequent behavior
depends on the type of site.

For Static content:

– The HTTP response does not change as a function of the user, time of day, geographic location, or other parameters (such as HTTP Request headers).
– In this case a fast static web server like nginx can rapidly serve up the same HTML/CSS/JS and binary files (jpg, mp4, etc.) to each visitor.
– Academic webpages are good examples of static content: the experience is the same for each user and there is no login.

For Dynamic content:

– The HTTP response does change as a function of the user, time of day, geographical
location, or the like.
– In this case you will usually forward dynamic requests from a web-server like nginx (or Apache) to a constantly running server-side daemon (like mod_wsgi hosting Django or node.js behind nginx), with the static requests intercepted and returned by nginx (or even before via caching layers).
– The server-side web framework you use (such as Python / Django, Ruby / Rails, or node.js / Express) gets access to the full request, and starts to prepare a HTTP response. A web framework is a collection of related libraries for working with HTTP responses and requests (and other things); for example, here’s the Express request and response APIs, which allow programmatic manipulation of requests/responses as Javascript objects.
– To construct the HTTP response a relational database is often accessed. You can think of a relational database as a set of tables similar to Excel tables, with links between rows (example database schema).
– While sometimes raw SQL is used to access the database, modern web frameworks allow engineers to access data via so-called Object-Relational Mappers (ORMs), such as sequelize.js (for node.js) or the Django ORM (for Python).
The ORM provides a high-level way of manipulating data within your language after defining some models. (Example models / instances in sequelize.js).
– The specific data for the current HTTP request is often obtained via a database search using the ORM (example), based on parameters in the path (or data) of the request.
– The objects created via the ORM are then used to template an HTML page (server-side templating), to directly return JSON (for usage in APIs or clientside templating ), or to otherwise populate the body of the HTTP Response.
This body is then conceptually put in an envelope with HTTP Response headers as metadata labeling that envelope.
– The web framework then returns the HTTP response back to the browser.

9. The browser receives the response. Assuming for now that the web framework used server-side templating and return HTML, this HTML is parsed. Importantly, the browser must be robust to broken or misformatted HTML.

10. A Document Object Model (DOM) tree is built out of the HTML. The DOM is a tree structure representation of a webpage. This is confusing at first as a webpage may look planar rather than hierarchical, but the key is to think of a webpage as composed of chapter headings, subsections, and subsubsections (due to HTML’s ancestry as a
descendant of SGML, used for formatting books).

11. All browsers provide a standard programmatic Javascript API for interacting with the DOM, though today most engineers manipulate the DOM through the cross-browser JQuery library or higher-level frameworks like Backbone. Compare and contrast JQuery with the legacy APIs to see why.

12. New requests are made to the server for each new resource that is found in the HTML source (typically images, style sheets, and JavaScript files). Go back to step 3 and repeatfor each resource.

13. CSS is parsed, and used to annotate each node in the DOM tree with style information on how it should render. CSS controls appearance.

14. Javascript is parsed and executed, and DOM nodes are moved and style information is updated accordingly. That is, Javascript controls behavior, and the Javascript executed on page load can be used to move nodes around or change appearance (by updating or setting CSS styles).

15. The browser renders the page on the screen according to the DOM tree and the final style information for each node.

16. You see the webpage and can interact with it by clicking on buttons or submitting forms.Every link you click or form you submit sends another HTTP request to a server, and the process repeats.

To recap, this is a good overview of what happens on the server and then the client (in this case the browser) when you navigate to a URL. Quite a miracle, and that’s just what happens when you click a link.