Dr. Greg Bernstein
Updated March 18th, 2021
From RFC7230:
HTTP is a stateless request/response protocol that operates by exchanging messages across a reliable transport or session-layer “connection”.
Stateless: The protocol itself provides “no memory” about previous requests or responses. Each request is “new”.
Request/Response: All exchanges are initiate by the client by making a request. Servers cannot independently send updates to the clients. Use Websockets for that.
How is a message different from a packet?
A message is this case is an application layer concept. Large HTTP messages are broken into smaller packets by the transport layer (usually TCP) and then sent over IP.
There are only two types of HTTP messages: request and response.
“Messages are passed in a format similar to that used by Internet mail [RFC5322] and the Multipurpose Internet Mail Extensions (MIME) [RFC2045]”
From RFC7230:
An HTTP client is a program that establishes a connection to a server for the purpose of sending one or more HTTP requests.
From RFC7230:
An HTTP server is a program that accepts connections in order to service HTTP requests by sending HTTP responses.
The same computer can host multiple clients and servers.
A single program may have both client and server functionality
We typically think of a web Browser as the client, but we can and will make HTTP requests programmatically for testing and other purposes.
Web servers come in many “flavors” depending on deployment context, features, performance, etc…
However intermediate systems called proxies may also take part
HTTP 1.1 is a text based protocol that uses CRLF to separate various parts of messages.
HTTP 2 keeps most of the high level interface that we will learn but provides much more efficient methods for encoding (binary) and transmission of messages.
Both flavors are supported by Node.js and Express.js, we will only use HTTP 1.1 for simplicity.
method request-target HTTP1.1 \r\n
Method types: GET
, HEAD
, POST
, PUT
, DELETE
, CONNECT
, OPTIONS
, TRACE
. See RFC7231
The request-target is either the entire URI or only the path part (slashes after the domain)
All general-purpose servers MUST support the methods GET and HEAD. All other methods are OPTIONAL.
The message (request or response) start line is followed by one or more Headers. These headers have the form:
Header-Name: Information \r\n
Host
, Cache-Control
, Expect
, …Accept
, Accept-Charset
, Accept-Encoding
, Accept-Language
Authorization
, Proxy-Authorization
From
, Referer
, User-Agent
request to www.grotto-networking.com:
GET / HTTP/1.1\r\n
Host: www.grotto-networking.com\r\n
Connection: keep-alive\r\n
Upgrade-Insecure-Requests: 1\r\n
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36\r\n
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\r\n
Accept-Encoding: gzip, deflate, sdch\r\n
Accept-Language: en-US,en;q=0.8\r\n
\r\n
HTTP/1.1 status-code reason-phrase \r\n
See RFC7231
OK
: Things worked!Bad Request
, 401 Unauthorized
, 403 Forbidden
: Permission/Authorization issuesNot Found
: Client asking for something that doesn’t exist or we put something in the wrong place on the server.Internal Server Error
: A problem with the server code.Age
, Cache-Control
, Expires
, Date
, Location
,…Etag
, Last-Modified
WWW-Authenticate
, Proxy-Authenticate
Accept-Ranges
, Allow
, Server
Used to describe body content RFC7231
From www.grotto-networking.com
HTTP/1.1 200 OK\r\n
Server: nginx\r\n
Date: Wed, 12 Apr 2017 20:08:38 GMT\r\n
Content-Type: text/html\r\n
Transfer-Encoding: chunked\r\n
Connection: keep-alive\r\n
Vary: Accept-Encoding\r\n
Last-Modified: Thu, 16 Mar 2017 02:56:28 GMT\r\n
ETag: W/"5581190-2b5e-54ad0351c774c"\r\n
Content-Encoding: gzip\r\n
\r\n
[Page content gziped]
From MDN HTTP Overview
Wireshark HTTP trace to www.grotto-networking.com:
A forward proxy is an Internet-facing proxy used to retrieve data from a wide range of sources. Used for monitoring, content filtering, bypassing filters and censorship, caching, and more.
A reverse proxy is an internal-facing proxy used as a front-end to control access to servers on a private network. Common tasks include: load-balancing, authentication, decryption or caching.
We will want to use proxy functionality in development:
Servers and proxies will perform different actions on requests messages based on:
The URL or portions of the URL
The HTTP Method (GET, POST, etc…)
We can think of this as application layer switching. High end servers such as NGINX and Apache 2 provide elaborate configuration options for this. Almost all servers provide some capabilities for this.