What is HTTP?

What is HTTP?

·

6 min read

We all use HTTP when we surf the web. And without HTTP it's generally not possible to search the web.

Even if we try not to write "HTTP" before our links, the browser sets it to HTTP by default.

Let's see what is HTTP.

What is HTTP?

We all know it is a transfer protocol used to send data over the web.

According to web - HTTP is an application-level protocol used for fetching resources such as HTML documents.

Application level means that it works on the top layer of the OSI model.

We won't go deep into the OSI model but for an overview:

An OSI model is a conceptual model which provides a standard of communication between computers.

And the application layer is where applications can access network services and can interact with the users.

HTTP is the foundation of data exchange on the web and is also known as a client-server protocol as it allows the browser to communicate with the server.

(here the client is the browser and hence the name client-server protocol).

Apart from being a simple data transfer protocol, it can be used as a generic protocol for user agents (such as browsers), proxies/gateways including those supported by protocols like SMTP and FTP, and hence it can access resources from different apps.

History of HTTP

  • The first version of HTTP formed was HTTP/0.9 which was used to transfer raw data.
  • HTTP/1.0 was introduced later which allowed us to transfer MIME-like messages such as media files.
  • HTTP/1.0 was doing great but there was a problem, it was not reliable, so to overcome that HTTP/1.1 was introduced that works on a TCP connection to ensure reliable implementations.

BUT HTTP doesn't just depend on TCP. It can even send data over a TLS/SSL protocol or any other reliable transport protocol.

Just to understand - A Transport Layer Security (TLS) also known as Secure Socket Layer (SSL) is a protocol to communicate securely ensuring privacy through cryptographic protocols.

HTTP versioning

HTTP uses structure as <major>.<minor> to indicate versions.

<minor> is when a version change does not affect the communication behavior and is an addition of message components.

<major> is incremented when the format of a message within the protocol is changed.

Working of HTTP

So now we know what is HTTP? And it's history. Let's see how it works.

As we know it is a client-server protocol, it follows a request-response method.

Request

A client sends a message to the server (request) in the form of a request method, URI, and protocol version followed by a MIME-like message containing request modifiers, client information, and content body.

  • Request method: GET, POST, PUT, DELETE
  • Protocol version: such as HTTP/1.0 HTTP/1.1
  • Client information: your browser identity
  • Content body: what you want from the server

The request sender can be a proxy or browser known as a user agent or a robot crawling the web to populate and maintain the search engine index.

Proxy

A Proxy is a computer or program used when navigating through the network. It lies between the client and the server (these can be modems and routers as well).

It can access the web and can intercepts the request and respond back and can manipulate the request received before sending it further.

Most of the proxy machines lie in the last four layers of the OSI model.

Response

When the request is received by the server, it responds back with a status code, status message, MIME-like message containing server information, meta information, and entity body content

  • Status codes: A code to if its a success or fail
  • Status message: human-readable form of the success/fail message
  • Body content: expected result

Although most communications are initiated by the clients to the origin server but nowadays servers can initiate a communication too.

Components of HTTP-based Systems

Client (The User Agent)

A user agent is anything that works on behalf of the user, generally, this role is performed by the browsers. Browsers send the request to the server and receive an HTML page and then requests for other files as specified in the HTML page.

Proxies

As we have covered proxies above, it's the same old same old.

Proxies can manipulate a request coming from the client and can translate it to different HTTP versions and to server understandable form.

Servers

A server is a computer that serves the documents to the client. It is called a computer but actually, it is a collection of computers sharing the load or communicating with other computers.

HTTP Flow:

  • Open a TCP connection: A connection is established between client and server.

The client sends a message to the server known as SYN (synchronous sequence number) which is a sequence of numbers and checks for open ports in the server to connect.

SYN = 3023

The server sends an ACK-SYN message to the client where the ACK (Acknowledge sequence number) is an incrementing number of the SYN message sent by the client and the SYN is the number sent by the server to the client.

ACK = 3024

SYN = 5043

When the message is received by the client then a final ACK message is sent by the client which is the incrementing number of the SYN sent by the server.

SYN = 5044

Both client and server know that the message is coming from the same device by the number being sent as acknowledgment numbers are the next number of the synchronous number sent by them.

And this is what we call a 3-way handshake.

This is how it is made sure that the connection is reliable and the packets are not being sent to anything.

  • Send HTTP message: A request is formed and sent to the server. Example: GET / HTTP/1.1

  • Server responds back: Server sends a response back to the client Example: HTTP/1.1 200 OK

  • Connection is closed or reused.

Connection and HTTP

A connection can be established between two devices in any form. It's not necessary whether the connection is secured or not.

A connection is formed at the transport layer with or without reliability.

But for an HTTP connection, it must be reliable, protocols like TCP are used here.

For a simple connection between a client and a server, it's not much of a story.

But when proxies, gateways, and tunnels come in, it gets complicated and a request is rewritten a few times.

A proxy is a forwarding agent rewriting all or part of the message and sends it to the server identified by the URI.

A gateway is a receiving agent, that lies above the server layer and can translate requests to server protocol.

A Tunnel acts as a relay point between two connections without changing the message, used to pass messages through a firewall generally

Request ------> UA --------> P --------> V ------> G --------> V --------> S

UA - user-agent P - proxy V - single connection G - gateway S - server

Proxy and gateways must be careful while forwarding the messages in different versions of the protocol, as they must not send a request with a higher version of protocol than the actual version. If that happens then they must downgrade it or respond with an error or use a tunnel.

That's all about HTTP. Thank you for reading. Feedbacks are appreciated.