One of the cool things that we do is make an Http Server (ASP.NET like) from scratch, in this segment of posts (don’t really know how much there is going to be) i will try to show how simple (or not) it is to make a functional HTTP Server in 3 different languages
- C# – The first implementation.
- Java – The second.
- C++ – The last.
You are probably asking why the hell is this guy doing the same thing in three different languages, i have good reasons to do so, the first implementation is going to be in C# because i already have it coded xD, in java for the challenge ( we here are very, very sick of java, so doing something in it is always a challenge xDD ) finally in C++ simply because i haven’t done anything substancialy in this language and plain curiosity to learn using concurrency libraries in it.
Now introductions aside, lets talk about the simple (but complex) thing that is HyperText Transfer Protocol.
The HTTP is one of the most simple protocols that i’ve studied since I start my degree, as stupid it may appear this protocol can be explained in 3 lines:
- The HTTP is a request/response protocol and it’s stateless.
- The Request is characterized by method, destination url, protocol version, headers, data.
- The Response is characterized by, HTTP status code, some mandatory headers, empty line, data.
Off course this means nothing if the only thing that you know about http its just the thing that appears before the urls. So let’s go deep into the http.
The http is a request/response protocol, that means that a client requests something to a server and a server produces a response the execution path its always like this, client requests, server respond.
The http is a stateless protocol, this feature is more important in the server side, what this means is that between requests of clients the server does not need to save session data from users.
A simple HTTP request is composed by just three mandatory lines:
First line : <method><space><request uri><space><HTTP Protocol Version><line feed>
This line is called the request line, that name isn’t just for show off, in this line the ‘what‘ and the ‘how‘ is defined. Let’s check an example, when you put the programmaticallySpeaking url on your browser and push enter the following is send to the server:
GET /blog/ HTTP/1.1
This line can be readied, as, give me (GET) the content in /blog/ and I’m using the version 1.1 of the protocol (1.1 ?!?!?! shouldn’t be in version 100000…. NO since 99 that the protocol hasn’t changed that much, and considering that the web changes daily this is a big thing to see).
The method in this case is GET, there are a bunch of methods that are possible to use (see HTTP RFC link below), but the most common are GET for asking the servers for some content in it and POST for updating data in the server.
After the request line, the headers are sent to the server like this:
Headers – <Header Name>:<Header Value>
The headers are separated by line feed and there purpose is to pass additional information from client to server. There are plenty of headers but on the request only one is mandatory, that header is the Host header like I showed before, this header shows the place where the server is, it’s main purpose is to “differentiate between internally-ambiguous URL”.
To finalize where is an complete HTTP request:
GET /blog/ HTTP/1.1
Just as the Http Request, the http response is also defined by a response line and some mandatory headers, let’s check it out:
Status-Line: <HTTP Protocol Version><space><Status-Code><space><Reason-Phrase><line feed>
So, the Protocol Version it was already explained on the Http Request, no changes here. Now Status Code ?! what the hell is this? Every person that have opened a browser a few times have seen the page 404 Not Found or 500 Internal Server error, never wandered what were this numbers? So now you know this numbers 404, 500, 200, 400, 401, etc are all status codes and each one have very defined meanings:
- 1XX – Information.
- 2XX – Success.
- 3XX – Redirection.
- 4XX – Client Error.
- 5XX – Server Error.
So next time you see a 404, you know that you screw up when writing the url, when you see a 500, you know that the guys that made the server screw up (just kidding) , when you see the page that you wanted to see it’s probably because the server sent a 200 to your browser, and so on.
Finally to end this status line is the Reason-Phrase, what is this? What I just said xD, reading the rfc you can find this line that explains what a reason-phrase is “The Reason-Phrase is intended to give a short textual description of the Status-Code.” And nothing more to say about this.
After the status line come as the request, the headers, this time there are two mandatory headers:
Content-Length – defines the length in bytes of the response.
So after this two headers comes a line feed, the data and finally to signalize the end of the message another line feed.
So after reading all this, you probably have some questions to make, feel free to write a comment or check the RFC.
The HTTP protocol is much more than the stuff I wrote, what I explained here was just the basic to make the next part of the “tutorial”, for each new part of the segment of posts probably I will add more information about the HTTP protocol, more features, more headers, etc.
References : HTTP RFC