What happens when you type holbertonschool.com in your browser and press Enter

Caroline
7 min readSep 5, 2021

In this article we are going to discuss what is happening between the moment we enter an URL(Uniform Resource Locator) in the internet search bar and the moment the website loads the contents. We will look into the main steps that take place behind the scenes.

First, let’s define a couple terms:

Server:

A server is a computer software or hardware that provides functionalities for other computers, programs or devices called “clients”.

The client-server model

Web server:

A web server is typically a software inside a physical server that manages how the web users and clients access specific website files. The web server answers client requests by returning to them static content, such as HTML files. Two common web servers are Nginx and Apache.

When users enter an URL in a web browser, they are requesting specific files that are being hosted on a server. In short, the web browser sends a request to the hosting server asking for the files, the web server then sends back the files to the browser. The browser then loads the files in a readable form.

Network Protocol:

A set of rules and conventions for communication between network devices. It includes ways devices can identify and make connections with each other. There are also formatting rules that specify how the data is packaged into sent and received messages.

IP address:

An IP address is an unique identifier found on machines connected to a network. It is a computer’s address. It allows computers on a network to send data to each other back and forth.

There are two types of IP addresses: IPv4 and IPv6. The ‘v’ stands for “Version”. IPv4 addresses are made up of four number sequences ranging from 0 to 255, separated by a dot. IPv4 offers 4.3 billion unique addresses, but due to the increase of devices connected to the internet, IPv6 was created to avoid a shortage of available IP addresses. IPv6 consist of a sequence of six segments of letters/numbers separated by a semi-colon.

When an URL is entered, the browser will first break down the URL into multiple parts, in order to retrieve the domain name of the website.

The purpose of domain names is to make it so humans do not have to remember IP addresses. In order to get the IP address of a server hosting the files, the web server will check first if the address is in the cache, if it does not find it there, then it will ask the operating system, if it still does not find it, it will do a query to a remote DNS server.

Now that we have defined some important terms, we will take a look into what is happening between the moment we enter an URL search bar and the moment the website loads the contents.

DNS request

DNS stands for Domain Name System, it translates domain names to IP addresses. The first thing the DNS does is go through the resolver. The resolver is typically the internet service provider(ISP). Most ISPs dedicate a certain amount of servers to resolving domain names. If the resolver knows the IP address, the resolution process ends and the IP address is sent back to the browser. If the resolver doesn’t know it, the request will go to the root server. The root server knows where the TLD server is, the Top-Level Domain. An example of a top-level domain is “.com”. If the TLD server does now the IP address, the resolver will now go through the ANS, the Authoritative Name Servers of the domain name. These are the servers that will know the IP address of the domain name and send it back to the resolver, who then sends back to the web browser. If an error exists, the error will be displayed on the web-page. Once an IP address is acquired, it is registered locally in the cache. Once the browser gets the IP address, it sends a HTTP request to the server. HTTP stands for Hypertext Transfer Protocol. The protocol that defines how messages are formatted and transmitted and what actions web servers and browsers take in response to various commands/HTTP methods.

In example, then the browser sends a HHTP request, the method GET is used by default. That means the browser tries to get data from a specified resource in the server. Other methods include: POST, PUT, HEAD, and DELETE.

HTTPS/SSL

HTTPS stands for HyperText Transfer Protocol Secure, it is a secure version of HTTP. The HTTPS requests and responses are encrypted, which ensures the users that their data can’t be stolen or used by third-parties. Another component in securing websites is the SSL certificate. SSL stands for Secure Sockets Layer, also known as Transport Layer Security or TSL. The certificate can only be issued by a trusted Certificate Authority, like let’s Encrypt for example. When a website has this certificate, you will see a little lock icon next to the website’s name on the search bar, and in some cases it will turn green.

TCP/IP

TCP/IP stands for Transmission Control Protocol/Internet Protocol. It is a set of standardized rules that allow computers to communicate on a network such as the internet. TCP and IP are two separate computer network protocols, but they are often used together. IP obtains the address to which the data is sent to, while TCP handles the way data is delivered, received, ordered and error-checked over the network.

When a client sends a request to a server, the data is broken in packets. A packet is a small parcel of information that gets transmitted over the network. The web server responds by sending back other packets. All packets are sent using TCP and tracked so no data is lost or corrupted.

Firewall

In order to protect themselves from hackers and attacks, servers are often equipped with a firewall. The firewall is a software that sets rules regarding who can enter and leave the network. When the browser asks for the website at a specific IP address, that request has to be processed by a firewall which will then decide if it’s safe or a threat to the server. The browser can also contain a firewall that decides if the IP given by the DNS request is potentially dangerous.

Load-balancer

If a website where to only live on one single server, then it would have a Single Point of Failure, SPOF. Meaning it would only take on attack to take down the website. In result, most websites live on multiple servers, organizing them in clusters and using load-balancers. A load-balancer is a software that distributes network requests between several servers using a load-balancing algorithm. A commonly used load-balance is HAproxy, a common algoithm is round-robin, which distributes the requests alternating between all the servers evenly and consequentially. Once the requests have been evenly distributed, the are process by one or more web servers.

Application server

Application servers are put to use by dynamic websites that users interact with. For example doing things like saving information or to log in. Application servers are software programs responsible for operating applications, communicating with databases and managing user information among many things. They work along with web servers and are able to serve a dynamic application using the static content from the web server.

Database server

A database is a collection of data that is organized. The are different types of databases, the most commonly used are relational databases. A relational database stores data in the form of tables. The tables may be linked to each other via primary and foreign keys.

A database server is the software that allows interaction with the database. Using a database server allows many tasks to be performed. Such as: storing, analyzing and manipulate data.

In short, first your browser looks for the IP address using the domain name “holbertonschool.com”. Then it sends a HTTPS request to the hosting servers. The request then has to first go through the firewall, once it does, a secure connection is established between the two machines. The load-balancer then forwards the request to once of the servers using a load-balancing algorithm. The chosen web server receives the request, looks for the requested files and sends them back in a HTTPS response to the browser. At last, the browser receives the packets of data and loads them in a readable format.

--

--

Caroline
0 Followers

Holberton School Programming Student