What is the Internet?

The History

The Internet is a global packet-switched computer network built on multiple protocols and technologies within the TCP/IP model. It’s roots go back to ARPANET, a US DoD computer network designed in the 1960’s for sharing information for military and academic use. Years later the project was decommissioned in 1990, due to the emergence of a new World Wide Web (WWW). In 1989 an English computer scientist, Tim Berners-Lee, invented the WWW and built the first web browser in 1990, using a protocol known as Hypertext Transfer Protocol (HTTP). This paved the way for the information age. However let’s backtrack to 1981, when RFC 791 was released by the University of Southern California, for a protocol named Internet Protocol Version 4 (IPv4). The combination of these concepts and protocols built the Internet we use and know today.

TCP/IP Model

Remember I stated the Internet was built on multiple protocols within the TCP/IP model, well I’ll begin to explain what these protocols are and what the TCP/IP model is. A protocol is a computer program that is designed as a standard to perform specific functions or actions between multiple devices within a computer network. There are 5 layers that are defined within the TCP/IP model. The first (layer 1) is the physical layer. This is the physical medium in which data can be transmitted on (e.g. copper, fiberoptic strands, the air for radio waves). The second (layer 2) is the data-link layer. This is the layer of protocols that interpret the modulation of signals and control the bits of data (frames) between the transmitting and receiving devices (e.g. protocols IEEE 802.3 Wired Ethernet, IEEE 802.11 Wireless Ethernet). The third (layer 3) is the network layer. These are the protocols that allow for routing of packets of data (data from the layers 4 and 5) by providing a header of information containing a source and destination IP address (e.g. protocols IPv4, IPv6). The fourth (layer 4) is the transport layer. These are the protocols that control the transmission and provide services to the data from layer 5 (e.g. protocols TCP, UDP). The fifth and last (layer 5) is the application layer. These are a large suite of protocols that allow for computer applications to communicate on a computer network (e.g. protocols HTTP, DNS, DHCP). Starting from layer 5, the process of sending data from a computer application goes down the stack of layers within the TCP/IP model. The data from each above layer gets encapsulated into what is known as the Protocol Data Unit (PDU) in the lower layer. Minus layer 5 and 1, each layer has a name for these PDUs. Layer 4 PDUs are called segments or datagrams, if using Transmission Control Protocol (TCP) or User Datagram Protocol (UDP) respectively. Layer 3 PDUs are called packets. Layer 2 PDUs are called frames. Starting from Layer 1, the process of receiving data from another device goes up the stack of layers within the TCP/IP model. The data from the lower layer gets decapsulated, meaning the header information is stripped to expose the data of the above layer.

The process of connecting to a website

Now that I’ve explained the history and the model for understanding modern computer networking, I can explain the process of what happens when you turn on your computer or smart-device, open your browser, and type in a web address.

LAN and DHCP

When you turn on your computer or smart-device, a multitude of actions happen. Your computer will first begin to attempt to connect to your Local Area Network (LAN). If hard-wired with an Ethernet cable, the wired Network Interface Card (NIC) on your computer will transmit electrical signals over the copper pairs within the cable using the layer 2 protocol IEEE 802.3. If connecting wirelessly, the wireless NIC on your computer will first transmit radio waves over the air exchanging management frames to associate to the Wireless Access Point (your home router) using the protocol IEEE 802.11. After initial connection, your computer will send a broadcast to all devices on the network with a message called a Dynamic Host Control Protocol (DHCP) Discover. Your home router (which is actually a multi-purpose networking device that acts as a Firewall/Router, Wireless Access Point, Ethernet Switch, DHCP Server, and sometimes DNS Server) will then respond with a DHCP Offer message, containing an IP address (e.g. 192.168.1.100) and other network information. Your computer will then send a message called a DHCP Request back to your router, containing the IP information received from the router’s DHCP Offer. Your router will respond with a message called a DHCP Acknowledge and your computer will then assign the IP address to the NIC, along with other information like default gateway, subnet mask, and Domain Name System (DNS) server(s). Now at this point your computer is fully connect to the LAN and is ready to access the Internet.

DNS

When you open your browser and type in an address, like facebook.com, your device will first need to send what is known as a DNS query. The reason for this is, on layer 3 of the TCP/IP model, headers with source and destination IP address are used to transmit the packets of data. Your device must resolve the domain name in the URL, e.g. facebook.com, to an IP address. Basically from human readable format to computer readable format. Your computer knows where to send this DNS query because of the DNS list that was provided by your router by DHCP. Depending on your router, the DNS server will either be the IP address of your router (if the router is acting as a DNS forwarder) or a public DNS server, which is typically learned upstream by your Internet Service Provider. Your computer will send the packet containing the DNS query to the default gateway (the IP address of your router). This is because the DNS server exists outside your LAN, in the Wide Area Network (WAN) known as the Internet.

Ethernet Frames and ARP

However before your computer can transmit the packet containing your DNS query, it must first need to know the MAC address of your router. This is because, on layer 2 of the TCP/IP model, the protocol Ethernet uses headers with source and destination MAC addresses to transmit the frames. Since your computer knows the IP address of the router it will transmit a broadcast known as an ARP (Address Resolution Protocol) request to resolve or translate the IP address to the MAC address. This broadcast is sent to all devices on the LAN, basically with a message stating, “Who is this IP Address?” The router should be the only device to respond with its MAC address. Now that the MAC address of the router is known, your computer will generate a frame containing the packet of the DNS query, adding the router’s MAC address in the header as the destination MAC , and send it to the router.

Routing and NAT

The router will decapsulate the frame, read the header of the packet for the destination IP address, look at its routing table (home routers will typically just have two routes, the directly connected route to the LAN and another known as a default route to the ISP), then forward the packet in a frame to your ISP. The packet also contains a source and destination port (e.g. HTTP is port 80, these are used as an identification of which application layer protocol is being used). Before routing the packet, your router will also perform Network Address Translation (NAT), specifically a form of dynamic NAT called Port Address Translation (PAT), to translate the source IP address within the header of the packet, from your computer’s private IP address, to the public IP address of the router’s interface connected to the modem. This is because only public IP address are routable within the Internet and the remote host/server needs a proper IP address to respond to (it is also mainly because there are a limited number of public IPv4 addresses in the world and NAT allows all devices within the LAN to share a single public IP address). Your router saves the NAT session, containing your computer’s source private IP address and source port. Once the packet reaches the ISPs network, it will continue to be routed in a series of carrier-class routers (they look nothing like what you have at home), likely between multiple ISPs, all performing route table lookups and making forwarding decisions. Eventually the packet reaches its destination at the DNS server. The DNS server will process the query, look within its database, and then respond with the IP address of the domain name you entered in your browser, traversing the Internet back to your router. Your router will re-translate (using NAT) the destination IP address of the packet to your computer’s private IP address and forward it to your computer, based on the original IP address and port information. Keep in mind your router will perform NAT on all packets transmitted to the Internet.

TCP Connection and HTTP

Once your computer has the IP address of the remote web server, it will first need to establish a TCP connection with the server. TCP is a layer 4 protocol that is connection-oriented, meaning it negotiates and establishes a virtual circuit that provides acknowledgements, flow control, and error recovery. This is done by first performing what is known as a three-way handshake. Your computer will first send a TCP SYN (synchronize) message to the web server, the web server will respond with a TCP SYN/ACK (synchronize/acknowledge), and your computer will then send the final TCP ACK back to the web server, establishing a TCP session. The purpose of the TCP session is to ensure reliable delivery of information, in case packets are dropped during traversal of the Internet. This is done by providing sequence numbers for every packet (these are called segments) transmitted and acknowledgements for every packet received. If packets are lost and the sequence number of the received segment is higher than the next number that was expected, the remote host/server will request for the re-transmission of the lost packet and to start over from there. After the TCP session is established, your computer will then send an HTTP GET, containing the full URL entered in the browser. Depending on what was entered, the web server will respond with the requested page and transmit all website data (e.g. html, css, javascript, images, videos). Your web browser (application) downloads (receives) the data and then compiles/executes the code and displays the media (images, video, audio) on your screen. This is the rudimentary process of how information is transmitted from your home across the Internet and how you can get to your favorite websites.