Network Fundamentals for the Cloud

September 25, 2023

Computer networking is all around us. From the days of dial up internet, internet connected devices have grown rapidly on the global scale. Despite this, across the globe there are broadly three types of computer network - LAN/MAN and WAN.

The above diagram does a good example of representing the geographical elements of these networks. But it is worth elaborating bit further. LAN networks are usually used in office locations and connected by physical cables and local wifi. MAN networks are typically interconnected LAN networks with a private network link and a connection to the internet. MAN use cases are mostly reserved for governmental or state institutions, along with huge corporates - I'd imagine Alphabet Inc in Silicon Valley might use this for example. Whilst WAN connects devices in a large geographic area, across multiple cities or countries. WAN's are used to connect LANs. You could argue that the biggest WAN in the world is the internet itself.

Some of the fundamental building blocks of these networks are;

Routers - connects different networks and filters the incoming & outbound data.
Switches - creates internal network, useful due to the limits of IPv4[!]
Network Interface Card (NIC)/Media Access Control (MAC) addresses - every NIC is given an individual mac address.

IT Professionals use the OSI model as a language to discuss networking. Interestingly, the OSI model is conceptual, but really this isn't a new feature in computing. Just like our file structures on operating systems like Linux and Windows operate in a folder structure, it's about the rules of how the architecture was designed to function.

The best way to learn the OSI model is mnemonics. My favourite one so far is definitely;

All People Seem To Need Data Processing

Already being a fan of ByteByteGo content I'm not surprised they had the very best visual representation of the OSI model available. Here you can clearly see the demarcations and all the relevant troubleshooting paths we may wish to explore. When working with other IT professionals collaboratively, the layer of discussion will ultimately determine what actions are pursued.

Networks by extension have topologies. We can think of these as logical, or physical topologies.

The diagram above are of logical topologies. Each have their own pros and cons, such as the star topology having a single point of failure. But what we need to know is that the most common globally is the hybrid model. The AWS Virtual Private Cloud (VPC) replicates the traditional it system but in the cloud.

So touching briefly on network management, the most common to encounter are the client-server or peer-to-peer (P2P) model. Essentially these define a sets of rules about how data is distributed on our networks. Now we will move onto a larger section in networking - protocols. Protocols are what dictate the rules under which our networks operate and exist as a global standard. Two of the most important are TCP and UDP.

• TCP is known as a 'connection orientated' protocol as a formal connection between devices are made, data transferred and only then is the connection closed. It's also known as a "reliable" protocol, because all data sent is always acknowledged, meaning it has error recovery - making it perfect for something like large files. It also has what is known is as "flow control", the receiver can determine how much data is being sent.

• With TCP, there is something called the TCP handshake. This handshake is comprised of three messages:

• Synchronize (SYN)

• Synchronize/Acknowledge (SYN/ACK)

• Acknowledge (ACK)

During this handshake, the protocol establishes parameters that support the data transfer between two hosts. For example:

• Host A sends a SYN packet to Host B.
• Host B sends the SYN with an ACK attached to acknowledge that they received it with the message back to Host A.

• Host A sends the last message with ACK to Host B letting them know they received the SYN/ACK message.

• UDP on the other hand is considered a "unreliable" protocol because there is no formal 'call' to connect to the other device. By extension, there is also no flow control as this protocol acts as a "post and hope" type of delivery. This type of protocol is used often for gaming due to the lack of potential latency, as well as with streaming.

Now for perhaps the most well known of the protocols, the Internet Protocol (IP). The shortest description is that it simply provides us the rules for relaying and routing data over the internet. We're going to focus on IPv4 (32bit) for the moment.

IP addresses are divided by the network portion and the host portion, the latter that identifies the specific network on that computer. The importance of IP classes will become apparent a little later.

Not all IP addresses are available. For example you usually have a default broadcast address, which is as an example, 10.255.255.255, usually the last address available in the range. You also have the default router range which is 10.0.0.1. The first very first address 10.0.0.0.0 is your network ID. These are assigned by convention.

Now in my infinite naivety, I always presumed ports referred to literal, physical ports. Anyway, the purpose of port numbers is to allow a device in the network to further communicate with devices or applications utilising the network. Ports will for the majority of instances follow convention.

Port numbers allows one server to receive many messages for several applications and maybe clients at once. It is also known as an endpoint. AWS own Domain Name Server service is named Route 53, because it uses port 53! It's vital to become familiar with the major ports as they are high up the list whenever the need to troubleshoot a networking issue comes up.

Now, for many reasons, it's really useful in building our understanding of IP to have the ability to convert the addresses into binary. Below is a very clear example about how we would convert each octet into binary.

IP's can be designated to either be static or dynamic, with the static usually reserved for severs or printers that devices need to connect often to.

So what could be some of the benefits of using Amazon VPC?

Well, in a matter of moments you can spin up a practical data centre is pretty incredible. It also means that on-premise infrastructure can be easily migrated onto the cloud and span multiple availability zones.

But to launch a VPC in AWS, we need to know what IPv4 address range we which to employ - and that is established by selecting a CIDR block for the network, but we will have more on this later. Every VPC instance created in the AWS sits within a region, within an availability zone, under a subnet. The VPC then uses a internet gateway to talk to internet, allowing us to isolate our EC2 instances. We have a route table assigned to the subnet, so if any device on the network decides they want to talk to X address the traffic can be accurately redirected to pass via the internet gateway we've set-up.

With a security group, we can define rules to allow or filter any kind of incoming or outgoing traffic we want attached to each EC2 instance. Further to this you can have an network access control list (NACL) which serves as a firewall which is applied at the subnet layer in our VPC. Operating a VPC means we of course benefit from the broader benefits of being on the cloud - redundancy and high availability unlike traditional data centers. We must use private IP ranges in AWS and they are as follows;

Class A IP addresses. Configurations range from 10.0. 0.0 to 10.255. 255.255. ...
Class B IP addresses. Configurations range from 172.16. 0.0 to 172.31. 255.255. ...
Class C IP addresses. Configurations range from 192.168. 0.0 to 192.168. 255.255.

So as mentioned earlier, when you create a VPC, we have to specify the IPv4 address range.

This VPC address range could be as large as /16 (65,536 addresses) or as small as /28 (16 addresses) [these two are in particular AWS specific limits, however]. To make further sense of this, we need to know about the split between network and host;

Now the above diagram builds a little further on what will led us to understanding how to determine CIDR blocks. For the moment, we will build our understanding on the other network technologies.

Now remember that a VPC mirrors the traditional IT infrastructure. We'll briefly run down all the parts shown in the diagram above and article them a little further. An Amazon VPC is a logically isolated environment for your resources within the cloud this is where we can determine our region.

• Subnets: Subnets just like a switch allocate a range of IP addresses within our network, our in our case the VPC. We will do a deep dive into this and CIDR blocks shortly. These subnets are either public or private subnets. Public subnets have the route table with the internet gateway associated with them, but the private subnets do not. Whilst a public subnet has an internet gateway and is accessible from the internet and within the VPC a private subnet does not have any traffic routed directly to an internet gateway. However, a private subnet it can have traffic routed to the NAT gateway from the private subnet as shown in the diagram above. Without a NAT gateway, only VPC traffic is accessible, but with a NAT gateway, a private subnet is able to communicate with the internet.

• Route table: The route table contains rules which determine how the VPC routes it's network traffic. This traffic could be sent to the internet gateway, VPC endpoint or a Network Address Translation (NAT) gateway. A route table contains routes for your subnet and directs traffic using the rules defined within the route table. A subnet can be associated with only one route table; however, route tables can be associated to multiple subnets.

•Internet gateway: Just like we need a modem to connect to the internet, so does our VPC. In AWS we need to create this service and attach it to our VPC. Once attached we still need to also connect it to our defined route table in order for resources to reach the internet. It has two jobs: perform network address translation (NAT) and be the target to route traffic to the internet for the VPC. An IGW's route on a route table is always 0.0.0.0/0.

• Security groups and Network Access Control Lists (NACLs) work as the firewall within your VPC. Security groups work at the instance level and are stateful, which means they block everything by default. NACLs work at the subnet level and are stateless, which means they do not block everything by default.

What is the difference between state and stateless firewalls?

In the diagram above, we see that the connection between client and server isn't determined by keeping a record of the request the client (Jack) made to the SGC Web Server, but instead is determined by the rule #2 - ie. that is to say, the firewall checks back at it's own designated rule base rather then making a decision based on the "state" of the connection. If we hacker takes control of the SGC Web Server, Jack's firewall would not filter any data sent at all.

In contrast, using a stateful firewall, at the point the client (Jack) makes a request to the SGC Web Server the firewall will create a session table noting this data flow, consequently expecting a response back from the web server. This response from the SGC server will pass through despite not being in the access control list because of the logged session table. This also means if a hacker gained access of the SGC Web Server and sent malicious data to Jack, it would now be filtered because it doesn't match either the rules or have an active data flow logged on it's session table.

Now, time for a deep dive into IP subnetting.

Subnetting is what we call the process of logically partitioning a single physical network into multiple smaller subnetwork's or subnets. Organisations use subnetting to conceal network complexity and reduce network traffic by adding subnets without adding a new network number. When a single network number must be used across many segments of a local area network (LAN) then subnetting becomes essential.

A subnet mask uses its own 32-bit number to mask the IP address and further enable the subnetting process. Subnet masks decide which hosts are on the local network and which are outside. Hosts can talk to other hosts when their on the same network, but they must communicate with a router to talk to hosts on external networks. This now brings us onto Classless Inter-Domain Routing (CIDR) notation;

To make a CIDR range, we need the network id and defined subnet mask.

192.168.0.0 - Network ID
255.255.255.0 - Defined subnet mask

The purpose of developing CIDR notation was to express the above two lines in a single line.
By expressing this information more succinctly it allowed routers to organise IP addresses into multiple subnets more efficiently. As we discussed above, IPv4 is called 32bit addressing - because of the 4 octets giving us the relevant length of binary.

ie. A default subnet mask of 255.255.255 in binary would be:

11111111.11111111.11111111.00000000

A CIDR notation for the above would be 192.168.0.0/24, as the /24 defines the subnet mask.

So let's repeat this exercise with a Class B address;

172.31.0.0 - Network ID
255.0.0.0 - Defined subnet mask

So here we can easily work out that the CIDR notation would be 10.0.0.0/8
That's because only 8 bits are dedicated to the network portion of the address!

(Please note, that due to implemented limitations we can't actually use this in AWS!!)

We can even borrow more from the host bits of IP to create even more subnets.
ie. 192.100.10.0/28

So when we launch a VPC on AWS, we need to have some awareness of the number of networks and relevant hosts our needs will require. The chart above gives us the examples of the number of host IP's available in different CIDR ranges.

Domain Name Servers (DNS) are how we're able to simply type in www.google.com into our web browsers and pull up right at the right IP address! So it's super essential to the everyday goings on of the web. Even my own little iframe website is on a DNS! 😎

Jumping back to some protocols;

ICMP is the internet control message protocol, and is essential the ping command we can do on Windows! Then there is the Dynamic Host Configuration Protocol (DHCP) which is what automatically issues our devices with an IP address.

Then for using sites that require security measures, there are SSL/TLS certificates. SSL/TLS creates a secure connection between the server and the user's computer whilst data is exchanged over the internet. This exchange relies on encryption, authentication, and integrity. Encryption stops 3rd parties seeing the text. These certificates are issued by approved certification authorities and are then held locally in a server, when we as the client send a request, we also download the certificate to authenticate we're connecting to a legitimate identity/server.

Touching a little bit back to Linux now, as we're working with networks, it makes sense that we have some tools and commands to monitor bandwidth usage, network performance and what our network configurations are. Below are just some common tools we can use;

• ping, essentially the ICMP protocol we mentioned just above. We send a packet to the designated IP and if received, it echo's back us a message with the round time of the packet.

• dig, this command sends a query to the DNS, also showing the actual IP address of domain names.

• traceroute, this command is a little like ping, however it also outputs the journey and number of hops that the network path has taken, useful for troubleshooting connectivity issues. All networks will experience some level of packet loss, but too much and it can be a problem. All packets are given a "time to live" and sometimes too many hops can be causing the loss. This is exactly what this command is for.

•telnet, is like ping for ports. This tool will check if a service runs on the remote device and whether it responses to requests. You can use telnet to test individual ports and see whether they are open or not.

•tcpdump, this is a packet sniffing and analysis tool that helps us troubleshoot connectivity issues in Linux. It captures TCP/IP packets over our network traffic for us to analyze.

Wireless technology is an ever-growing part of today’s communications. This is particularly related to the growing internet of things. As a result of this, additional protocols have been designed to manage and operate on the wireless networking. When we talk about the internet of things (IoT) we mean devices or things that are connected to the internet so they can share and collect data. These can include things like a smartwatch or for more industry related things, sensors used in agriculture production.

• Wired Equivalent Privacy (WEP): This adds security and protection to wireless networks by encrypting data that is available for others to capture.

• Wi-Fi Protected Access (WPA): WPA was introduced to replace WEP. But whilst both WEP and WPA are similar, WPA has better security and user authorization.

• MQTT (Message Queuing Telemetry Transport): this is an extreme lightweight messaging protocol for limited devices, usually going over HTTPS.

• Bluetooth Low Energy (BLE): This protocol emphasises low energy consumption. This technology (BLE) is used mostly for mobile apps and is appropriate for IoT projects. Initially developed for periodic data transfer over short ranges it is now a solution used in many areas such as healthcare, fitness, and security.

• 5G: Moving on from 4G, this will provide download speeds up to 10 Gbps, but is still being rolled out globally.

• AWS IoT Core service allows devices using various technologies and protocols to connect to AWS.

• AWS provide a service called WorkSpaces to powerful remote computers on a virtual desktop for users.

Network Fundamentals for the Cloud

Popular posts from this blog

Familiarizing with the Command Line Interface

Security Fundamentals for the Cloud

CLI Fundamentals for the Cloud

DataDog, a Cloud Analytics & Monitoring application

A brief introduction to Databases and MySQL

AWS CodeCommit + Creating a CI/CD pipeline

Future Orientation: Tips from a AWS re/Start Graduate

A brief introduction to AWS Cloud Adoption Framework (CAF) and Well-Architected Framework (WAF)

Building a VPC in AWS