MQTT, HTTP, or WebSockets for IoT Communication

AWS networking

aws_networking

Understanding network fundamentals is important when creating an AWS infrastructure for your application. In this post, I will go through the important topics related to AWS networking fundamentals.

Regions and Availability Zones

 Amazon EC2 is hosted in multiple locations worldwide. These locations are composed of Regions and Availability Zones.

aws_global_infrastructure
AWS global infrastructure
According to the image, we can see that there several AWS regions spread across the world. In the Singapore region, there are 3 availability zones. If we create our instances in a single zone and that zone is affected by a system failure then our whole application would stop working. To overcome this we can create our instances in several isolated zones and achieve fault tolerance and stability.

Regions are independent. Availability Zones are isolated but connected through low-latency links.

Elastic IP addresses can also be used to recover from failure by rapidly remapping the address to an instance in another Availability Zone.

Your AWS account determines the Regions that are available to you. You can see the list of regions available for your account from the console. When creating an EC2 resource, you can specify the region for the resource. By doing so you can put the instance closer to your customers or fulfill certain legal requirements.

Instances can also be migrated from one zone to another based on your requirements.
Check this Cool Map of The AWS Global Infrastructure >>> www.infrastructure.aws

VPC

 In AWS, we can deploy many things such as EC2 instances, RDS to store data, Load Balancers and many more. When you create an AWS account, you will be assigned a default VPC. All the things we create inside will be in this VPC and you can create more VPCs as well. A VPC spans all the Availability Zones in the region.

AWS is like the ocean and VPN is like a sea that we can create from our AWS account. We can put our AWS services in the sea. Technically speaking, VPC is a  logically isolated section of the AWS Cloud. It is a  virtual network where you can keep your AWS resources.

In a VPC you can define IP ranges, create public subnets, create private subnets, create NAT gateways and create internet gateways. Each of these components has a purpose and meaning. We need these to create an AWS infrastructure for our system. We will look at these components briefly later in this post.

vpc
VPC

A list of VPCs can be seen from the VPC dashboard. If you haven't created new VPCs, only the default one will be listed.
VPC_dashboard
VPC dashboard
I needed to study these concepts when I wanted to set up a network load balancer for a rest API and I thought sharing my findings would help others.

Subnets

First, let's look at the below image that shows a list of subnets in the Default VPC.
subnets
Subnets
In this image, there are 3 subnets (Highlighted in green).
  1. DF-subnet-1b
  2. DF-subnet-1a
  3. DF-subnet-1c
Before going into VPC subnetting let's look at fundamentals of IPV4 briefly.

A public IP address is an IP address that can be accessed over the Internet. Private IP address is used to assign computers within your private space without letting them directly expose to the Internet. An example of this is your home router having a public IP and each computer in the home network gets a private IP address.
Go to https://whatismyipaddress.com/ to see your public IP.
Run ipconfig in your windows CMD to see the private IP.
ipconfig
ipconfig
There are 2^32 of IPV4 addresses in the world and some portions of the IPv4 space that are reserved for specific uses. The following IP blocks are reserved for private IP addresses.

ClassStarting IP AddressEnding IP Address# of Hosts
A10.0.0.010.255.255.25516,777,216
B172.16.0.0172.31.255.2551,048,576
C192.168.0.0192.168.255.25565,536

You will notice that IPV4 address of my computer is between 192.168.0.0 and 192.168.255.255. Classless Inter-Domain Routing (CIDR) system is used to define a range of IPs.

171.31.0.0/16 has 65534 usable IP4 addresses between 171.31.0.1 and 171.31.255.254171.31.255.255 is being used to Broadcast. You can use an online calculator(http://jodies.de) to calculate all these IPs. 
IP_calculator
IP calculator
These tutorials will help you to understand more about IPs and CIDR notation.
  1. (Network IDs and Subnet Masks)https://www.youtube.com/watch?v=XQ3T14SIlV4
  2. https://www.digitalocean.com/community/tutorials/understanding-ip-addresses-subnets-and-cidr-notation-for-networking#understanding-ip-addresses
  3. https://www.iplocation.net/public-vs-private-ip-address

End of IPV4 fundamentals and back to VPC subnetting.              

When you create a VPC, you must specify a range of IPv4 addresses for the VPC in the form of a Classless Inter-Domain Routing (CIDR) block. My DefaultVPC's CIDR is 172.31.0.0/16.  That means this VPC has 65534 IPs.

vpc_cidr
CIDR
The process of dividing a network into smaller network sections is called subnetting. When you do subnetting, 65534 IPs need to be divided between those subnets using the same CIDR system. The CIDR blocks of the subnets cannot overlap (When you create CIDRs for the subnets, you must create clear boundaries between subnets. This will be validated when you create the subnets).
  1. DF-subent-1a's  CIDR  is  172.31.0.0/20 and that means 4094 usable IPs.
  2. DF-subent-1b's  CIDR  is  172.31.16.0/20  and that means  4094 usable IPs.
  3. DF-subent-1c's  CIDR  is  172.31.32.0/20  and that means  4094 usable IPs.

But you will notice that these numbers are different from the number of IPs in the "Subnets" figure. It is because, in my DefaultVPC, some of the IPs have been used by AWS resources(EC2)  and few other AWS services in that subnet.

See the Private IPs, VPC ID and Subnet ID of the EC2 instance in the below image.
EC2_instance
EC2 Instance
When creating a subnet, AWS recommends us to create CIDR blocks from the private IPv4 address range. Lastly, a secondary CIDR can be added to the VPC, in case you need more IPs.

vpc_with_2_cidr_blocks
Vpc with 2 CIDR blocks

This VPC has 2 CIDRs and 3 subnets. Primary CIDR has 2 subnets (Subnet 1, Subnet 2) and the secondary CIDR has 1 subnet (Subnet 3). You can disassociate a CIDR block from the VPC, except for the Primary CIDR.
AWS official documentation has a section called "To add a CIDR block to your VPC, the following rules apply:". This has a huge list of rules. It might seem troublesome to go through all this, but it is always good to spend some time and read the docs at least one time.

Public and Private Subnets (NAT)

Public subnets are subnets that are connected to the internet while Private subnets are subnets that are not connected to the internet.
Public and private subnets are important when designing secure applications. Instead of exposing our Databases/Internal APIs and protect those with username/password mechanism or any other mechanism, it is always better not to expose them at all.
A common example is a multi-tier website, with the web servers in a public subnet and the database servers in a private subnet. To implement this, we need to use a few other AWS services such as subnets, NAT gateways, Internet gateways, and Routers.

private_public_subnets
Private and public subnets (This is not a 100% accurate diagram, but this is the basic idea and we can draw a more accurate diagram later) 

Subnet Routing and Route tables

We learned that Public subnets are connected to the internet while private subnets are not. Private subnets are more secure. Each subnet must be associated with a route table.
A route table contains a set of rules(routes) that determine where network traffic from your subnet is directed.
There are two types of route tables; Main and Custom.

When you create a VPC, it automatically has a main route table. It controls the routing for all subnets that are not associated with any other route table(custom route tables). You can modify the main route table and can replace it as well. But when you replace that, you have to edit subnet associations as well. You can create custom route tables. Initially, there will be routes for the CIDR block of the VPC and new routes can be added accordingly.

When you create a new subnet, it will automatically be associated with the main route table. Later, you can create a custom route table for the new subnet to control the network traffic. In aws documentation, it is described as,

A subnet can be explicitly associated with custom route table, or implicitly or explicitly associated with the main route table.

When talking about securing services behind subnets, route tables are the key.

Check this route table.
route_table
Route table

Let's study the Route table in the above image to understand how it works. Route table has a destination and a target. According to the table,

10.0.0.0/26 and 10.2.0.0/24 have local routes which enable those IPs to communicate within the VPC. These local routes are added to the route table by default and cannot be modified or deleted.  Target of the 0.0.0.0/0 is igw-0f9151e81637e5ea1. 
"0.0.0.0/0" represents all the IPv4 addresses and the target is the internet gateway(igw-0f9151e81637e5ea1), hence it enables the subnets which are associated with this route table to access the internet. Since these subnets have internet access, we must not host our databases in these subnets. [let's talk about internet gateways later].
To allow internet access for IPv6 addresses, you must create a route with a destination CIDR of ::/0 for all IPv6 addresses.

AWS uses the most specific route in your route table that matches the traffic to determine how to route the traffic. Because of this even though 0.0.0.0/0 represented all the IPv4 addresses, in this route table it does not represents 10.0.0.0/26 and 10.2.0.0/24.
10.0.0.0/26 and 10.2.0.0/24 have their own targets to control the network traffic.

Let's check another route table.
route_table
Route table without internet gateway

If you check this table, you will notice that the target of 0.0.0.0/0 is a network address translation (NAT) gateway (nat-0048652b1192ff5b9). This means the subnets which are associated with this route table have controlled access to the internet, hence those subnets are perfect for creating our secure services. By controlled access, I meant connecting to the internet, but prevent the internet from initiating a connection with the subnet.

All we talked about here is IPv4 addresses and route tables must include separate routes for IPv6 traffic.

Internet Gateways

An internet gateway is a horizontally scaled, redundant, and highly available VPC component that allows communication between instances in your VPC and the internet. 
It helps the instances to communicate with other instances and the internet. It does two things to support this,
  • Provide a target in your VPC route tables for internet-routable traffic. This can be achieved in a few steps.
    • Attach an Internet gateway to your VPC.
    • Ensure that your subnet's route table points to the internet gateway.
    • Ensure that instances in your subnet have a globally unique IP address (Example: EC2 instances can have Elastic IP addresses).
    • Ensure that your network access control and security group rules allow the relevant traffic to flow to and from your instance.
  • Perform network address translation (NAT) for instances that have public IPv4 addresses: To enable communication over the internet for IPv4, your instance must have a public IPv4 address or an Elastic IP address that's associated with a private IPv4 address on your instance. Instances only know about the private IP address and internet gateway is responsible for providing the one-to-one network address translation with the public/elastic IP address when communicating via the internet. 
    internet_gateway_nat
    Internet gateway NAT(Click to enlarge)

We want to implement highly available, redundant, robust and secure services and AWS helps us to achieve these easily. Some might wonder that we are using multiple instances to create our services/APIs/DBS and wouldn't it create a bottleneck by sending all the traffic through an internet gateway?
The answer is No. According to AWS, internet gateways are horizontally scaled, redundant, and highly available VPC components. That means it can scale according to our requirements.

IPv6 addresses are globally unique, and therefore public by default and no need to have public/elastic IPs.

NAT Gateways

Network address translation (NAT) gateways are used to enable instances in a private subnet to connect to the internet or other AWS services, but prevent the internet from initiating a connection with those instances
 NAT gateways use NAT devices to achieve this. What it does is, it replaces the IPv4 address of the instance with NAT device's IP address when communication back and forth via the internet.
NAT does not support IPv6 and instead, we have to use egress-only Internet gateway.

You need a public subnet and an Elastic IP when creating a NAT gateway. Once done, you can configure the route table of your private subnet to gain internet access via the NAT gateway.

If you have multiple availability zones, the best practice is to create NAT gateway for each zone.

nat_gateway
NAT gateway

VPC Endpoints

We know that we can create multiple instances in a VPC. Since they all share the same network it is only logical to think that they do not need any public IPs to communicate with each other.

That is true. Devices that enable this internal traffic is called VPC endpoints. As I have mentioned earlier when talking about internet gateways, no need to worry about any bottlenecks in the internal network because VPC endpoints are horizontally scaled, redundant, and highly available sets of components.

vpc_endpoints
VPC endpoints

Conclusion

We looked at some of the major components of a network of an AWS infrastructure. Most of the time I referenced AWS documentation because they are well written. If you pay attention to details, you can self learn almost all of the AWS technologies.

The below image has most of the components we studies today. Try to read and understand it.


Follow this tutorial and try to deploy an API using docker and a preferred framework instead of aws lambda. You will learn a lot.
https://aws.amazon.com/quickstart/architecture/serverless-cicd-for-enterprise/

References

  • https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html#concepts-regions-availability-zones
  • https://docs.aws.amazon.com/vpc/latest/userguide/VPC_Subnets.html
  • https://www.youtube.com/watch?v=hiKPPy584Mg
  • https://www.digitalocean.com/community/tutorials/understanding-ip-addresses-subnets-and-cidr-notation-for-networking#understanding-ip-addresses
  • https://www.iplocation.net/public-vs-private-ip-address
  • https://stackoverflow.com/questions/2437169/what-is-the-total-amount-of-public-ipv4-addresses
  • https://docs.aws.amazon.com/vpc/latest/userguide/VPC_Scenario2.html
  • https://docs.aws.amazon.com/vpc/latest/userguide/VPC_Route_Tables.html
  • https://docs.aws.amazon.com/vpc/latest/userguide/VPC_Internet_Gateway.html
  • https://docs.aws.amazon.com/vpc/latest/userguide/vpc-nat-gateway.html
  • https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints.html

Comments