One of the most useful skills to learn when working with AWS is how to troubleshoot network connectivity issues. Whether you’re a Solutions Architect, SysOps Engineer, or a Developer, chances are you’ll encounter network connectivity issues at some point in your cloud journey. Learning how to quickly identify and solve networking issues in AWS is a great skill that will serve you well throughout your career.
Here are my top 5 tips that I’ve learned along the way.
Accelerate your career
Get started with ACG and transform your career with courses and real hands-on labs in AWS, Microsoft Azure, Google Cloud, and beyond.
1. Be methodical
VPCs can be really complex, so when troubleshooting network connectivity issues, it pays to be methodical in your approach.
Any component within a VPC could be misconfigured, and could be the potential cause of a connectivity issue.
Consider a scenario where you have an EC2 instance that needs to perform a yum update, but the command fails because the instance is not able to reach the internet.
When troubleshooting an issue like this, it helps to create a diagram showing how the VPC is configured, just like the one above. This will help you identify all of the components, between the source of network traffic – the instance, and its destination – the internet.
2. Identify likely causes of your network connectivity issue
Once you have identified all of the components involved, you can then examine the configuration of each of them in a methodical way, to identify the cause of the issue.
It can sometimes help to identify possible causes that could result in the problems that you are experiencing. Here are some examples of potential causes that might be at play, when dealing with an instance that cannot connect to the internet:
- Instance health
- Instance configuration
- Subnet configuration
- Security group or network ACL configuration
- Route table configuration
- Internet or NAT gateway configuration
- Request configuration and protocol
With an exhaustive list on hand, you can then check the configuration of each component to identify the root cause. Process of elimination!
Start building your cloud skills with these 10 fun hands-on projects to learn AWS.
Work from the inside out
My suggestion is to work from the inside out. From the instance to either the source or destination of the traffic, or vice versa, whichever makes sense for you. Check that there is nothing present in the configuration of each component that would prevent network communication.
So in this case, I would begin by checking first the configuration of the instance, then the security group, then the network ACL, the routing table, the NAT gateway, and finally the request itself. If any component is not configured correctly, fix it and re-test.
3. Take a shortcut
Think outside the box! Or, take a look at what’s already in the box. If you have an example of a working configuration you can compare to, that can sometimes be a quick shortcut to finding the root cause of the network connectivity problem. Compare the two configurations, and if you identify a difference between them, you have likely identified your root cause.
4. Enable VPC flow logs
Flow logs capture data relating to network traffic that is attempting to enter and leave your VPC. They can easily be enabled from the VPC console, and can also be enabled for individual Elastic Network Interfaces (ENIs).
You can configure your flow log to record accepted, rejected, or ALL network traffic, and you can send your logs to S3 or consolidate them, by delivering them to CloudWatch. This is a great way to determine if your requests are being blocked by your network configuration.
Here’s an example of a flow log entry that clearly shows an attempted SSH connection was rejected:
5. If all else fails, use the VPC Reachability Analyzer
The VPC Reachability Analyzer is a great tool that enables you to analyze a network path and determine if it is reachable. If the path is not reachable, it will provide an explanation why. So, this is an awesome way to help identify a misconfiguration in your network that is causing a connectivity issue.
To use the tool, you simply provide the traffic source, destination, protocol, and optionally a destination port. Reachability Analyzer will perform the analysis and report whether the path is reachable or not. If the path is not reachable, it’ll tell you which component is blocking the traffic, so you can investigate and fix the problem. In our example below, Reachability Analyzer has identified the security group that is causing the traffic to be blocked.
From AWS VPC Reachability Analyzer Console
Network troubleshooting is an awesome skill to develop, which comes in handy time and time again. Of course, this article just scratches the surface of the kind of issues you might encounter when working with AWS.
Want to learn more about troubleshooting AWS in general? We’ve just launched a brand new course which examines best practices for troubleshooting IAM, Lambda, S3, CloudFormation, as well as Networking. To get a deeper dive into these other troubleshooting topics, check out Hands-On AWS Troubleshooting.