Table of Contents
Introduction: Understanding the Docker Ecosystem Requires a Good Knowledge of Containers Networking
This series will guide you through the most crucial container networking concepts. You don't need to be a Docker expert to apprehend the different concepts introduced here, though a basic understanding of networking, Docker, and Kubernetes is required. You can fast-track to the second part by going to Docker Networking Part II.
Docker is a tool designed to create, build, and run isolated environments inside containers. It's widely used to containerize applications to run inside lightweight containers. To master Docker, you must understand how to create images, run them, secure your containers, manipulate the docker file system, and be proficient in managing docker networks.
Docker networking may be the most confusing part of your learning journey. During the last few years, this technology has developed a dynamic ecosystem. Technologies like Docker Compose, Docker Swarm, and Kubernetes solved many problems in the containerization ecosystem but introduced new challenges, notably in networking. A good understanding of the Docker ecosystem implies a good knowledge of networking.
Containers: Do One Thing and Do It Well
When we run a container, say WordPress, we can build an image and ship a web server (Nginx or Apache) + PHP or (PHP FPM) + a Mysql/MariaDB database.
This solution eliminates many networking problems. In this case, you can use a process manager like supervisord to ensure your processes are running.
However, this is not a good practice since you will add more layers to your images. To use supervisord, you must install it and ship its configuration with the container. Building and running lightweight containers, including only the essential processes and software packages, is good practice.
Moreover, running multiple processes in a single container is an anti-pattern. A good practice is to run a unique process inside a container.
For the WordPress case, you should have a container for the webserver (Apache or Nginx), a PHP container, and another database container.
These containers must communicate with each other: The web server receives a request and sends a request to the PHP container, and if the latter needs data, it will request it from the database container. The inverse path must be taken into the equation as well. If you run these containers in different hosts, they should be able to send and receive traffic from each other, even though they are not on the same host. You should consider a minimum of security standards for traffic between multiple hosts. You can also face some cases where you need to scale the webserver or the application containers and manage to route traffic to them using a load balancer.
Even if this use case seems basic, we can see how networking is essential in running the whole stack.
Orchestration: A Layer of Complexity?
You may run standalone containers, but in highly available environments, particularly production, you will need orchestration platforms to manage these containers.
Using orchestration systems like Kubernetes will undoubtedly solve many problems to which containers alone cannot provide satisfying solutions.
Let's identify some of them:
- Scaling up/down the number of containers composing a service
- Load balancing across different containers of the same service
- Transfer of containers from one node (VM) to another
- Exposure of service to other services and the Internet
- Service discovery between containers and services
- Deployment of containers
If you examine most of these use cases and features, networking is a common theme.
Orchestration is a must, but it adds an extra layer of networking. Besides the inter-container networking in the same node, there are multiple types of networking in Kubernetes, like master-to-cluster networking, cluster-to-master networking, Internet-to-service networking, service-to-pod networking, pod-to-pod networking, container-to-container networking, etc. If we want to go into more detail, we can also consider nodes, kubelet, Kube-proxy, and DNS networking.
Kubernetes is considered the modern data center operating system; one would expect to see this networking complexity in a platform of this magnitude. At the same time, one of the most complex and probably the most critical parts of Kubernetes is networking. There is no mastery of Kubernetes without embracing its networking system.
Containers Networking is Complex but not Complicated
There is an aphorism in the Zen of Python that says:
"Simple is better than complex. Complex is better than complicated."
In IT, complexity refers to the number of components of a system and the level of interactions between them. Complexity means a high level of difficulty.
Container networking, notably in orchestrated systems, is complex but not complicated. Moreover, this complexity is sometimes necessary to create abstract systems and generic solutions to common problems. Joe Beda, a Kubernetes developer, said:
"Kubernetes is a complex system. It does a lot and brings new abstractions [...] as engineers, we tend to discount the complexity we build ourselves vs. complexity we need to learn."
Beda went on to say that when you create a complex deployment system with Jenkins, Bash, Puppet/Chef/Salt/Ansible, AWS, Terraform, etc., you end up with a unique brand of complexity that you are comfortable with. The system grew organically, so it doesn't feel complex, but it is difficult to bring new people on to help with an organically grown system like this. They may know some of the tools, but how you put them together is unique.
This is where Kubernetes adds value. It provides a set of abstractions that solves a standard set of problems. As people build understanding and skills around those problems, they are more productive in more situations.
Although container networking is initially complicated, it is approachable if you have basic knowledge of networking and the time to invest in learning new skills.
In Part II, we will dive deep into the technical details. We will understand how Docker container networking works when we run it in standalone mode, how multi-container networking works, and their differences. We will also discover how networking is managed across multiple hosts and the fascinating Kubernetes networking world.