I’ve been working with container networking a bunch this week. When learning about new unfamiliar stuff (like container networking / virtual ethernet devices / bridges / iptables), I often realize that I don’t fully understand something much more fundamental.
This week, that thing was: network interfaces!!
You know, when you run
ifconfig and it lists devices like
wlan0, or whatever. Those.
This is a thing I thought I understood but it turns out there are at least 2 things I didn’t know about them.
I’m not going to try to give you a crisp definition, instead we’re going to make some observations, do some experiments, ask some questions, and make some guesses.
What happens if you don’t have any network interfaces?
I was messing around with network namespaces, and I created a new one with:
sudo ip netns add ns1
It turns out that when you create a new network namespace, it doesn’t have any network interfaces at all! What does that mean? Let’s explore and see what it looks like:
We can run commands inside this new network namespace with
sudo ip netns exec ns1 COMMAND. I’m just going to run a shell inside this network namespace, and then
try out some things.
So let’s start with
sudo ip netns exec ns1 bash
$ sudo ip netns exec ns1 bash $ ifconfig (no output)
That makes sense, this is a new network namespace so there are no network interfaces set up yet. Still inside that network namespace, let’s try to make a webserver and connect to it.
$ nc -l 8900 & # make a server on port 8900 $ netstat -tulpn # list open ports Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address PID/Program name tcp 0 0 0.0.0.0:8900 2918/nc $ curl localhost:8900 curl: (7) Couldn't connect to server
Okay, so this is sort of interesting. I can create a server on port 8900 with
nc -l 8900. And netstat shows that that server exists. But when I try to
curl localhost:8900, nothing happens!
What if I try to create a server listening on 127.0.0.1?
sudo nc -l 127.0.0.1 8080 nc: Cannot assign requested address
Doesn’t work. Makes sense.
I think what’s happening here is:
nc -l 8900is listening on 0.0.0.0:8900, which means “all network interfaces”
- but there are no network interfaces
- so when we do
curl localhost:8900, no packets actually get sent (when I ran tcpdump, no packets show up)
ncnever receives any packets
Let’s do an experiment to try to confirm our hypotheses: let’s add a network
interface! The idea is that if we have a
lo network interface, then
localhost:8900 will actually send packets,
nc will receive them, and
everything will work.
$ ip link set dev lo up # this sets uo the 'lo' loopback interface $ curl localhost:8900 # BAM! this totally works! # the backgrounded netcat prints out this output: GET / HTTP/1.1 User-Agent: curl/7.35.0 Host: localhost:8900 Accept: */*
This is rad. What we know now:
- if you don’t have any network interfaces, you can’t do any networking (but you can start servers on 0.0.0.0 and netstat shows those servers)
- when we add a network interface, our server starts working right away (without having to restart the server)
A packet can appear multiple times in tcpdump
Something I’ve been observing recently but haven’t fully understood is – sometimes I’ll be on a machine which has
- virtual network interfaces for each container (
- a bridge interface (
- and a “real” network interface to the outside world (
When containers send packets to the outside world and I’m running
sudo tcpdump -i any, I’ll see those packets 3 times.
I know a few more things about how tcpdump works:
- I can run
sudo tcpdump -i cni0to listen on a specific interface. When I do that, the packets appear only once
- tcpdump happens at the “beginning” of the network stack. I think that means that packets are captured by tcpdump when packets enter a network interface
What does “enter a network interface” actually mean, though? I tried to look at this 20,000 word article on the iinux network stack and I think I have a workable theory!
What happens when a packet is created?
Okay, so I skimmed Monitoring and Tuning the Linux Networking Stack: Receiving Data and I think I have a working hypothesis for how packets
- get assigned network interfaces
- get captured by tcpdump
- can be assigned more than one network interface
First thing first, this document refers to “network interfaces” as “network devices”. I think those are the same thing.
So!! Let’s say I create a packet on my computer.
step 0: iptables prerouting rules
step 1: the packet gets routed.
Routing a packet means “assigning it a network device”.
Let’s do a tiny experiment in routing – I have 3 interfaces on my computer right now
$ ifconfig docker0 Link encap:Ethernet HWaddr 02:42:ef:ab:0d:ac inet addr:172.17.0.1 Bcast:0.0.0.0 Mask:255.255.0.0 enp0s25 Link encap:Ethernet HWaddr 3c:97:0e:55:b3:7g inet addr:192.168.1.213 Bcast:192.168.1.255 Mask:255.255.255.0 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0
and here are the routes:
$ sudo ip route list table all default via 192.168.1.1 dev enp0s25 proto static metric 100 169.254.0.0/16 dev docker0 scope link metric 1000 linkdown 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown 192.168.1.0/24 dev enp0s25 proto kernel scope link src 192.168.1.213 metric 100 local 127.0.0.0/8 dev lo table local proto kernel scope host src 127.0.0.1 local 127.0.0.1 dev lo table local proto kernel scope host src 127.0.0.1 local 172.17.0.1 dev docker0 table local proto kernel scope host src 172.17.0.1
So – if I make a request to 172.17.0.1 (
curl 172.17.0.1:8080), it seems like that would end up on the
docker0 device. Right? Wrong, apparently.
If I run
tcpdump -i lo packets to 172.17.0.1 show up, and if I run
tcpdump -i docker0,
the packets don’t show up. So it seems right now, on my machine,
packets sent to 172.17.0.1 go through the
The reason they get sent to
lo instead of
docker0 is that there’s a route
for 172.17.0.1 in my route table that says
local – the same reasons that
127.0.0.1 get sent to
step 2 tcpdump gets the packet
This is pretty straightforward – once there’s a network device attached to the packet, then tcpdump gets the packet.
That’s all I know for now!
ok so what do we know about network interfaces?
Here’s what I think so far:
- they can be physical network interfaces (like
eth0) or virtual interfaces (like
- you can list them with
ip link list
- if you don’t have any network interfaces, your packets don’t enter the linux network stack at all really. To go through the network stack you need network interfaces.
- When you send a packet to an IP address, your route table decides which network interface that packet goes through. This is one of the first things that happens in the network stack.
- tcpdump captures packets after they’re routed (assigned an interface) Though there’s a
PREROUTINGchain in iptables that happens before routing!`
Some of this is probably wrong, let me know what! I’m on Twitter as always (https://twitter.com/b0rk)