Linux namespace と Pod とコンテナ

Linux namespace と Pod とコンテナの関係を確認するメモ。

EKS 1.24 のクラスターで確認する。

$ k get node   
NAME                                             STATUS   ROLES    AGE   VERSION
ip-10-0-10-67.ap-northeast-1.compute.internal    Ready    <none>   85m   v1.24.7-eks-fb459a0
ip-10-0-10-76.ap-northeast-1.compute.internal    Ready    <none>   85m   v1.24.7-eks-fb459a0
ip-10-0-11-179.ap-northeast-1.compute.internal   Ready    <none>   85m   v1.24.7-eks-fb459a0

テスト用に普通の Pod を起動する。

cat << EOF > pod1.yaml
apiVersion: v1
kind: Pod
metadata:
  labels:
    run: pod1
  name: pod1
spec:
  nodeName: ip-10-0-10-76.ap-northeast-1.compute.internal
  containers:
  - image: nginx
    name: c1
  - image: nginx
    name: c2
    command:
    - sleep
    - infinity
EOF
k apply -f pod1.yaml
cat << EOF > pod2.yaml
apiVersion: v1
kind: Pod
metadata:
  labels:
    run: pod2
  name: pod2
spec:
  nodeName: ip-10-0-10-76.ap-northeast-1.compute.internal
  containers:
  - image: nginx
    name: c3
EOF
k apply -f pod2.yaml

Pod を確認する。

$ k get po -o wide
NAME   READY   STATUS    RESTARTS   AGE     IP            NODE                                            NOMINATED NODE   READINESS GATES
pod1   2/2     Running   0          10s     10.0.10.156   ip-10-0-10-76.ap-northeast-1.compute.internal   <none>           <none>
pod2   1/1     Running   0          3m11s   10.0.10.32    ip-10-0-10-76.ap-northeast-1.compute.internal   <none>           <none>

ワーカーノードにログインし、crictl をインストールする。

VERSION="v1.26.0"
wget https://github.com/kubernetes-sigs/cri-tools/releases/download/$VERSION/crictl-$VERSION-linux-amd64.tar.gz
sudo tar zxvf crictl-$VERSION-linux-amd64.tar.gz -C /usr/local/bin
rm -f crictl-$VERSION-linux-amd64.tar.gz

一般ユーザーではパーミッションが足りなかったので、root で作業する。

sudo -i
PATH=/usr/local/bin:$PATH

コンテナを確認する。

[root@ip-10-0-10-76 ~]# crictl ps
WARN[0000] runtime connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]. As the default settings are now deprecated, you should set the endpoint instead.
WARN[0000] image connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]. As the default settings are now deprecated, you should set the endpoint instead.
CONTAINER           IMAGE               CREATED             STATE               NAME                     ATTEMPT             POD ID              POD
e403fe5ffa665       a99a39d070bfd       27 seconds ago      Running             c2                       0                   bb248149db8aa       pod1
302bc156a8851       a99a39d070bfd       29 seconds ago      Running             c1                       0                   bb248149db8aa       pod1
e26060a138fa0       a99a39d070bfd       3 minutes ago       Running             c3                       0                   44aa61e83205b       pod2
39a2e1ffb3802       e89557324d1dc       About an hour ago   Running             aws-cloudwatch-metrics   0                   ba00864e6c645       aws-cloudwatch-metrics-rz8gg
a5df991ddd9b5       5bad0186aac16       2 hours ago         Running             aws-node                 0                   4c89a915897ac       aws-node-h8zpz
34a11679a69b7       45d382f80f905       2 hours ago         Running             liveness-probe           0                   7277b4c16e132       ebs-csi-node-fbj5c
7c6bf9d37cb96       23b72c0353f68       2 hours ago         Running             node-driver-registrar    0                   7277b4c16e132       ebs-csi-node-fbj5c
45312cd447464       b0bf1b95ca27d       2 hours ago         Running             ebs-plugin               0                   7277b4c16e132       ebs-csi-node-fbj5c
101e615d876fb       c78e7b825058a       2 hours ago         Running             coredns                  0                   efd0307fb7b4a       coredns-5fc8d4cdcf-cpmbk
6669bfb0ed765       c78e7b825058a       2 hours ago         Running             coredns                  0                   35b82e7e655a9       coredns-5fc8d4cdcf-h82vt
41168828c2868       04beb3b811d34       2 hours ago         Running             kube-proxy               0                   4aed797955139       kube-proxy-xk924

pod1 では 2 つ (c1, c2) のコンテナが動いているが、それらの PID を確認する。

[root@ip-10-0-10-76 ~]# crictl inspect 302bc156a8851 | grep pid
WARN[0000] runtime connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]. As the default settings are now deprecated, you should set the endpoint instead.
    "pid": 25202,
            "pid": 1
            "type": "pid"
[root@ip-10-0-10-76 ~]# crictl inspect e403fe5ffa665 | grep pid
WARN[0000] runtime connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]. As the default settings are now deprecated, you should set the endpoint instead.
    "pid": 25296,
            "pid": 1
            "type": "pid"
[root@ip-10-0-10-76 ~]#

pod2 の 1 つのコンテナの PID も確認する。

[root@ip-10-0-10-76 ~]# crictl inspect e26060a138fa0 | grep pid
WARN[0000] runtime connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]. As the default settings are now deprecated, you should set the endpoint instead.
    "pid": 22813,
            "pid": 1
            "type": "pid"

それぞれのプロセスの Namespace を見てみる。

[root@ip-10-0-10-76 ~]# ls -l /proc/25202/ns
total 0
lrwxrwxrwx 1 root root 0 Jan 11 08:57 cgroup -> cgroup:[4026531835]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 ipc -> ipc:[4026532688]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 mnt -> mnt:[4026532690]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 net -> net:[4026532610]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 pid -> pid:[4026532691]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 pid_for_children -> pid:[4026532691]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 user -> user:[4026531837]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 uts -> uts:[4026532687]
[root@ip-10-0-10-76 ~]# ls -l /proc/25296/ns
total 0
lrwxrwxrwx 1 root root 0 Jan 11 08:57 cgroup -> cgroup:[4026531835]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 ipc -> ipc:[4026532688]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 mnt -> mnt:[4026532775]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 net -> net:[4026532610]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 pid -> pid:[4026532776]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 pid_for_children -> pid:[4026532776]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 user -> user:[4026531837]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 uts -> uts:[4026532687]
[root@ip-10-0-10-76 ~]# ls -l /proc/22813/ns
total 0
lrwxrwxrwx 1 root root 0 Jan 11 08:57 cgroup -> cgroup:[4026531835]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 ipc -> ipc:[4026532771]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 mnt -> mnt:[4026532773]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 net -> net:[4026532693]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 pid -> pid:[4026532774]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 pid_for_children -> pid:[4026532774]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 user -> user:[4026531837]
lrwxrwxrwx 1 root root 0 Jan 11 08:57 uts -> uts:[4026532770]

さらにホストの自分自身 (bash) の名前空間を見てみる。

[root@ip-10-0-10-76 ~]# ls -l /proc/$$/ns
total 0
lrwxrwxrwx 1 root root 0 Jan 11 08:40 cgroup -> cgroup:[4026531835]
lrwxrwxrwx 1 root root 0 Jan 11 08:40 ipc -> ipc:[4026531839]
lrwxrwxrwx 1 root root 0 Jan 11 08:40 mnt -> mnt:[4026531840]
lrwxrwxrwx 1 root root 0 Jan 11 08:40 net -> net:[4026531992]
lrwxrwxrwx 1 root root 0 Jan 11 08:40 pid -> pid:[4026531836]
lrwxrwxrwx 1 root root 0 Jan 11 08:40 pid_for_children -> pid:[4026531836]
lrwxrwxrwx 1 root root 0 Jan 11 08:40 user -> user:[4026531837]
lrwxrwxrwx 1 root root 0 Jan 11 08:40 uts -> uts:[4026531838]

表にしてみる。

namespace c1 (pod1) c2 (pod1) c3 (pod2) bash
cgroup 4026531835 4026531835 4026531835 4026531835
ipc 4026532688 4026532688 4026532771 4026531839
mnt 4026532690 4026532775 4026532773 4026531840
net 4026532610 4026532610 4026532693 4026531992
pid 4026532691 4026532776 4026532774 4026531836
pid_for_children 4026532691 4026532776 4026532774 4026531836
user 4026531837 4026531837 4026531837 4026531837
uts 4026532687 4026532687 4026532770 4026531838

まとめる。

namespace 同 Pod の別コンテナ 別 Pod の別コンテナ まとめ
cgroup 同じ 同じ 隔離されていない (ホストと同じ)
ipc 同じ Pod 毎
mnt コンテナ毎
net 同じ Pod 毎
pid コンテナ毎
pid_for_children コンテナ毎
user 同じ 同じ 隔離されていない (ホストと同じ)
uts 同じ Pod 毎

crictl では表示されていないが、pause コンテナがあったはず。

[root@ip-10-0-10-76 ~]# ps -ef | grep pause
root      3464     1  1 Jan11 ?        00:14:53 /usr/bin/kubelet --cloud-provider aws --image-credential-provider-config /etc/eks/ecr-credential-provider/ecr-credential-provider-config --image-credential-provider-bin-dir /etc/eks/ecr-credential-provider --config /etc/kubernetes/kubelet/kubelet-config.json --kubeconfig /var/lib/kubelet/kubeconfig --container-runtime remote --container-runtime-endpoint unix:///run/containerd/containerd.sock --node-ip=10.0.10.76 --pod-infra-container-image=602401143452.dkr.ecr.ap-northeast-1.amazonaws.com/eks/pause:3.5 --v=2 --node-labels=eks.amazonaws.com/nodegroup-image=ami-0a29a25cb84e79542,eks.amazonaws.com/capacityType=ON_DEMAND,eks.amazonaws.com/nodegroup=managed-ondemand-20230111062322101300000001 --max-pods=58
65535     3709  3663  0 Jan11 ?        00:00:00 /pause
65535     4640  4607  0 Jan11 ?        00:00:00 /pause
65535     4687  4599  0 Jan11 ?        00:00:00 /pause
root      4777  4707  0 00:55 pts/0    00:00:00 grep --color=auto pause
root     19591 19567  0 Jan11 ?        00:00:00 /pause
65535    20238 20208  0 Jan11 ?        00:00:00 /pause
65535    22745 22720  0 Jan11 ?        00:00:00 /pause
65535    25134 25109  0 Jan11 ?        00:00:00 /pause
65535    30715 30687  0 Jan11 ?        00:00:00 /pause

ひとつひとつ探すと、25134pod1 と同じ ipc/net/uts namespace を保持していることが確認できた。

[root@ip-10-0-10-76 ~]# ls -l /proc/25134/ns
total 0
lrwxrwxrwx 1 65535 65535 0 Jan 12 01:03 cgroup -> cgroup:[4026531835]
lrwxrwxrwx 1 65535 65535 0 Jan 11 08:53 ipc -> ipc:[4026532688]
lrwxrwxrwx 1 65535 65535 0 Jan 12 01:03 mnt -> mnt:[4026532686]
lrwxrwxrwx 1 65535 65535 0 Jan 11 08:53 net -> net:[4026532610]
lrwxrwxrwx 1 65535 65535 0 Jan 12 01:03 pid -> pid:[4026532689]
lrwxrwxrwx 1 65535 65535 0 Jan 12 01:03 pid_for_children -> pid:[4026532689]
lrwxrwxrwx 1 65535 65535 0 Jan 12 01:03 user -> user:[4026531837]
lrwxrwxrwx 1 65535 65535 0 Jan 11 08:53 uts -> uts:[4026532687]

プロセスの親子関係はこうなっている。

[root@ip-10-0-10-76 ~]# pstree -p 25109
containerd-shim(25109)─┬─nginx(25202)─┬─nginx(25273)
                       │              ├─nginx(25274)
                       │              ├─nginx(25275)
                       │              └─nginx(25277)
                       ├─pause(25134)
                       ├─sleep(25296)
                       ├─{containerd-shim}(25110)
                       ├─{containerd-shim}(25111)
                       ├─{containerd-shim}(25112)
                       ├─{containerd-shim}(25113)
                       ├─{containerd-shim}(25114)
                       ├─{containerd-shim}(25115)
                       ├─{containerd-shim}(25116)
                       ├─{containerd-shim}(25117)
                       ├─{containerd-shim}(25118)
                       ├─{containerd-shim}(25176)
                       ├─{containerd-shim}(28809)
                       └─{containerd-shim}(8915)

参考リンク