容器Namespace - 2

      容器通過namespace建立屬於自己的一個相對隔離的環境,從上一篇《容器Namespace - 1》我們知道centos7默認沒有啓用user namespace。

      

      上圖顯示容器的user namespace與宿主機是同一個ID:4026531837,接下來簡單分析下docker創建namespace的代碼。

 

vendor/github.com/containerd/containerd/oci/spec_unix.go,這個文件定義了缺省的namespace以及capability。

func defaultCaps() []string {

        return []string{

                "CAP_CHOWN",

                "CAP_DAC_OVERRIDE",

                "CAP_FSETID",

                "CAP_FOWNER",

                "CAP_MKNOD",

                "CAP_NET_RAW",

                "CAP_SETGID",

                "CAP_SETUID",

                "CAP_SETFCAP",

                "CAP_SETPCAP",

                "CAP_NET_BIND_SERVICE",

                "CAP_SYS_CHROOT",

                "CAP_KILL",

                "CAP_AUDIT_WRITE",

        }

}



func defaultNamespaces() []specs.LinuxNamespace {

        return []specs.LinuxNamespace{

                {

                        Type: specs.PIDNamespace,

                },

                {

                        Type: specs.IPCNamespace,

                },

                {

                        Type: specs.UTSNamespace,

                },

                {

                        Type: specs.MountNamespace,

                },

                {

                        Type: specs.NetworkNamespace,

                },

        }

}

創建啓動容器過程中,會在下面的目錄下生成run-time的config.json

/run/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/容器id/config.json

解析下這個文件(jq is a tool for processing JSON inputs):

cat config.json | jq . > /opt/config.json

/opt/config.json文件相關的namespace配置

  "namespaces": [

      {

        "type": "mount"

      },

      {

        "type": "network"

      },

      {

        "type": "uts"

      },

      {

        "type": "pid"

      },

      {

        "type": "ipc"

      }

    ],

 

創建容器namespace的邏輯在:

vendor/github.com/opencontainers/runc/libcontainer/nsenter/nsexec.c

 if (config.namespaces)

       join_namespaces(config.namespaces);



void join_namespaces(char *nslist)

{

        struct namespace_t {

                int fd;

                int ns;

                char type[PATH_MAX];

                char path[PATH_MAX];

        } *namespaces = NULL;



      //準備namespace

                fd = open(path, O_RDONLY);

                if (fd < 0)

                        bail("failed to open %s", path);



                ns->fd = fd;

                ns->ns = nsflag(namespace);

                strncpy(ns->path, path, PATH_MAX - 1);





        for (i = 0; i < num; i++) {

                struct namespace_t ns = namespaces[i];


                // 通過setns設置
                if (setns(ns.fd, ns.ns) < 0)

                        bail("failed to setns to %s", ns.path);



                close(ns.fd);

        }



        free(namespaces);



}

詳細的代碼分析邏輯需要仔細再看看,這裏只是個大概過程。

 

既然docker容器user namespace的管理員root是“map root to root”,它應該就具備super priviledge權限,我們來看看。

[root@centos opt]# unshare -m --mount-proc -u -i -n -p -U -r -f  /bin/bash

[root@centos opt]# ip link show

1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

[root@centos opt]# ip link set dev lo down

[root@centos opt]# ip link set dev lo up

 

[root@centos opt]#docker run -ti centos /bin/bash

[root@f45f03e236ec /]#ip link set lo down

RTNETLINK answers: Operation not permitted

爲什麼docker容器在設置loop接口狀態時提示:“RTNETLINK answers: Operation not permitted”?

[root@f45f03e236ec /]# mount -t tmpfs -o size=20m tmpfs /tmp1

mount: permission denied       // mount 也不允許

這個super user的權限是怎麼限制的呢?

 

這個就需要講講linux系統的capability了。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章