首先声明pipework这个工具的下载地址:https://pan.baidu.com/s/1aQl1a6h2PF5zXJRcrdjLPA 提取码为:pipe。
首先我们需要弄清楚docker的独立IP之意义-----跨宿主机容器之间的组网。通常,容器之间如果在同一个宿主机,那么,容器之间的通信是没有什么问题的,但如果是不同的宿主机的容器,那么通信是非常困难的。
那么,这不得不提到docker的四种网络模式。Docker有以下4种网络模式:
a,host模式
众所周知,Docker使用了Linux的Namespaces技术来进行资源隔离,如PID Namespace隔离进程,Mount Namespace隔离文件系统,Network Namespace隔离网络等。一个Network Namespace提供了一份独立的网络环境,包括网卡、路由、Iptable规则等都与其他的Network Namespace隔离。一个Docker容器一般会分配一个独立的Network Namespace。但如果启动容器的时候使用host模式,那么这个容器将不会获得一个独立的Network Namespace,而是和宿主机共用一个Network Namespace。容器将不会虚拟出自己的网卡,配置自己的IP等,而是使用宿主机的IP和端口。
因此,一个宿主机只能启动一个镜像的一个容器,并且启动容器的时候不需要暴露端口,也就是不需要-p 选项了。如果通过一个相同的镜像启动bridge模式的容器将不会成功。如下所示:
[root@centos7 ~]# docker run -itd -e MYSQL_ROOT_PASSWORD=123456 --name mysql --net=host hub.c.163.com/library/mysql
c27aa617a54ca517db4c5b209bea0e99d367b9499daeb3ce130d742db8ce4c00
[root@centos7 ~]# docker run -itd -p 3306:3306 -e MYSQL_ROOT_PASSWORD=123456 --name mysql1 hub.c.163.com/library/mysql
2460ac141e8c38e4991fbdc6cf00fb39a190bdefc5b944c06b241aa1ece0f768
docker: Error response from daemon: driver failed programming external connectivity on endpoint mysql1 (b43b91b680f0311b937ed65d4bd65976fa79794b3d3c5a939f9bbe4ba33c9875): Error starting userland proxy: listen tcp 0.0.0.0:3306: bind: address already in use.
当然,你有别的不一样端口的镜像,是可以启动的。我想,可以这么说host,它是一个独占模式,毕竟宿主机端口就那么一个独一无二的。
b,container模式,
这个模式指定新创建的容器和已经存在的一个容器共享一个Network Namespace,而不是和宿主机共享。新创建的容器不会创建自己的网卡,配置自己的IP,而是和一个指定的容器共享IP、端口范围等。同样,两个容器除了网络方面,其他的如文件系统、进程列表等还是隔离的。两个容器的进程可以通过lo网卡设备通信。
也就是说,使用这个模式必须有父容器已经启动,然后子容器继承父容器的IP和端口,当然,我更倾向于认为这个是继承模式。
c,none模式
无网络模式,这种模式所启动的容器将没有IP,路由,dns,但端口是暴露的,这个模式将是pipework等等更高阶的网络配置前提。
d,bridge模式
常用的也是默认的模式是bridge模式,因此,在我们日常的使用以及管理docker容器时这一模式通常不需要指定,都默认啦。此模式会为每一个容器分配Network Namespace、设置IP等,并将一个主机上的Docker容器连接到一个虚拟网桥上。这一模式是容器网络VLAN的基础。但,这不在本文的讨论范围内。
下面大略的分析一下pipework的源码,源码内容如下:
#!/bin/sh
# This code should (try to) follow Google's Shell Style Guide
# (https://google.github.io/styleguide/shell.xml)
set -e
case "$1" in
--wait)
WAIT=1
;;
--direct-phys)
DIRECT_PHYS=1
shift
;;
esac
IFNAME=$1
# default value set further down if not set here
CONTAINER_IFNAME=
if [ "$2" = "-i" ]; then
CONTAINER_IFNAME=$3
shift 2
fi
if [ "$2" = "-l" ]; then
LOCAL_IFNAME=$3
shift 2
fi
#inet or inet6
FAMILY_FLAG="-4"
if [ "$2" = "-a" ]; then
FAMILY_FLAG="-$3"
shift 2
fi
GUESTNAME=$2
IPADDR=$3
MACADDR=$4
case "$MACADDR" in
*@*)
VLAN="${MACADDR#*@}"
VLAN="${VLAN%%@*}"
MACADDR="${MACADDR%%@*}"
;;
*)
VLAN=
;;
esac
# did they ask to generate a custom MACADDR?
# generate the unique string
case "$MACADDR" in
U:*)
macunique="${MACADDR#*:}"
# now generate a 48-bit hash string from $macunique
MACADDR=$(echo $macunique|md5sum|sed 's/^\(..\)\(..\)\(..\)\(..\)\(..\).*$/02:\1:\2:\3:\4:\5/')
;;
esac
[ "$IPADDR" ] || [ "$WAIT" ] || {
echo "Syntax:"
echo "pipework <hostinterface> [-i containerinterface] [-l localinterfacename] [-a addressfamily] <guest> <ipaddr>/<subnet>[@default_gateway] [macaddr][@vlan]"
echo "pipework <hostinterface> [-i containerinterface] [-l localinterfacename] <guest> dhcp [macaddr][@vlan]"
echo "pipework mac:<hostinterface_macaddress> [-i containerinterface] [-l localinterfacename] [-a addressfamily] <guest> <ipaddr>/<subnet>[@default_gateway] [macaddr][@vlan]"
echo "pipework mac:<hostinterface_macaddress> [-i containerinterface] [-l localinterfacename] <guest> dhcp [macaddr][@vlan]"
echo "pipework route <guest> <route_command>"
echo "pipework rule <guest> <rule_command>"
echo "pipework tc <guest> <tc_command>"
echo "pipework --wait [-i containerinterface]"
exit 1
}
# Succeed if the given utility is installed. Fail otherwise.
# For explanations about `which` vs `type` vs `command`, see:
# http://stackoverflow.com/questions/592620/check-if-a-program-exists-from-a-bash-script/677212#677212
# (Thanks to @chenhanxiao for pointing this out!)
installed () {
command -v "$1" >/dev/null 2>&1
}
# Google Styleguide says error messages should go to standard error.
warn () {
echo "$@" >&2
}
die () {
status="$1"
shift
warn "$@"
exit "$status"
}
if echo $IFNAME | grep -q '^mac:'; then
mac_address=$(echo $IFNAME | cut -c5-)
ifmatch=
for iftest in /sys/class/net/*; do
if [ -f $iftest/address ] && [ "$mac_address" = "$(cat $iftest/address)" ]; then
ifmatch="$(basename $iftest)"
break
fi
done
if [ -z "$ifmatch" ]; then
die 1 "Mac address $mac_address specified for interface but no host interface matched."
else
IFNAME=$ifmatch
fi
fi
# First step: determine type of first argument (bridge, physical interface...),
# Unless "--wait" is set (then skip the whole section)
if [ -z "$WAIT" ]; then
if [ -d "/sys/class/net/$IFNAME" ]
then
if [ -d "/sys/class/net/$IFNAME/bridge" ]; then
IFTYPE=bridge
BRTYPE=linux
elif installed ovs-vsctl && ovs-vsctl list-br|grep -q "^${IFNAME}$"; then
IFTYPE=bridge
BRTYPE=openvswitch
elif [ "$(cat "/sys/class/net/$IFNAME/type")" -eq 32 ]; then # InfiniBand IPoIB interface type 32
IFTYPE=ipoib
# The IPoIB kernel module is fussy, set device name to ib0 if not overridden
CONTAINER_IFNAME=${CONTAINER_IFNAME:-ib0}
PKEY=$VLAN
else IFTYPE=phys
fi
else
case "$IFNAME" in
br*)
IFTYPE=bridge
BRTYPE=linux
;;
ovs*)
if ! installed ovs-vsctl; then
die 1 "Need OVS installed on the system to create an ovs bridge"
fi
IFTYPE=bridge
BRTYPE=openvswitch
;;
route*)
IFTYPE=route
;;
rule*)
IFTYPE=rule
;;
tc*)
IFTYPE=tc
;;
dummy*)
IFTYPE=dummy
;;
*) die 1 "I do not know how to setup interface $IFNAME." ;;
esac
fi
fi
# Set the default container interface name to eth1 if not already set
CONTAINER_IFNAME=${CONTAINER_IFNAME:-eth1}
[ "$WAIT" ] && {
while true; do
# This first method works even without `ip` or `ifconfig` installed,
# but doesn't work on older kernels (e.g. CentOS 6.X). See #128.
grep -q '^1$' "/sys/class/net/$CONTAINER_IFNAME/carrier" && break
# This method hopefully works on those older kernels.
ip link ls dev "$CONTAINER_IFNAME" && break
sleep 1
done > /dev/null 2>&1
exit 0
}
[ "$IFTYPE" = bridge ] && [ "$BRTYPE" = linux ] && [ "$VLAN" ] && {
die 1 "VLAN configuration currently unsupported for Linux bridge."
}
[ "$IFTYPE" = ipoib ] && [ "$MACADDR" ] && {
die 1 "MACADDR configuration unsupported for IPoIB interfaces."
}
# Second step: find the guest (for now, we only support LXC containers)
while read _ mnt fstype options _; do
[ "$fstype" != "cgroup" ] && continue
echo "$options" | grep -qw devices || continue
CGROUPMNT=$mnt
done < /proc/mounts
[ "$CGROUPMNT" ] || {
die 1 "Could not locate cgroup mount point."
}
# Try to find a cgroup matching exactly the provided name.
M=$(find "$CGROUPMNT" -name "$GUESTNAME")
N=$(echo "$M" | wc -l)
if [ -z "$M" ] ; then
N=0
fi
case "$N" in
0)
# If we didn't find anything, try to lookup the container with Docker.
if installed docker; then
RETRIES=3
while [ "$RETRIES" -gt 0 ]; do
DOCKERPID=$(docker inspect --format='{{ .State.Pid }}' "$GUESTNAME")
DOCKERCID=$(docker inspect --format='{{ .ID }}' "$GUESTNAME")
DOCKERCNAME=$(docker inspect --format='{{ .Name }}' "$GUESTNAME")
[ "$DOCKERPID" != 0 ] && break
sleep 1
RETRIES=$((RETRIES - 1))
done
[ "$DOCKERPID" = 0 ] && {
die 1 "Docker inspect returned invalid PID 0"
}
[ "$DOCKERPID" = "<no value>" ] && {
die 1 "Container $GUESTNAME not found, and unknown to Docker."
}
else
die 1 "Container $GUESTNAME not found, and Docker not installed."
fi
;;
1) true ;;
2) # LXC >=3.1.0 returns two entries from the cgroups mount instead of one.
echo "$M" | grep -q "lxc\.monitor" || die 1 "Found more than one container matching $GUESTNAME."
;;
*) die 1 "Found more than one container matching $GUESTNAME." ;;
esac
# only check IPADDR if we are not in a route mode
[ "$IFTYPE" != route ] && [ "$IFTYPE" != rule ] && [ "$IFTYPE" != tc ] && {
case "$IPADDR" in
# Let's check first if the user asked for DHCP allocation.
dhcp|dhcp:*)
# Use Docker-specific strategy to run the DHCP client
# from the busybox image, in the network namespace of
# the container.
if ! [ "$DOCKERPID" ]; then
warn "You asked for a Docker-specific DHCP method."
warn "However, $GUESTNAME doesn't seem to be a Docker container."
warn "Try to replace 'dhcp' with another option?"
die 1 "Aborting."
fi
DHCP_CLIENT=${IPADDR%%:*}
;;
udhcpc|udhcpc:*|udhcpc-f|udhcpc-f:*|dhcpcd|dhcpcd:*|dhclient|dhclient:*|dhclient-f|dhclient-f:*)
DHCP_CLIENT=${IPADDR%%:*}
# did they ask for the client to remain?
DHCP_FOREGROUND=
[ "${DHCP_CLIENT%-f}" != "${DHCP_CLIENT}" ] && {
DHCP_FOREGROUND=true
}
DHCP_CLIENT=${DHCP_CLIENT%-f}
if ! installed "$DHCP_CLIENT"; then
die 1 "You asked for DHCP client $DHCP_CLIENT, but I can't find it."
fi
;;
# Alright, no DHCP? Then let's see if we have a subnet *and* gateway.
*/*@*)
GATEWAY="${IPADDR#*@}" GATEWAY="${GATEWAY%%@*}"
IPADDR="${IPADDR%%@*}"
;;
# No gateway? We need at least a subnet, anyway!
*/*) : ;;
# ... No? Then stop right here.
*)
warn "The IP address should include a netmask."
die 1 "Maybe you meant $IPADDR/24 ?"
;;
esac
}
# If a DHCP method was specified, extract the DHCP options.
if [ "$DHCP_CLIENT" ]; then
case "$IPADDR" in
*:*) DHCP_OPTIONS="${IPADDR#*:}" ;;
esac
fi
if [ "$DOCKERPID" ]; then
NSPID=$DOCKERPID
else
NSPID=$(head -n 1 "$(find "$CGROUPMNT" -name "$GUESTNAME" | head -n 1)/tasks")
[ "$NSPID" ] || {
# it is an alternative way to get the pid
NSPID=$(lxc-info -n "$GUESTNAME" | grep PID | grep -Eo '[0-9]+')
[ "$NSPID" ] || {
die 1 "Could not find a process inside container $GUESTNAME."
}
}
fi
# Check if an incompatible VLAN device already exists
[ "$IFTYPE" = phys ] && [ "$VLAN" ] && [ -d "/sys/class/net/$IFNAME.VLAN" ] && {
ip -d link show "$IFNAME.$VLAN" | grep -q "vlan.*id $VLAN" || {
die 1 "$IFNAME.VLAN already exists but is not a VLAN device for tag $VLAN"
}
}
[ ! -d /var/run/netns ] && mkdir -p /var/run/netns
rm -f "/var/run/netns/$NSPID"
ln -s "/proc/$NSPID/ns/net" "/var/run/netns/$NSPID"
# Check if we need to create a bridge.
[ "$IFTYPE" = bridge ] && [ ! -d "/sys/class/net/$IFNAME" ] && {
[ "$BRTYPE" = linux ] && {
(ip link add dev "$IFNAME" type bridge > /dev/null 2>&1) || (brctl addbr "$IFNAME")
ip link set "$IFNAME" up
}
[ "$BRTYPE" = openvswitch ] && {
ovs-vsctl add-br "$IFNAME"
}
}
[ "$IFTYPE" != "route" ] && [ "$IFTYPE" != "dummy" ] && [ "$IFTYPE" != "rule" ] && [ "$IFTYPE" != "tc" ] && MTU=$(ip link show "$IFNAME" | awk '{print $5}')
# If it's a bridge, we need to create a veth pair
[ "$IFTYPE" = bridge ] && {
if [ -z "$LOCAL_IFNAME" ]; then
LOCAL_IFNAME="v${CONTAINER_IFNAME}pl${NSPID}"
fi
GUEST_IFNAME="v${CONTAINER_IFNAME}pg${NSPID}"
# Does the link already exist?
if ip link show "$LOCAL_IFNAME" >/dev/null 2>&1; then
# link exists, is it in use?
if ip link show "$LOCAL_IFNAME" up | grep -q "UP"; then
echo "Link $LOCAL_IFNAME exists and is up"
exit 1
fi
# delete the link so we can re-add it afterwards
ip link del "$LOCAL_IFNAME"
fi
ip link add name "$LOCAL_IFNAME" mtu "$MTU" type veth peer name "$GUEST_IFNAME" mtu "$MTU"
case "$BRTYPE" in
linux)
(ip link set "$LOCAL_IFNAME" master "$IFNAME" > /dev/null 2>&1) || (brctl addif "$IFNAME" "$LOCAL_IFNAME")
;;
openvswitch)
if ! ovs-vsctl list-ports "$IFNAME" | grep -q "^${LOCAL_IFNAME}$"; then
ovs-vsctl add-port "$IFNAME" "$LOCAL_IFNAME" ${VLAN:+tag="$VLAN"} \
-- set Interface "$LOCAL_IFNAME" \
external-ids:pipework.interface="$LOCAL_IFNAME" \
external-ids:pipework.bridge="$IFNAME" \
${DOCKERCID:+external-ids:pipework.containerid="$DOCKERCID"} \
${DOCKERCNAME:+external-ids:pipework.containername="$DOCKERCNAME"} \
${NSPID:+external-ids:pipework.nspid="$NSPID"} \
${VLAN:+external-ids:pipework.vlan="$VLAN"}
fi
;;
esac
ip link set "$LOCAL_IFNAME" up
}
# If it's a physical interface, create a macvlan subinterface
[ "$IFTYPE" = phys ] && {
[ "$VLAN" ] && {
[ ! -d "/sys/class/net/${IFNAME}.${VLAN}" ] && {
ip link add link "$IFNAME" name "$IFNAME.$VLAN" mtu "$MTU" type vlan id "$VLAN"
}
ip link set "$IFNAME" up
IFNAME=$IFNAME.$VLAN
}
if [ ! -z "$DIRECT_PHYS" ]; then
GUEST_IFNAME=$IFNAME
else
GUEST_IFNAME=ph$NSPID$CONTAINER_IFNAME
ip link add link "$IFNAME" dev "$GUEST_IFNAME" mtu "$MTU" type macvlan mode bridge
fi
ip link set "$IFNAME" up
}
# If it's an IPoIB interface, create a virtual IPoIB interface (the IPoIB
# equivalent of a macvlan device)
#
# Note: no macvlan subinterface nor Ethernet bridge can be created on top of an
# IPoIB interface. InfiniBand is not Ethernet. IPoIB is an IP layer on top of
# InfiniBand, without an intermediate Ethernet layer.
[ "$IFTYPE" = ipoib ] && {
GUEST_IFNAME="${IFNAME}.${NSPID}"
# If a partition key is provided, use it
[ "$PKEY" ] && {
GUEST_IFNAME="${IFNAME}.${PKEY}.${NSPID}"
PKEY="pkey 0x$PKEY"
}
ip link add link "$IFNAME" name "$GUEST_IFNAME" type ipoib $PKEY
ip link set "$IFNAME" up
}
# If its a dummy interface, create a dummy interface.
[ "$IFTYPE" = dummy ] && {
GUEST_IFNAME=du$NSPID$CONTAINER_IFNAME
ip link add dev "$GUEST_IFNAME" type dummy
}
# If the `route` command was specified ...
if [ "$IFTYPE" = route ]; then
# ... discard the first two arguments and pass the rest to the route command.
shift 2
ip netns exec "$NSPID" ip route "$@"
elif [ "$IFTYPE" = rule ] ; then
shift 2
ip netns exec "$NSPID" ip rule "$@"
elif [ "$IFTYPE" = tc ] ; then
shift 2
ip netns exec "$NSPID" tc "$@"
else
# Otherwise, run normally.
ip link set "$GUEST_IFNAME" netns "$NSPID"
ip netns exec "$NSPID" ip link set "$GUEST_IFNAME" name "$CONTAINER_IFNAME"
[ "$MACADDR" ] && ip netns exec "$NSPID" ip link set dev "$CONTAINER_IFNAME" address "$MACADDR"
# When using any of the DHCP methods, we start a DHCP client in the
# network namespace of the container. With the 'dhcp' method, the
# client used is taken from the Docker busybox image (therefore
# requiring no specific client installed on the host). Other methods
# use a locally installed client.
case "$DHCP_CLIENT" in
dhcp)
docker run -d --net container:$GUESTNAME --cap-add NET_ADMIN \
busybox udhcpc -i "$CONTAINER_IFNAME" -x "hostname:$GUESTNAME" \
$DHCP_OPTIONS \
>/dev/null
;;
udhcpc)
DHCP_Q="-q"
[ "$DHCP_FOREGROUND" ] && {
DHCP_OPTIONS="$DHCP_OPTIONS -f"
}
ip netns exec "$NSPID" "$DHCP_CLIENT" -qi "$CONTAINER_IFNAME" \
-x "hostname:$GUESTNAME" \
-p "/var/run/udhcpc.$GUESTNAME.pid" \
$DHCP_OPTIONS
[ ! "$DHCP_FOREGROUND" ] && {
rm "/var/run/udhcpc.$GUESTNAME.pid"
}
;;
dhclient)
ip netns exec "$NSPID" "$DHCP_CLIENT" "$CONTAINER_IFNAME" \
-pf "/var/run/dhclient.$GUESTNAME.pid" \
-lf "/etc/dhclient/dhclient.$GUESTNAME.leases" \
$DHCP_OPTIONS
# kill dhclient after get ip address to prevent device be used after container close
[ ! "$DHCP_FOREGROUND" ] && {
kill "$(cat "/var/run/dhclient.$GUESTNAME.pid")"
rm "/var/run/dhclient.$GUESTNAME.pid"
}
;;
dhcpcd)
ip netns exec "$NSPID" "$DHCP_CLIENT" -q "$CONTAINER_IFNAME" -h "$GUESTNAME"
;;
"")
if installed ipcalc; then
eval $(ipcalc -b $IPADDR)
ip netns exec "$NSPID" ip "$FAMILY_FLAG" addr add "$IPADDR" brd "$BROADCAST" dev "$CONTAINER_IFNAME"
else
ip netns exec "$NSPID" ip "$FAMILY_FLAG" addr add "$IPADDR" dev "$CONTAINER_IFNAME"
fi
[ "$GATEWAY" ] && {
ip netns exec "$NSPID" ip "$FAMILY_FLAG" route delete default >/dev/null 2>&1 && true
}
ip netns exec "$NSPID" ip "$FAMILY_FLAG" link set "$CONTAINER_IFNAME" up
[ "$GATEWAY" ] && {
ip netns exec "$NSPID" ip "$FAMILY_FLAG" route get "$GATEWAY" >/dev/null 2>&1 || \
ip netns exec "$NSPID" ip "$FAMILY_FLAG" route add "$GATEWAY/32" dev "$CONTAINER_IFNAME"
ip netns exec "$NSPID" ip "$FAMILY_FLAG" route replace default via "$GATEWAY" dev "$CONTAINER_IFNAME"
}
;;
esac
# Give our ARP neighbors a nudge about the new interface
if installed arping; then
IPADDR=$(echo "$IPADDR" | cut -d/ -f1)
ip netns exec "$NSPID" arping -c 1 -A -I "$CONTAINER_IFNAME" "$IPADDR" > /dev/null 2>&1 || true
else
echo "Warning: arping not found; interface may not be immediately reachable"
fi
fi
# Remove NSPID to avoid `ip netns` catch it.
rm -f "/var/run/netns/$NSPID"
# vim: set tabstop=2 shiftwidth=2 softtabstop=2 expandtab :
该源码为489行,不得不说专业人士写的脚本确实很专业,注释通俗易懂,那么就从第一行源代码开始吧!!!!!!!!!
set -e :
set -e : 执行的时候如果出现了返回值为非零,整个脚本 就会立即退出 ,作用范围只限于脚本执行的当前进行,不作用于其创建的子进程,这两点很重要!~~可以圈起来,要考。
set +e: 执行的时候如果出现了返回值为非零将会继续执行下面的脚本
顺带提一下 set -x ,这个是脚本的调试模式。也就是set -x会在执行每一行 shell 脚本时,把执行的内容输出来。它可以让你看到当前执行的情况,里面涉及的变量也会被替换成实际的值。默认是set +x 也就是调试模式关闭是默认的,如果你本来就不想调试,那么就不用写任何set,或者你写一个废话 set +x。
在本脚本中,使用的是sed -e ,其实也就是保证脚本的健壮性,仅此而已。因为,我们这个脚本是配置类脚本(配置容器的网络,对吧),如果前面都错了,那么后面没有必须继续进行下去的。
该脚本的7到40行,是定义脚本所使用的参数,--wait,--direct-phys,-i,-l ,-a 这么五个参数,并且根据你所使用参数进行了参数漂移,
该脚本的55行到60行,是生成容器内的网卡mac
未完待续!~~~~~~~
,