多集群网络Submariner集成
介绍
在当今的云计算和容器化环境中,企业为了满足多样化的业务需求,常常会部署多个Kubernetes集群。例如,将生产环境和测试环境分别部署在不同的集群中,这样可以避免测试活动对生产环境造成影响;或者将不同地域的数据中心作为独立的集群,以提高数据处理的效率和响应速度。
然而,这些集群之间往往需要进行网络通信,以实现资源共享、服务调用等功能。比如,一个微服务架构的应用可能会将不同的服务部署在不同的集群中,这些服务之间需要进行相互调用;或者企业需要在不同地域的数据中心之间进行数据同步和备份。
Submariner为解决跨集群网络通信问题提供了一种有效的解决方案。它能够实现不同集群中的Pod和Service之间的通信,就好像它们在同一个集群中一样。接下来,我们将详细介绍如何安装和配置Submariner。
安装
前提条件
在开始安装Submariner之前,我们需要确保满足以下前提条件。这些条件是Submariner能够正常工作的基础,每一个条件都有其重要的作用。
- 准备至少两个Kubernetes集群:每个集群至少有一个节点,并且集群之间能够互相通信。这是因为Submariner的主要功能是实现跨集群的网络通信,如果集群之间无法通信,那么Submariner也就无法发挥作用。
- 集群之间的Pod CIDR和Service CIDR尽量不重叠:否则需要使用Globalnet。Pod CIDR和Service CIDR是集群中用于分配IP地址的范围,如果重叠,可能会导致IP地址冲突,从而影响网络通信。
- 集群已经安装网络插件:例如Calico(本文为例)、Flannel等。网络插件负责集群内部的网络通信,Submariner需要依赖这些网络插件来实现跨集群的通信。如果使用Calico,还需要满足以下两个条件:
- 安装Calico API Server:因为需要使用ippools.projectcalico.org/v3 CRD,而不是ippools.crd.projectcalico.org/v1。恰好ippools.projectcalico.org/v3 CRD是由Calico API Server生成的。
- 将Calico的网络模式修改为VXLAN:默认为IPIP。
- 将kube - proxy的网络模式修改为iptables:默认为ipvs。
- 关闭nodelocaldns:Submariner不支持nodelocaldns。nodelocaldns是Kubernetes中的一个本地DNS缓存服务,由于Submariner修改nodelocaldns的配置文件存在缺陷,所以关闭它可以避免该问题。
- 添加标签:在每个集群挑选一个节点打上
submariner.io/gateway=true
Label,Submariner会将这些节点作为Gateway节点。Gateway节点是跨集群通信的关键节点,它负责转发不同集群之间的网络流量。 - 添加注解(内网无法访问时):对于两个集群之间内网不能访问的,需要为Gateway节点添加公网ip注解:
kubectl annotate node --all gateway.submariner.io/public-ip=ipv4:<public-ip>
(用户使用时自行变更,注意更改里面的实际IP)
规划
在进行具体的安装操作之前,我们需要对集群进行规划。规划的目的是明确各个集群的相关信息,包括context路径、节点、角色和公网IP等,以便后续的安装和配置工作能够顺利进行。以下是我们的规划表格:
集群 | context路径 | 节点 | 角色 | public-ip |
---|---|---|---|---|
cluster | ~/.kube/config | node1 | broker && operator | 10.53.23.11 |
cluster1 | ~/.kube/config-1 | node2 | operator | 10.54.10.7 |
注意:后续操作默认在cluster集群的node1节点执行,Submariner版本为v0.20.0
安装submariner-broker(helm)
完成规划后,我们首先要安装submariner - broker。
submariner - broker是Submariner的核心组件之一,它负责管理集群之间的连接信息和资源分配。
通过Helm进行安装可以方便地管理和部署Submariner的相关组件,以下是安装命令:
# add repo
helm repo add submariner-latest https://submariner-io.github.io/submariner-charts/charts
# export env
export BROKER_NS=submariner-k8s-broker
# install
helm install "${BROKER_NS}" submariner-latest/submariner-k8s-broker \
--create-namespace \
--namespace "${BROKER_NS}" \
--version 0.20.0
安装submariner-operator(helm)
安装完submariner - broker后,接下来我们要安装submariner - operator。
submariner - operator负责管理和监控Submariner的各个组件,确保它们正常运行。我们需要分别在cluster和cluster1两个集群上进行安装。
cluster集群安装:
# cluster
export BROKER_NS=submariner-k8s-broker
export SUBMARINER_NS=submariner-operator
# psk 可固定为某一特定值
export SUBMARINER_PSK=$(LC_CTYPE=C tr -dc 'a-zA-Z0-9' < /dev/urandom | fold -w 64 | head -n 1)
# mac 上无法使用以上命令,可以固定为某一特定值
export SUBMARINER_PSK='yxywUMWl85AHqVi0aoVbzPlLEiBb2EnLmZNCF5HxqNHbT44PPnSmOpTHpqyR5nN9'
# broker param
export KUBECONFIG=~/.kube/config
# 访问 Broker 集群 API-Server 的 url
export SUBMARINER_BROKER_URL=$(kubectl -n default get endpoints kubernetes -o jsonpath="{.subsets[0].addresses[0].ip}:{.subsets[0].ports[?(@.name=='https')].port}")
# 访问 Broker 集群 API-Server 的 CA证书
export SUBMARINER_BROKER_CA=$(kubectl -n "${BROKER_NS}" get secrets submariner-k8s-broker-client-token -o jsonpath="{.data['ca\.crt']}")
# 访问 Broker 集群 API-Server 的 Token
export SUBMARINER_BROKER_TOKEN=$(kubectl -n "${BROKER_NS}" get secrets submariner-k8s-broker-client-token -o jsonpath="{.data.token}"|base64 --decode)
export KUBECONFIG=~/.kube/config
# set cluster id
export CLUSTER_ID=cluster
# get current cluster pod cidr and service cidr
export CLUSTER_CIDR=$(kubectl -n kube-system get configmap kubeadm-config -o jsonpath='{.data.ClusterConfiguration}' | grep podSubnet | awk '{print $2}')
export SERVICE_CIDR=$(kubectl -n kube-system get configmap kubeadm-config -o jsonpath='{.data.ClusterConfiguration}' | grep serviceSubnet | awk '{print $2}')
# install 安装之前请确保每个变量都是正确的值
helm install submariner-operator submariner-latest/submariner-operator \
--version 0.20.0 \
--create-namespace \
--namespace "${SUBMARINER_NS}" \
--set ipsec.psk="${SUBMARINER_PSK}" \
--set broker.server="${SUBMARINER_BROKER_URL}" \
--set broker.token="${SUBMARINER_BROKER_TOKEN}" \
--set broker.namespace="${BROKER_NS}" \
--set broker.ca="${SUBMARINER_BROKER_CA}" \
--set broker.insecure=true \
--set submariner.clusterId="${CLUSTER_ID}" \
--set submariner.clusterCidr="${CLUSTER_CIDR}" \
--set submariner.serviceCidr="${SERVICE_CIDR}" \
--set submariner.natEnabled="true" \
--set submariner.images.repository="swr.cn-east-3.myhuaweicloud.com/lomtom-common" \
--set operator.image.repository="swr.cn-east-3.myhuaweicloud.com/lomtom-common/submariner-operator"
cluster1集群安装:
# cluster1
export BROKER_NS=submariner-k8s-broker
export SUBMARINER_NS=submariner-operator
# psk 可固定为某一特定值
export SUBMARINER_PSK=$(LC_CTYPE=C tr -dc 'a-zA-Z0-9' < /dev/urandom | fold -w 64 | head -n 1)
# mac 上无法使用以上命令,可以固定为某一特定值
export SUBMARINER_PSK='yxywUMWl85AHqVi0aoVbzPlLEiBb2EnLmZNCF5HxqNHbT44PPnSmOpTHpqyR5nN9'
# broker param
export KUBECONFIG=~/.kube/config
# 访问 Broker 集群 API-Server 的 url
export SUBMARINER_BROKER_URL=$(kubectl -n default get endpoints kubernetes -o jsonpath="{.subsets[0].addresses[0].ip}:{.subsets[0].ports[?(@.name=='https')].port}")
# 访问 Broker 集群 API-Server 的 CA证书
export SUBMARINER_BROKER_CA=$(kubectl -n "${BROKER_NS}" get secrets submariner-k8s-broker-client-token -o jsonpath="{.data['ca\.crt']}")
# 访问 Broker 集群 API-Server 的 Token
export SUBMARINER_BROKER_TOKEN=$(kubectl -n "${BROKER_NS}" get secrets submariner-k8s-broker-client-token -o jsonpath="{.data.token}"|base64 --decode)
export KUBECONFIG=~/.kube/config-1
# set cluster id
export CLUSTER_ID=cluster1
# get current cluster pod cidr and service cidr
export CLUSTER_CIDR=$(kubectl -n kube-system get configmap kubeadm-config -o jsonpath='{.data.ClusterConfiguration}' | grep podSubnet | awk '{print $2}')
export SERVICE_CIDR=$(kubectl -n kube-system get configmap kubeadm-config -o jsonpath='{.data.ClusterConfiguration}' | grep serviceSubnet | awk '{print $2}')
# install 安装之前请确保每个变量都是正确的值
helm install submariner-operator submariner-latest/submariner-operator \
--version 0.20.0 \
--create-namespace \
--namespace "${SUBMARINER_NS}" \
--set ipsec.psk="${SUBMARINER_PSK}" \
--set broker.server="${SUBMARINER_BROKER_URL}" \
--set broker.token="${SUBMARINER_BROKER_TOKEN}" \
--set broker.namespace="${BROKER_NS}" \
--set broker.ca="${SUBMARINER_BROKER_CA}" \
--set broker.insecure=true \
--set submariner.clusterId="${CLUSTER_ID}" \
--set submariner.clusterCidr="${CLUSTER_CIDR}" \
--set submariner.serviceCidr="${SERVICE_CIDR}" \
--set submariner.natEnabled="true" \
--set submariner.images.repository="swr.cn-east-3.myhuaweicloud.com/lomtom-common" \
--set operator.image.repository="swr.cn-east-3.myhuaweicloud.com/lomtom-common/submariner-operator"
安装subctl工具
subctl是Submariner的命令行工具,它可以帮助我们更方便地管理Submariner集群。如果你不需要使用subctl工具,可以跳过此步骤。
以下是安装subctl工具的两种方式:
方式一:自动安装工具
# 方式一:auto install tool
curl https://get.submariner.io | VERSION=0.20.0 bash
export PATH=$PATH:~/.local/bin
echo export PATH=\$PATH:~/.local/bin >> ~/.profile
方式二:手动安装工具
# 方式二:manual install tool
# amd64
wget https://github.com/submariner-io/releases/releases/download/v0.20.0/subctl-v0.20.0-linux-amd64.tar.gz
tar -xvf subctl-v0.20.0-linux-amd64.tar.gz
cp subctl-v0.20.0/subctl /usr/local/bin/
# arm64
wget https://github.com/submariner-io/releases/releases/download/v0.20.0/subctl-v0.20.0-linux-arm64.tar.gz
tar -xvf subctl-v0.20.0-linux-arm64.tar.gz
cp subctl-v0.20.0/subctl /usr/local/bin/
安装验证
完成上述所有安装步骤后,我们需要对安装结果进行验证,以确保Submariner已经正常工作。验证主要包括两个方面:验证broker是否正常和验证operator是否正常。
- 验证broker是否正常
- 获取所有集群是否已经正常join到broker集群
kubectl -n submariner-k8s-broker get clusters.submariner.io
NAME AGE
cluster 35s
cluster1 39s
- 查看CRD资源是否正常生成
# kubectl get crds | grep -iE 'submariner|multicluster.x-k8s.io'
clusters.submariner.io 2025-05-15T08:09:35Z
endpoints.submariner.io 2025-05-15T08:09:35Z
gateways.submariner.io 2025-05-15T08:09:35Z
serviceexports.multicluster.x-k8s.io 2025-05-15T08:09:35Z
serviceimports.multicluster.x-k8s.io 2025-05-15T08:09:35Z
- 验证operator是否正常
- 查看operator pod是否正常
kubectl get pod -n submariner-operator
NAME READY STATUS RESTARTS AGE
submariner-gateway-p27px 1/1 Running 0 21s
submariner-lighthouse-agent-5b678544d4-prm9k 1/1 Running 0 21s
submariner-lighthouse-coredns-56db555d7b-4m4c2 1/1 Running 0 20s
submariner-lighthouse-coredns-56db555d7b-9k6ld 1/1 Running 0 20s
submariner-metrics-proxy-pp5hd 1/1 Running 0 21s
submariner-operator-785df79474-4k8fc 1/1 Running 0 39s
submariner-routeagent-g9cn5 1/1 Running 0 21s
- 使用
subctl show all
命令查看集群间的连接情况,输出如下:
# subctl show all
Cluster "kubernetes"
✓ Detecting broker(s)
✓ No brokers found
✓ Showing Connections
GATEWAY CLUSTER REMOTE IP NAT CABLE DRIVER SUBNETS STATUS RTT avg.
node1 cluster1 10.54.10.7 yes libreswan 10.96.4.0/22, 100.128.0.0/10 connected
✓ Showing Endpoints
CLUSTER ENDPOINT IP PUBLIC IP CABLE DRIVER TYPE
cluster 192.168.23.22 10.53.23.11 libreswan local
cluster1 192.168.80.221 10.54.10.7 libreswan remote
✓ Showing Gateways
NODE HA STATUS SUMMARY
node active All connections (1) are established
✓ Showing Network details
Discovered network details via Submariner:
Network plugin: calico
Service CIDRs: [10.96.0.0/22]
Cluster CIDRs: [100.64.0.0/10]
✓ Showing versions
COMPONENT REPOSITORY CONFIGURED RUNNING ARCH
submariner-gateway swr.cn-east-3.myhuaweicloud.com/lomtom-common 0.20.0 release-0.20-f0a5355cabfc amd64
submariner-routeagent swr.cn-east-3.myhuaweicloud.com/lomtom-common 0.20.0 release-0.20-f0a5355cabfc amd64
submariner-metrics-proxy swr.cn-east-3.myhuaweicloud.com/lomtom-common 0.20.0 release-0.20-8fde9372397b amd64
submariner-operator swr.cn-east-3.myhuaweicloud.com/lomtom-common 0.20.0 release-0.20-44970648cf5c amd64
submariner-lighthouse-agent swr.cn-east-3.myhuaweicloud.com/lomtom-common 0.20.0 release-0.20-c9e76a4aee91 amd64
submariner-lighthouse-coredns swr.cn-east-3.myhuaweicloud.com/lomtom-common 0.20.0 release-0.20-c9e76a4aee91 amd64
需注意几个点:
- Showing Connections 中,STATUS为connected,表示集群间的连接已经建立
- Showing Endpoints 中,TYPE为local和remote,并且ENDPOINT IP为节点的IP,PUBLIC IP为节点的公网IP
- Showing Gateways 中,HA STATUS为active,并且SUMMARY中显示所有连接的数量为除此集群外的集群数量
服务间访问验证
- 部署一个nginx服务
export KUBECONFIG=~/.kube/config
# create namespace
NAMESPACE=nginx-test
kubectl create namespace $NAMESPACE
# create deployment and service
kubectl -n $NAMESPACE create deployment nginx --image=swr.cn-east-3.myhuaweicloud.com/lomtom-common/nginx-unprivileged:stable-alpine
kubectl -n $NAMESPACE expose deployment nginx --port 8080
# expose service
subctl export service --namespace $NAMESPACE nginx
- 通过另一个集群访问
export KUBECONFIG=~/.kube/config-1
# create namespace
NAMESPACE=nginx-test
kubectl create namespace $NAMESPACE
# create pod
kubectl run tmp-shell --rm -i --tty --image swr.cn-east-3.myhuaweicloud.com/lomtom-common/nettest:0.20.0 -- /bin/bash
# exec command
curl nginx.nginx-test.svc.clusterset.local:8080
dig nginx.nginx-test.svc.clusterset.local
诊断
如果安装后无法访问,可以按照以下思路进行排查:
- 查看所有的前提条件是否满足
- 使用
subctl diagnose all
命令查看诊断信息,请确保所有的检查项均正确,输出如下:
# subctl diagnose all
Cluster "kubernetes"
✓ Checking Submariner support for the Kubernetes version
✓ Kubernetes version "v1.24.0" is supported
✓ Non-Globalnet deployment detected - checking that cluster CIDRs do not overlap
✓ Checking DaemonSet "submariner-gateway"
✓ Checking DaemonSet "submariner-routeagent"
✓ Checking DaemonSet "submariner-metrics-proxy"
✓ Checking Deployment "submariner-lighthouse-agent"
✓ Checking Deployment "submariner-lighthouse-coredns"
✓ Checking the status of all Submariner pods
✓ Checking that gateway metrics are accessible from non-gateway nodes
✓ Skipping this check as it's a single node cluster
✓ Checking Submariner support for the CNI network plugin
✓ The detected CNI network plugin ("calico") is supported
✓ Calico CNI detected, checking if the Submariner IPPool pre-requisites are configured
✓ Checking gateway connections
✓ Checking route agent connections
✓ There are no remote endpoint connections on route agent "node"
✓ Checking Submariner support for the kube-proxy mode
✓ The kube-proxy mode is supported
✓ Checking that firewall configuration allows intra-cluster VXLAN traffic
✓ Skipping this check as it's a single node cluster
✓ Checking that services have been exported properly
Skipping inter-cluster firewall check as it requires two kubeconfigs. Please run "subctl diagnose firewall inter-cluster" command manually.
- 如果通过pod的IP或者service的IP可以访问,但是无法通过service的域名访问,请检查coredns配置是否正常,或者是否使用了其他的dns服务
- 查看gateway的日志
- 可能会因为broker的证书错误无法访问 broker
- 可能会因为内网ip无法互通,而未指定公网ip导致无法访问
- …
- 排查route agent的日志
卸载
方法一:一键卸载
subctl uninstall
方法二:手动卸载
# delete submariner
kubectl delete submariners.submariner.io -n submariner-operator submariner
# delete operator
helm uninstall submariner-operator -n submariner-operator
# delete broker
helm uninstall submariner-k8s-broker -n submariner-k8s-broker
# delete crd
for CRD in `kubectl get crds | grep -iE 'submariner|multicluster.x-k8s.io'| awk '{print $1}'`; do kubectl delete crd $CRD; done
# delete clusterroler and clusterrolebinding
roles="submariner-operator submariner-operator-globalnet submariner-lighthouse submariner-networkplugin-syncer"
kubectl delete clusterrole,clusterrolebinding $roles --ignore-not-found
# delete namespace
kubectl delete namespace submariner-k8s-broker submariner-operator
参考
修改calico 网络模式
# kubectl edit installations.operator.tigera.io default
# 修改encapsulation为 VXLAN
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
name: default
spec:
calicoNetwork:
bgp: Disabled
ipPools:
- blockSize: 26
cidr: 10.244.0.0/16
encapsulation: VXLAN
natOutgoing: Enabled
nodeSelector: all()
修改kube-proxy
# kubectl edit configmap -n kube-system kube-proxy
# 修改mode为 iptables 后重启kube-proxy
# pod内 执行kube-proxy --cleanup
# 主机sudo ipvsadm --clear
# 重启kubelet
apiVersion: v1
kind: ConfigMap
metadata:
name: kube-proxy
namespace: kube-system
data:
config.conf: |-
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
bindAddressHardFail: false
mode: iptables
修改nodelocaldns
# kubectl edit configmap -n kube-system nodelocaldns
# 安装时指定为nodelocaldns 并且增加bind参数 移除 lighthouse.server: | 并且修改为以下
apiVersion: v1
kind: ConfigMap
metadata:
name: nodelocaldns
namespace: kube-system
data:
Corefile: |
clusterset.local:53 {
bind 169.254.25.10 # nodelocaldns固定
forward . 10.234.30.44 # 实时生成的
}
