Kustomize #
- 管理多个环境的yaml资源
- 资源配置可以复用
- 相比helm,不需要学习额外语法
创建一个base文件夹
在里面有一个deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myngx
namespace: default
spec:
selector:
matchLabels:
app: myngx
replicas: 1
template:
metadata:
labels:
app: myngx
spec:
containers:
- name: ngx1
image: nginx:1.18-alpine
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
有个service.yaml
apiVersion: v1
kind: Service
metadata:
name: myngx-svc
namespace: default
spec:
ports:
- port: 80
targetPort: 80
selector: #service通过selector和pod建立关联
app: myngx
然后创建一个kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- service.yaml
- deployment.yaml
进入到这个文件夹
kubectl kustomize
# 就会把deployment和service合并到了一块儿
抽取公共部分示例 #
比如上面这个例子,在deployment和service中的namespace我都去掉,然后中kustomization里面我加入namesapce
像下面这样,也可以修改image的内容。
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: default
images:
- name: nginx
newTag: 1.19-alpine
newName: mysql
resources:
- service.yaml
- deployment.yaml
使用overlay创建多“环境” #
创建一个文件夹叫overlays
然后创建dev、prod文件夹
然后创建kustomization.yaml
dev文件夹中
namespace: dev
bases:
- ../../base
prod文件夹中
namespace: prod
bases:
- ../../base
使用patch:配置修改 #
比方说我们希望在dev中,对replica的数量做一个修改
在dev文件夹中创建一个replica.yaml
在dev的kustomization.yaml中加入内容
namespace: dev
bases:
- ../../base
patchesStrategicMerge:
- replica.yaml
在replica.yaml中加入
apiVersion: apps/v1
kind: Deployment
metadata:
name: myngx
spec:
replicas: 2
接下来想把containerPort:80
改成81
如果按照上面的逻辑,会进行累加,而不是修改。因为他是列表
所以需要按照下面的方式修改
namespace: dev
bases:
- ../../base
patchesStrategicMerge:
- replica.yaml
patchesJson6902
- target:
group: apps
version: v1
kind: Deployment
name: myngx
path: port.yaml
然后创建一个port.yaml
- op: replace
path: /spec/template/spec/containers/0/ports/0/containerPort
value: 9090
- op: replace
path: /spec/replicas
value: 3
configmap生成器 #
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: default
images:
- name: nginx
newTag: 1.19-alpine
newName: mysql
resources:
- service.yaml
- deployment.yaml
# 去掉多余的哈希
generatorOptions:
disableNameSuffixHash: true
configMapGenerator:
- name: myconfig
literals:
- MySQL=mysqluri
- Password=123
二进制部署k8s1.20 #
master
etcd、kube-apiserver、kube-controller-manager、kube-scheduler、docker
node
kube-proxy、kubelet、docker
前置工作
- 关闭且禁用防火墙
- 关闭SELinux
- 关闭swap
- 同步系统时间
- 安装docker环境(19.03+ < 20)
查看系统日志命令: tail -f /var/log/messages
创建一个文件夹专门来放入二进制可执行文件
sudo mkdir /usr/k8s
然后配置系统环境变量
sudo vi /etc/profile
在最后一行加入
export K8SBIN=/user/k8s
export PATH=$PATH:$K8SBIN
然后执行source /etc/profile
证书
- APIServer证书
- kubectl用于API服务器交互身份认证
- kubelet和api交互或对外服务也需要证书
- etcd的交互、内部通信需要证书
- scheduler、controller-manager、kube-proxy等都要
Etcd #
下载cfssl工具
wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
chmod + x *
sudo mkdir /usr/cfssl
sudo cf cfssl_linux-amd64 cfssljson_linux-amd64 cfssl-certinfo_linux-amd64 /usr/cfssl
# 把/usr/cfssl加入环境变量
生成etcd证书
首先创建出文件夹
mkdir -p ~/certs/{etcd,k8s}
进入到etcd中
输入下面命令,创建执行配置文件
cat > ca-config.json << EOF
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"www": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
EOF
然后创建证书请求文件
cat > ca-csr.json << EOF
{
"CN": "etcd CA",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "Beijing",
"ST": "Beijing"
}
]
}
EOF
使用下面命令生成出证书
cfssl gencert -initca ca-csr.json | cfssljson -bare ca -
这里主要生成两个文件ca-key.pem
和ca.pem
然后创建etcd https证书请求文件
cat > server-csr.json << EOF
{
"CN": "etcd",
"hosts": [
"192.168.0.13",
"192.168.0.106"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "BeiJing",
"ST": "BeiJing"
}
]
}
EOF
接着来签名证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=www server-csr.json | cfssljson -bare server
这里主要生成两个文件server-key.pem
和server.pem
然后来创建文件夹,存放k8s中etcd相关的配置和证书文件
sudo mkdir -p /etc/k8s/etcd/{config,certs}
# 接着创建一个目录来存放etcd数据文件
sudo mkdir /var/lib/etcd
创建etcd配置文件,写入下面内容
sudo vi /etc/k8s/config/etcd.conf
# 注意下面是别人本机的内网地址,需要对这个做修改的。
#[Member]
ETCD_NAME="etcd1"
ETCD_DATA_DIR="/var/lib/etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.0.13:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.0.13:2379"
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.0.13:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.0.13:2379"
ETCD_INITIAL_CLUSTER="etcd1=https://192.168.0.13:2380 "
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
把刚刚在/certs/etcd
中的证书拷贝过去
sudo cp *.pem /etc/k8s/etcd/certs
接着配置systemd管理etcd
sudo vi /usr/lib/systemd/system/etcd.service
贴入下面内容
sudo vi /usr/lib/systemd/system/etcd.service 贴入下面的内容(这一行别贴)
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
EnvironmentFile=/etc/k8s/etcd/config/etcd.conf
ExecStart=/usr/k8s/etcd \
--cert-file=/etc/k8s/etcd/certs/server.pem \
--key-file=/etc/k8s/etcd/certs/server-key.pem \
--peer-cert-file=/etc/k8s/etcd/certs/server.pem \
--peer-key-file=/etc/k8s/etcd/certs/server-key.pem \
--trusted-ca-file=/etc/k8s/etcd/certs/ca.pem \
--peer-trusted-ca-file=/etc/k8s/etcd/certs/ca.pem \
--logger=zap
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
然后启动、开机启动
systemctl daemon-reload
systemctl start etcd
systemctl enable etcd
然后来测试一下是否可以了
sudo /usr/k8s/etcdctl \
--endpoints=https://192.168.0.13:2379 \
--cert=/etc/k8s/etcd/certs/server.pem \
--cacert=/etc/k8s/etcd/certs/ca.pem \
--key=/etc/k8s/etcd/certs/server-key.pem \
member list
Apiserver #
生成证书流程:
- 生成配置文件(json)
- 生成证书请求配置(json)
- 使用命令生成ca证书
- 生成https证书请求文件
- 利用ca签发https证书
先进到之前创建的/certs/k8s
目录
创建配置文件
cat > ca-config.json << EOF
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"kubernetes": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
EOF
生成证书请求配置
cat > ca-csr.json << EOF
{
"CN": "kubernetes",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "Beijing",
"ST": "Beijing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
生成ca证书
cfssl gencert -initca ca-csr.json | cfssljson -bare ca -
重点是ca-key.pem和ca.pem
生成https证书请求文件,注意下面的ip需要做修改的
cat > server-csr.json << EOF
{
"CN": "kubernetes",
"hosts": [
"10.0.0.1",
"127.0.0.1",
"192.168.0.13",
"192.168.0.106",
"kubernetes",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "BeiJing",
"ST": "BeiJing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
利用ca签发https证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes server-csr.json | cfssljson -bare server
重点是server-key.pem和server.pem
接下来拷贝可执行文件到/usr/k8s
里
sudo cp ~/tools/kubernetes/server/bin/kube-apiserver /usr/k8s
sudo cp ~/tools/kubernetes/server/bin/kubectl /usr/k8s
创建k8s相关的配置文件目录、证书目录和日志目录
sudo mkdir -p /etc/k8s/{configs,logs,certs}
# 拷贝证书到证书目录里
sudo cp ~/certs/k8s/*.pem /etc/k8s/certs
接着来编辑service
sudo vi /usr/lib/systemd/system/kube-apiserver.service
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes
[Service]
EnvironmentFile=/etc/k8s/configs/kube-apiserver.conf
ExecStart=/usr/k8s/kube-apiserver \
--logtostderr=false \
--v=4 \
--log-dir=/etc/k8s/logs \
--etcd-servers=https://192.168.0.13:2379 \
--bind-address=192.168.0.13 \
--secure-port=6443 \
--advertise-address=192.168.0.13 \
--allow-privileged=true \
--service-cluster-ip-range=10.0.0.0/24 \
--enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota,NodeRestriction \
--authorization-mode=RBAC,Node \
--enable-bootstrap-token-auth=true \
--token-auth-file=/etc/k8s/configs/token.csv \
--service-node-port-range=30000-32767 \
--kubelet-client-certificate=/etc/k8s/certs/server.pem \
--kubelet-client-key=/etc/k8s/certs/server-key.pem \
--tls-cert-file=/etc/k8s/certs/server.pem \
--tls-private-key-file=/etc/k8s/certs/server-key.pem \
--client-ca-file=/etc/k8s/certs/ca.pem \
--service-account-key-file=/etc/k8s/certs/ca-key.pem \
--service-account-signing-key-file=/etc/k8s/certs/ca-key.pem \
--service-account-issuer=kubernetes.default.svc \
--etcd-cafile=/etc/k8s/etcd/certs/ca.pem \
--etcd-certfile=/etc/k8s/etcd/certs/server.pem \
--etcd-keyfile=/etc/k8s/etcd/certs/server-key.pem \
--audit-log-maxage=30 \
--audit-log-maxbackup=3 \
--audit-log-maxsize=100 \
--audit-log-path=/etc/k8s/logs/k8s-audit.log
Restart=on-failure
[Install]
WantedBy=multi-user.target
接下来我们生成一个token
head -c 16 /dev/urandom | od -An -t x | tr -d ' '
把生成的内容复制
创建并编辑token
sudo vi /etc/k8s/configs/token.csv
# 贴入刚刚生成的token
239309f7162e1fefdfa8ff63932fdbc4,10001,"system:node-bootstrapper"
各种启动
systemctl daemon-reload
systemctl restart kube-apiserver
systemctl enable kube-apiserver
kubectl #
进入到/certs/k8s
生成一个admin的请求文件
cat > admin-csr.json << EOF
{
"CN": "admin",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "system:masters"
}
]
}
EOF
然后用http签发这个证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin
设置集群
kubectl config set-cluster kubernetes \
--certificate-authority=/etc/k8s/certs/ca.pem \
--embed-certs=true \
--server=https://192.168.0.13:6443
设置客户端认证参数
kubectl config set-credentials kube-admin \
--client-certificate=/home/shenyi/certs/k8s/admin.pem \
--client-key=/home/shenyi/certs/k8s/admin-key.pem \
--embed-certs=true
设置上下文参数
kubectl config set-context kube-admin@kubernetes \
--cluster=kubernetes \
--user=kube-admin
设置默认上下文
kubectl config use-context kube-admin@kubernetes
各种启动、开机启动
systemctl daemon-reload
systemctl start kube-apiserver
systemctl enable kube-apiserver
controller-manager #
目录说明
可执行目录
/usr/k8s
配置文件在
/etc/k8s/{configs,logs,certs}
下载下来的k8s文件在
~/tools/kubernetes/server/bin/
接下来拷贝今天要装的两个可执行程序
sudo cp ~/tools/kubernetes/server/bin/kube-controller-manager /usr/k8s
sudo cp ~/tools/kubernetes/server/bin/kube-scheduler /usr/k8s
配置证书-请求文件
vi kube-controller-manager-csr.json (贴入如下内容)
{
"CN": "system:kube-controller-manager",
"key": {
"algo": "rsa",
"size": 2048
},
"hosts": [
"127.0.0.1",
"192.168.0.13"
],
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "system:kube-controller-manager"
}
]
}
接下来生成证书
cfssl gencert -ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager
生成kube-controller-manager.pem和kube-controller-manager-key.pem
拷贝到/etc/k8s/certs
里
sudo cp kube-controller-manager.pem /etc/k8s/certs
sudo cp kube-controller-manager-key.pem /etc/k8s/certs
创建配置文件
vi /etc/k8s/configs/kube-controller-manager.kubeconfig
创建服务管理
sudo vi /usr/lib/systemd/system/kube-controller-manager.service
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/kubernetes/kubernetes
[Service]
ExecStart=/usr/k8s/kube-controller-manager \
--logtostderr=false \
--kubeconfig=/etc/k8s/configs/kube-controller-manager.kubeconfig \
--v=4 \
--log-dir=/etc/k8s/logs \
--leader-elect=false \
--master=https://192.168.0.13:6443 \
--bind-address=127.0.0.1 \
--allocate-node-cidrs=true \
--cluster-cidr=10.244.0.0/16 \
--service-cluster-ip-range=10.0.0.0/24 \
--cluster-signing-cert-file=/etc/k8s/certs/ca.pem \
--cluster-signing-key-file=/etc/k8s/certs/ca-key.pem \
--root-ca-file=/etc/k8s/certs/ca.pem \
--service-account-private-key-file=/etc/k8s/certs/ca-key.pem \
--client-ca-file=/etc/k8s/certs/ca.pem \
--tls-cert-file=/etc/k8s/certs/kube-controller-manager.pem \
--tls-private-key-file=/etc/k8s/certs/kube-controller-manager-key.pem \
--cluster-signing-duration=87600h0m0s \
--use-service-account-credentials=true
Restart=on-failure
[Install]
WantedBy=multi-user.target
弄好以后把他们启动起来
systemctl daemon-reload
systemctl restart kube-controller-manager
systemctl enable kube-controller-manager
kube-sheduler #
将程序拷贝过来
sudo cp ~/tools/kubernetes/server/bin/kube-sheduler /usr/k8s
配置请求证书
vi kube-scheduler-csr.json # (贴入如下内容)
{
"CN": "system:kube-scheduler",
"key": {
"algo": "rsa",
"size": 2048
},
"hosts": [
"127.0.0.1",
"192.168.0.13"
],
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "system:kube-scheduler"
}
]
}
然后生成证书
cfssl gencert -ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler
# 把证书拷贝过去
sudo cp kube-scheduler.pem /etc/k8s/certs
sudo cp kube-scheduler-key.pem /etc/k8s/certs
创建一个kubeconfig文件
vi /etc/k8s/configs/kube-sheduler.kubeconfig
apiVersion: v1
clusters:
- cluster:
certificate-authority: /etc/k8s/certs/ca.pem
server: https://192.168.0.13:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: system:kube-scheduler
name: default
current-context: default
kind: Config
preferences: {}
users:
- name: system:kube-scheduler
user:
client-certificate: /etc/k8s/certs/kube-scheduler.pem
client-key: /etc/k8s/certs/kube-scheduler-key.pem
创建服务文件
sudo vi /usr/lib/systemd/system/kube-scheduler.service
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes
[Service]
ExecStart=/usr/k8s/kube-scheduler \
--logtostderr=true \
--v=4 \
--kubeconfig=/etc/k8s/configs/kube-scheduler.kubeconfig \
--log-dir=/etc/k8s/logs \
--master=https://192.168.0.13:6443 \
--bind-address=127.0.0.1
--tls-cert-file=/etc/k8s/certs/kube-scheduler.pem \
--tls-private-key-file=/etc/k8s/certs/kube-scheduler-key.pem \
--client-ca-file=/etc/k8s/certs/ca.pem
Restart=on-failure
[Install]
WantedBy=multi-user.target
弄好后设置各种启动
systemctl daemon-reload
systemctl restart kube-scheduler
systemctl enable kube-scheduler
Kubelet #
- 和apiserver交互,管理pod(容器)
- 向master上报node使用情况
- 监控容器和节点资源
kubelet不使用之前这种证书,因为节点的增删比较频繁,导致证书修改频繁。所以有TLS bootstraping机制,就是kubelet会有个token与apiserver连,自动签发。
拷贝程序
sudo cp ~/tools/kubernetes/server/bin/kubelet /usr/k8s
配置kubelet的config
vi /etc/k8s/configs/kubelet-config.yaml
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
address: 0.0.0.0
port: 10250
readOnlyPort: 10255
cgroupDriver: systemd
clusterDNS:
- 10.0.0.2
clusterDomain: cluster.local
failSwapOn: true
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 2m0s
enabled: true
x509:
clientCAFile: /etc/k8s/certs/ca.pem
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 5m0s
cacheUnauthorizedTTL: 20s
evictionHard:
imagefs.available: 15%
memory.available: 100Mi
nodefs.available: 10%
nodefs.inodesFree: 5%
maxOpenFiles: 1000000
maxPods: 110
修改token
239309f7162e1fefdfa8ff63932fdbc4,"kubelet-bootstrap",10001,"system:node-bootstrapper"
在certs里面创建一个kubelet文件夹
执行下面命令,把用户和clusterrole进行绑定
kubectl create clusterrolebinding kubelet-bootstrap \
--clusterrole=system:node-bootstrapper \
--user=kubelet-bootstrap
然后重启一下apiserver
systemctl restart kube-apiserver
去/etc/k8s/configs里面创建bootstrap.kubeconfig
apiVersion: v1
clusters:
- cluster:
certificate-authority: /etc/k8s/certs/ca.pem
server: https://192.168.0.13:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: kubelet-bootstrap
name: default
current-context: default
kind: Config
preferences: {}
users:
- name: kubelet-bootstrap
user:
token: 239309f7162e1fefdfa8ff63932fdbc4
然后设置kubelet的启动参数
vi /usr/lib/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet
After=docker.service
[Service]
ExecStart=/usr/k8s/kubelet \
--logtostderr=false \
--v=4 \
--log-dir=/etc/k8s/logs \
--hostname-override=jtthink4 \
--network-plugin=cni \
--kubeconfig=/etc/k8s/configs/kubelet.kubeconfig \
--bootstrap-kubeconfig=/etc/k8s/configs/bootstrap.kubeconfig \
--config=/etc/k8s/configs/kubelet-config.yaml \
--cert-dir=/etc/k8s/certs/kubelet \
--pod-infra-container-image=mirrorgooglecontainers/pause-amd64:3.0
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
弄好后,各种开机自启
systemctl daemon-reload
systemctl start kubelet
systemctl enable kubelet
使用kubectl批准证书申请
kubectl get csr
找到最新的那个node
然后复制一下他的名字
执行下面操作
kubectl certificate approve xxx
然后再执行kubectl get csr
可以发现状态变成了approved
修改docker
sudo vi /etc/docker/daemon.json
# 加入
{
"exec-opts":["native.cgroupdriver=systemd"]
}
重启docker和containerd
systemctl daemon-reload && systemctl restart docker
systemctl restart containerd
systemctl restart kubelet
看一下driver是不是systemd
docker info | grep Cgroup
然后执行kubectl get node
就可以获取结果了
kube-proxy #
- 实现service的通信与负载均衡机制
- 为pod创建代理服务
- 实现server到pod到请求路由和转发
拷贝可执行程序
sudo cp ~/tools/kubernetes/server/bin/kube-proxy /usr/k8s
创建证书请求文件
vi kube-proxy-csr.json
{
"CN": "system:kube-proxy",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "BeiJing",
"ST": "BeiJing",
"O": "kube-proxy",
"OU": "System"
}
]
}
然后来生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy
将证书拷贝到指定目录
sudo cp kube-proxy.pem /etc/k8s/certs
sudo cp kube-proxy-key.pem /etc/k8s/certs
配置kubeconfig
kubectl config set-cluster kubernetes \
--certificate-authority=/etc/k8s/certs/ca.pem \
--server=https://192.168.0.13:6443 \
--embed-certs=true \
--kubeconfig=kube-proxy.kubeconfig
kubectl config set-credentials kube-proxy \
--client-certificate=/etc/k8s/certs/kube-proxy.pem \
--client-key=/etc/k8s/certs/kube-proxy-key.pem \
--embed-certs=true \
--kubeconfig=kube-proxy.kubeconfig
kubectl config set-context default \
--cluster=kubernetes \
--user=kube-proxy \
--kubeconfig=kube-proxy.kubeconfig
kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig
接着把生成的文件拷贝到/etc/k8s/configs里面
sudo cp kube-proxy.kubeconfig /etc/k8s/configs
配置kube-proxy-config.yml文件
vi /etc/k8s/configs/kube-proxy-config.yml # (贴入下面内容)
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
metricsBindAddress: 0.0.0.0:10249
clientConnection:
kubeconfig: /etc/k8s/configs/kube-proxy.kubeconfig
hostnameOverride: jtthink4
clusterCIDR: 10.0.0.0/24
创建service服务文件
vi /usr/lib/systemd/system/kube-proxy.service #贴入下面内容
[Unit]
Description=Kubernetes Proxy
After=network.target
[Service]
ExecStart=/usr/k8s/kube-proxy \
--logtostderr=false \
--v=4 \
--log-dir=/etc/k8s/logs \
--config=/etc/k8s/configs/kube-proxy-config.yml
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
然后各种启动
systemctl daemon-reload
systemctl start kube-proxy
systemctl enable kube-proxy
Cni(flannel) #
有两种方式安装
- 二进制安装 使用etcd存储
- 以pod运行,flannel集成到k8s集群里面,直接从集群获取网段数据
先在之前apisever配置里多加一个10.0.0.1这个IP,因为flannel需要使用这个虚拟ip
重新签发证书
vi server-csr.json
# 加入ip 10.0.0.1
cfssl gencert-ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes server-csr.json | cfssljson -bare server
# 覆盖,拷贝证书到证书目录里
sudo cp ~/certs/k8s/server-key.pem /etc/k8s/certs
sudo cp ~/certs/k8s/server.pem /etc/k8s/certs
# 重启api server
sudo systemctl restart kube-apiserver
下载cni插件
# 到github里面下载
# 下载后解压 并放到/usr/k8s/cni
sudo mkdir -p /usr/k8s/cni
sudo tar zxvf cni-plugins-linux-amd64-v0.9.1.tgz -C /usr/k8s/cni
修改kubelet启动参数
vi /usr/lib/systemd/system/kubelet.service
--nework-plugin=cni
# cni插件目录
--cni-bin-dir=/usr/k8s/cni
# --cni-conf-dir默认是/etc/cni/net.d目录(可以不改)
# 然后重启kubelet
systemctl daemon-reload
systemctl restart kubelet
下载flannel镜像,并导入
docker load < flanneld-v0.13.1-rc1-amd64.docker
其他操作
kubectl label node jtthink4 node-role.kubernetes.io/master=
kubectl apply -f flannel.txt
kubectl taint nodes --all node-role.kubernetes.io/master-
# 切换到root添加防火墙转发
cat > /etc/sysctl.d/k8s.conf << EOF
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
modprobe br_netfilter
sysctl --system
执行下面的操作
kubectl apply -f kubelet-rbac.yaml
kubectl apply -f kube-flannel.yml
# 这两个在flannel官网有
然后执行kubectl get pods -A
可以看到kube-flannel跑起来了
CoreDNS #
用于k8s中service服务名解析
# __MACHINE_GENERATED_WARNING__
apiVersion: v1
kind: ServiceAccount
metadata:
name: coredns
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
kubernetes.io/bootstrapping: rbac-defaults
addonmanager.kubernetes.io/mode: Reconcile
name: system:coredns
rules:
- apiGroups:
- ""
resources:
- endpoints
- services
- pods
- namespaces
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
addonmanager.kubernetes.io/mode: EnsureExists
name: system:coredns
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:coredns
subjects:
- kind: ServiceAccount
name: coredns
namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: EnsureExists
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local. in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf {
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: coredns
namespace: kube-system
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "CoreDNS"
spec:
# replicas: not specified here:
# 1. In order to make Addon Manager do not reconcile this replicas parameter.
# 2. Default is 1.
# 3. Will be tuned in real time if DNS horizontal auto-scaling is turned on.
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
selector:
matchLabels:
k8s-app: kube-dns
template:
metadata:
labels:
k8s-app: kube-dns
spec:
securityContext:
seccompProfile:
type: RuntimeDefault
priorityClassName: system-cluster-critical
serviceAccountName: coredns
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: k8s-app
operator: In
values: ["kube-dns"]
topologyKey: kubernetes.io/hostname
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
nodeSelector:
kubernetes.io/os: linux
containers:
- name: coredns
#image: k8s.gcr.io/coredns:1.7.0
image: coredns/coredns:1.7.0
imagePullPolicy: IfNotPresent
resources:
limits:
#memory: __DNS__MEMORY__LIMIT__
memory: 250Mi
requests:
cpu: 100m
memory: 70Mi
args: [ "-conf", "/etc/coredns/Corefile" ]
volumeMounts:
- name: config-volume
mountPath: /etc/coredns
readOnly: true
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
- containerPort: 9153
name: metrics
protocol: TCP
livenessProbe:
httpGet:
path: /health
port: 8080
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
readinessProbe:
httpGet:
path: /ready
port: 8181
scheme: HTTP
securityContext:
allowPrivilegeEscalation: false
capabilities:
add:
- NET_BIND_SERVICE
drop:
- all
readOnlyRootFilesystem: true
dnsPolicy: Default
volumes:
- name: config-volume
configMap:
name: coredns
items:
- key: Corefile
path: Corefile
---
apiVersion: v1
kind: Service
metadata:
name: kube-dns
namespace: kube-system
annotations:
prometheus.io/port: "9153"
prometheus.io/scrape: "true"
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "CoreDNS"
spec:
selector:
k8s-app: kube-dns
clusterIP: 10.0.0.2
ports:
- name: dns
port: 53
protocol: UDP
- name: dns-tcp
port: 53
protocol: TCP
- name: metrics
port: 9153
protocol: TCP
然后apply一下这个内容
接着来测试一下这个内容是否成功
apiVersion: apps/v1
kind: Deployment
metadata:
name: ngx1
spec:
selector:
matchLabels:
app: nginx
replicas: 1
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: ngx1
image: nginx:1.18-alpine
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx-svc
labels:
app: nginx
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 80
selector:
app: nginx
进入到容器后,curl这个service,看看是否成功,成功了就是有效的。
子节点kubelet #
前置工作:防火墙、SELinux、Swap关闭、安装docker
开启转发
cat > /etc/sysctl.d/k8s.conf<<EOF
net.ipv4.ip_forward=1
net.bridge.bridge-nf-call-ip6tables=1
net.bridge.bridge-nf-call-iptables=1
EOF
modprobe br_netfilter
sysctl --system
部署kubelet
mkdir /usr/k8s # 放可执行程序
mkdir -p /etc/k8s/{certs,configs,logs} # 放ca证书、配置和日志
mkdir /usr/k8s/cni # 插件目录
mkdir /etc/k8s/certs/kubelet # (申请证书后的存放位置)
# 从主机拷贝CA到子节点
scp /etc/k8s/certs/ca.pem root@192.168.0.106:/etc/k8s/certs
其他内容拷贝
# 拷贝可执行程序
scp /usr/k8s/kubelet root@192.168.0.106:/usr/k8s
# 拷贝cni插件
scp /usr/k8s/cni/* root@192.168.0.106:/usr/k8s/cni
# 拷贝kubelet-config.yaml
scp /etc/k8s/configs/kubelet-config.yaml root@192.168.0.106:/etc/k8s.configs/
# 拷贝kubelet和配置文件
scp /usr/k8s/kubectl root@192.168.0.106:/bin
scp /home/shenyi/.kube/config root@192.168.0.106:/home/shenyi/.kube/
拷贝bootstrap配置文件
scp /etc/k8s/configs/bootstrap.kubeconfig root@192.168.0.106:/etc/k8s/configs
需要设置一下hosts
vi /etc/hosts
然后设置ip和主机名
然后设置service文件配置
sudo vi /usr/lib/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet
After=docker.service
[Service]
ExecStart=/usr/k8s/kubelet \
--logtostderr=false \
--v=4 \
--log-dir=/etc/k8s/logs \
--hostname-override=jtthink3 \
--network-plugin=cni \
--cni-bin-dir=/usr/k8s/cni \
--kubeconfig=/etc/k8s/configs/kubelet.kubeconfig \
--bootstrap-kubeconfig=/etc/k8s/configs/bootstrap.kubeconfig \
--config=/etc/k8s/configs/kubelet-config.yaml \
--cert-dir=/etc/k8s/certs/kubelet \
--pod-infra-container-image=mirrorgooglecontainers/pause-amd64:3.0
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
接着各种开启
systemctl daemon-reload
systemctl start kubelet
systemctl enable kubelet
然后去主机批准申请
kubectl get csr
# 找到最新的那个node,执行如下命令
kubectl certificate approve node-csr-xxxxx
然后使用kubectl get nodes就可以看到了
子节点kube-proxy #
这个只需要拷贝就可以了
# 拷贝可执行程序
scp /usr/k8s/kubeproxy root@192.168.0.106:/usr/k8s
# 拷贝kubeconfig配置文件
scp /etc/k8s/configs/kube-proxy.kubeconfig root@192.168.0.106:/etc/k8s/configs
# 拷贝kube-proxy配置文件(修改下hostname)
scp /etc/k8s/configs/kube-proxy-config/yml root@192.168.0.106:/etc/k8s/configs
创建服务文件
vi /usr/lib/systemd/system/kube-proxy.service #贴入下面内容
[Unit]
Description=Kubernetes Proxy
After=network.target
[Service]
ExecStart=/usr/k8s/kube-proxy \
--logtostderr=false \
--v=4 \
--log-dir=/etc/k8s/logs \
--config=/etc/k8s/configs/kube-proxy-config.yml
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
systemctl daemon-reload
systemctl start kube-proxy
systemctl enable kube-proxy
k8s高可用集群(其他的后面用到了再看吧) #
高可用:一个节点挂了,别的节点依旧可以提供服务。
服务器配置 #
# 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
# 关闭selinux
sed -i 's/enforcing/disabled' /etc/selinux/config # 永久
setenforce 0 # 临时
# 关闭swap
swapoff -a # 临时
sed -ri 's/.swap.*/#&/' /etc/fstab # 永久
# 将桥接的IPv4流量传递到iptables到链:
cat >> /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system # 生效
# 时间同步
cat >> /etc/hosts << EOF
192.168.33.12 master-1
192.168.33.13 master-2
192.168.33.14 master-3
192.168.33.15 node-1
EOF
yum install ntpdate -y
ntpdate time.windows.com
VIP #
配置nginx
yum install epel-release -y
yum install nginx keepalived -y
vi /etc/nginx/nginx.conf
stream{
log_format main '$remote_addr $upstream_addr - [$time_loacl] $status $upstream'
access_log /var/log/nginx/k8s-access.log main;
upstream k8s-apiserver{
server 192.168.33.12:6443;
server 192.168.33.13:6443;
server 192.168.33.14:6443
}
server{
listen 6443;
proxy_pass k8s-apiserver;
}
}
然后重启nginx
service nginx restart
配置keepalived(master)
cat > /etc/keepalived/keepalived.conf << EOF
global_defs{
notification_email{
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id NGINX_MASTER
}
vrrp_script check_nginx{
script "/etc/keepalived/check_nginx.sh"
}
vrrp_instance VI_1{
state MASETER
interface eth1 # 修改为实际网卡名
virtual_router_id 51 # VRRP 路由ID实例,每个实例是唯一的
priority 100 # 优先级,备服务器设置90
advert_int 1 # 指定VRRP 心跳包通告间隔时间,默认1秒
authentication{
auth_type PASS
auth_pass 1111
}
# 虚拟IP
virtual_ipaddress{
192.168.33.20/24
}
track_script{
check_nginx
}
}
EOF
配置keepalived(slave)
cat > /etc/keepalived/keepalived.conf << EOF
global_defs{
notification_email{
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id NGINX_MASTER
}
vrrp_script check_nginx{
script "/etc/keepalived/check_nginx.sh"
}
vrrp_instance VI_1{
state MASETER
interface eth1 # 修改为实际网卡名
virtual_router_id 51 # VRRP 路由ID实例,每个实例是唯一的
priority 90 # 优先级,备服务器设置90
advert_int 1 # 指定VRRP 心跳包通告间隔时间,默认1秒
authentication{
auth_type PASS
auth_pass 1111
}
# 虚拟IP
virtual_ipaddress{
192.168.33.20/24
}
track_script{
check_nginx
}
}
EOF
keepalived心跳脚本
cat > /etc/keepalived/check_nginx.sh << "EOF"
#!/bin/bash
count=$(ss -antp |grep 6443 |egrep -cv "gerp|$$")
if [ "$count" -eq 0 ];then
exit 1
else
exit 0
fi
EOF
chmod +x /etc/keepalived/check_nginx.sh
开启服务
service keepalived start
Webhook #
在本地运行webhook
服务端
package main
import (
"encoding/json"
"k8s.io/klog/v2"
"log"
"myhook/lib"
"io/ioutil"
"k8s.io/api/admission/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"net/http"
)
func main() {
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
log.Println(r.RequestURI)
var body []byte
if r.Body != nil {
if data, err := ioutil.ReadAll(r.Body); err == nil {
body = data
}
}
//第二步
reqAdmissionReview := v1.AdmissionReview{} //请求
rspAdmissionReview := v1.AdmissionReview{ //响应 ---只构建了一部分
TypeMeta: metav1.TypeMeta{
Kind: "AdmissionReview",
APIVersion: "admission.k8s.io/v1",
},
}
//第三步。 把body decode 成对象
deserializer := lib.Codecs.UniversalDeserializer()
if _, _, err := deserializer.Decode(body, nil, &reqAdmissionReview); err != nil {
klog.Error(err)
rspAdmissionReview.Response = lib.ToV1AdmissionResponse(err)
} else {
rspAdmissionReview.Response = lib.AdmitPods(reqAdmissionReview) //我们的业务
}
rspAdmissionReview.Response.UID = reqAdmissionReview.Request.UID
respBytes,_ := json.Marshal(rspAdmissionReview)
w.Write(respBytes)
})
//server := &http.Server{
// Addr: ":443",
// TLSConfig: lib.ConfigTLS(cofig)
//}
//server.ListenAndServeTLS("","")
http.ListenAndServe(":8080",nil )
}
客户端
package main
import (
"context"
"fmt"
"k8s.io/client-go/kubernetes"
"k8s.io/client-go/rest"
"log"
"strings"
)
var body string =`
{
"apiVersion": "admission.k8s.io/v1",
"kind": "AdmissionReview",
"request": {
"uid": "705ab4f5-6393-11e8-b7cc-42010a800002",
"kind": {"group":"","version":"v1","kind":"pods"},
"resource": {"group":"","version":"v1","resource":"pods"},
"name": "mypod",
"namespace": "default",
"operation": "CREATE",
"object": {"apiVersion":"v1","kind":"Pod","metadata":{"name":"shenyi","namespace":"default"}},
"userInfo": {
"username": "admin",
"uid": "014fbff9a07c",
"groups": ["system:authenticated","my-admin-group"],
"extra": {
"some-key":["some-value1", "some-value2"]
}
},
"dryRun": false
}
}
`
func main() {
mainconfig:=&rest.Config{
Host:"http://localhost:8080",
}
c,err:=kubernetes.NewForConfig(mainconfig)
if err!=nil{
log.Fatal(err)
}
result:=c.AdmissionregistrationV1().RESTClient().Post().Body(strings.NewReader(body)).
Do(context.Background())
b,_:=result.Raw()
fmt.Println(string(b))
}
主要webhook逻辑,写在了pods.go里面
package lib
import (
"fmt"
"k8s.io/api/admission/v1"
corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/klog/v2"
)
func AdmitPods(ar v1.AdmissionReview) *v1.AdmissionResponse {
podResource := metav1.GroupVersionResource{Group: "", Version: "v1", Resource: "pods"}
if ar.Request.Resource != podResource {
err := fmt.Errorf("expect resource to be %s", podResource)
klog.Error(err)
return ToV1AdmissionResponse(err)
}
raw := ar.Request.Object.Raw
pod := corev1.Pod{}
deserializer := Codecs.UniversalDeserializer()
if _, _, err := deserializer.Decode(raw, nil, &pod); err != nil {
klog.Error(err)
return ToV1AdmissionResponse(err)
}
reviewResponse := v1.AdmissionResponse{}
if pod.Name=="shenyi"{
reviewResponse.Allowed = false
reviewResponse.Result = &metav1.Status{Code:503,Message: "pod name cannot be shenyi"}
}else{
reviewResponse.Allowed = true
}
return &reviewResponse
}
部署webhook到k8s中,例子:禁止特殊pod名称 #
package main
import (
"encoding/json"
"k8s.io/klog/v2"
"log"
"myhook/lib"
"io/ioutil"
"k8s.io/api/admission/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"net/http"
)
func main() {
http.HandleFunc("/pods", func(w http.ResponseWriter, r *http.Request) {
log.Println(r.RequestURI)
var body []byte
if r.Body != nil {
if data, err := ioutil.ReadAll(r.Body); err == nil {
body = data
}
}
//第二步
reqAdmissionReview := v1.AdmissionReview{} //请求
rspAdmissionReview := v1.AdmissionReview{ //响应 ---只构建了一部分
TypeMeta: metav1.TypeMeta{
Kind: "AdmissionReview",
APIVersion: "admission.k8s.io/v1",
},
}
//第三步。 把body decode 成对象
deserializer := lib.Codecs.UniversalDeserializer()
if _, _, err := deserializer.Decode(body, nil, &reqAdmissionReview); err != nil {
klog.Error(err)
rspAdmissionReview.Response = lib.ToV1AdmissionResponse(err)
} else {
rspAdmissionReview.Response = lib.AdmitPods(reqAdmissionReview) //我们的业务
}
rspAdmissionReview.Response.UID = reqAdmissionReview.Request.UID
respBytes,err := json.Marshal(rspAdmissionReview)
if err!=nil{
klog.Error(err)
}else{
if _, err := w.Write(respBytes); err != nil {
klog.Error(err)
}
}
})
tlsConfig:=lib.Config{
CertFile:"/etc/webhook/certs/tls.crt",
KeyFile:"/etc/webhook/certs/tls.key",
}
server := &http.Server{
Addr: ":443",
TLSConfig: lib.ConfigTLS(tlsConfig),
}
server.ListenAndServeTLS("","")
//http.ListenAndServe(":8080",nil )
}
然后执行交叉编译,编译成linux可执行文件
set GOOS=linux
set GOARCH=amd64
go build -o myhook main.go
创建文件夹/hook/certs
生成证书
vi ca-config.json
{
"signing": {
"default": {
"expiry": "8760h"
},
"profiles": {
"server": {
"usages": ["signing"],
"expiry": "8760h"
}
}
}
}
# 生成证书的请求文件
vi ca-csr.json
{
"CN": "Kubernetes",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "zh",
"L": "bj",
"O": "bj",
"OU": "CA"
}
]
}
# 生成证书
cfssl gencert -initca ca-csr.json | cfssljson -bare ca
# 服务端证书请求文件
vi server-csr.json
{
"CN": "admission",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "zh",
"L": "bj",
"O": "bj",
"OU": "bj"
}
]
}
# 签发证书
cfssl gencert \
-ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-hostname=myhook.kube-system.svc \
-profile=server \
server-csr.json | cfssljson -bare server
创建webhook的yaml
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
name: myhood
webhooks:
- clientConfig:
caBundle: |
service:
name: myhook
namespace: kube-system
path: /pods
failurePolicy: Fail
sideEffects: NoneOnDryRun
name: myhook.jtthink.com
admissionReviewVersions: ["v1", "v1beta1"]
namespaceSelector: {}
rules:
- apiGroups: [""]
apiVersions: ["v1"]
operations: ["CREATE"]
resources: ["pods"]
这里的caBundle
内容使用这个命令cat ca.pem | base64
来获取
创建密文,将server.pem和server-key.pem导入到secret中
kubectl create secret tls myhook --cert=server.pem --key=server-key.pem -n kube-system
pod的创建如下,这里需要注意,要映射这个secret
apiVersion: apps/v1
kind: Deployment
metadata:
name: myhook
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: myhook
template:
metadata:
labels:
app: myhook
spec:
nodeName: jtthink4
containers:
- name: myhook
image: alpine:3.12
imagePullPolicy: IfNotPresent
command: ["/app/myhook"]
volumeMounts:
- name: hooktls
mountPath: /etc/webhook/certs
readOnly: true
- name: app
mountPath: /app
ports:
- containerPort: 443
volumes:
- name: app
hostPath:
path: /home/shenyi/app
- name: hooktls
secret:
secretName: myhook
---
apiVersion: v1
kind: Service
metadata:
name: myhook
namespace: kube-system
labels:
app: myhook
spec:
type: ClusterIP
ports:
- port: 443
targetPort: 443
selector:
app: myhook
最后应用下面这个测试的pod
apiVersion: v1
kind: Pod
metadata:
name: shenyi
namespace: default
spec:
containers:
- name: nginx
image: nginx:1.18-alpine
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
运行的时候就可以发现,这个pod没法被创建
提交pod时修改镜像源 #
将image从1.18
改成1.19
package lib
import (
"fmt"
"k8s.io/api/admission/v1"
corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/klog/v2"
)
func patchImage() []byte{
str:=`[
{
"op" : "replace" ,
"path" : "/spec/containers/0/image" ,
"value" : "nginx:1.19-alpine"
}
]`
return []byte(str)
}
func AdmitPods(ar v1.AdmissionReview) *v1.AdmissionResponse {
podResource := metav1.GroupVersionResource{Group: "", Version: "v1", Resource: "pods"}
if ar.Request.Resource != podResource {
err := fmt.Errorf("expect resource to be %s", podResource)
klog.Error(err)
return ToV1AdmissionResponse(err)
}
raw := ar.Request.Object.Raw
pod := corev1.Pod{}
deserializer := Codecs.UniversalDeserializer()
if _, _, err := deserializer.Decode(raw, nil, &pod); err != nil {
klog.Error(err)
return ToV1AdmissionResponse(err)
}
reviewResponse := v1.AdmissionResponse{}
if pod.Name=="shenyi"{
reviewResponse.Allowed = false
reviewResponse.Result = &metav1.Status{Code:503,Message: "pod name cannot be shenyi"}
}else{
reviewResponse.Allowed = true
reviewResponse.Patch=patchImage()
jsonPt:=v1.PatchTypeJSONPatch
reviewResponse.PatchType=$jsonPt
}
return &reviewResponse
}
指定命名空间修改pod镜像源 #
只有针对打了标签的namespace创建pod才会进行修改镜像
标签如下:
key是:jtthink-injection
value是:enabled
# 打标签
kubectl label namespace default jtthink-injection=enabled
# 删除标签
kubectl label namespace default jtthink-injection-
修改admconfig.yaml
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
name: myhood
webhooks:
- clientConfig:
caBundle: |
service:
name: myhook
namespace: kube-system
path: /pods
failurePolicy: Fail
sideEffects: NoneOnDryRun
name: myhook.jtthink.com
admissionReviewVersions: ["v1", "v1beta1"]
namespaceSelector:
matchExpressions:
- key: jtthink-injection
operator: In
values: ["enabled","1"]
rules:
- apiGroups: [""]
apiVersions: ["v1"]
operations: ["CREATE"]
resources: ["pods"]
创建pod时自动注入容器 #
package lib
import (
"fmt"
"k8s.io/api/admission/v1"
corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/klog/v2"
)
func patchImage() []byte{
str:=`[
{
"op" : "replace" ,
"path" : "/spec/containers/0/image" ,
"value" : "nginx:1.19-alpine"
},
{
"op" : "add",
"path" : "/spec/initConatiners",
"value" : [{
"name" : "myinit",
"image" : "busybox:1.28",
"command" : ["sh","-c","echo The app is running!"]
}]
}
]`
return []byte(str)
}
func AdmitPods(ar v1.AdmissionReview) *v1.AdmissionResponse {
podResource := metav1.GroupVersionResource{Group: "", Version: "v1", Resource: "pods"}
if ar.Request.Resource != podResource {
err := fmt.Errorf("expect resource to be %s", podResource)
klog.Error(err)
return ToV1AdmissionResponse(err)
}
raw := ar.Request.Object.Raw
pod := corev1.Pod{}
deserializer := Codecs.UniversalDeserializer()
if _, _, err := deserializer.Decode(raw, nil, &pod); err != nil {
klog.Error(err)
return ToV1AdmissionResponse(err)
}
reviewResponse := v1.AdmissionResponse{}
if pod.Name=="shenyi"{
reviewResponse.Allowed = false
reviewResponse.Result = &metav1.Status{Code:503,Message: "pod name cannot be shenyi"}
}else{
reviewResponse.Allowed = true
reviewResponse.Patch=patchImage()
jsonPt:=v1.PatchTypeJSONPatch
reviewResponse.PatchType=$jsonPt
}
return &reviewResponse
}
k8s网络原理 #
- 一个node上面多个pod如何访问
- 多个node上面多个pod网络连接又是怎么样的
- docker0的网卡删了会怎么样
Operator #
operator包含:CRD+webhook+controller
通过CRD扩展kubernetes API
- 自动创建、管理和配置应用实例
- 以deployment的形式部署到k8s中
k8s控制器会监视资源的创建/更新/删除事件,并触发Reconcile函数作为响应。整个调整过程被称为“Reconcile Loop(调协循环)”或者“Sync Loop(同步循环)”
controller根据内置逻辑,通过k8s API进行持续调整,直到当前状态变成所需状态。
- 需要在k8s服务器上开代理kube proxy。然后把配置文件下载下来,修改apiserver地址为代理地址(云服务器)
- 在kubeadm安装时,kubeadm init xxxx
加入这个参数–apiserver-cert-extra-sans=172.16.100.10,47.110.100.20(这里填本机IP)
敲命令
kubebuilder init --domain jtthink.com
kubebuilder create api --group myapp --version v1 --kind Redis
然后就可以有自己的资源配置文件redis.yaml
apiVersion: myapp.jtthink.com/v1
kind: Redis
metadata:
name: myredis
spec:
xxxx
xxxx
比方说,我们想要加一个port
就到redis_types.go里面的RedisSpec 加入 Port int
然后在yaml文件里面的spec字段里面加入port
执行make install
命令,安装crd
使用kubectl get crd
命令就可以获取到redis.myapp.jtthink.com
这个crd了
控制器 #
我们找到redis_controller
func (r *RedisReconciler) Reconcile(ctx context.Context, req ctrl.Request)(ctrl.Result,error){
_ = log.FromContext(ctx)
redis := &myappv1.Redis{}
if err:= r.Get(ctx,req.NamespacedName,redis); err != nil{
fmt.Println(err)
}else{
fmt.Println(redis)
}
return ctrl.ResultP{}, nil
}
使用make run
本地调试
初步发布到k8s中 #
两条命令
make docker-build docker-push IMG=xxxx/jtredis:v1
make deloy IMG=xxxx/jtredis:v1
自定义资源字段验证的基本方法 #
这里要去查kubebuilder的文档,在crd里面添加注释就可以,基本验证
比方说对上面的port,设置为port最小值为1
type RedisSpec struct{
//+kubebuilder:validation:Minimum:=1
//+kubebuilder:validation:Maximum:=40000
Port int `json:"port,omitempty"`
}
创建webhokk进行深入验证 #
创建webhook
kubebuilder create webhook --group myapp --version v1 --kind Redis --defaulting --programmatic-validation
创建好了以后就可以在v1里有redis_webhook.go里面做修改
安装
kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.4.0/cert-manager.yaml
# 验证下
kubectl get pods -A | grep cert
需要webhook就打开
(&myappv1.Redis{}).SetupWebhookWithManager(mgr)
接着去kustomization.yaml
中打开- ../webhook
和 - ../certmanager
-manager_webhook_patch.yaml
、-webhookcainjection_patch.yaml
都打开
然后在vars里面,所有内容放开注释
接着修改manager.yaml
中的imagePullPolicy,修改为IfNotPresent
然后执行命令
make install
make docker-build docker-push IMG=xxxx/jtredis:v1
make deploy IMG=xxx/jtredis:v1
实例1、提交资源和创建POD #
func CreateRedis(client client.Client,redisConfig *v1.redis) err {
newpod := &corev1.Pod{}
newpod.Name = redisConfig.Name
newpod.Namespace = redisConfig.Namespace
newpod.Spec.Containers = []corev1.Container{
{
Name:redisConfig.Name,
Image:"redis:5-alpine",
ImagePullPolicy: corev1.PullIfNotPresent,
Ports:[]corev1.ContainerPort{
{
ContainerPort:int32(redisConfig.Spec.Port),
},
}
},
}
return client.Create(context.Background(),newpod)
}
func (r *RedisReconciler) Reconcile(ctx context.Context, req ctrl.Request)(ctrl.Result,error){
_ = log.FromContext(ctx)
redis := &myappv1.Redis{}
if err := r.Get(ctx, req.NamespacedName, redis) ; err != nil{
fmt.Println(err)
} else{
err := helper.CreateRedis(r.Client,redis)
return ctrl.Result{},nil
}
return ctrl.Result{}, nil
}
实例2、资源删除判断 #
finalizer终结器,用来预删除
删除finalizer
kubectl patch configmap/mymap --type json --patch='[{"op":"remove","path":"/metadata/finalizers"}]'
redis_types.go
type RedisSpec struct {
//+kubebuilder:validation:Minimum:=81
//+kubebuilder:validation:Maximum:=40000
Port int `json:"port,omitempty"`
//+kubebuilder:validation:Minimum:=1
//+kubebuilder:validation:Maximum:=100
Num int `json:"num,omitempty"`
}
redis_helper.go
func GetRedisPodNames(redisConfig *v1.Redis) []string{
podNames := make([]string, redisConfig.Spec.Num)
for i := 0; i < redsiConfig.Spec.Num; i++{
podNames[i] = fmt.Sprintf("%s-%d",redisConfig.Name,i)
}
fmt.Println("podnames:",podNames)
return podNames
}
func IsExist(podName string, redis *v1.Redis) bool{
for _, po := range redis.Finalizers {
if podName == po{
return true
}
}
return false
}
func CreateRedis(client client.Client, redisConfig *v1.Redis, podName string)(string ,error){
if IsExist(podName, redisConfig){
return "",nil
}
newpod := &corev1.Pod{}
newpod.Name = podName
newpod.Namespace = redisConfig.Namespace
newpod.Spec.Containers = []corev1.Container{
{
Name:podName,
Image: "redis:5-alpine",
ImagePullPolicy: corev1.PullIfNotPresent,
Ports: []corev1.ContainerPort{
{
ContainerPort: int32(redisConfig.Spec.Port),
},
}
},
}
return podName, clent.Create(context.Background(),newpod)
}
redis_controller.go
func (r *RedisReconciler) Reconcile(ctx context.Context, req ctrl.Request)(ctrl.Result,error){
_ = log.FromContext(ctx)
redis := &myappv1.Redis{}
if err := r.Get(ctx, req.NamespacedName, redis); err != nil{
return ctrl.Result{},err
}else{
// 正在删
if !redis.DeleteionTimestamp.IsZero(){
return ctrl.Result{},r.clearRedis(ctx,redis)
}
// 开始创建
podName := helper.GetRedisPodNames(redis)
isAppend := false
for _,po := range podNames{
pname,err := helper.CreateRedis(r.Client,redis,po)
if err != nil{
return ctrl.Result{},err
}
if pname==""{
continue
}
redis.Finalizers = append(redis.Finalizers,pname)
isAppend=true
}
if isAppend{
err := r.Client.Update(ctx,redis)
if err != nil{
return ctrl.Result{},err
}
}
}
return ctrl.Result{},nil
}
func (r *RedisReconciler) clearRedis(ctx context.Context, redis *myappv1.Redis) error{
podList := redis.Finalizers
for _, podName := range podList{
err := r.Client.Delete(ctx, &v1.Pod{
ObjectMeta: metav1.ObjectMeta{Name:podName,Namespace:redis.Namespace},
})
if err != nil{
return err
}
}
redis.Finalizers = []string{}
return r.Client.Update(ctx,redis)
}
实例3、副本收缩处理 #
func (r *RedisReconciler) Reconcile(ctx context.Context, req ctrl.Request)(ctrl.Result,error){
_ = log.FromContext(ctx)
redis := &myappv1.Redis{}
if err := r.Get(ctx, req.NamespacedName, redis); err != nil{
return ctrl.Result{},err
}else{
// 正在删
if !redis.DeleteionTimestamp.IsZero(){
return ctrl.Result{},r.clearRedis(ctx,redis)
}
// 开始创建
podName := helper.GetRedisPodNames(redis)
isEdit := false
for _,po := range podNames{
pname,err := helper.CreateRedis(r.Client,redis,po)
if err != nil{
return ctrl.Result{},err
}
if pname==""{
continue
}
redis.Finalizers = append(redis.Finalizers,pname)
isEdit=true
}
// 收缩副本
if len(redis.Finalizers) > len(podNames){
isEdit = true
err := r.rmIfSurplus(ctx,podNames,redis)
if err != nil{
return ctrl.Result{},err
}
}
if isEdit{
err := r.Client.Update(ctx,redis)
if err != nil{
return ctrl.Result{},err
}
}
}
return ctrl.Result{},nil
}
// 收缩副本
func (r *RedisReconciler) rmIfSurplus(ctx context.Context, poNames []string,redis *myappv1.Redis) error{
for i := 0 ; i < len(redis.Finalizers) - len(poNames); i++{
err := r.Client.Delete(ctx,&v1.Pod{
ObjectMeta: metav1.ObjectMeta{
Name:redis.Finalizers[len(poNames)+i],Namespace:redis.Namespace
},
})
if err != nil{
return err
}
}
redis.Finalizers = poNames
return nil
}
实例4、监听CR创建的pod #
func (r *RedisReconciler) podDeleteHandler(event event.DeleteEvent,limitingInterface){
fmt.Println(event.Object.GetName())
}
func (r *RedisReconciler) SetupWithManager(mgr ctrl.Manager) error{
return ctrl.NewControllerManagedBy(mgr).
For(&myappv1.Redis{}).
Watches(&source.Kind{
Type:&v1.Pod{},
},handler.Funcs{DeleteFunc: r.podDeleteHandler}).
Complete(r)
}
实例5、自动重建手工删除的Pod #
设置owner reference
func GetRedisPodNames(redisConfig *v1.Redis) []string{
podNames := make([]string, redisConfig.Spec.Num)
for i := 0; i < redsiConfig.Spec.Num; i++{
podNames[i] = fmt.Sprintf("%s-%d",redisConfig.Name,i)
}
fmt.Println("podnames:",podNames)
return podNames
}
func IsExist(podName string, redis *v1.Redis) bool{
err := client.Get(context.Background(),
types.NamespacedName{Namespace:redis.Namespace,Name:podName},&corev1.Pod{})
if err != nil{
return false
}
return true
}
func CreateRedis(client client.Client,
redisConfig *v1.Redis, podName string,schema *runtime.Schema)(string ,error){
if IsExist(podName, redisConfig,client){
return "",nil
}
newpod := &corev1.Pod{}
newpod.Name = podName
newpod.Namespace = redisConfig.Namespace
newpod.Spec.Containers = []corev1.Container{
{
Name:podName,
Image: "redis:5-alpine",
ImagePullPolicy: corev1.PullIfNotPresent,
Ports: []corev1.ContainerPort{
{
ContainerPort: int32(redisConfig.Spec.Port),
},
}
},
}
err := controllerutil.SetControllerReference(redisConfig,newpod,schema)
if err != nil{
return "",err
}
err = client.Create(context.Background(),newpod)
if err != nil{
return "",err
}
return podName, nil
}
func (r *RedisReconciler) podDeleteHandler(event event.DeleteEvent,limitingInterface workqueue.Ra){
fmt.Println(event.Object.GetName())
for _ , ref := range event.Object.GetOwnerReferences(){
if ref.Kind == "redis" && ref.APIVersion == "myapp.jtthink.com/v1"{
limitingInterface.Add(reconcile.Request{
NamespacedName:types.NamespacedName{Name:ref.Name,
Namespace:event.Object.GetNamespace()}})
}
}
}
func (r *RedisReconciler) SetupWithManager(mgr ctrl.Manager) error{
return ctrl.NewControllerManagedBy(mgr).
For(&myappv1.Redis{}).
Watches(&source.Kind{
Type:&v1.Pod{},
},handler.Funcs{DeleteFunc: r.podDeleteHandler}).
Complete(r)
}
func (r *RedisReconciler) Reconcile(ctx context.Context, req ctrl.Request)(ctrl.Result,error){
_ = log.FromContext(ctx)
redis := &myappv1.Redis{}
if err := r.Get(ctx, req.NamespacedName, redis); err != nil{
return ctrl.Result{},err
}else{
// 正在删
if !redis.DeleteionTimestamp.IsZero(){
return ctrl.Result{},r.clearRedis(ctx,redis)
}
// 开始创建
podName := helper.GetRedisPodNames(redis)
isAppend := false
for _,po := range podNames{
pname,err := helper.CreateRedis(r.Client,redis,po)
if err != nil{
return ctrl.Result{},err
}
if pname==""{
continue
}
if controllerutil.ContainsFinalizer(redis,pname){
continue
}
redis.Finalizers = append(redis.Finalizers,pname)
isAppend=true
}
if isAppend{
err := r.Client.Update(ctx,redis)
if err != nil{
return ctrl.Result{},err
}
}
}
return ctrl.Result{},nil
}
func (r *RedisReconciler) clearRedis(ctx context.Context, redis *myappv1.Redis) error{
podList := redis.Finalizers
for _, podName := range podList{
err := r.Client.Delete(ctx, &v1.Pod{
ObjectMeta: metav1.ObjectMeta{Name:podName,Namespace:redis.Namespace},
})
if err != nil{
return err
}
}
redis.Finalizers = []string{}
return r.Client.Update(ctx,redis)
}
实例6、添加事件支持(event) #
在redis_controller.go
中
type RedisReconciler struct {
client.Client
Scheme *tuntime.Scheme
EventRecord record.EventRecorder
}
然后到入口函数main.go
if err = (&controllers.RedisReconciler{
Client: mgr.GetClient(),
Scheme: mgr.GetScheme(),
EventRecord: mgr.GetEventRecorderFor("JtRedis"),
}).SetupWithManager(mgr); err != nil{
setupLog.Error("unable to create controller","controller","redis")
os.Exit(1)
}
比方说我们在收缩副本的时候加入event
// 收缩副本
if len(redis.Finalizers) > len(podNames){
r.EventRecord.Event(redis,corev1.EventTypeNormal,"Upgrade","replicas decrease")
isEdit = true
err := r.rmIfSurplus(ctx,podNames,redis)
if err != nil{
return ctrl.Result{},err
}
}
实例7、支持资源的状态展现 #
在redis_types.go
type RedisStatus struct{
RedisNum int `json:"num"`
}
func (r *RedisReconciler) Reconcile(ctx context.Context, req ctrl.Request)(ctrl.Result,error){
_ = log.FromContext(ctx)
redis := &myappv1.Redis{}
if err := r.Get(ctx, req.NamespacedName, redis); err != nil{
return ctrl.Result{},err
}else{
// 正在删
if !redis.DeleteionTimestamp.IsZero(){
return ctrl.Result{},r.clearRedis(ctx,redis)
}
// 开始创建
podName := helper.GetRedisPodNames(redis)
isEdit := false
for _,po := range podNames{
pname,err := helper.CreateRedis(r.Client,redis,po)
if err != nil{
return ctrl.Result{},err
}
if pname==""{
continue
}
redis.Finalizers = append(redis.Finalizers,pname)
isEdit=true
}
// 收缩副本
if len(redis.Finalizers) > len(podNames){
isEdit = true
err := r.rmIfSurplus(ctx,podNames,redis)
if err != nil{
return ctrl.Result{},err
}
}
if isEdit{
err := r.Client.Update(ctx,redis)
if err != nil{
return ctrl.Result{},err
}
redis.Status.RedisNum=len(redis.Finalizers)
err = r.Status().Update(ctx,redis)
if err != nil{
return ctrl.Result{},err
}
}
}
return ctrl.Result{},nil
}
在redis定义的上面加上注释
//+kubebuilder:printcolumn:JSONPath=".status.num",name=Num,type=integer
实例8、初步进行集成测试 #
/*
Copyright 2021.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package controllers
import (
. "github.com/onsi/ginkgo"
. "github.com/onsi/gomega"
"k8s.io/client-go/kubernetes/scheme"
"k8s.io/client-go/rest"
"log"
"path/filepath"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/envtest"
"sigs.k8s.io/controller-runtime/pkg/envtest/printer"
logf "sigs.k8s.io/controller-runtime/pkg/log"
"sigs.k8s.io/controller-runtime/pkg/log/zap"
"testing"
myappv1 "jtapp/api/v1"
//+kubebuilder:scaffold:imports
)
// These tests use Ginkgo (BDD-style Go testing framework). Refer to
// http://onsi.github.io/ginkgo/ to learn more about Ginkgo.
var cfg *rest.Config
var k8sClient client.Client
var testEnv *envtest.Environment
func TestAPIs(t *testing.T) {
RegisterFailHandler(Fail)
RunSpecsWithDefaultAndCustomReporters(t,
"Controller Suite",
[]Reporter{printer.NewlineReporter{}})
}
var _ = BeforeSuite(func() {
logf.SetLogger(zap.New(zap.WriteTo(GinkgoWriter), zap.UseDevMode(true)))
By("bootstrapping test environment")
testEnv = &envtest.Environment{
CRDDirectoryPaths: []string{filepath.Join("..", "config", "crd", "bases")},
//KubeAPIServerFlags: apiServerArgs,
}
cfg, err := testEnv.Start()
log.Println(cfg, err)
//time.Sleep(time.Second * 100000)
Expect(err).NotTo(HaveOccurred())
Expect(cfg).NotTo(BeNil())
err = myappv1.AddToScheme(scheme.Scheme)
Expect(err).NotTo(HaveOccurred())
//+kubebuilder:scaffold:scheme
k8sClient, err = client.New(cfg, client.Options{Scheme: scheme.Scheme})
Expect(err).NotTo(HaveOccurred())
Expect(k8sClient).NotTo(BeNil())
k8sManager, err := ctrl.NewManager(cfg, ctrl.Options{
Scheme: scheme.Scheme,
})
Expect(err).ToNot(HaveOccurred())
err = (&RedisReconciler{
Client: k8sManager.GetClient(),
Scheme: k8sManager.GetScheme(),
EventRecord: k8sManager.GetEventRecorderFor("JtRedis"),
}).SetupWithManager(k8sManager)
Expect(err).ToNot(HaveOccurred())
go func() {
err = k8sManager.Start(ctrl.SetupSignalHandler())
Expect(err).ToNot(HaveOccurred())
}()
}, 60)
var _ = AfterSuite(func() {
By("tearing down the test environment")
err := testEnv.Stop()
Expect(err).NotTo(HaveOccurred())
})
package controllers
import (
"context"
. "github.com/onsi/ginkgo"
. "github.com/onsi/gomega"
myappv1 "jtapp/api/v1"
)
var _ = Describe("test myredis", func() {
redis := &myappv1.Redis{}
redis.Namespace = "default"
redis.Spec.Port = 2377
redis.Spec.Num = 3
It("create myredis", func() {
Expect(k8sClient.Create(context.Background(), redis)).Should(Succeed())
})
})
安装测试环境
curl -sSLo envtest-bins.tar.gz "https://storage.googleapis.com/kubebuilder-tools/kubebuilder-tools-1.20.2-$(go env GOOS)-$(go env GOARCH).tar.gz"
运行测试
make test SKIP_FETCH_TOOLS=1 KUBEBUILDER_ASSETS=/Users/shenyi/Documents/projects/kb/testbin/bin KUBEBUILDER_ATTACH_CONTROL_PLANE_OUTPUT=1
Prometheus #
最简部署、kube-state-metrics部署 #
暂时体外部署
这个prometheus.yml
放的就是prometheus拉取内容的时间间隔
docker run -d --name pm \
-p 9090:9090 \
-v /home/shenyi/prometheus/config:/config \
prom/prometheus:v2.30.0 --web.enable-lifecycle --config.file=/config/prometheus.yml
kube-state-metrics部署
官方提供的k8s内部各个组件的状态指标
然后到有k8s的服务器上装kube-state-metrics
deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: 2.2.1
name: kube-state-metrics
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: kube-state-metrics
template:
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: 2.2.1
spec:
containers:
- image: bitnami/kube-state-metrics:2.2.1
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 5
timeoutSeconds: 5
name: kube-state-metrics
ports:
- containerPort: 8080
name: http-metrics
- containerPort: 8081
name: telemetry
readinessProbe:
httpGet:
path: /
port: 8081
initialDelaySeconds: 5
timeoutSeconds: 5
securityContext:
runAsUser: 65534
nodeSelector:
kubernetes.io/os: linux
serviceAccountName: kube-state-metrics
service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: 2.2.1
name: kube-state-metrics
namespace: kube-system
spec:
#clusterIP: None
type: NodePort
ports:
- name: http-metrics
port: 8080
targetPort: http-metrics
nodePort: 32280
- name: telemetry
port: 8081
targetPort: telemetry
selector:
app.kubernetes.io/name: kube-state-metrics
service-account.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: 2.2.1
name: kube-state-metrics
namespace: kube-system
cluster-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: 2.2.1
name: kube-state-metrics
rules:
- apiGroups:
- ""
resources:
- configmaps
- secrets
- nodes
- pods
- services
- resourcequotas
- replicationcontrollers
- limitranges
- persistentvolumeclaims
- persistentvolumes
- namespaces
- endpoints
verbs:
- list
- watch
- apiGroups:
- apps
resources:
- statefulsets
- daemonsets
- deployments
- replicasets
verbs:
- list
- watch
- apiGroups:
- batch
resources:
- cronjobs
- jobs
verbs:
- list
- watch
- apiGroups:
- autoscaling
resources:
- horizontalpodautoscalers
verbs:
- list
- watch
- apiGroups:
- authentication.k8s.io
resources:
- tokenreviews
verbs:
- create
- apiGroups:
- authorization.k8s.io
resources:
- subjectaccessreviews
verbs:
- create
- apiGroups:
- policy
resources:
- poddisruptionbudgets
verbs:
- list
- watch
- apiGroups:
- certificates.k8s.io
resources:
- certificatesigningrequests
verbs:
- list
- watch
- apiGroups:
- storage.k8s.io
resources:
- storageclasses
- volumeattachments
verbs:
- list
- watch
- apiGroups:
- admissionregistration.k8s.io
resources:
- mutatingwebhookconfigurations
- validatingwebhookconfigurations
verbs:
- list
- watch
- apiGroups:
- networking.k8s.io
resources:
- networkpolicies
- ingresses
verbs:
- list
- watch
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- list
- watch
cluster-role-binding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: 2.2.1
name: kube-state-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kube-state-metrics
subjects:
- kind: ServiceAccount
name: kube-state-metrics
namespace: kube-system
然后执行kubectl get pods -n kube-system
然后就可以看到这个kube-state-metrics了
Prometheus拉取kube-state-metrics、node_exporter部署 #
修改配置文件
global:
scrape_interval: 5s
scrape_configs:
- job_name: 'prometheus-state-metrics'
static_configs:
- targets: ['192.168.0.53:32280']
- job_name: 'node-exporter'
static_configs:
- targets: ['192.168.0.53:9100']
然后更新一下prometheus
curl -X POST http://192.168.0.106:9090/-/reload
node_exporter安装部署
apiVersion: v1
kind: Namespace
metadata:
name: prometheus
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-exporter
namespace: prometheus
labels:
name: node-exporter
spec:
selector:
matchLabels:
name: node-exporter
template:
metadata:
labels:
name: node-exporter
spec:
hostPID: true
hostIPC: true
hostNetwork: true
containers:
- name: node-exporter
image: bitnami/node-exporter:1.2.2
ports:
- containerPort: 9100
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 1000m
memory: 1Gi
securityContext:
privileged: true
args:
- --path.procfs
- /host/proc
- --path.sysfs
- /host/sys
- --collector.filesystem.ignored-mount-points
- '"^/(sys|proc|dev|host|etc)($|/)"'
volumeMounts:
- name: dev
mountPath: /host/dev
- name: proc
mountPath: /host/proc
- name: sys
mountPath: /host/sys
- name: rootfs
mountPath: /rootfs
tolerations:
- key: "node-role.kubernetes.io/master"
operator: "Exists"
effect: "NoSchedule"
volumes:
- name: proc
hostPath:
path: /proc
- name: dev
hostPath:
path: /dev
- name: sys
hostPath:
path: /sys
- name: rootfs
hostPath:
path: /
因为这个是跑在容器里面的,所以要做一些配置,才能查看宿主机的内容。
HostPID - 控制 Pod 中容器是否可以共享宿主上的进程 ID 空间
HostIPC - 控制 Pod 容器是否可共享宿主上的 IPC (进程通信)
hostNetwork: true –POD允许使用宿主机网络
privileged: true 容器以特权方式允许(为了能够访问宿主机所有设备)
我们把主机的/dev、/proc、/sys这些目录挂载到容器中
/dev —存放与设备(包括外设)有关的文件,如打印机、USB、串口/并口等
/proc –CPU信息、内存信息、内核信息等
/sys –硬件设备的驱动程序信息
Prometheus服务自动发现 #
kubernetes_sd_configs
Prometheus通过k8s API集成目前主要支持5种服务模式,分别是:Node、Service、Pod、Endpoints、Ingress。
外部的Prometheus需要创建下面内容
- 创建一个serviceaccount、clusterrole和clusterrolebinding用于外部访问
- 创建好rbac后需要拷贝k8s集群的ca证书和serviceaccount对应的token内容
- 保存到Prometheus对应的服务器中
apiVersion: v1
kind: ServiceAccount
metadata:
name: myprometheus
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: myprometheus-clusterrole
rules:
- apiGroups:
- ""
resources:
- nodes
- services
- endpoints
- pods
- nodes/proxy
verbs:
- get
- list
- watch
- apiGroups:
- "extensions"
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- configmaps
- nodes/metrics
verbs:
- get
- nonResourceURLs:
- /metrics
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: myprometheus-clusterrolebinding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: myprometheus-clusterrole
subjects:
- kind: ServiceAccount
name: myprometheus
namespace: kube-system
global:
scrape_interval: 5s
scrape_configs:
- job_name: 'prometheus-state-metrics'
static_configs:
- targets: ['192.168.0.53:32280']
- job_name: 'node-exporter'
static_configs:
- targets: ['192.168.0.53:9100','192.168.0.152:9100']
- job_name: 'k8s-node'
metrics_path: /metrics
kubernetes_sd_configs:
- api_server: https://192.168.0.53:6443/
role: node
bearer_token_file: /config/sa.token
tls_config:
ca_file: /config/ca.crt
# insecure_skip_verify: true
因为是体外,所以需要填入这些内容。
需要从secret里面取出来token
kubectl -n kube-system describe secret \
$(kubectl -n kube-system describe sa myprometheus |grep 'Mountable secrets'| cut -f 2- -d ":" | tr -d " ") |grep -E '^token' | cut -f2 -d':' | tr -d '\t'
myprometheus 是我们的 sa 名称
ca证书 默认在cd /etc/kubernetes/pki ,请自行 scp 拷贝
最后更新配置 curl -X POST http://192.168.0.106:9090/-/reload
prometheus服务自动发现(2)修改标签 #
主要是将本来是10250的端口,改成9100
global:
scrape_interval: 5s
scrape_configs:
- job_name: 'prometheus-state-metrics'
static_configs:
- targets: ['192.168.0.53:32280']
- job_name: 'node-exporter'
static_configs:
- targets: ['192.168.0.53:9100','192.168.0.152:9100']
- job_name: 'k8s-node'
metrics_path: /metrics
kubernetes_sd_configs:
- api_server: https://192.168.0.53:6443/
role: node
bearer_token_file: /config/sa.token
tls_config:
ca_file: /config/ca.crt
# insecure_skip_verify: true
relabel_configs:
- source_labels: [__address__]
regex: '(.*):10250'
replacement: '${1}:9100'
target_label: __address__
action: replace
Prometheus服务自动发现(3)pod监控的快速配置(kubelet) #
TOKEN=`cat sa.token` && curl https://192.168.0.53:6443/api/v1/nodes/jtthink2/proxy/metrics/cadvisor \
--header "Authorization: Bearer $TOKEN" --cacert ca.crt
global:
scrape_interval: 5s
scrape_configs:
- job_name: 'prometheus-state-metrics'
static_configs:
- targets: ['192.168.0.53:32280']
- job_name: 'k8s-node'
metrics_path: /metrics
kubernetes_sd_configs:
- api_server: https://192.168.0.53:6443/
role: node
bearer_token_file: /config/sa.token
tls_config:
ca_file: /config/ca.crt
# insecure_skip_verify: true
relabel_configs:
- source_labels: [__address__]
regex: '(.*):10250'
replacement: '${1}:9100'
target_label: __address__
action: replace
- job_name: 'k8s-kubelet'
scheme: https
bearer_token_file: /config/sa.token
tls_config:
ca_file: /config/ca.crt
kubernetes_sd_configs:
- api_server: https://192.168.0.53:6443/
role: node
bearer_token_file: /config/sa.token
tls_config:
ca_file: /config/ca.crt
relabel_configs:
- target_label: __address__
replacement: 192.168.0.53:6443
- source_labels: [__meta_kubernetes_node_name]
regex: '(.+)'
replacement: '/api/v1/nodes/$1/proxy/metrics/cadvisor'
target_label: __metrics_path__
action: replace
Prometheus Adapter #
kubernetes主要通过两类API来获取资源使用指标
resource metrics API:核心组件提供监控指标,如容器CPU、内存。
custom metrics API:自定义指标。
这个组件是从prometheus中取值
将yaml内容放在了/adapter
文件夹内,修改custom-metrics-apiserver-deployment.yaml
里面prometheus所对应的url(外部的)
命名空间
kubectl create namespace custom-metrics
修改
custom-metrics-apiserver-deployment.yaml
修改prometheus对应地址生成secret
进入k8sca证书所在目录
如果是kubeadm装的,默认目录在master主机的/etc/kubernetes/pki
执行下面的命令
openssl genrsa -out serving.key 2048 openssl req -new -key serving,key -out serving.csr -subj "/CN=serving" openssl x509 -req -in serving.csr -CA ./ca.crt -CAkey ./ca.key -CAcreateserial -out serving.crt -days 3650 kubectl create secret generic cm-adapter-servcing-certs --from-file=serving.crt=./serving.crt --from-file=serving.key -n custom-metrics kubectl apply -f . # 验证一下 kubectl get apiservice | grep custom-metrics
手工查看
# 获取default下所有pods的cpu使用情况 kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/myredis-bb8dbf5c5-lrbbk/cpu_usage" # 获取内存 kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/nginx-v2-5867c466f6-m9cfr/memory_usage_bytes"
Prometheus基本查询(1)基本概念、即时向量查询 #
PMQL PromQL(Prometheus Query Language),Prometheus的数据查询DSL语言
- 即时向量(抓取的时候)
- 范围向量
- 标量
- 字符串
查询container_memory_usage_bytes
代表容器的内存使用字节数
=:选择与提供的字符串完全相同的标签。
!=:选择不等于提供的字符串的标签。
=~:选择与提供的字符串正则表达式匹配的标签。
!~:选择与提供的字符串不匹配的标签。
譬如这个
container_memory_usage_bytes{namespace=“default”,pod=~“nginx.*”}
再譬如这个
container_memory_usage_bytes{namespace=“default”,pod=~"(nginx|mysql).*"}
一般还要加个条件(container=POD 是pause容器,一般不需要显示)
container_memory_usage_bytes{namespace=“default”,pod=~“nginx.*”, container!=“POD”}
API查询
http://121.36.219.114:9090/api/v1/query?query=container_memory_usage_bytes{namespace="default",pod=~"nginx.*"}
这就是 即时查询:
GET /api/v1/query
Prometheus基本查询(2)使用gin实现自定义指标 #
main.go代码
package main
import (
"github.com/gin-gonic/gin"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promhttp"
"strconv"
)
func init() {
prometheus.MustRegister(prodsVisit)
}
// 计数器向量
var prodsVisit = prometheus.NewCounterVec(
prometheus.CounterOpts{
Name: "jtthink_prods_visit",
},
[]string{"prod_id"},
)
func main() {
r:=gin.New()
r.GET("/prods/visit", func(c *gin.Context) {
pid_str:=c.Query("pid")
_,err:=strconv.Atoi(pid_str)
if err!=nil{
c.JSON(400,gin.H{"message":"error pid"})
}
prodsVisit.With(prometheus.Labels{
"prod_id":pid_str,
}).Inc()
c.JSON(200,gin.H{"message":"OK"})
})
r.GET("/metrics", gin.WrapH(promhttp.Handler()))
r.Run(":8080")
}
pCounter(计数器): 表示 一个单调递增的指标数据,请求次数、错误数量等等
pGauge(计量器):代表一个可以任意变化的指标数据, 可增可减 .场景有:协程数量、CPU、Memory 、业务队列的数量 等等
pHistogram(累积直方图)
主要是样本观测数据,在一段时间范围内对数据进行采样. 譬如请求持续时间或响应大小等等 。这点往往可以配合链路追踪系统(譬如之前讲到过jaeger来使用)
pSummary (摘要统计)
和 直方图类似,也是样本观测。但是它提供了样本值的分位数、所有样本值的大小总和、样本总量
创建deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: prodmetrics
namespace: default
spec:
selector:
matchLabels:
app: prodmetrics
replicas: 1
template:
metadata:
labels:
app: prodmetrics
spec:
nodeName: jtthink1
containers:
- name: prodmetrics
image: alpine:3.12
imagePullPolicy: IfNotPresent
workingDir: /app
command: ["./prodmetrics"]
volumeMounts:
- name: app
mountPath: /app
ports:
- containerPort: 8080
volumes:
- name: app
hostPath:
path: /home/shenyi/prometheus/prodmetrics
---
apiVersion: v1
kind: Service
metadata:
name: prodmetrics
namespace: default
spec:
type: NodePort
ports:
- port: 80
targetPort: 8080
nodePort: 31880
selector:
app: prodmetrics
加入配置
- job_name: 'prod-metrics'
static_configs:
- targets: ['192.168.0.53:31880']
Prometheus基本查询(3)区间查询、聚合操作符 #
区间查询,根据时间段内查询出结果来
jtthink_prods_visit[5m] 查询过去5分钟的数列
s - 秒
m - 分钟
h - 小时
d - 天
w - 周
y - 年
jtthink_prods_visit {} offset 5m 查5分之前的 叫做偏移量
聚合操作符:
sum:求和
min:最小值
max:最大值
avg:平均值
stddev:标准差
stdvar:方差
count:元素个数
count_values:等于某值的元素个数
bottomk:最小的 k 个元素
topk:最大的 k 个元素
quantile:分位数
自动发现业务service并进行抓取 #
global:
scrape_interval: 5s
scrape_configs:
- job_name: 'prometheus-state-metrics'
static_configs:
- targets: ['192.168.0.53:32280']
# - job_name: 'jtthink-prods'
# static_configs:
# - targets: ['192.168.0.53:31880']
- job_name: 'jtthink-svc-auto'
metrics_path: /metrics
kubernetes_sd_configs:
- api_server: https://192.168.0.53:6443/
role: service
bearer_token_file: /config/sa.token
tls_config:
ca_file: /config/ca.crt
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_scrape]
regex: true
action: keep
- source_labels: [__meta_kubernetes_service_annotation_nodeport]
regex: '(.+)'
replacement: '192.168.0.53:${1}'
target_label: __address__
action: replace
- job_name: 'k8s-node'
metrics_path: /metrics
kubernetes_sd_configs:
- api_server: https://192.168.0.53:6443/
role: node
bearer_token_file: /config/sa.token
tls_config:
ca_file: /config/ca.crt
# insecure_skip_verify: true
relabel_configs:
- source_labels: [__address__]
regex: '(.*):10250'
replacement: '${1}:9100'
target_label: __address__
action: replace
- job_name: 'k8s-kubelet'
scheme: https
bearer_token_file: /config/sa.token
tls_config:
ca_file: /config/ca.crt
kubernetes_sd_configs:
- api_server: https://192.168.0.53:6443/
role: node
bearer_token_file: /config/sa.token
tls_config:
ca_file: /config/ca.crt
relabel_configs:
- target_label: __address__
replacement: 192.168.0.53:6443
- source_labels: [__meta_kubernetes_node_name]
regex: '(.+)'
replacement: '/api/v1/nodes/$1/proxy/metrics/cadvisor'
target_label: __metrics_path__
action: replace
apiVersion: apps/v1
kind: Deployment
metadata:
name: prodmetrics
namespace: default
spec:
selector:
matchLabels:
app: prodmetrics
replicas: 1
template:
metadata:
labels:
app: prodmetrics
spec:
nodeName: jtthink1
containers:
- name: prodmetrics
image: alpine:3.12
imagePullPolicy: IfNotPresent
workingDir: /app
command: ["./prodmetrics"]
volumeMounts:
- name: app
mountPath: /app
ports:
- containerPort: 8080
volumes:
- name: app
hostPath:
path: /home/shenyi/prometheus/prodmetrics
---
apiVersion: v1
kind: Service
metadata:
name: prodmetrics
namespace: default
annotations:
scrape: "true"
nodeport: "31880"
spec:
type: NodePort
ports:
- port: 80
targetPort: 8080
nodePort: 31880
selector:
app: prodmetrics
Prometheus Adapter 创建自定义指标 #
修改custom-metrics-apiserver-deployment.yaml
修改时间间隔,确保和prometheus间隔差不多
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: custom-metrics-apiserver
name: custom-metrics-apiserver
namespace: custom-metrics
spec:
replicas: 1
selector:
matchLabels:
app: custom-metrics-apiserver
template:
metadata:
labels:
app: custom-metrics-apiserver
name: custom-metrics-apiserver
spec:
serviceAccountName: custom-metrics-apiserver
containers:
- name: custom-metrics-apiserver
#image: gcr.io/k8s-staging-prometheus-adapter-amd64
#image: willdockerhub/prometheus-adapter:v0.9.0
image: directxman12/k8s-prometheus-adapter-amd64:v0.8.4
args:
- --secure-port=6443
- --tls-cert-file=/var/run/serving-cert/serving.crt
- --tls-private-key-file=/var/run/serving-cert/serving.key
- --logtostderr=true
- --prometheus-url=http://192.168.0.106:9090/
- --metrics-relist-interval=5s
- --metrics-max-age=5s
- --v=6
- --config=/etc/adapter/config.yaml
ports:
- containerPort: 6443
volumeMounts:
- mountPath: /var/run/serving-cert
name: volume-serving-cert
readOnly: true
- mountPath: /etc/adapter/
name: config
readOnly: true
- mountPath: /tmp
name: tmp-vol
volumes:
- name: volume-serving-cert
secret:
secretName: cm-adapter-serving-certs
- name: config
configMap:
name: adapter-config
- name: tmp-vol
emptyDir: {}
seriesQuery:用于确定需要查询的指标集合
seriesFilters :用于过滤指标
is: , 匹配包含该正则表达式的metrics.
isNot: , 匹配不包含该正则表达式的metrics.
resources:把指标的标签和 k8s 的资源类型(必须是真实资源)关联起譬如, pod 和 namespace
(因此今天我们需要把pod和namespace加入到标签)
name:用来给指标重命名的
matches: ^container_(.*)$ —支持正则
as:默认值为 $1 ,为空就是$1
接着来配置custom-metrics-config-map.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: adapter-config
namespace: custom-metrics
data:
config.yaml: |
rules:
- seriesQuery: '{__name__=~"^jtthink_.*",namespace!=""}'
seriesFilters: []
resources:
overrides:
namespace:
resource: namespace
svcname:
resource: service
name:
matches: ^jtthink_(.*)$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
- seriesQuery: '{__name__=~"^container_.*",container!="POD",namespace!="",pod!=""}'
seriesFilters: []
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
matches: ^container_(.*)_seconds_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>,container!="POD"}[1m])) by (<<.GroupBy>>)
- seriesQuery: '{__name__=~"^container_.*",container!="POD",namespace!="",pod!=""}'
seriesFilters:
- isNot: ^container_.*_seconds_total$
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
matches: ^container_(.*)_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>,container!="POD"}[1m])) by (<<.GroupBy>>)
- seriesQuery: '{__name__=~"^container_.*",container!="POD",namespace!="",pod!=""}'
seriesFilters:
- isNot: ^container_.*_total$
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
matches: ^container_(.*)$
as: ""
metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>,container!="POD"}) by (<<.GroupBy>>)
- seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
seriesFilters:
- isNot: .*_total$
resources:
template: <<.Resource>>
name:
matches: ""
as: ""
metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)
- seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
seriesFilters:
- isNot: .*_seconds_total
resources:
template: <<.Resource>>
name:
matches: ^(.*)_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
- seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
seriesFilters: []
resources:
template: <<.Resource>>
name:
matches: ^(.*)_seconds_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
resourceRules:
cpu:
containerQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
nodeQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>, id='/'}[1m])) by (<<.GroupBy>>)
resources:
overrides:
instance:
resource: node
namespace:
resource: namespace
pod:
resource: pod
containerLabel: container
memory:
containerQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>}) by (<<.GroupBy>>)
nodeQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>,id='/'}) by (<<.GroupBy>>)
resources:
overrides:
instance:
resource: node
namespace:
resource: namespace
pod:
resource: pod
containerLabel: container
window: 1m
externalRules:
- seriesQuery: '{__name__=~"^.*_queue_(length|size)$",namespace!=""}'
resources:
overrides:
namespace:
resource: namespace
name:
matches: ^.*_queue_(length|size)$
as: "$0"
metricsQuery: max(<<.Series>>{<<.LabelMatchers>>})
- seriesQuery: '{__name__=~"^.*_queue$",namespace!=""}'
resources:
overrides:
namespace:
resource: namespace
name:
matches: ^.*_queue$
as: "$0"
metricsQuery: max(<<.Series>>{<<.LabelMatchers>>})
使用HPA进行业务的扩容 #
kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta2
metadata:
name: prodhpa
spec:
scaleTargetRef:
# 指向我们之前创建过的 deploy
apiVersion: apps/v1
kind: Deployment
name: prodmetrics
# 指定最小
minReplicas: 1
maxReplicas: 3
metrics:
# 我们用的是service 模式 Resource Pods Object
- type: Object
object:
metric:
name: prods_visit
describedObject:
apiVersion: v1
kind: Service
name: prodmetrics
target:
type: Value
value: 8000m
关于缩容
可以在kube-controller-manager配置文件中加入或修改(/etc/kubernetes/manifests/kube-controller-manager.yaml)
–horizontal-pod-autoscaler-downscale-stabilization=1m 默认是5分钟(一般不要改)
AlertManager快速入门、安装 #
docker部署
docker pull prom/alertmanager:v0.22.2
- Prometheus负责采集和告警规则的配置
- 告警规则触发会将告警信息转发到独立的组件Alertmanager
- Alertmanager经过处理后发送给外部(邮件、钉钉机器人、微信、自定义的webhook等)
config.yaml
#根路由
route:
#根路由接收者(还可以设置子路由)
receiver: 'test-receiver'
#下面时间内如果接收到多个报警,则会合并成一个通知发送给receiver
group_wait: 30s
#两次报警通知的时间间隔
group_interval: 1m
#发送相同告警的时间间隔
repeat_interval: 2m
#分组规则 后面再说
group_by: [alertname]
#定义所有接收者
receivers:
#接收者名称
- name: 'test-receiver'
#设置为webhook类型--还可以设置email、钉钉、微信等
webhook_configs:
- url: 'http://192.168.0.106:9091/'
testhook.go
package main
import (
"fmt"
"github.com/gin-gonic/gin"
"io/ioutil"
"log"
)
func main() {
r:=gin.New()// gin
r.POST("/", func(c *gin.Context) {
b,err:=ioutil.ReadAll(c.Request.Body)
if err!=nil{log.Println(err)}
fmt.Println("收到告警信息")
fmt.Println(string(b))
c.JSON(200,gin.H{"message":"OK"})
})
r.Run(":9090")
}
docker运行webhook
docker run --name webhook -d \
-v /home/shenyi/prometheus/alertmanager/webhook:/app \
-p 9091:9090 \
alpine:3.12 /app/testhook
启动alertmanager
docker run -d -p 9093:9093 \
-v /home/shenyi/prometheus/alertmanager/config/config.yaml:/etc/alertmanager/alertmanager.yml \
--name alertmanager prom/alertmanager:v0.22.2
修改prometheus配置
alerting:
alertmanagers:
- static_configs:
- targets: ["192.168.0.106:9093"]
更新
curl -X POST http://192.168.0.106:9090/-/reload
AlertManager 告警规则配置 #
groups:
- name: prods
rules:
- alert: prodsvisit
expr: sum(rate(jtthink_prods_visit[2m])) > 0.5
for: 10s
labels:
level: warning
annotations:
summary: "商品访问量飙升"
description: " 商品访问量飙升,预估值:{{ $value }}"
大概的模版
groups:
- name:
rules:
- alert: # 报警名称---唯一
expr: # PromQL表达式 用于指定情况规则属于告警范围
for: # 告警持续时间,超过会推送给 Alertmanager
labels:
level: #自定义标签
annotations: # 注释 (自定义)
summary: # 概况(自定义)
description: # 描述(自定义)
重启pm
docker run -d --name pm \
-p 9090:9090 \
-v /home/shenyi/prometheus/config:/config \
-v /home/shenyi/prometheus/rules:/rules \
prom/prometheus:v2.30.0 --web.enable-lifecycle --config.file=/config/prometheus.yml --web.enable-admin-api
rule_files:
- "/rules/prodsrule.yaml"
默认情况下Prometheus会每分钟对这些告警规则进行计算 (如下配置可以改)
global:
evaluation_interval: 1m
更新
curl -X POST http://192.168.0.106:9090/-/reload
然后 访问下 我们做好的API 提升 访问量:
http://121.36.252.154:31880/prods/visit?pid=102&num=10
日志收集 #
fluent-bit
负责解析及数据过滤
fluentd
负责接收fluent-bit
解析后到数据(进而发到MQ(如kafka)、ELK里)
参考文档:
https://docs.fluentbit.io/manual/installation/kubernetes
文件主要在这: