首页
友链
统计
留言
更多
直播
壁纸
推荐
我的毛线
院长科技
Search
1
本站官方群:894703859------|诚邀各位大佬的入驻!
418 阅读
2
pxe 自动化安装系统
348 阅读
3
新款螺旋帽子编织#夏凉帽#合股线夏凉帽编织
308 阅读
4
10 个Linux Awk文本处理经典案例
306 阅读
5
软件安装
293 阅读
Linux
yaml
iptables
shell
ansible
ssl
命令
文件管理
用户权限
综合集群架构
三剑客
awk
sed
自动化
pxe
编织
编织视频
监控
prometheus
go
go占位符
vue
vue基础
vue项目
web
Nginx
html
vscode
html标签
html表格
css基础
css定位
css精灵图
code
html5
项目
js
jQuery
面向对象
kubernetes
k8s命令
k8s
k8s搭建
database
clickhouse
常用工具
微软
登录
/
注册
Search
标签搜索
基础
js
Nginx
css
webapi
jQuery
面向对象
command
项目
ansible
用户权限
go
html
文件管理
命令
综合集群架构
k8s
pxe
awk
vscode
JustDoIt
累计撰写
112
篇文章
累计收到
4
条评论
首页
栏目
Linux
yaml
iptables
shell
ansible
ssl
命令
文件管理
用户权限
综合集群架构
三剑客
awk
sed
自动化
pxe
编织
编织视频
监控
prometheus
go
go占位符
vue
vue基础
vue项目
web
Nginx
html
vscode
html标签
html表格
css基础
css定位
css精灵图
code
html5
项目
js
jQuery
面向对象
kubernetes
k8s命令
k8s
k8s搭建
database
clickhouse
常用工具
微软
页面
友链
统计
留言
直播
壁纸
推荐
我的毛线
院长科技
搜索到
18
篇与
的结果
2022-04-02
备份与迁移k8s集群神器
前言一般来说大家都用etcd备份恢复k8s集群,但是有时候我们可能不小心删掉了一个namespace,假设这个ns里面有上百个服务,瞬间没了,怎么办?当然了,可以用CI/CD系统发布,但是时间会花费很久,这时候,vmvare的Velero出现了。velero可以帮助我们:灾备场景,提供备份恢复k8s集群的能力迁移场景,提供拷贝集群资源到其他集群的能力(复制同步开发,测试,生产环境的集群配置,简化环境配置)下面我就介绍一下如何使用 Velero 完成备份和迁移。Velero 地址:https://github.com/vmware-tanzu/veleroACK 插件地址:https://github.com/AliyunContainerService/velero-plugin下载 Velero 客户端Velero 由客户端和服务端组成,服务器部署在目标 k8s 集群上,而客户端则是运行在本地的命令行工具。前往 Velero 的 Release 页面 下载客户端,直接在 GitHub 上下载即可解压 release 包将 release 包中的二进制文件 velero 移动到 $PATH 中的某个目录下执行 velero -h 测试部署velero-plugin插件拉取代码git clone https://github.com/AliyunContainerService/velero-plugin 配置修改#修改`install/credentials-velero`文件,将新建用户中获得的`AccessKeyID`和`AccessKeySecret`填入,这里的 OSS EndPoint 为之前 OSS 的访问域名 ALIBABA_CLOUD_ACCESS_KEY_ID=<ALIBABA_CLOUD_ACCESS_KEY_ID> ALIBABA_CLOUD_ACCESS_KEY_SECRET=<ALIBABA_CLOUD_ACCESS_KEY_SECRET> ALIBABA_CLOUD_OSS_ENDPOINT=<ALIBABA_CLOUD_OSS_ENDPOINT> #修改 `install/01-velero.yaml`,将 OSS 配置填入: --- apiVersion: v1 kind: ServiceAccount metadata: namespace: velero name: velero --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: component: velero name: velero roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: velero namespace: velero --- apiVersion: velero.io/v1 kind: BackupStorageLocation metadata: labels: component: velero name: default namespace: velero spec: config: region: cn-beijing objectStorage: bucket: k8s-backup-test prefix: test provider: alibabacloud --- apiVersion: velero.io/v1 kind: VolumeSnapshotLocation metadata: labels: component: velero name: default namespace: velero spec: config: region: cn-beijing provider: alibabacloud --- apiVersion: extensions/v1beta1 kind: Deployment metadata: name: velero namespace: velero spec: replicas: 1 selector: matchLabels: deploy: velero template: metadata: annotations: prometheus.io/path: /metrics prometheus.io/port: "8085" prometheus.io/scrape: "true" labels: component: velero deploy: velero spec: serviceAccountName: velero containers: - name: velero # sync from velero/velero:v1.2.0 image: registry.cn-hangzhou.aliyuncs.com/acs/velero:v1.2.0 imagePullPolicy: IfNotPresent command: - /velero args: - server - --default-volume-snapshot-locations=alibabacloud:default env: - name: VELERO_SCRATCH_DIR value: /scratch - name: ALIBABA_CLOUD_CREDENTIALS_FILE value: /credentials/cloud volumeMounts: - mountPath: /plugins name: plugins - mountPath: /scratch name: scratch - mountPath: /credentials name: cloud-credentials initContainers: - image: registry.cn-hangzhou.aliyuncs.com/acs/velero-plugin-alibabacloud:v1.2-991b590 imagePullPolicy: IfNotPresent name: velero-plugin-alibabacloud volumeMounts: - mountPath: /target name: plugins volumes: - emptyDir: {} name: plugins - emptyDir: {} name: scratch - name: cloud-credentials secret: secretName: cloud-credentials k8s 部署 Velero 服务# 新建 namespace kubectl create namespace velero # 部署 credentials-velero 的 secret kubectl create secret generic cloud-credentials --namespace velero --from-file cloud=install/credentials-velero # 部署 CRD kubectl apply -f install/00-crds.yaml # 部署 Velero kubectl apply -f install/01-velero.yaml 备份测试这里,我们将使用velero备份一个集群内相关的resource,并在当该集群出现一些故障或误操作的时候,能够快速恢复集群resource, 首先我们用下面的yaml来部署:--- apiVersion: v1 kind: Namespace metadata: name: nginx-example labels: app: nginx --- apiVersion: apps/v1beta1 kind: Deployment metadata: name: nginx-deployment namespace: nginx-example spec: replicas: 2 template: metadata: labels: app: nginx spec: containers: - image: nginx:1.7.9 name: nginx ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: labels: app: nginx name: my-nginx namespace: nginx-example spec: ports: - port: 80 targetPort: 80 selector: app: nginx 我们可以全量备份,也可以只备份需要备份的一个namespace,本处只备份一个namespace:nginx-example[rsync@velero-plugin]$ kubectl get pods -n nginx-example NAME READY STATUS RESTARTS AGE nginx-deployment-5c689d88bb-f8vsx 1/1 Running 0 6m31s nginx-deployment-5c689d88bb-rt2zk 1/1 Running 0 6m32s [rsync@velero]$ cd velero-v1.4.0-linux-amd64/ [
[email protected]
]$ ll total 56472 drwxrwxr-x 4 rsync rsync 4096 Jun 1 15:02 examples -rw-r--r-- 1 rsync rsync 10255 Dec 10 01:08 LICENSE -rwxr-xr-x 1 rsync rsync 57810814 May 27 04:33 velero [
[email protected]
]$ ./velero backup create nginx-backup --include-namespaces nginx-example --wait Backup request "nginx-backup" submitted successfully. Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background. . Backup completed with status: Completed. You may check for more information using the commands `velero backup describe nginx-backup` and `velero backup logs nginx-backup`. 删除ns[
[email protected]
]$ kubectl delete namespaces nginx-example namespace "nginx-example" deleted [
[email protected]
]$ kubectl get pods -n nginx-example No resources found. 恢复[
[email protected]
]$ ./velero restore create --from-backup nginx-backup --wait Restore request "nginx-backup-20200603180922" submitted successfully. Waiting for restore to complete. You may safely press ctrl-c to stop waiting - your restore will continue in the background. Restore completed with status: Completed. You may check for more information using the commands `velero restore describe nginx-backup-20200603180922` and `velero restore logs nginx-backup-20200603180922`. [
[email protected]
]$ kubectl get pods -n nginx-example NAME READY STATUS RESTARTS AGE nginx-deployment-5c689d88bb-f8vsx 1/1 Running 0 5s nginx-deployment-5c689d88bb-rt2zk 0/1 ContainerCreating 0 5s 可以看到已经恢复了 另外迁移和备份恢复也是一样的,下面看一个特殊的,再部署一个项目,之后恢复会不会删掉新部署的项目。新建了一个tomcat容器[rsync@tomcat-test]$ kubectl get pods -n nginx-example NAME READY STATUS RESTARTS AGE nginx-deployment-5c689d88bb-f8vsx 1/1 Running 0 65m nginx-deployment-5c689d88bb-rt2zk 1/1 Running 0 65m tomcat-test-sy-677ff78f6b-rc5vq 1/1 Running 0 7s restore 一下[
[email protected]
]$ ./velero restore create --from-backup nginx-backup Restore request "nginx-backup-20200603191726" submitted successfully. Run `velero restore describe nginx-backup-20200603191726` or `velero restore logs nginx-backup-20200603191726` for more details. [
[email protected]
]$ kubectl get pods -n nginx-example NAME READY STATUS RESTARTS AGE nginx-deployment-5c689d88bb-f8vsx 1/1 Running 0 68m nginx-deployment-5c689d88bb-rt2zk 1/1 Running 0 68m tomcat-test-sy-677ff78f6b-rc5vq 1/1 Running 0 2m33s 可以看到没有覆盖 删除nginx的deployment,在restore[
[email protected]
]$ kubectl delete deployment nginx-deployment -n nginx-example deployment.extensions "nginx-deployment" deleted [
[email protected]
]$ kubectl get pods -n nginx-example NAME READY STATUS RESTARTS AGE tomcat-test-sy-677ff78f6b-rc5vq 1/1 Running 0 4m18s [
[email protected]
]$ ./velero restore create --from-backup nginx-backup Restore request "nginx-backup-20200603191949" submitted successfully. Run `velero restore describe nginx-backup-20200603191949` or `velero restore logs nginx-backup-20200603191949` for more details. [
[email protected]
]$ kubectl get pods -n nginx-example NAME READY STATUS RESTARTS AGE nginx-deployment-5c689d88bb-f8vsx 1/1 Running 0 2s nginx-deployment-5c689d88bb-rt2zk 0/1 ContainerCreating 0 2s tomcat-test-sy-677ff78f6b-rc5vq 1/1 Running 0 4m49s 可以看到,对我们的tomcat项目是没影响的。 结论:velero恢复不是直接覆盖,而是会恢复当前集群中不存在的resource,已有的resource不会回滚到之前的版本,如需要回滚,需在restore之前提前删除现有的resource。高级用法可以设置一个周期性定时备份# 每日1点进行备份 velero create schedule <SCHEDULE NAME> --schedule="0 1 * * *" # 每日1点进行备份,备份保留48小时 velero create schedule <SCHEDULE NAME> --schedule="0 1 * * *" --ttl 48h # 每6小时进行一次备份 velero create schedule <SCHEDULE NAME> --schedule="@every 6h" # 每日对 web namespace 进行一次备份 velero create schedule <SCHEDULE NAME> --schedule="@every 24h" --include-namespaces web 定时备份的名称为:`<SCHEDULE NAME>-<TIMESTAMP>`,恢复命令为:`velero restore create --from-backup <SCHEDULE NAME>-<TIMESTAMP>`。 如需备份恢复持久卷,备份如下:velero backup create nginx-backup-volume --snapshot-volumes --include-namespaces nginx-example 该备份会在集群所在region给云盘创建快照(当前还不支持NAS和OSS存储),快照恢复云盘只能在同region完成。恢复命令如下:velero restore create --from-backup nginx-backup-volume --restore-volumes 删除备份方法一,通过命令直接删除velero delete backups default-backup 方法二,设置备份自动过期,在创建备份时,加上TTL参数velero backup create <BACKUP-NAME> --ttl <DURATION> 还可为资源添加指定标签,添加标签的资源在备份的时候被排除。# 添加标签 kubectl label -n <ITEM_NAMESPACE> <RESOURCE>/<NAME> velero.io/exclude-from-backup=true # 为 default namespace 添加标签 kubectl label -n default namespace/default velero.io/exclude-from-backup=true 参考链接https://yq.aliyun.com/articles/705007?spm=a2c4e.11163080.searchblog.140.1a8b2ec1TYJPbF 院长技术
2022年04月02日
56 阅读
0 评论
0 点赞
2022-04-02
K8s基于自定义指标实现自动扩容
基于自定义指标除了基于 CPU 和内存来进行自动扩缩容之外,我们还可以根据自定义的监控指标来进行。这个我们就需要使用 Prometheus Adapter,Prometheus 用于监控应用的负载和集群本身的各种指标,Prometheus Adapter 可以帮我们使用 Prometheus 收集的指标并使用它们来制定扩展策略,这些指标都是通过 APIServer 暴露的,而且 HPA 资源对象也可以很轻易的直接使用。下面来看具体怎么实现的!部署应用首先,我们部署一个示例应用,在该应用程序上测试 Prometheus 指标自动缩放,资源清单文件如下所示:(podinfo.yaml)--- apiVersion: apps/v1 kind: Deployment metadata: name: podinfo spec: selector: matchLabels: app: podinfo replicas: 1 template: metadata: labels: app: podinfo annotations: prometheus.io/scrape: 'true' spec: containers: - name: podinfod image: stefanprodan/podinfo:0.0.1 imagePullPolicy: Always command: - ./podinfo - -port=9898 - -logtostderr=true - -v=2 volumeMounts: - name: metadata mountPath: /etc/podinfod/metadata readOnly: true ports: - containerPort: 9898 protocol: TCP readinessProbe: httpGet: path: /readyz port: 9898 initialDelaySeconds: 1 periodSeconds: 2 failureThreshold: 1 livenessProbe: httpGet: path: /healthz port: 9898 initialDelaySeconds: 1 periodSeconds: 3 failureThreshold: 2 resources: requests: memory: "32Mi" cpu: "1m" limits: memory: "256Mi" cpu: "100m" volumes: - name: metadata downwardAPI: items: - path: "labels" fieldRef: fieldPath: metadata.labels - path: "annotations" fieldRef: fieldPath: metadata.annotations --- apiVersion: v1 kind: Service metadata: name: podinfo labels: app: podinfo spec: type: NodePort ports: - port: 9898 targetPort: 9898 nodePort: 31198 protocol: TCP selector: app: podinfo 接下来我们将 Prometheus-Adapter 安装到集群中,这里选用helm安装,当然也可以直接yaml文件安装。Prometheus-Adapter规则Prometheus-Adapter 规则大致 可以分为以下几个部分:seriesQuery:查询 Prometheus 的语句,通过这个查询语句查询到的所有指标都可以用于 HPA seriesFilters:查询到的指标可能会存在不需要的,可以通过它过滤掉。 resources:通过 seriesQuery 查询到的只是指标,如果需要查询某个 Pod 的指标,肯定要将它的名称和所在的命名空间作为指标的标签进行查询,resources 就是将指标的标签和 k8s 的资源类型关联起来,最常用的就是 pod 和 namespace。有两种添加标签的方式,一种是 overrides,另一种是 template。 overrides:它会将指标中的标签和 k8s 资源关联起来。上面示例中就是将指标中的 pod 和 namespace 标签和 k8s 中的 pod 和 namespace 关联起来,因为 pod 和 namespace 都属于核心 api 组,所以不需要指定 api 组。当我们查询某个 pod 的指标时,它会自动将 pod 的名称和名称空间作为标签加入到查询条件中。比如 pod: {group: "apps", resource: "deployment"} 这么写表示的就是将指标中 podinfo 这个标签和 apps 这个 api 组中的 deployment 资源关联起来; template:通过 go 模板的形式。比如template: "kube_<<.Group>>_<<.Resource>>" 这么写表示,假如 <<.Group>> 为 apps,<<.Resource>> 为 deployment,那么它就是将指标中 kube_apps_deployment 标签和 deployment 资源关联起来。 name:用来给指标重命名的,之所以要给指标重命名是因为有些指标是只增的,比如以 total 结尾的指标。这些指标拿来做 HPA 是没有意义的,我们一般计算它的速率,以速率作为值,那么此时的名称就不能以 total 结尾了,所以要进行重命名。 matches:通过正则表达式来匹配指标名,可以进行分组 as:默认值为 $1,也就是第一个分组。as 为空就是使用默认值的意思。 metricsQuery:这就是 Prometheus 的查询语句了,前面的 seriesQuery 查询是获得 HPA 指标。当我们要查某个指标的值时就要通过它指定的查询语句进行了。可以看到查询语句使用了速率和分组,这就是解决上面提到的只增指标的问题。 Series:表示指标名称 LabelMatchers:附加的标签,目前只有 pod 和 namespace 两种,因此我们要在之前使用 resources 进行关联 GroupBy:就是 pod 名称,同样需要使用 resources 进行关联。 安装我们新建 hpa-prome-adapter-values.yaml 文件覆盖默认的 Values 值 ,安装Prometheus-Adapter,我用的helm2文件如下:rules: default: false custom: - seriesQuery: 'http_requests_total' resources: overrides: kubernetes_namespace: resource: namespace kubernetes_pod_name: resource: pod name: matches: "^(.*)_total" as: "${1}_per_second" metricsQuery: (sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)) prometheus: url: http://prometheus-clusterip.monitor.svc.cluster.local 安装helm repo add apphub https://apphub.aliyuncs.com/ helm install --name prome-adapter --namespace monitor -f hpa-prome-adapter-values.yaml apphub/prometheus-adapter 等一小会儿,安装完成后,可以使用下面的命令来检测是否生效了:[root@prometheus]# kubectl get --raw="/apis/custom.metrics.k8s.io/v1beta1" | jq { "kind": "APIResourceList", "apiVersion": "v1", "groupVersion": "custom.metrics.k8s.io/v1beta1", "resources": [ { "name": "namespaces/http_requests_per_second", "singularName": "", "namespaced": false, "kind": "MetricValueList", "verbs": [ "get" ] }, { "name": "pods/http_requests_per_second", "singularName": "", "namespaced": true, "kind": "MetricValueList", "verbs": [ "get" ] } ] } 我们可以看到 http_requests_per_second 指标可用。 现在,让我们检查该指标的当前值:[root@prometheus]# kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/http_requests_per_second" | jq . { "kind": "MetricValueList", "apiVersion": "custom.metrics.k8s.io/v1beta1", "metadata": { "selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%2A/http_requests_per_second" }, "items": [ { "describedObject": { "kind": "Pod", "namespace": "default", "name": "podinfo-5cdc457c8b-99xtw", "apiVersion": "/v1" }, "metricName": "http_requests_per_second", "timestamp": "2020-06-02T12:01:01Z", "value": "888m", "selector": null }, { "describedObject": { "kind": "Pod", "namespace": "default", "name": "podinfo-5cdc457c8b-b7pfz", "apiVersion": "/v1" }, "metricName": "http_requests_per_second", "timestamp": "2020-06-02T12:01:01Z", "value": "888m", "selector": null } ] } 下面部署hpa对象apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler metadata: name: podinfo spec: scaleTargetRef: apiVersion: extensions/v1beta1 kind: Deployment name: podinfo minReplicas: 2 maxReplicas: 5 metrics: - type: Pods pods: metricName: http_requests_per_second targetAverageValue: 3 部署之后,可见:[root@prometheus-adapter]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE podinfo Deployment/podinfo 911m/10 2 5 2 70s [root@prometheus-adapter]# kubectl describe hpa Name: podinfo Namespace: default Labels: <none> Annotations: kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"autoscaling/v2beta1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"podinfo","namespace":"default"},... CreationTimestamp: Tue, 02 Jun 2020 17:53:14 +0800 Reference: Deployment/podinfo Metrics: ( current / target ) "http_requests_per_second" on pods: 911m / 10 Min replicas: 2 Max replicas: 5 Deployment pods: 2 current / 2 desired Conditions: Type Status Reason Message ---- ------ ------ ------- AbleToScale True ScaleDownStabilized recent recommendations were higher than current one, applying the highest recent recommendation ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from pods metric http_requests_per_second 做一个ab压测:ab -n 2000 -c 5 http://sy.test.com:31198/ 观察下hpa变化:Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulRescale 9m29s horizontal-pod-autoscaler New size: 3; reason: pods metric http_requests_per_second above target Normal SuccessfulRescale 9m18s horizontal-pod-autoscaler New size: 4; reason: pods metric http_requests_per_second above target Normal SuccessfulRescale 3m34s horizontal-pod-autoscaler New size: 3; reason: All metrics below target Normal SuccessfulRescale 3m4s horizontal-pod-autoscaler New size: 2; reason: All metrics below target 发现触发扩容动作了,副本到了4,并且压测结束后,过了5分钟左右,又恢复到最小值2个。参考链接:https://github.com/directxman12/k8s-prometheus-adapter院长技术
2022年04月02日
81 阅读
0 评论
1 点赞
2022-03-31
k8s在线命令
https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands 入口导航在线命令查询入口
2022年03月31日
116 阅读
0 评论
0 点赞
1
...
3
4