本文发布已超过一年。较旧的文章可能包含过时内容。请检查页面中的信息自发布以来是否已失效。
Kubernetes 1.30:校验准入策略正式发布
我代表 Kubernetes 项目很高兴地宣布,ValidatingAdmissionPolicy 作为 Kubernetes 1.30 版本的一部分,已达到正式发布阶段。如果您还没有阅读过关于这一新的、替代校验准入 Webhook 的声明式方法,您可以阅读我们之前关于该新特性的文章。如果您已经听说过 ValidatingAdmissionPolicies 并渴望尝试,那么现在是最好的时机。
让我们通过替换一个简单的 Webhook 来体验一下 ValidatingAdmissionPolicy。
准入 Webhook 示例
首先,让我们看看一个简单的 Webhook 示例。这是一个 Webhook 的摘录,它强制要求将 runAsNonRoot
、readOnlyRootFilesystem
、allowPrivilegeEscalation
和 privileged
设置为最严格的值。
func verifyDeployment(deploy *appsv1.Deployment) error {
var errs []error
for i, c := range deploy.Spec.Template.Spec.Containers {
if c.Name == "" {
return fmt.Errorf("container %d has no name", i)
}
if c.SecurityContext == nil {
errs = append(errs, fmt.Errorf("container %q does not have SecurityContext", c.Name))
}
if c.SecurityContext.RunAsNonRoot == nil || !*c.SecurityContext.RunAsNonRoot {
errs = append(errs, fmt.Errorf("container %q must set RunAsNonRoot to true in its SecurityContext", c.Name))
}
if c.SecurityContext.ReadOnlyRootFilesystem == nil || !*c.SecurityContext.ReadOnlyRootFilesystem {
errs = append(errs, fmt.Errorf("container %q must set ReadOnlyRootFilesystem to true in its SecurityContext", c.Name))
}
if c.SecurityContext.AllowPrivilegeEscalation != nil && *c.SecurityContext.AllowPrivilegeEscalation {
errs = append(errs, fmt.Errorf("container %q must NOT set AllowPrivilegeEscalation to true in its SecurityContext", c.Name))
}
if c.SecurityContext.Privileged != nil && *c.SecurityContext.Privileged {
errs = append(errs, fmt.Errorf("container %q must NOT set Privileged to true in its SecurityContext", c.Name))
}
}
return errors.NewAggregate(errs)
}
查看什么是准入 Webhook? 或者,查看此 Webhook 的完整代码来跟随本演练。
策略
现在让我们尝试使用 ValidatingAdmissionPolicy 忠实地重新创建该校验。
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
name: "pod-security.policy.example.com"
spec:
failurePolicy: Fail
matchConstraints:
resourceRules:
- apiGroups: ["apps"]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["deployments"]
validations:
- expression: object.spec.template.spec.containers.all(c, has(c.securityContext) && has(c.securityContext.runAsNonRoot) && c.securityContext.runAsNonRoot)
message: 'all containers must set runAsNonRoot to true'
- expression: object.spec.template.spec.containers.all(c, has(c.securityContext) && has(c.securityContext.readOnlyRootFilesystem) && c.securityContext.readOnlyRootFilesystem)
message: 'all containers must set readOnlyRootFilesystem to true'
- expression: object.spec.template.spec.containers.all(c, !has(c.securityContext) || !has(c.securityContext.allowPrivilegeEscalation) || !c.securityContext.allowPrivilegeEscalation)
message: 'all containers must NOT set allowPrivilegeEscalation to true'
- expression: object.spec.template.spec.containers.all(c, !has(c.securityContext) || !has(c.securityContext.Privileged) || !c.securityContext.Privileged)
message: 'all containers must NOT set privileged to true'
使用 kubectl
创建策略。很好,到目前为止没有问题。但是让我们把策略对象取回来看看它的状态。
kubectl get -oyaml validatingadmissionpolicies/pod-security.policy.example.com
status:
typeChecking:
expressionWarnings:
- fieldRef: spec.validations[3].expression
warning: |
apps/v1, Kind=Deployment: ERROR: <input>:1:76: undefined field 'Privileged'
| object.spec.template.spec.containers.all(c, !has(c.securityContext) || !has(c.securityContext.Privileged) || !c.securityContext.Privileged)
| ...........................................................................^
ERROR: <input>:1:128: undefined field 'Privileged'
| object.spec.template.spec.containers.all(c, !has(c.securityContext) || !has(c.securityContext.Privileged) || !c.securityContext.Privileged)
| ...............................................................................................................................^
该策略针对其匹配类型 apps/v1.Deployment 进行了检查。查看 fieldRef
,问题出在第三个表达式(索引从 0 开始)。该表达式访问了一个未定义的 Privileged
字段。啊,看起来是复制粘贴错误。字段名应该是小写。
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
name: "pod-security.policy.example.com"
spec:
failurePolicy: Fail
matchConstraints:
resourceRules:
- apiGroups: ["apps"]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["deployments"]
validations:
- expression: object.spec.template.spec.containers.all(c, has(c.securityContext) && has(c.securityContext.runAsNonRoot) && c.securityContext.runAsNonRoot)
message: 'all containers must set runAsNonRoot to true'
- expression: object.spec.template.spec.containers.all(c, has(c.securityContext) && has(c.securityContext.readOnlyRootFilesystem) && c.securityContext.readOnlyRootFilesystem)
message: 'all containers must set readOnlyRootFilesystem to true'
- expression: object.spec.template.spec.containers.all(c, !has(c.securityContext) || !has(c.securityContext.allowPrivilegeEscalation) || !c.securityContext.allowPrivilegeEscalation)
message: 'all containers must NOT set allowPrivilegeEscalation to true'
- expression: object.spec.template.spec.containers.all(c, !has(c.securityContext) || !has(c.securityContext.privileged) || !c.securityContext.privileged)
message: 'all containers must NOT set privileged to true'
再次检查其状态,您应该看到所有警告已清除。
接下来,让我们为测试创建一个命名空间。
kubectl create namespace policy-test
然后,我将策略绑定到命名空间。但此时,我将 action 设置为 Warn
,这样策略会打印出警告而不是拒绝请求。这在开发和自动化测试期间收集所有表达式的结果时特别有用。
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
name: "pod-security.policy-binding.example.com"
spec:
policyName: "pod-security.policy.example.com"
validationActions: ["Warn"]
matchResources:
namespaceSelector:
matchLabels:
"kubernetes.io/metadata.name": "policy-test"
测试策略执行情况。
kubectl create -n policy-test -f- <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx
name: nginx
spec:
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- image: nginx
name: nginx
securityContext:
privileged: true
allowPrivilegeEscalation: true
EOF
Warning: Validation failed for ValidatingAdmissionPolicy 'pod-security.policy.example.com' with binding 'pod-security.policy-binding.example.com': all containers must set runAsNonRoot to true
Warning: Validation failed for ValidatingAdmissionPolicy 'pod-security.policy.example.com' with binding 'pod-security.policy-binding.example.com': all containers must set readOnlyRootFilesystem to true
Warning: Validation failed for ValidatingAdmissionPolicy 'pod-security.policy.example.com' with binding 'pod-security.policy-binding.example.com': all containers must NOT set allowPrivilegeEscalation to true
Warning: Validation failed for ValidatingAdmissionPolicy 'pod-security.policy.example.com' with binding 'pod-security.policy-binding.example.com': all containers must NOT set privileged to true
Error from server: error when creating "STDIN": admission webhook "webhook.example.com" denied the request: [container "nginx" must set RunAsNonRoot to true in its SecurityContext, container "nginx" must set ReadOnlyRootFilesystem to true in its SecurityContext, container "nginx" must NOT set AllowPrivilegeEscalation to true in its SecurityContext, container "nginx" must NOT set Privileged to true in its SecurityContext]
看起来很棒!该策略和 Webhook 产生等效结果。在测试了其他一些情况并对策略感到满意后,也许是时候进行一些清理了。
- 对于每个表达式,我们重复访问
object.spec.template.spec.containers
和每个securityContext
; - 存在一种模式,即先检查字段是否存在,然后再访问它,这看起来有点冗长。
幸运的是,自 Kubernetes 1.28 起,我们针对这两个问题有了新的解决方案。变量组合(Variable Composition)允许我们将重复的子表达式提取到它们自己的变量中。Kubernetes 为 CEL 启用了可选库,这非常适合处理,正如您猜到的,可选字段。
考虑到这两个特性,让我们稍作重构策略。
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
name: "pod-security.policy.example.com"
spec:
failurePolicy: Fail
matchConstraints:
resourceRules:
- apiGroups: ["apps"]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["deployments"]
variables:
- name: containers
expression: object.spec.template.spec.containers
- name: securityContexts
expression: 'variables.containers.map(c, c.?securityContext)'
validations:
- expression: variables.securityContexts.all(c, c.?runAsNonRoot == optional.of(true))
message: 'all containers must set runAsNonRoot to true'
- expression: variables.securityContexts.all(c, c.?readOnlyRootFilesystem == optional.of(true))
message: 'all containers must set readOnlyRootFilesystem to true'
- expression: variables.securityContexts.all(c, c.?allowPrivilegeEscalation != optional.of(true))
message: 'all containers must NOT set allowPrivilegeEscalation to true'
- expression: variables.securityContexts.all(c, c.?privileged != optional.of(true))
message: 'all containers must NOT set privileged to true'
策略现在更加简洁和易读。更新策略后,您应该看到它像以前一样工作。
现在让我们将策略绑定从警告改为实际拒绝未通过校验的请求。
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
name: "pod-security.policy-binding.example.com"
spec:
policyName: "pod-security.policy.example.com"
validationActions: ["Deny"]
matchResources:
namespaceSelector:
matchLabels:
"kubernetes.io/metadata.name": "policy-test"
最后,移除 Webhook。现在结果应该只包含来自策略的消息。
kubectl create -n policy-test -f- <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx
name: nginx
spec:
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- image: nginx
name: nginx
securityContext:
privileged: true
allowPrivilegeEscalation: true
EOF
The deployments "nginx" is invalid: : ValidatingAdmissionPolicy 'pod-security.policy.example.com' with binding 'pod-security.policy-binding.example.com' denied request: all containers must set runAsNonRoot to true
请注意,根据设计,策略会在第一个导致请求被拒绝的表达式之后停止评估。这与表达式仅生成警告时的情况不同。
设置监控
与 Webhook 不同,策略不是一个可以暴露自身指标的专用进程。相反,您可以使用 API 服务器的指标来代替。
以下是一些 Prometheus 查询语言中常见监控任务的示例。
查找上面所示策略的第 95 百分位执行时长。
histogram_quantile(0.95, sum(rate(apiserver_validating_admission_policy_check_duration_seconds_bucket{policy="pod-security.policy.example.com"}[5m])) by (le))
查找策略评估的速率。
rate(apiserver_validating_admission_policy_check_total{policy="pod-security.policy.example.com"}[5m])
您可以阅读指标参考了解更多关于上述指标的信息。ValidatingAdmissionPolicy 的指标目前处于 Alpha 阶段,未来版本中稳定性提高后将会有更多更好的指标。