Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

End to end tests broke including the namespace in alerts #257

Open
rhmdnd opened this issue Jul 6, 2022 · 4 comments
Open

End to end tests broke including the namespace in alerts #257

rhmdnd opened this issue Jul 6, 2022 · 4 comments
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@rhmdnd
Copy link
Contributor

rhmdnd commented Jul 6, 2022

We recently merged support for including the namespace in the NodeHasIntegrityFailure alert [0].

This helps understand where the alert is coming from, but we have some assertions in the end-to-end tests that appear to fail with this new format [1].

Opening this issue to track the work to get e2e tests running again.

[0] af58faa
[1] https://github.com/openshift/file-integrity-operator/blob/master/tests/e2e/e2e_test.go#L56

@mrogers950
Copy link
Contributor

@rhmdnd I haven't seen this in CI, was it when running locally?

@rhmdnd
Copy link
Contributor Author

rhmdnd commented Jul 13, 2022

I saw it in CI here:

https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_file-integrity-operator/256/pull-ci-openshift-file-integrity-operator-master-e2e-aws/1544768695480356864#1:build-log.txt%3A1320

Pasting the actual output so it's persisted with the issue:

 --- PASS: TestFileIntegrityConfigurationRevert (252.27s)
=== RUN   TestFileIntegrityConfigurationStatus
I0706 20:42:19.724622   12125 request.go:665] Waited for 1.094093155s due to client-side throttling, not priority and fairness, request: GET:https://api.ci-op-r83l2b9c-eb8a9.origin-ci-int-aws.dev.rhcloud.com:6443/api/v1?timeout=32s
I0706 20:42:29.924350   12125 request.go:665] Waited for 11.293718737s due to client-side throttling, not priority and fairness, request: GET:https://api.ci-op-r83l2b9c-eb8a9.origin-ci-int-aws.dev.rhcloud.com:6443/apis/discovery.k8s.io/v1?timeout=32s
I0706 20:42:40.123236   12125 request.go:665] Waited for 8.735896089s due to client-side throttling, not priority and fairness, request: GET:https://api.ci-op-r83l2b9c-eb8a9.origin-ci-int-aws.dev.rhcloud.com:6443/apis/discovery.k8s.io/v1beta1?timeout=32s
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-710d083a-14f6-4632-848b-aaee5fe71704/file-integrity-daemon) created
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-710d083a-14f6-4632-848b-aaee5fe71704/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-710d083a-14f6-4632-848b-aaee5fe71704/file-integrity-daemon) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-710d083a-14f6-4632-848b-aaee5fe71704/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-710d083a-14f6-4632-848b-aaee5fe71704/leader-election-role) created
    client.go:47: resource type ClusterRole with namespace/name (/file-integrity-operator) created
    client.go:47: resource type ClusterRole with namespace/name (/file-integrity-operator-metrics) created
    client.go:47: resource type ClusterRole with namespace/name (/fileintegrity-editor-role) created
    client.go:47: resource type ClusterRole with namespace/name (/fileintegrity-viewer-role) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-710d083a-14f6-4632-848b-aaee5fe71704/file-integrity-daemon) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-710d083a-14f6-4632-848b-aaee5fe71704/file-integrity-operator) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-710d083a-14f6-4632-848b-aaee5fe71704/leader-election-rolebinding) created
    client.go:47: resource type RoleBinding with namespace/name (osdk-e2e-710d083a-14f6-4632-848b-aaee5fe71704/prometheus-k8s) created
    client.go:47: resource type ClusterRoleBinding with namespace/name (/file-integrity-operator) created
    client.go:47: resource type ClusterRoleBinding with namespace/name (/file-integrity-operator-metrics) created
    client.go:47: resource type Deployment with namespace/name (osdk-e2e-710d083a-14f6-4632-848b-aaee5fe71704/file-integrity-operator) created
    helpers.go:272: Initialized cluster resources
    wait_util.go:59: Deployment available (1/1)
    client.go:47: resource type  with namespace/name (osdk-e2e-710d083a-14f6-4632-848b-aaee5fe71704/e2e-test-configstatus) created
    helpers.go:362: Created FileIntegrity: &{TypeMeta:{Kind: APIVersion:} ObjectMeta:{Name:e2e-test-configstatus GenerateName: Namespace:osdk-e2e-710d083a-14f6-4632-848b-aaee5fe71704 SelfLink: UID:9d1c1f88-5d21-4425-8ff1-9ed7cb8618f8 ResourceVersion:38637 Generation:1 CreationTimestamp:2022-07-06 20:42:53 +0000 UTC DeletionTimestamp:<nil> DeletionGracePeriodSeconds:<nil> Labels:map[] Annotations:map[] OwnerReferences:[] Finalizers:[] ClusterName: ManagedFields:[{Manager:e2e.test Operation:Update APIVersion:fileintegrity.openshift.io/v1alpha1 Time:2022-07-06 20:42:53 +0000 UTC FieldsType:FieldsV1 FieldsV1:{"f:spec":{".":{},"f:config":{".":{},"f:gracePeriod":{},"f:maxBackups":{}},"f:debug":{},"f:nodeSelector":{".":{},"f:node-role.kubernetes.io/worker":{}},"f:tolerations":{}}} Subresource:}]} Spec:{NodeSelector:map[node-role.kubernetes.io/worker:] Config:{Name: Namespace: Key: GracePeriod:20 MaxBackups:5} Debug:true Tolerations:[{Key:node-role.kubernetes.io/master Operator:Exists Value: Effect:NoSchedule TolerationSeconds:<nil>}]} Status:{Phase:}}
    helpers.go:839: Got (Active) result #1 out of 0 needed.
    helpers.go:850: FileIntegrity ready (Active)
    helpers.go:398: FileIntegrity deployed successfully
    helpers.go:899: Found FileIntegrityStatus event: Active
    helpers.go:839: Got (Active) result #1 out of 0 needed.
    helpers.go:850: FileIntegrity ready (Active)
    helpers.go:899: Found FileIntegrityStatus event: Initializing
    helpers.go:1606: error getting output exit status 7
    helpers.go:1581: metrics output:
        Warning: would violate PodSecurity "restricted:v1.24": allowPrivilegeEscalation != false (container "metrics-test" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "metrics-test" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "metrics-test" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "metrics-test" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
        If you don't see a command prompt, try pressing enter.
        pod "metrics-test" deleted
        pod osdk-e2e-710d083a-14f6-4632-848b-aaee5fe71704/metrics-test terminated (Error)
        
    helpers.go:1629: 0
    helpers.go:1629: 1
    helpers.go:1629: 2
    helpers.go:1629: 3
    helpers.go:1629: 4
    e2e_test.go:449: unexpected metrics value
    helpers.go:1567: wrote logs for file-integrity-operator-65975b7b67-v79p6/self
time="2022-07-06T20:43:54Z" level=info msg="Skipping cleanup function since --skip-cleanup-error is true"
--- FAIL: TestFileIntegrityConfigurationStatus (96.24s)
=== RUN   TestFileIntegrityConfigurationIgnoreMissing
I0706 20:43:55.964928   12125 request.go:665] Waited for 1.029263949s due to client-side throttling, not priority and fairness, request: GET:https://api.ci-op-r83l2b9c-eb8a9.origin-ci-int-aws.dev.rhcloud.com:6443/apis/image.openshift.io/v1?timeout=32s
I0706 20:44:05.964958   12125 request.go:665] Waited for 11.028919399s due to client-side throttling, not priority and fairness, request: GET:https://api.ci-op-r83l2b9c-eb8a9.origin-ci-int-aws.dev.rhcloud.com:6443/apis/security.openshift.io/v1?timeout=32s
I0706 20:44:16.164959   12125 request.go:665] Waited for 8.536386266s due to client-side throttling, not priority and fairness, request: GET:https://api.ci-op-r83l2b9c-eb8a9.origin-ci-int-aws.dev.rhcloud.com:6443/apis/discovery.k8s.io/v1beta1?timeout=32s
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-c8457e75-55c1-4fef-b8a2-858fa90f5ad6/file-integrity-daemon) created
    client.go:47: resource type ServiceAccount with namespace/name (osdk-e2e-c8457e75-55c1-4fef-b8a2-858fa90f5ad6/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-c8457e75-55c1-4fef-b8a2-858fa90f5ad6/file-integrity-daemon) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-c8457e75-55c1-4fef-b8a2-858fa90f5ad6/file-integrity-operator) created
    client.go:47: resource type Role with namespace/name (osdk-e2e-c8457e75-55c1-4fef-b8a2-858fa90f5ad6/leader-election-role) created
    helpers.go:264: failed to initialize cluster resources: clusterroles.rbac.authorization.k8s.io "file-integrity-operator" already exists
--- FAIL: TestFileIntegrityConfigurationIgnoreMissing (29.81s) 

@mrogers950
Copy link
Contributor

Looks like the metrics failed due to the serving-cert not being available at startup, which is weird because we detect that much earlier and restart.

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_file-integrity-operator/256/pull-ci-openshift-file-integrity-operator-master-e2e-aws/1544768695480356864/artifacts/e2e-aws/test/artifacts/e2e-test-configstatus_file-integrity-operator-65975b7b67-v79p6_self.log

{"level":"error","ts":1657140177.7594256,"logger":"metrics","msg":"Metrics service failed","error":"open /var/run/secrets/serving-cert/tls.crt: no such file or directory"}

Let's keep this issue open for now in case it pops up again.

@mrogers950
Copy link
Contributor

/lifecycle frozen

@openshift-ci openshift-ci bot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Jul 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

No branches or pull requests

2 participants