Skip to content

Instantly share code, notes, and snippets.

@dims
Last active September 22, 2023 00:57
Show Gist options
  • Save dims/c1296f8ed42238baea0a5fcae45f4cf4 to your computer and use it in GitHub Desktop.
Save dims/c1296f8ed42238baea0a5fcae45f4cf4 to your computer and use it in GitHub Desktop.
Example of a CI job moving from using bootstrap to pod utils
# periodic job in config/jobs/kubernetes-sigs/cluster-api-provider-gcp/cluster-api-provider-gcp-ci.yaml
- name: ci-cluster-api-provider-gcp-make-conformance-stable-k8s-ci-artifacts
interval: 3h
labels:
preset-service-account: "true"
preset-bazel-scratch-dir: "true"
preset-bazel-remote-cache-enabled: "true"
preset-dind-enabled: "true"
preset-kind-volume-mounts: "true"
decorate: true
extra_refs:
- org: kubernetes-sigs
repo: cluster-api-provider-gcp
base_ref: release-0.2
path_alias: "sigs.k8s.io/cluster-api-provider-gcp"
- org: kubernetes-sigs
repo: cluster-api
base_ref: master
path_alias: "sigs.k8s.io/cluster-api"
- org: kubernetes-sigs
repo: image-builder
base_ref: master
path_alias: "sigs.k8s.io/image-builder"
- org: kubernetes
repo: kubernetes
base_ref: master
path_alias: k8s.io/kubernetes
spec:
containers:
- image: gcr.io/k8s-testimages/kubekins-e2e:v20191105-e60677a-experimental
env:
- name: "CAPI_BRANCH"
value: "stable"
- name: "CABPK_BRANCH"
value: "stable"
command:
- "runner.sh"
- "./scripts/ci-e2e.sh"
- "--use-ci-artifacts"
# we need privileged mode in order to do docker in docker
securityContext:
privileged: true
resources:
requests:
# these are both a bit below peak usage during build
# this is mostly for building kubernetes
memory: "9000Mi"
# during the tests more like 3-20m is used
cpu: 2000m
annotations:
testgrid-dashboards: sig-cluster-lifecycle-cluster-api-provider-gcp, sig-release-master-informing
testgrid-tab-name: capg-conformance-stable-k8s-master
testgrid-alert-email: kubernetes-sig-cluster-lifecycle-cluster-api-alerts@googlegroups.com
testgrid-num-failures-to-alert: "2"
# periodic job in config/jobs/kubernetes-sigs/cluster-api-provider-gcp/cluster-api-provider-gcp-ci.yaml
- name: ci-cluster-api-provider-gcp-make-conformance-stable-k8s-ci-artifacts
interval: 3h
branches:
- release-0.2
labels:
preset-service-account: "true"
preset-bazel-scratch-dir: "true"
preset-bazel-remote-cache-enabled: "true"
preset-dind-enabled: "true"
preset-kind-volume-mounts: "true"
spec:
containers:
- image: gcr.io/k8s-testimages/kubekins-e2e:v20191103-6816af1-master
env:
- name: "CAPI_BRANCH"
value: "stable"
- name: "CABPK_BRANCH"
value: "stable"
args:
- "--repo=k8s.io/kubernetes=master"
- "--repo=sigs.k8s.io/cluster-api-provider-gcp=release-0.2"
- "--repo=sigs.k8s.io/image-builder=master"
- "--root=/go/src"
- "--service-account=/etc/service-account/service-account.json"
- "--upload=gs://kubernetes-jenkins/logs"
- "--scenario=execute"
- "--"
- "bash"
- "--"
- "-c"
- "cd ./../../sigs.k8s.io/cluster-api-provider-gcp && scripts/ci-e2e.sh --use-ci-artifacts"
# we need privileged mode in order to do docker in docker
securityContext:
privileged: true
resources:
requests:
# these are both a bit below peak usage during build
# this is mostly for building kubernetes
memory: "9000Mi"
# during the tests more like 3-20m is used
cpu: 2000m
annotations:
testgrid-dashboards: sig-cluster-lifecycle-cluster-api-provider-gcp, sig-release-master-informing
testgrid-tab-name: capg-conformance-stable-k8s-master
testgrid-alert-email: kubernetes-sig-cluster-lifecycle-cluster-api-alerts@googlegroups.com
testgrid-num-failures-to-alert: "2"

Read this first! : https://github.com/kubernetes/test-infra/blob/master/prow/pod-utilities.md

Note the example above is a periodic job.

  1. name - no change
  2. interval - no change
  3. branches - move to extra_refs. Make sure the entry is the first in the list of repos in extra_refs. (WARNING: current work directory will not work properly otherwise)
  4. labels - no change
    • use preset-bazel-scratch-dir and preset-bazel-remote-cache-enabled if you are building something using bazel
    • use preset-dind-enabled and preset-kind-volume-mounts if you are running kind (Also note the resources.requests/cpu/mem bump, copy them from an existing job)
  5. add decorate: true to tell prow that you are using pod utils
  6. for every repo in before.yaml that you needed using --repo=, add an entry in extra_refs.
    • Pay attention to the path_alias so it gets checked out in the correct directories on disk
    • be specific about base_ref as well.
    • Note the order of the repos again (see warning in 3. above)
  7. spec.containers.image - no change
  8. spec.containers.env - no change
  9. switch over args: to command: (make sure you add runner.sh as the first param, this is the script that runs the command)
    • drop --repo as we moved them to extra_refs
    • drop --job, --root, --service-account, --upload and --scenario magic parameters we don't need them
    • don't need the bash -c stuff either, just pass the script we need as the second parameter in command and follow that with any other params needed by your script
  10. securityContext, resources, annotations remain the same
  11. Make sure any logs you need are under $ARTIFACTS/logs/
  12. Make sure any references to /go/src is switched to /home/prow/go/src in your scripts
  13. if there is a - --timeout=105 above the -- delimiter, use decoration_config like so:
    decoration_config:
      timeout: 105m
  1. Oh! for the image: the latest kubekins-e2e image is fine to use

For pre-submit jobs:

  1. No need to add extra_refs for the main repo, just ensure path_alias is correct.
  2. current work directory will be the main repo (NOT the first entry in extra_refs)
  3. always_run: true means, this CI job will run for every change
  4. optional: true means, this CI job does not have to be green for the PR to merge

Tips:

  • Always start with a presubmit job, get that working and then change the periodic job, so its easier to iterate
  • Keep the interval small (every hour) to get things working and then switch it over to longer/desired duration
  • use https://prow.k8s.io/?job=my-favorite-job* to see the last few runs (yes supports wild cards)
  • if you don't see your repos cloned correctly check the clone-log.txt in the captured logs (click "Artifacts" link in the prow report for the failed job)

Finally, Don't forget to check the yaml indentation! when in doubt, find another CI job that does something similar and see how that is set up!

@rikatz
Copy link

rikatz commented Nov 13, 2020

Some comments:

  • If the job uses --scenario=kubernetes_e2e, it seems that command should be as the following:
         command:
        - runner.sh
        - /workspace/scenarios/kubernetes_e2e.py

This way, runner.sh calls kubernetes_e2e.py, that calls kubetest, that does its magic :)

  • --job can also be dropped

@rikatz
Copy link

rikatz commented Nov 14, 2020

An example of a migrated Job (actually working in parallel) with a scenario can be found in https://github.com/kubernetes/test-infra/blob/master/config/jobs/kubernetes/sig-node/sig-node-presubmit.yaml where “pull-kubernetes-node-e2e” is the old job and “pull-kubernetes-node-e2e-podutil” is the new one (but optional and run manually)

@dims
Copy link
Author

dims commented Sep 22, 2023

thanks @rikatz i've updated the gist for the --job, looks like the job you added was removed along the way. So here's another one i just landed https://github.com/kubernetes/test-infra/pull/30797/files

it does use the trick you have for the python script but from the newly checked out location for test-infra.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment