pipeline/docs/developers/README.md
Scott 6c132b937e Document bug with sidecar usage of nop container
Sidecars are stopped by having their Image field swapped out to the
`nop` image. When the nop image starts up in the sidecar container it is
supposed to immediately exit because `nop` doesn't include the sidecar's
command. However, when the `nop` image *does* contain the command that
the sidecar is running, the sidecar container will actually never stop
and the Task will eventually timeout.

For most sidecars this issue will not manifest - the `nop` container
that Tekton provides out of the box includes only a very limited set of
commands. However, if a Tekton operator overrides the `nop` image when
deploying the tekton controller (for example, because their organization
requires images configured for Tekton to be built on their org's own base
image) then there is a risk that `nop` will start offering more commands
and therefore introduce a higher risk that a sidecar's command will be
runnable by the `nop` image finally increasing the likelihood of Tasks
with sidecars running until timeout.

This issue is a known bug with the way sidecars operate at the moment
and is being tracked in https://github.com/tektoncd/pipeline/issues/1347
but should be documented clearly.
2019-10-25 15:15:24 -05:00

7.7 KiB

Developer docs

This document is aimed at helping maintainers/developers of project understand the complexity.

How are resources shared between tasks

PipelineRun uses PVC to share resources between tasks. PVC volume is mounted on path /pvc by PipelineRun.

  • If a resource in a task is declared as output then the TaskRun controller adds a step to copy each output resource to the directory path /pvc/task_name/resource_name.

  • If an input resource includes from condition then the TaskRun controller adds a step to copy from PVC directory path: /pvc/previous_task/resource_name.

Another alternatives is to use a GCS storage bucket to share the artifacts. This can be configured using a ConfigMap with the name config-artifact-bucket with the following attributes:

  • location: the address of the bucket (for example gs://mybucket)
  • bucket.service.account.secret.name: the name of the secret that will contain the credentials for the service account with access to the bucket
  • bucket.service.account.secret.key: the key in the secret with the required service account json. The bucket is recommended to be configured with a retention policy after which files will be deleted.

Both options provide the same functionality to the pipeline. The choice is based on the infrastructure used, for example in some Kubernetes platforms, the creation of a persistent volume could be slower than uploading/downloading files to a bucket, or if the the cluster is running in multiple zones, the access to the persistent volume can fail.

How are inputs handled

Input resources, like source code (git) or artifacts, are dumped at path /workspace/task_resource_name. Resource definition in task can have custom target directory. If targetPath is mentioned in task input then the controllers are responsible for adding container definitions to create directories and also to fetch the versioned artifacts into that directory.

How are outputs handled

Output resources, like source code (git) or artifacts (storage resource), are expected in directory path /workspace/output/resource_name.

  • If resource has an output "action" like upload to blob storage, then the container step is added for this action.

  • If there is PVC volume present (TaskRun holds owner reference to PipelineRun) then copy step is added as well.

  • If the resource is declared only in output but not in input for task then the copy step includes resource being copied to PVC to path /pvc/task_name/resource_name from /workspace/output/resource_name like the following example.

    kind: Task
    metadata:
      name: get-gcs-task
      namespace: default
    spec:
      outputs:
        resources:
          - name: gcs-workspace
            type: storage
    
  • If the resource is declared only in output but not in input for task and the resource defined with TargetPath then the copy step includes resource being copied to PVC to path /pvc/task_name/resource_name from /workspace/outputstuff like the following example.

    kind: Task
    metadata:
      name: get-gcs-task
      namespace: default
    spec:
      outputs:
        resources:
          - name: gcs-workspace
            type: storage
            targetPath: /workspace/outputstuff
    
  • If the resource is declared both in input and output for task the then copy step includes resource being copied to PVC to path /pvc/task_name/resource_name from /workspace/random-space/ if input resource has custom target directory (random-space) declared like the following example.

    kind: Task
    metadata:
      name: get-gcs-task
      namespace: default
    spec:
      inputs:
        resources:
          - name: gcs-workspace
            type: storage
            targetPath: random-space
      outputs:
        resources:
          - name: gcs-workspace
            type: storage
    
    • If resource is declared both in input and output for task without custom target directory then copy step includes resource being copied to PVC to path /pvc/task_name/resource_name from /workspace/resource_name/ like the following example.
    kind: Task
    metadata:
      name: get-gcs-task
      namespace: default
    spec:
      inputs:
        resources:
          - name: gcs-workspace
            type: storage
      outputs:
        resources:
          - name: gcs-workspace
            type: storage
    

Entrypoint rewriting and step ordering

Entrypoint is injected into the Task Container(s), wraps the Task step to manage the execution order of the containers. The entrypoint binary has the following arguments:

  • wait_file - If specified, file to wait for
  • wait_file_content - If specified, wait until the file has non-zero size
  • post_file - If specified, file to write upon completion
  • entrypoint - The command to run in the image being wrapped

As part of the PodSpec created by TaskRun the entrypoint for each Task step is changed to the entrypoint binary with the mentioned arguments and a volume with the binary and file(s) is mounted.

If the image is a private registry, the service account should include an ImagePullSecret

Builder namespace on containers

The /builder/ namespace is reserved on containers for various system tools, such as the following:

  • The environment variable HOME is set to /builder/home, used by the builder tools and injected on into all of the step containers
  • Default location for output-images /builder/output-images

Handling of injected sidecars

Tekton has to take some special steps to support sidecars that are injected into TaskRun Pods. Without intervention sidecars will typically run for the entire lifetime of a Pod but in Tekton's case it's desirable for the sidecars to run only as long as Steps take to complete. There's also a need for Tekton to schedule the sidecars to start before a Task's Steps begin, just in case the Steps rely on a sidecars behaviour, for example to join an Istio service mesh. To handle all of this, Tekton Pipelines implements the following lifecycle for sidecar containers:

First, the Downward API is used to project an annotation on the TaskRun's Pod into the entrypoint container as a file. The annotation starts as an empty string, so the file projected by the downward API has zero length. The entrypointer spins, waiting for that file to have non-zero size.

The sidecar containers start up. Once they're all in a ready state, the annotation is populated with string "READY", which in turn populates the Downward API projected file. The entrypoint binary recognizes that the projected file has a non-zero size and allows the Task's steps to begin.

On completion of all steps in a Task the TaskRun reconciler stops any sidecar containers. The Image field of any sidecar containers is swapped to the nop image. Kubernetes observes the change and relaunches the container with updated container image. The nop container image exits immediately because it does not provide the command that the sidecar is configured to run. The container is considered Terminated by Kubernetes and the TaskRun's Pod stops.

There is a known issue with this implementation of sidecar support. When the nop image does provide the sidecar's command, the sidecar will continue to run even after nop has been swapped into the sidecar container's image field. See https://github.com/tektoncd/pipeline/issues/1347 for the issue tracking this bug. Until this issue is resolved the best way to avoid it is to avoid overriding the nop image when deploying the tekton controller, or ensuring that the overridden nop image contains as few commands as possible.