New Kubernetes GitOps Toolkit – Flux CD v2

I have been using the Flux CD operator for a few month to manage Kubernetes clusters in dev and prod and it is a great tool. When I initially reviewed Flux the first time back then, I liked it because of its simplicity but it was missing some important features such as the possibility to synchronise based on tags instead of a single branch, and configuring the Flux operator through the deployment wasn’t as good and intuitive, and caused some headaches.

A few days ago I stumbled across the new Flux CD GitOps Toolkit and it got my attention when I saw the new Flux v2 operator architecture. They’ve split the operator functions into three controller and using CRDs to configure Source, Kustomize and Helm configuration:

The feature which I was really waiting for was the support for Semantic Versioning semver in your GitRepository source. With this I am able to create platform releases, and can separate non-prod and prod clusters better which makes the deployment of configuration more controlled and flexible than previously with Flux v1.

You can see below the different release versions I’ve created in my cluster management repository:

The following two GitRepository examples; the first one syncs based on a static release tag 0.0.1 and the second syncs within a Semantic version range >=0.0.1 <0.1.0:

---
apiVersion: source.toolkit.fluxcd.io/v1alpha1
kind: GitRepository
metadata:
  creationTimestamp: null
  name: gitops-system
  namespace: gitops-system
spec:
  interval: 1m0s
  ref:
    tag: 0.0.1
  secretRef:
    name: gitops-system
  url: ssh://github.com/berndonline/gitops-toolkit
status: {}
---
apiVersion: source.toolkit.fluxcd.io/v1alpha1
kind: GitRepository
metadata:
  creationTimestamp: null
  name: gitops-system
  namespace: gitops-system
spec:
  interval: 1m0s
  ref:
    semver: '>=0.0.1 <0.1.0'
  secretRef:
    name: gitops-system
  url: ssh://github.com/berndonline/gitops-toolkit
status: {}

There are improvements for the Kustomize configuration to add additional overlays depending on your repository folder structure or combine this with another GitRepository source. In my example repository I have a cluster folder cluster-dev and a folder for common configuration:

.
|____cluster-dev
| |____kustomization.yaml
| |____hello-world_base
| | |____kustomization.yaml
| | |____deploy.yaml
|____common
  |____kustomization.yaml
  |____nginx-service.yaml
  |____nginx_base
    |____kustomization.yaml
    |____service.yaml
    |____nginx.yaml

You can add multiple Kustomize custom resources as you can see in my examples, one for the cluster specific config and a second one for the common configuration with can be applied to multiple clusters:

---
apiVersion: kustomize.toolkit.fluxcd.io/v1alpha1
kind: Kustomization
metadata:
  creationTimestamp: null
  name: cluster-conf
  namespace: gitops-system
spec:
  interval: 5m0s
  path: ./cluster-dev
  prune: true
  sourceRef:
    kind: GitRepository
    name: gitops-system
status: {}
---
apiVersion: kustomize.toolkit.fluxcd.io/v1alpha1
kind: Kustomization
metadata:
  creationTimestamp: null
  name: common-con
  namespace: gitops-system
spec:
  interval: 5m0s
  path: ./common
  prune: true
  sourceRef:
    kind: GitRepository
    name: gitops-system
status: {}

Let’s install the Flux CD GitOps Toolkit. The toolkit comes again with its own command-line utility tk which you use to install and configure the operator . You find available CLI versions on the Github release page.

Set up a  new repository to store you k8s configuration:

$ git clone ssh://github.com/berndonline/gitops-toolkit
$ cd gitops-toolkit
$ mkdir -p ./cluster-dev/gitops-system

Generate the GitOps Toolkit manifests and store under gitops-system folder, afterwards apply the configuration to your k8s cluster:

$ tk install --version=latest \
    --export > ./cluster-dev/gitops-system/toolkit-components.yaml
$ kubectl apply -f ./cluster-dev/gitops-system/toolkit-components.yaml 
namespace/gitops-system created
customresourcedefinition.apiextensions.k8s.io/alerts.notification.toolkit.fluxcd.io created
customresourcedefinition.apiextensions.k8s.io/gitrepositories.source.toolkit.fluxcd.io created
customresourcedefinition.apiextensions.k8s.io/helmcharts.source.toolkit.fluxcd.io created
customresourcedefinition.apiextensions.k8s.io/helmreleases.helm.toolkit.fluxcd.io created
customresourcedefinition.apiextensions.k8s.io/helmrepositories.source.toolkit.fluxcd.io created
customresourcedefinition.apiextensions.k8s.io/kustomizations.kustomize.toolkit.fluxcd.io created
customresourcedefinition.apiextensions.k8s.io/providers.notification.toolkit.fluxcd.io created
customresourcedefinition.apiextensions.k8s.io/receivers.notification.toolkit.fluxcd.io created
role.rbac.authorization.k8s.io/crd-controller-gitops-system created
rolebinding.rbac.authorization.k8s.io/crd-controller-gitops-system created
clusterrolebinding.rbac.authorization.k8s.io/cluster-reconciler-gitops-system created
service/notification-controller created
service/source-controller created
service/webhook-receiver created
deployment.apps/helm-controller created
deployment.apps/kustomize-controller created
deployment.apps/notification-controller created
deployment.apps/source-controller created
networkpolicy.networking.k8s.io/deny-ingress created

Check if all the pods are running and use the command tk check to see if the toolkit is working correctly:

$ kubectl get pod -n gitops-system
NAME                                       READY   STATUS    RESTARTS   AGE
helm-controller-64f846df8c-g4mhv           1/1     Running   0          19s
kustomize-controller-6d9745c8cd-n8tth      1/1     Running   0          19s
notification-controller-587c49f7fc-ldcg2   1/1     Running   0          18s
source-controller-689dcd8bd7-rzp55         1/1     Running   0          18s
$ tk check
► checking prerequisites
✔ kubectl 1.18.3 >=1.18.0
✔ Kubernetes 1.18.6 >=1.16.0
► checking controllers
✔ source-controller is healthy
✔ kustomize-controller is healthy
✔ helm-controller is healthy
✔ notification-controller is healthy
✔ all checks passed

Now you can create a GitRepository custom resource, it will generate a ssh key local and displays the public key which you need to add to your repository deploy keys:

$ tk create source git gitops-system \
  --url=ssh://github.com/berndonline/gitops-toolkit \ 
  --ssh-key-algorithm=ecdsa \
  --ssh-ecdsa-curve=p521 \
  --branch=master \
  --interval=1m
► generating deploy key pair
ecdsa-sha2-nistp521 xxxxxxxxxxx
Have you added the deploy key to your repository: y
► collecting preferred public key from SSH server
✔ collected public key from SSH server:
github.com ssh-rsa xxxxxxxxxxx
► applying secret with keys
✔ authentication configured
✚ generating source
► applying source
✔ source created
◎ waiting for git sync
✗ git clone error: remote repository is empty

Continue with adding the Kustomize configuration:

$ tk create kustomization gitops-system \
  --source=gitops-system \
  --path="./cluster-dev" \
  --prune=true \
  --interval=5m
✚ generating kustomization
► applying kustomization
✔ kustomization created
◎ waiting for kustomization sync
✗ Source is not ready

Afterwards you can add your Kubernetes manifests to your repository and the operator will start synchronising the repository and apply the configuration which you’ve defined.

You can export the Source and Kustomize configuration:

$ tk export source git gitops-system \
 > ./cluster-dev/gitops-system/toolkit-source.yaml
$ tk export kustomization gitops-system \
 > ./cluster-dev/gitops-system/toolkit-kustomization.yaml

You basically finished installing the GitOps Toolkit and below you have some useful commands to reconcile the configured custom resources:

$ tk reconcile source git gitops-system
$ tk reconcile kustomization gitops-system

I was thinking of explaining how to setup a Kubernetes platform repository and do release versioning with the Flux GitOps Toolkit in one of my next articles. Please let me know if you have questions.

Synchronize Cluster Configuration using OpenShift Hive – SyncSets and SelectorSyncSets

It has been some time since my last post but I want to continue my OpenShift Hive article series about Getting started with OpenShift Hive and how to Deploy OpenShift/OKD 4.x clusters using Hive. In this blog post I want to explain how you can use Hive to synchronise cluster configuration using SyncSets. There are two different types of SyncSets, the SyncSet (namespaced custom resource), which you assign to a specific cluster name in the Cluster Deployment Reference, and a SelectorSyncSet (cluster-wide custom resource) using the Cluster Deployment Selector, which uses a label selector to apply configuration to a set of clusters matching the label across cluster namespaces.

Let’s look at the first example of a SyncSet (namespaced resource), which you can see in the example below. In the clusterDeploymentRefs you need to match a cluster name which is created in the same namespace where you create the SyncSet. In SyncSet there are sections where you can create resources or apply patches to a cluster. The last section is secretReference which you use to apply secrets to a cluster without having them in clear text written in the SyncSet:

apiVersion: hive.openshift.io/v1
kind: SyncSet
metadata:
  name: example-syncset
  namespace: okd
spec:
  clusterDeploymentRefs:
  - name: okd
  resources:
  - apiVersion: v1
    kind: Namespace
    metadata:
      name: myproject
  patches:
  - kind: Config
    apiVersion: imageregistry.operator.openshift.io/v1
    name: cluster
    applyMode: AlwaysApply
    patch: |-
      { "spec": { "defaultRoute": true }}
    patchType: merge
  secretReferences:
  - source:
      name: mysecret
      namespace: okd
    target:
      name: mysecret
      namespace: myproject

The second SyncSet example for an SelectorSyncSet (cluster-wide resource) is very similar to the previous example but more flexible because you can use a label selector clusterDeploymentSelector and the configuration can be applied to multiple clusters matching the label across cluster namespaces. Great use-case for common or environment configuration which is the same for all OpenShift clusters:

---
apiVersion: hive.openshift.io/v1
kind: SelectorSyncSet
metadata:
  name: mygroup
spec:
  resources:
  - apiVersion: v1
    kind: Namespace
    metadata:
      name: myproject
  resourceApplyMode: Sync
  clusterDeploymentSelector:
    matchLabels:
      cluster-group: okd

The problem with SyncSets is that they can get pretty large and it is complicated to write them by yourself depending on the size of configuration. My colleague Matt wrote a syncset generator which solves the problem and automatically generates a  SelectorSyncSet, please checkout his github repository:

$ wget -O syncset-gen https://github.com/matt-simons/syncset-gen/releases/download/v0.5/syncset-gen_linux_amd64 && chmod +x ./syncset-gen
$ sudo mv ./syncset-gen /usr/bin/
$ syncset-gen view -h
Parses a manifest directory and prints a SyncSet/SelectorSyncSet representation of the objects it contains.

Usage:
  ss view [flags]

Flags:
  -c, --cluster-name string   The cluster name used to match the SyncSet to a Cluster
  -h, --help                  help for view
  -p, --patches string        The directory of patch manifest files to use
  -r, --resources string      The directory of resource manifest files to use
  -s, --selector string       The selector key/value pair used to match the SelectorSyncSet to Cluster(s)

Next we need a repository to store the configuration for the OpenShift/OKD clusters. Below you can see a very simple example. The ./config folder contains common configuration which is using a SelectorSyncSet with a clusterDeploymentSelector:

$ tree
.
└── config
    ├── patch
    │   └── cluster-version.yaml
    └── resource
        └── namespace.yaml

To generate a SelectorSyncSet from the ./config folder, run the syncset-gen and the following command options:

$ syncset-gen view okd-cluster-group-selectorsyncset --selector cluster-group/okd -p ./config/patch/ -r ./config/resource/
{
    "kind": "SelectorSyncSet",
    "apiVersion": "hive.openshift.io/v1",
    "metadata": {
        "name": "okd-cluster-group-selectorsyncset",
        "creationTimestamp": null,
        "labels": {
            "generated": "true"
        }
    },
    "spec": {
        "resources": [
            {
                "apiVersion": "v1",
                "kind": "Namespace",
                "metadata": {
                    "name": "myproject"
                }
            }
        ],
        "resourceApplyMode": "Sync",
        "patches": [
            {
                "apiVersion": "config.openshift.io/v1",
                "kind": "ClusterVersion",
                "name": "version",
                "patch": "{\"spec\": {\"channel\": \"stable-4.3\",\"desiredUpdate\": {\"version\": \"4.3.0\", \"image\": \"quay.io/openshift-release-dev/[email protected]:3a516480dfd68e0f87f702b4d7bdd6f6a0acfdac5cd2e9767b838ceede34d70d\"}}}",
                "patchType": "merge"
            },
            {
                "apiVersion": "rbac.authorization.k8s.io/v1",
                "kind": "ClusterRoleBinding",
                "name": "self-provisioners",
                "patch": "{\"subjects\": null}",
                "patchType": "merge"
            }
        ],
        "clusterDeploymentSelector": {
            "matchExpressions": [
                {
                    "key": "cluster-group/okd",
                    "operator": "Exists"
                }
            ]
        }
    },
    "status": {}
}

To debug SyncSets use the below command in the cluster deployment namespace which can give you a status of whether the configuration has successfully applied or if it has failed to apply:

$ oc get syncsetinstance -n <namespace>
$ oc get syncsetinstances <synsetinstance name> -o yaml

I hope this was useful to get you started using OpenShift Hive and SyncSets to apply configuration to OpenShift/OKD clusters. More information about SyncSets can be found in the OpenShift Hive repository.

Getting started with OpenShift Hive

If you don’t know OpenShift Hive I recommend having a look at the video of my talk at RedHat OpenShift Commons about OpenShift Hive where I also talk about how you can provision and manage the lifecycle of OpenShift 4 clusters using the Kubernetes API and the OpenShift Hive operator.

The Hive operator has three main components the admission controller,  the Hive controller and the Hive operator itself. For more information about the Hive architecture visit the Hive docs:

You can use an OpenShift or native Kubernetes cluster to run the operator, in my case I use a EKS cluster. Let’s go through the prerequisites which are required to generate the manifests and the hiveutil:

$ curl -s "https://raw.githubusercontent.com/\
> kubernetes-sigs/kustomize/master/hack/install_kustomize.sh"  | bash
$ sudo mv ./kustomize /usr/bin/
$ wget https://dl.google.com/go/go1.13.3.linux-amd64.tar.gz
$ tar -xvf go1.13.3.linux-amd64.tar.gz
$ sudo mv go /usr/local

To setup the Go environment copy the content below and add to your .profile:

export GOPATH="${HOME}/.go"
export PATH="$PATH:/usr/local/go/bin"
export PATH="$PATH:${GOPATH}/bin:${GOROOT}/bin"

Continue with installing the Go dependencies and clone the OpenShift Hive Github repository:

$ mkdir -p ~/.go/src/github.com/openshift/
$ go get github.com/golang/mock/mockgen
$ go get github.com/golang/mock/gomock
$ go get github.com/cloudflare/cfssl/cmd/cfssl
$ go get github.com/cloudflare/cfssl/cmd/cfssljson
$ cd ~/.go/src/github.com/openshift/
$ git clone https://github.com/openshift/hive.git
$ cd hive/
$ git checkout remotes/origin/master

Before we run make deploy I would recommend modifying the Makefile that we only generate the Hive manifests without deploying them to Kubernetes:

$ sed -i -e 's#oc apply -f config/crds# #' -e 's#kustomize build overlays/deploy | oc apply -f -#kustomize build overlays/deploy > hive.yaml#' Makefile
$ make deploy
# The apis-path is explicitly specified so that CRDs are not created for v1alpha1
go run tools/vendor/sigs.k8s.io/controller-tools/cmd/controller-gen/main.go crd --apis-path=pkg/apis/hive/v1
CRD files generated, files can be found under path /home/ubuntu/.go/src/github.com/openshift/hive/config/crds.
go generate ./pkg/... ./cmd/...
hack/update-bindata.sh
# Deploy the operator manifests:
mkdir -p overlays/deploy
cp overlays/template/kustomization.yaml overlays/deploy
cd overlays/deploy && kustomize edit set image registry.svc.ci.openshift.org/openshift/hive-v4.0:hive=registry.svc.ci.openshift.org/openshift/hivev1:hive
kustomize build overlays/deploy > hive.yaml
rm -rf overlays/deploy

Quick look at the content of the hive.yaml manifest:

$ cat hive.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: hive
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: hive-operator
  namespace: hive

...

---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    control-plane: hive-operator
    controller-tools.k8s.io: "1.0"
  name: hive-operator
  namespace: hive
spec:
  replicas: 1
  revisionHistoryLimit: 4
  selector:
    matchLabels:
      control-plane: hive-operator
      controller-tools.k8s.io: "1.0"
  template:
    metadata:
      labels:
        control-plane: hive-operator
        controller-tools.k8s.io: "1.0"
    spec:
      containers:
      - command:
        - /opt/services/hive-operator
        - --log-level
        - info
        env:
        - name: CLI_CACHE_DIR
          value: /var/cache/kubectl
        image: registry.svc.ci.openshift.org/openshift/hive-v4.0:hive
        imagePullPolicy: Always
        livenessProbe:
          failureThreshold: 1
          httpGet:
            path: /debug/health
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 10
        name: hive-operator
        resources:
          requests:
            cpu: 100m
            memory: 256Mi
        volumeMounts:
        - mountPath: /var/cache/kubectl
          name: kubectl-cache
      serviceAccountName: hive-operator
      terminationGracePeriodSeconds: 10
      volumes:
      - emptyDir: {}
        name: kubectl-cache

Now we can apply the Hive custom resource definition (crds):

$ kubectl apply -f ./config/crds/
customresourcedefinition.apiextensions.k8s.io/checkpoints.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/clusterdeployments.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/clusterdeprovisions.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/clusterimagesets.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/clusterprovisions.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/clusterstates.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/dnszones.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/hiveconfigs.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/machinepools.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/selectorsyncidentityproviders.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/selectorsyncsets.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/syncidentityproviders.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/syncsets.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/syncsetinstances.hive.openshift.io created

And continue to apply the hive.yaml manifest for deploying the OpenShift Hive operator and its components:

$ kubectl apply -f hive.yaml
namespace/hive created
serviceaccount/hive-operator created
clusterrole.rbac.authorization.k8s.io/hive-frontend created
clusterrole.rbac.authorization.k8s.io/hive-operator-role created
clusterrole.rbac.authorization.k8s.io/manager-role created
clusterrole.rbac.authorization.k8s.io/system:openshift:hive:hiveadmission created
rolebinding.rbac.authorization.k8s.io/extension-server-authentication-reader-hiveadmission created
clusterrolebinding.rbac.authorization.k8s.io/auth-delegator-hiveadmission created
clusterrolebinding.rbac.authorization.k8s.io/hive-frontend created
clusterrolebinding.rbac.authorization.k8s.io/hive-operator-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/hiveadmission-hive-hiveadmission created
clusterrolebinding.rbac.authorization.k8s.io/hiveapi-cluster-admin created
clusterrolebinding.rbac.authorization.k8s.io/manager-rolebinding created
deployment.apps/hive-operator created

For the Hive admission controller you need to generate a SSL certifcate:

$ ./hack/hiveadmission-dev-cert.sh
~/Dropbox/hive/hiveadmission-certs ~/Dropbox/hive
2020/02/03 22:17:30 [INFO] generate received request
2020/02/03 22:17:30 [INFO] received CSR
2020/02/03 22:17:30 [INFO] generating key: ecdsa-256
2020/02/03 22:17:30 [INFO] encoded CSR
certificatesigningrequest.certificates.k8s.io/hiveadmission.hive configured
certificatesigningrequest.certificates.k8s.io/hiveadmission.hive approved
-----BEGIN CERTIFICATE-----
MIICaDCCAVCgAwIBAgIQHvvDPncIWHRcnDzzoWGjQDANBgkqhkiG9w0BAQsFADAv
MS0wKwYDVQQDEyRiOTk2MzhhNS04OWQyLTRhZTAtYjI4Ny1iMWIwOGNmOGYyYjAw
HhcNMjAwMjAzMjIxNTA3WhcNMjUwMjAxMjIxNTA3WjAhMR8wHQYDVQQDExZoaXZl
YWRtaXNzaW9uLmhpdmUuc3ZjMFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEea4N
UPbvzM3VdtOkdJ7lBytekRTvwGMqs9HgG14CtqCVCOFq8f+BeqqyrRbJsX83iBfn
gMc54moElb5kIQNjraNZMFcwDAYDVR0TAQH/BAIwADBHBgNVHREEQDA+ghZoaXZl
YWRtaXNzaW9uLmhpdmUuc3ZjgiRoaXZlYWRtaXNzaW9uLmhpdmUuc3ZjLmNsdXN0
ZXIubG9jYWwwDQYJKoZIhvcNAQELBQADggEBADhgT3tNnFs6hBIZFfWmoESe6nnZ
fy9GmlmF9qEBo8FZSk/LYvV0peOdgNZCHqsT2zaJjxULqzQ4zfSb/koYpxeS4+Bf
xwgHzIB/ylzf54wVkILWUFK3GnYepG5dzTXS7VHc4uiNJe0Hwc5JI4HBj7XdL3C7
cbPm7T2cBJi2jscoCWELWo/0hDxkcqZR7rdeltQQ+Uhz87LhTTqlknAMFzL7tM/+
pJePZMQgH97vANsbk97bCFzRZ4eABYSiN0iAB8GQM5M+vK33ZGSVQDJPKQQYH6th
Kzi9wrWEeyEtaWozD5poo9s/dxaLxFAdPDICkPB2yr5QZB+NuDgA+8IYffo=
-----END CERTIFICATE-----
secret/hiveadmission-serving-cert created
~/Dropbox/hive

Afterwards we can check if all the pods are running, this might take a few seconds:

$ kubectl get pods -n hive
NAME                                READY   STATUS    RESTARTS   AGE
hive-controllers-7c6ccc84b9-q7k7m   1/1     Running   0          31s
hive-operator-f9f4447fd-jbmkh       1/1     Running   0          55s
hiveadmission-6766c5bc6f-9667g      1/1     Running   0          27s
hiveadmission-6766c5bc6f-gvvlq      1/1     Running   0          27s

The Hive operator is successfully installed on your Kubernetes cluster but we are not finished yet. To create the required Cluster Deployment manifests we need to generate the hiveutil binary:

$ make hiveutil
go generate ./pkg/... ./cmd/...
hack/update-bindata.sh
go build -o bin/hiveutil github.com/openshift/hive/contrib/cmd/hiveutil

To generate Hive Cluster Deployment manifests just run the following hiveutil command below, I output the definition with -o into yaml:

$ bin/hiveutil create-cluster --base-domain=mydomain.example.com --cloud=aws mycluster -o yaml
apiVersion: v1
items:
- apiVersion: hive.openshift.io/v1
  kind: ClusterImageSet
  metadata:
    creationTimestamp: null
    name: mycluster-imageset
  spec:
    releaseImage: quay.io/openshift-release-dev/ocp-release:4.3.2-x86_64
  status: {}
- apiVersion: v1
  kind: Secret
  metadata:
    creationTimestamp: null
    name: mycluster-aws-creds
  stringData:
    aws_access_key_id: <-YOUR-AWS-ACCESS-KEY->
    aws_secret_access_key: <-YOUR-AWS-SECRET-KEY->
  type: Opaque
- apiVersion: v1
  data:
    install-config.yaml: <-BASE64-ENCODED-OPENSHIFT4-INSTALL-CONFIG->
  kind: Secret
  metadata:
    creationTimestamp: null
    name: mycluster-install-config
  type: Opaque
- apiVersion: hive.openshift.io/v1
  kind: ClusterDeployment
  metadata:
    creationTimestamp: null
    name: mycluster
  spec:
    baseDomain: mydomain.example.com
    clusterName: mycluster
    controlPlaneConfig:
      servingCertificates: {}
    installed: false
    platform:
      aws:
        credentialsSecretRef:
          name: mycluster-aws-creds
        region: us-east-1
    provisioning:
      imageSetRef:
        name: mycluster-imageset
      installConfigSecretRef:
        name: mycluster-install-config
  status:
    clusterVersionStatus:
      availableUpdates: null
      desired:
        force: false
        image: ""
        version: ""
      observedGeneration: 0
      versionHash: ""
- apiVersion: hive.openshift.io/v1
  kind: MachinePool
  metadata:
    creationTimestamp: null
    name: mycluster-worker
  spec:
    clusterDeploymentRef:
      name: mycluster
    name: worker
    platform:
      aws:
        rootVolume:
          iops: 100
          size: 22
          type: gp2
        type: m4.xlarge
    replicas: 3
  status:
    replicas: 0
kind: List
metadata: {}

I hope this post is useful in getting you started with OpenShift Hive. In my next article I will go through the details of the OpenShift 4 cluster deployment with Hive.

Read my new article about OpenShift / OKD 4.x Cluster Deployment using OpenShift Hive

How to backup OpenShift with Heptio Velero(Ark)

I have found an interesting open source tool called Heptio Velero previously known as Heptio Ark which is able to backup Kubernetes and OpenShift container platforms. The tool mainly does this via the API and backup namespace objects and additionally is able to create snapshots for PVs on Azure, AWS and GCP.

The user uses the ark command line utility to create and restore backups.

The installation on Velero is super simple, just follow the steps below:

# Download and extract the latest Velero release from github
wget https://github.com/heptio/velero/releases/download/v0.10.1/ark-v0.10.1-linux-amd64.tar.gz
tar -xzf ark-v0.10.1-linux-amd64.tar.gz -c ./velero/

# Move the ark binary to somewhere in your PATH
mv ./velero/ark /usr/sbin/

# The last two commands create namespace and applies configuration
oc create -f ./velero/config/common/00-prereqs.yaml
oc create -f ./velero/config/minio/

You can expose Minio to access the web console from the outside.

# Create route
oc expose service minio

# View access and secret key to login via the web console
oc describe deployment.apps/minio | grep -i Environment -A2
    Environment:
      MINIO_ACCESS_KEY:  minio
      MINIO_SECRET_KEY:  minio123

Here a few command options on how to backup objects:

# Create a backup for any object that matches the app=pod label selector:
ark backup create <backup-name> --selector <key>=<value> 

# Alternatively if you want to backup all objects except those matching the label backup=ignore:
ark backup create <backup-name> --selector 'backup notin (ignore)'

# Create regularly scheduled backups based on a cron expression using the app=pod label selector:
ark schedule create <backup-name> --schedule="0 1 * * *" --selector <key>=<value>

# Create a backup for a namespace:
ark backup create <backup-name> --include-namespaces <namespace-name>

Let’s do a backup and restore tests; I have created a new OpenShift project with a simple hello-openshift build- and deployment-config:

[[email protected] ~]# ark backup create mybackup --include-namespaces myapplication
Backup request "mybackup" submitted successfully.
Run `ark backup describe mybackup` or `ark backup logs mybackup` for more details.
[[email protected] ~]# ark backup get
NAME          STATUS      CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
mybackup      Completed   2019-02-08 17:14:09 +0000 UTC   29d       default            

Once the backup has completed we can delete the project.

[[email protected] ~]# oc delete project myapplication
project.project.openshift.io "myapplication" deleted

Now let’s restore the project namespace from the previous created backup:

[[email protected] ~]# ark restore create --from-backup mybackup
Restore request "mybackup-20190208171745" submitted successfully.
Run `ark restore describe mybackup-20190208171745` or `ark restore logs mybackup-20190208171745` for more details.
[[email protected] ~]# ark restore get
NAME                         BACKUP        STATUS       WARNINGS   ERRORS    CREATED                         SELECTOR
mybackup-20190208171745      mybackup      InProgress   0          0         2019-02-08 17:17:45 +0000 UTC   
[[email protected] ~]# ark restore get
NAME                         BACKUP        STATUS      WARNINGS   ERRORS    CREATED                         SELECTOR
mybackup-20190208171745      mybackup      Completed   1          0         2019-02-08 17:17:45 +0000 UTC   

The project is back in the state it was when we created the backup.

[[email protected] ~]# oc get pods
NAME                     READY     STATUS    RESTARTS   AGE
hello-app-http-1-qn8jj   1/1       Running   0          2m
[[email protected] ~]# curl -k --insecure https://hello-app-http-myapplication.aio.hostgate.net/
Hello OpenShift!

There are a few issues around the restore which I have seen and I want to explain, I’m not sure if these are related to OpenShift in general or just the latest 3.11 version. The secrets for the builder account are missing or didn’t restore correctly and cannot be used.

[[email protected] ~]# oc get build
NAME                 TYPE      FROM         STATUS                               STARTED   DURATION
hello-build-http-1   Docker    Dockerfile   New (CannotRetrieveServiceAccount)
hello-build-http-2   Docker    Dockerfile   New
[[email protected] ~]# oc get events | grep Failed
1m          1m           2         hello-build-http.15816e39eefb637d         BuildConfig                                     Warning   BuildConfigInstantiateFailed   buildconfig-controller                                error instantiating Build from BuildConfig myapplication/hello-build-http (0): Error resolving ImageStreamTag hello-openshift-source:latest in namespace myapplication: imagestreams.image.openshift.io "hello-openshift-source" not found
1m          1m           6         hello-build-http.15816e39f446207f         BuildConfig                                     Warning   BuildConfigInstantiateFailed   buildconfig-controller                                error instantiating Build from BuildConfig myapplication/hello-build-http (0): Error resolving ImageStreamTag hello-openshift-source:latest in namespace myapplication: unable to find latest tagged image
1m          1m           1         hello-build-http.15816e3a49f21411         BuildConfig                                     Warning   BuildConfigInstantiateFailed   buildconfig-controller                                error instantiating Build from BuildConfig myapplication/hello-build-http (0): builds.build.openshift.io "hello-build-http-1" already exists
[[email protected] ~]# oc get secrets | grep builder
builder-token-5q646        kubernetes.io/service-account-token   4         5m

# OR
[[email protected] ~]# oc get build
NAME                 TYPE      FROM         STATUS                        STARTED   DURATION
hello-build-http-1   Docker    Dockerfile   Pending (MissingPushSecret)
hello-build-http-2   Docker    Dockerfile   New
[[email protected] ~]# oc get events | grep FailedMount
15m         19m          10        hello-build-http-1-build.15816cc22f35795c   Pod                                             Warning   FailedMount                    kubelet, ip-172-26-12-32.eu-west-1.compute.internal   MountVolume.SetUp failed for volume "builder-dockercfg-k55f6-push" : secrets "builder-dockercfg-k55f6" not found
15m         17m          2         hello-build-http-1-build.15816cdec9dc561a   Pod                                             Warning   FailedMount                    kubelet, ip-172-26-12-32.eu-west-1.compute.internal   Unable to mount volumes for pod "hello-build-http-1-build_myapplication(4c2f1113-2bb5-11e9-8a6b-0a007934f01e)": timeout expired waiting for volumes to attach or mount for pod "myapplication"/"hello-build-http-1-build". list of unmounted volumes=[builder-dockercfg-k55f6-push]. list of unattached volumes=[buildworkdir docker-socket crio-socket builder-dockercfg-k55f6-push builder-dockercfg-m6d2v-pull builder-token-sjvw5]
13m         13m          1         hello-build-http-1-build.15816d1e3e65ad2a   Pod                                             Warning   FailedMount                    kubelet, ip-172-26-12-32.eu-west-1.compute.internal   Unable to mount volumes for pod "hello-build-http-1-build_myapplication(4c2f1113-2bb5-11e9-8a6b-0a007934f01e)": timeout expired waiting for volumes to attach or mount for pod "myapplication"/"hello-build-http-1-build". list of unmounted volumes=[buildworkdir docker-socket crio-socket builder-dockercfg-k55f6-push builder-dockercfg-m6d2v-pull builder-token-sjvw5]. list of unattached volumes=[buildworkdir docker-socket crio-socket builder-dockercfg-k55f6-push builder-dockercfg-m6d2v-pull builder-token-sjvw5]
[[email protected] ~]# oc get secrets | grep builder
NAME                       TYPE                                  DATA      AGE
builder-dockercfg-m6d2v    kubernetes.io/dockercfg               1         5m
builder-token-4chx4        kubernetes.io/service-account-token   4         5m
builder-token-sjvw5        kubernetes.io/service-account-token   4         5m

The deployment config seems to be disconnected and doesn’t know the state of the running pod:

[[email protected] ~]# oc get dc
NAME             REVISION   DESIRED   CURRENT   TRIGGERED BY
hello-app-http   0          1         0         config,image(hello-openshift:latest)
[[email protected] ~]#

Here are the steps to recover out of this situation:

# First cancel all builds - the restore seems to have triggered a new build:
[[email protected] ~]# oc cancel-build $(oc get build --no-headers | awk '{ print $1 }')
build.build.openshift.io/hello-build-http-1 marked for cancellation, waiting to be cancelled
build.build.openshift.io/hello-build-http-2 marked for cancellation, waiting to be cancelled
build.build.openshift.io/hello-build-http-1 cancelled
build.build.openshift.io/hello-build-http-2 cancelled

# Delete all builds otherwise you will get later a problem because of duplicate name:
[[email protected] ~]# oc delete build $(oc get build --no-headers | awk '{ print $1 }')
build.build.openshift.io "hello-build-http-1" deleted
build.build.openshift.io "hello-build-http-2" deleted

# Delete the project builder account - this triggers openshift to re-create the builder
[[email protected] ~]# oc delete sa builder
serviceaccount "builder" deleted
[[email protected] ~]# oc get secrets | grep builder
builder-dockercfg-vwckw    kubernetes.io/dockercfg               1         24s
builder-token-dpgj9        kubernetes.io/service-account-token   4         24s
builder-token-lt7z2        kubernetes.io/service-account-token   4         24s

# Start the build and afterwards do a rollout for the deployment config:
[[email protected] ~]# oc start-build hello-build-http
build.build.openshift.io/hello-build-http-3 started
[[email protected] ~]# oc rollout latest dc/hello-app-http
deploymentconfig.apps.openshift.io/hello-app-http rolled out

After doing all this your build- and deployment-config is back synchronised.

[[email protected] ~]# oc get dc
NAME             REVISION   DESIRED   CURRENT   TRIGGERED BY
hello-app-http   3          1         1         config,image(hello-openshift:latest)

My feedback about Heptio Velero(Ark); apart from the restore issues with the build- and deployment-config, I find the tool great especially in scenarios where I accidently deleted a namespace or for DR where I need to recover a whole cluster. What makes the tool worth it, is actually the possibility to create snapshots from PV disks on your cloud provider.

Check out the official documentation from Heptio for more information and if you like this article please leave a comment.

How to display OpenShift/Kubernetes namespace on bash prompt

Very short but useful post about how to display the current Kubernetes namespace on the bash command prompt. I got used add an -n <namespace-name> when I execute an oc command but it is still very useful to get the current namespace displayed on the command prompt especially when troubleshooting issues to not get lost in the different platform namespaces.

Create new file ~/.oc-prompt.sh in your users home folder.

#!/bin/bash
__oc_ps1()
{
    # Get current context
    CONTEXT=$(cat ~/.kube/config 2>/dev/null| grep -o '^current-context: [^/]*' | cut -d' ' -f2)

    if [ -n "$CONTEXT" ]; then
        echo "(ocp:${CONTEXT})"
    fi
}

Add the following lines at the end of the ~/.bashrc and re-connect your terminal session.

NORMAL="\[\033[00m\]"
BLUE="\[\033[01;34m\]"
YELLOW="\[\e[1;33m\]"
GREEN="\[\e[1;32m\]"

export PS1="${BLUE}\W ${GREEN}\u${YELLOW}\$(__oc_ps1)${NORMAL} \$ "
source ~/.oc-prompt.sh

The example bash prompt showing the current OpenShift/Kubernetes namespace:

Very useful when you need to administrate a cluster with multiple namespace.