OpenShift Hive v1.1.x – Latest updates & new features

Over a year has gone by since my first article about Getting started with OpenShift Hive and my talk at the RedHat OpenShift Gathering when the first stable OpenShift Hive v1 version got released. In between a lot has happened and OpenShift Hive v1.1.1 was released a few weeks ago. So I wanted to look into the new functionalities of OpenShift Hive.

  • Operator Lifecycle Manager (OLM) installation

Hive is now available through the Operator Hub community catalog and can be installed on both OpenShift or native Kubernetes cluster through the OLM. The install is straightforward by adding the operator-group and subscription manifests:

---
apiVersion: operators.coreos.com/v1alpha2
kind: OperatorGroup
metadata:
  name: operatorgroup
  namespace: hive
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: hive
  namespace: hive
spec:
  channel: alpha
  name: hive-operator
  source: operatorhubio-catalog
  sourceNamespace: olm

Alternatively the Hive subscription can be configured with a manual install plan. In this case the OLM will not automatically upgrade the Hive operator when a new version is released – I highly recommend this for production deployments!

---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: hive
  namespace: hive
spec:
  channel: alpha
  name: hive-operator
  installPlanApproval: Manual
  source: operatorhubio-catalog
  sourceNamespace: olm

After a few seconds you see an install plan being added.

$ k get installplan
NAME            CSV                    APPROVAL   APPROVED
install-9drmh   hive-operator.v1.1.0   Manual     false

Edit the install plan and set approved value to true – the OLM will start and install or upgrade the Hive operator automatically.

...
spec:
  approval: Manual
  approved: true
  clusterServiceVersionNames:
  - hive-operator.v1.1.0
  generation: 1
...

After the Hive operator is installed you need to apply the Hiveconfig object for the operator to install all of the needed Hive components. On non-OpenShift installs (native Kubernetes) you still need to generate Hiveadmission certificates for the admission controller pods to start otherwise they are missing the hiveadmission-serving-cert secret.

  • Hiveconfig – Velero backup and delete protection

There are a few small but also very useful changes in the Hiveconfig object. You can now enable the deleteProtection option which prevents administrators from accidental deletions of ClusterDeployments or SyncSets. Another great addition is that you can enable automatic configuration of Velero to backup your cluster namespaces, meaning you’re not required to configure backups separately.

---
apiVersion: hive.openshift.io/v1
kind: HiveConfig
metadata:
  name: hive
spec:
  logLevel: info
  targetNamespace: hive
  deleteProtection: enabled
  backup:
    velero:
      enabled: true
      namespace: velero

Backups are configured in the Velero namespace as specified in the Hiveconfig.

$ k get backups -n velero
NAME                              AGE
backup-okd-2021-03-26t11-57-32z   3h12m
backup-okd-2021-03-26t12-00-32z   3h9m
backup-okd-2021-03-26t12-35-44z   154m
backup-okd-2021-03-26t12-38-44z   151m
...

With the deletion protection enabled in the hiveconfig, the controller automatically adds the annotation hive.openshift.io/protected-delete: “true” to all resources and prevents these from accidental deletions:

$ k delete cd okd --wait=0
The ClusterDeployment "okd" is invalid: metadata.annotations.hive.openshift.io/protected-delete: Invalid value: "true": cannot delete while annotation is present
  • ClusterSync and Scaling Hive controller

To check applied resources through SyncSets and SelectorSyncSets, where Hive has previously used Syncsetnstance but these no longer exists. This now has move to ClusterSync to collect status information about applied resources:

$ k get clustersync okd -o yaml
apiVersion: hiveinternal.openshift.io/v1alpha1
kind: ClusterSync
metadata:
  name: okd
  namespace: okd
spec: {}
status:
  conditions:
  - lastProbeTime: "2021-03-26T16:13:57Z"
    lastTransitionTime: "2021-03-26T16:13:57Z"
    message: All SyncSets and SelectorSyncSets have been applied to the cluster
    reason: Success
    status: "False"
    type: Failed
  firstSuccessTime: "2021-03-26T16:13:57Z"
...

It is also possible to horizontally scale the Hive controller to change the synchronisation frequency for running larger OpenShift deployments.

---
apiVersion: hive.openshift.io/v1
kind: HiveConfig
metadata:
  name: hive
spec:
  logLevel: info
  targetNamespace: hive
  deleteProtection: enabled
  backup:
    velero:
      enabled: true
      namespace: velero
  controllersConfig:
    controllers:
    - config:
        concurrentReconciles: 10
        replicas: 3
      name: clustersync

Please checkout the scaling test script which I found in the Github repo, you can simulate fake clusters by adding the annotation “hive.openshift.io/fake-cluster=true” to your ClusterDeployment.

  • Hibernating clusters

RedHat introduced that you can hibernate (shutdown) clusters in OpenShift 4.5 when they are not needed and switch them easily back on when you need them. This is now possible with OpenShift Hive: you can hibernate and change the power state of a cluster deployment.

$ kubectl patch cd okd --type='merge' -p $'spec:\n powerState: Hibernating'

Checking the cluster deployment and power state change to stopping.

$ kubectl get cd
NAME   PLATFORM   REGION      CLUSTERTYPE   INSTALLED   INFRAID     VERSION   POWERSTATE   AGE
okd    aws        eu-west-1                 true        okd-jpqgb   4.7.0     Stopping     44m

After a couple of minutes the power state of the cluster nodes will change to hibernating.

$ kubectl get cd
NAME   PLATFORM   REGION      CLUSTERTYPE   INSTALLED   INFRAID     VERSION   POWERSTATE    AGE
okd    aws        eu-west-1                 true        okd-jpqgb   4.7.0     Hibernating   47m

In the AWS console you see the cluster instances as stopped.

When turning the cluster back online, change the power state in the cluster deployment to running.

$ kubectl patch cd okd --type='merge' -p $'spec:\n powerState: Running'

Again the power state changes to resuming.

$ kubectl get cd
NAME   PLATFORM   REGION      CLUSTERTYPE   INSTALLED   INFRAID     VERSION   POWERSTATE   AGE
okd    aws        eu-west-1                 true        okd-jpqgb   4.7.0     Resuming     49m

A few minutes later the cluster changes to running and is ready to use again.

$ k get cd
NAME   PLATFORM   REGION      CLUSTERTYPE   INSTALLED   INFRAID     VERSION   POWERSTATE   AGE
okd    aws        eu-west-1                 true        okd-jpqgb   4.7.0     Running      61m
  • Cluster pools

Cluster pools is something which came together with the hibernating feature which allows you to pre-provision OpenShift clusters without actually allocating them and after the provisioning they will hibernate until you claim a cluster. Again a nice feature and ideal use-case for ephemeral type development or integration test environments which allows you to have clusters ready to go to claim when needed and dispose them afterwards.

Create a ClusterPool custom resource which is similar to a cluster deployment.

apiVersion: hive.openshift.io/v1
kind: ClusterPool
metadata:
  name: okd-eu-west-1
  namespace: hive
spec:
  baseDomain: okd.domain.com
  imageSetRef:
    name: okd-4.7-imageset
  installConfigSecretTemplateRef: 
    name: install-config
  skipMachinePools: true
  platform:
    aws:
      credentialsSecretRef:
        name: aws-creds
      region: eu-west-1
  pullSecretRef:
    name: pull-secret
  size: 3

To claim a cluster from a pool, apply the ClusterClaim resource.

apiVersion: hive.openshift.io/v1
kind: ClusterClaim
metadata:
  name: okd-claim
  namespace: hive
spec:
  clusterPoolName: okd-eu-west-1
  lifetime: 8h

I haven’t tested this yet but will definitely start using this in the coming weeks. Have a look at the Hive documentation on using ClusterPool and ClusterClaim.

  • Cluster relocation

For me, having used OpenShift Hive for over one and half years to run OpenShift 4 cluster, this is a very useful functionality because at some point you might need to rebuild or move your management services to a new Hive cluster. The ClusterRelocator object gives you the option to do this.

$ kubectl create secret generic new-hive-cluster-kubeconfig -n hive --from-file=kubeconfig=./new-hive-cluster.kubeconfig

Create the ClusterRelocator object and specify the kubeconfig of the remote Hive cluster, and also add a clusterDeploymentSelector:

apiVersion: hive.openshift.io/v1
kind: ClusterRelocate
metadata:
  name: migrate
spec:
  kubeconfigSecretRef:
    namespace: hive
    name: new-hive-cluster-kubeconfig
  clusterDeploymentSelector:
    matchLabels:
      migrate: cluster

To move cluster deployments, add the label migrate=cluster to your OpenShift clusters you want to move.

$ kubectl label clusterdeployment okd migrate=cluster

The cluster deployment will move to the new Hive cluster and will be removed from the source Hive cluster without the de-provision. It’s important to keep in mind that you need to copy any other resources you need, such as secrets, syncsets, selectorsyncsets and syncidentiyproviders, before moving the clusters. Take a look at the Hive documentation for the exact steps.

  • Useful annotation

Pause SyncSets by adding the annotation “hive.openshift.io/syncset-pause=true” to the clusterdeployment which stops the reconcile of defined resources and great for troubleshooting.

In a cluster deployment you can set the option to preserve cluster on delete which allows the user to disconnect a cluster from Hive without de-provisioning it.

$ kubectl patch cd okd --type='merge' -p $'spec:\n preserveOnDelete: true'

This sums up the new features and functionalities you can use with the latest OpenShift Hive version.

OpenShift Hive – Deploy Single Node (All-in-One) OKD Cluster on AWS

The concept of a single-node or All-in-One OpenShift / Kubernetes cluster isn’t something new, years ago when I was working with OpenShift 3 and before that with native Kubernetes, we were using single-node clusters as ephemeral development environment, integrations testing for pull-request or platform releases. It was only annoying because this required complex Jenkins pipelines, provision the node first, then install prerequisites and run the openshift-ansible installer playbook. Not always reliable and not a great experience but it done the job.

This is possible as well with the new OpenShift/OKD 4 version and with the help from OpenShift Hive. The experience is more reliable and quicker than previously and I don’t need to worry about de-provisioning, I will let Hive delete the cluster after a few hours automatically.

It requires a few simple modifications in the install-config. You need to add the Availability Zone you want where the instance will be created. When doing this the VPC will only have two subnets, one public and one private subnet in eu-west-1. You can also install the single-node cluster into an existing VPC you just have to specify subnet ids. Change the compute worker node replicas zero and control-plane replicas to one. Make sure to have an instance size with enough CPU and memory for all OpenShift components because they need to fit onto the single node. The rest of the install-config is pretty much standard.

---
apiVersion: v1
baseDomain: k8s.domain.com
compute:
- name: worker
  platform:
    aws:
      zones:
      - eu-west-1a
      rootVolume:
        iops: 100
        size: 22
        type: gp2
      type: r4.xlarge
  replicas: 0
controlPlane:
  name: master
  platform:
    aws:
      zones:
      - eu-west-1a
      rootVolume:
        iops: 100
        size: 22
        type: gp2
      type: r5.2xlarge
  replicas: 1
metadata:
  creationTimestamp: null
  name: okd-aio
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  machineCIDR: 10.0.0.0/16
  networkType: OpenShiftSDN
  serviceNetwork:
  - 172.30.0.0/16
platform:
  aws:
    region: eu-west-1
pullSecret: ""
sshKey: ""

Create a new install-config secret for the cluster.

kubectl create secret generic install-config-aio -n okd --from-file=install-config.yaml=./install-config-aio.yaml

We will be using OpenShift Hive for the cluster deployment because the provision is more simplified and Hive can also apply any configuration using SyncSets or SelectorSyncSets which is needed. Add the annotation hive.openshift.io/delete-after: “2h” and Hive will automatically delete the cluster after 4 hours.

---
apiVersion: hive.openshift.io/v1
kind: ClusterDeployment
metadata:
  creationTimestamp: null
  annotations:
    hive.openshift.io/delete-after: "2h"
  name: okd-aio 
  namespace: okd
spec:
  baseDomain: k8s.domain.com
  clusterName: okd-aio
  controlPlaneConfig:
    servingCertificates: {}
  installed: false
  platform:
    aws:
      credentialsSecretRef:
        name: aws-creds
      region: eu-west-1
  provisioning:
    releaseImage: quay.io/openshift/okd:4.5.0-0.okd-2020-07-14-153706-ga
    installConfigSecretRef:
      name: install-config-aio
  pullSecretRef:
    name: pull-secret
  sshKey:
    name: ssh-key
status:
  clusterVersionStatus:
    availableUpdates: null
    desired:
      force: false
      image: ""
      version: ""
    observedGeneration: 0
    versionHash: ""

Apply the cluster deployment to your clusters namespace.

kubectl apply -f  ./clusterdeployment-aio.yaml

This is slightly faster than provision 6 nodes cluster and will take around 30mins until your ephemeral test cluster is ready to use.

Synchronize Cluster Configuration using OpenShift Hive – SyncSets and SelectorSyncSets

It has been some time since my last post but I want to continue my OpenShift Hive article series about Getting started with OpenShift Hive and how to Deploy OpenShift/OKD 4.x clusters using Hive. In this blog post I want to explain how you can use Hive to synchronise cluster configuration using SyncSets. There are two different types of SyncSets, the SyncSet (namespaced custom resource), which you assign to a specific cluster name in the Cluster Deployment Reference, and a SelectorSyncSet (cluster-wide custom resource) using the Cluster Deployment Selector, which uses a label selector to apply configuration to a set of clusters matching the label across cluster namespaces.

Let’s look at the first example of a SyncSet (namespaced resource), which you can see in the example below. In the clusterDeploymentRefs you need to match a cluster name which is created in the same namespace where you create the SyncSet. In SyncSet there are sections where you can create resources or apply patches to a cluster. The last section is secretReference which you use to apply secrets to a cluster without having them in clear text written in the SyncSet:

apiVersion: hive.openshift.io/v1
kind: SyncSet
metadata:
  name: example-syncset
  namespace: okd
spec:
  clusterDeploymentRefs:
  - name: okd
  resources:
  - apiVersion: v1
    kind: Namespace
    metadata:
      name: myproject
  patches:
  - kind: Config
    apiVersion: imageregistry.operator.openshift.io/v1
    name: cluster
    applyMode: AlwaysApply
    patch: |-
      { "spec": { "defaultRoute": true }}
    patchType: merge
  secretReferences:
  - source:
      name: mysecret
      namespace: okd
    target:
      name: mysecret
      namespace: myproject

The second SyncSet example for an SelectorSyncSet (cluster-wide resource) is very similar to the previous example but more flexible because you can use a label selector clusterDeploymentSelector and the configuration can be applied to multiple clusters matching the label across cluster namespaces. Great use-case for common or environment configuration which is the same for all OpenShift clusters:

---
apiVersion: hive.openshift.io/v1
kind: SelectorSyncSet
metadata:
  name: mygroup
spec:
  resources:
  - apiVersion: v1
    kind: Namespace
    metadata:
      name: myproject
  resourceApplyMode: Sync
  clusterDeploymentSelector:
    matchLabels:
      cluster-group: okd

The problem with SyncSets is that they can get pretty large and it is complicated to write them by yourself depending on the size of configuration. My colleague Matt wrote a syncset generator which solves the problem and automatically generates a  SelectorSyncSet, please checkout his github repository:

$ wget -O syncset-gen https://github.com/matt-simons/syncset-gen/releases/download/v0.5/syncset-gen_linux_amd64 && chmod +x ./syncset-gen
$ sudo mv ./syncset-gen /usr/bin/
$ syncset-gen view -h
Parses a manifest directory and prints a SyncSet/SelectorSyncSet representation of the objects it contains.

Usage:
  ss view [flags]

Flags:
  -c, --cluster-name string   The cluster name used to match the SyncSet to a Cluster
  -h, --help                  help for view
  -p, --patches string        The directory of patch manifest files to use
  -r, --resources string      The directory of resource manifest files to use
  -s, --selector string       The selector key/value pair used to match the SelectorSyncSet to Cluster(s)

Next we need a repository to store the configuration for the OpenShift/OKD clusters. Below you can see a very simple example. The ./config folder contains common configuration which is using a SelectorSyncSet with a clusterDeploymentSelector:

$ tree
.
└── config
    ├── patch
    │   └── cluster-version.yaml
    └── resource
        └── namespace.yaml

To generate a SelectorSyncSet from the ./config folder, run the syncset-gen and the following command options:

$ syncset-gen view okd-cluster-group-selectorsyncset --selector cluster-group/okd -p ./config/patch/ -r ./config/resource/
{
    "kind": "SelectorSyncSet",
    "apiVersion": "hive.openshift.io/v1",
    "metadata": {
        "name": "okd-cluster-group-selectorsyncset",
        "creationTimestamp": null,
        "labels": {
            "generated": "true"
        }
    },
    "spec": {
        "resources": [
            {
                "apiVersion": "v1",
                "kind": "Namespace",
                "metadata": {
                    "name": "myproject"
                }
            }
        ],
        "resourceApplyMode": "Sync",
        "patches": [
            {
                "apiVersion": "config.openshift.io/v1",
                "kind": "ClusterVersion",
                "name": "version",
                "patch": "{\"spec\": {\"channel\": \"stable-4.3\",\"desiredUpdate\": {\"version\": \"4.3.0\", \"image\": \"quay.io/openshift-release-dev/[email protected]:3a516480dfd68e0f87f702b4d7bdd6f6a0acfdac5cd2e9767b838ceede34d70d\"}}}",
                "patchType": "merge"
            },
            {
                "apiVersion": "rbac.authorization.k8s.io/v1",
                "kind": "ClusterRoleBinding",
                "name": "self-provisioners",
                "patch": "{\"subjects\": null}",
                "patchType": "merge"
            }
        ],
        "clusterDeploymentSelector": {
            "matchExpressions": [
                {
                    "key": "cluster-group/okd",
                    "operator": "Exists"
                }
            ]
        }
    },
    "status": {}
}

To debug SyncSets use the below command in the cluster deployment namespace which can give you a status of whether the configuration has successfully applied or if it has failed to apply:

$ oc get syncsetinstance -n <namespace>
$ oc get syncsetinstances <synsetinstance name> -o yaml

I hope this was useful to get you started using OpenShift Hive and SyncSets to apply configuration to OpenShift/OKD clusters. More information about SyncSets can be found in the OpenShift Hive repository.

OpenShift / OKD 4.x Cluster Deployment using OpenShift Hive

Before you continue to deploy an OpenShift or OKD cluster please check out my other posts about OpenShift Hive – API driven OpenShift cluster provisioning and management operator and Getting started with OpenShift Hive  because you need a running OpenShift Hive operator.

To install the OKD (OpenShift Origin Community Distribution) version we need a few things beforehand: a cluster namespace, AWS credentials, SSH keys, image pull secret, install-config, cluster image version and cluster deployment.

Let’s start to create the cluster namespace:

cat <<EOF | kubectl apply -f -
---
apiVersion: v1
kind: Namespace
metadata:
  name: okd

Create a secret with your ssh key:

$ kubectl create secret generic ssh-key -n okd --from-file=ssh-privatekey=/home/ubuntu/.ssh/id_rsa --from-file=ssh-publickey=/home/ubuntu/.ssh/id_rsa.pub

Create the AWS credential secret:

$ kubectl create secret generic aws-creds -n okd --from-literal=aws_secret_access_key=$AWS_SECRET_ACCESS_KEY --from-literal=aws_access_key_id=$AWS_ACCESS_KEY_ID

Create an image pull secret, this is not important for installing a OKD 4.x cluster but needs to be present otherwise Hive will not start the cluster deployment. If you have an RedHat Enterprise subscription for OpenShift then you need to add here your RedHat image pull secret:

$ kubectl create secret generic pull-secret -n okd --from-file=.dockerconfigjson=/home/ubuntu/.docker/config.json --type=kubernetes.io/dockerconfigjson 

Create a install-config.yaml for the cluster deployment and modify to your needs:

---
apiVersion: v1
baseDomain: kube.domain.com
compute:
- name: worker
  platform:
    aws:
      rootVolume:
        iops: 100
        size: 22
        type: gp2
      type: m4.xlarge
  replicas: 3
controlPlane:
  name: master
  platform:
    aws:
      rootVolume:
        iops: 100
        size: 22
        type: gp2
      type: m4.xlarge
replicas: 3
metadata:
  creationTimestamp: null
  name: okd
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  machineCIDR: 10.0.0.0/16
  networkType: OpenShiftSDN
  serviceNetwork:
  - 172.30.0.0/16
platform:
  aws:
    region: eu-west-1
pullSecret: ""
sshKey: ""

Create the install-config secret for the cluster deployment:

$ kubectl create secret generic install-config -n okd --from-file=install-config.yaml=./install-config.yaml

Create the ClusterImageSet for OKD. In my example I am using the latest OKD 4.4.0 release. More information about the available OKD release versions you find here: https://origin-release.svc.ci.openshift.org/

cat <<EOF | kubectl apply -f -
---
apiVersion: hive.openshift.io/v1
kind: ClusterImageSet
metadata:
  name: okd-4-4-0-imageset
spec:
  releaseImage: registry.svc.ci.openshift.org/origin/release:4.4.0-0.okd-2020-02-18-212654
EOF 

Below is an example of a RedHat Enterprise OpenShift 4 ClusterImageSet:

---
apiVersion: hive.openshift.io/v1
kind: ClusterImageSet
metadata:
  name: openshift-4-3-0-imageset
spec:
  releaseImage: quay.io/openshift-release-dev/ocp-release:4.3.0-x86_64

For Hive to start with the cluster deployment, we need to modify the manifest below and add the references to the previous created secrets, install-config and cluster imageset version:

cat <<EOF | kubectl apply -f -
---
apiVersion: hive.openshift.io/v1
kind: ClusterDeployment
metadata:
  creationTimestamp: null
  name: okd
  namespace: okd
spec:
  baseDomain: kube.domain.com
  clusterName: okd
  controlPlaneConfig:
    servingCertificates: {}
  installed: false
  platform:
    aws:
      credentialsSecretRef:
        name: aws-creds
      region: eu-west-1
  provisioning:
    imageSetRef:
      name: okd-4-4-0-imageset
    installConfigSecretRef:
      name: install-config 
  pullSecretRef:
    name: pull-secret
  sshKey:
    name: ssh-key
status:
  clusterVersionStatus:
    availableUpdates: null
    desired:
      force: false
      image: ""
      version: ""
    observedGeneration: 0
    versionHash: ""
EOF

Once you submitted the ClusterDeployment manifest, the Hive operator will start to deploy the cluster straightaway:

$ kubectl get clusterdeployments.hive.openshift.io -n okd
NAME   CLUSTERNAME   CLUSTERTYPE   BASEDOMAIN          INSTALLED   INFRAID     AGE
okd    okd                         kube.domain.com     false       okd-jcdkd   107s

Hive will create the provision (install) pod for the cluster deployment and inject the installer configuration:

$ kubectl get pods -n okd
NAME                          READY   STATUS    RESTARTS   AGE
okd-0-tbm9t-provision-c5hpf   1/3     Running   0          57s

You can view the logs to check the progress of the cluster deployment. You will see the terraform output for creating the infrastructure resources and feedback from the installer about the installation progress. At the end you will see when the installation completed successfully:

$ kubectl logs okd-0-tbm9t-provision-c5hpf -n okd -c hive -f
...
time="2020-02-23T13:31:41Z" level=debug msg="module.dns.aws_route53_zone.int: Creating..."
time="2020-02-23T13:31:42Z" level=debug msg="aws_ami_copy.main: Still creating... [3m40s elapsed]"
time="2020-02-23T13:31:51Z" level=debug msg="module.dns.aws_route53_zone.int: Still creating... [10s elapsed]"
time="2020-02-23T13:31:52Z" level=debug msg="aws_ami_copy.main: Still creating... [3m50s elapsed]"
time="2020-02-23T13:32:01Z" level=debug msg="module.dns.aws_route53_zone.int: Still creating... [20s elapsed]"
time="2020-02-23T13:32:02Z" level=debug msg="aws_ami_copy.main: Still creating... [4m0s elapsed]"
time="2020-02-23T13:32:11Z" level=debug msg="module.dns.aws_route53_zone.int: Still creating... [30s elapsed]"
time="2020-02-23T13:32:12Z" level=debug msg="aws_ami_copy.main: Still creating... [4m10s elapsed]"
time="2020-02-23T13:32:21Z" level=debug msg="module.dns.aws_route53_zone.int: Still creating... [40s elapsed]"
time="2020-02-23T13:32:22Z" level=debug msg="aws_ami_copy.main: Still creating... [4m20s elapsed]"
time="2020-02-23T13:32:31Z" level=debug msg="module.dns.aws_route53_zone.int: Still creating... [50s elapsed]"
time="2020-02-23T13:32:32Z" level=debug msg="aws_ami_copy.main: Still creating... [4m30s elapsed]"
time="2020-02-23T13:32:41Z" level=debug msg="module.dns.aws_route53_zone.int: Still creating... [1m0s elapsed]"
time="2020-02-23T13:32:41Z" level=debug msg="module.dns.aws_route53_zone.int: Creation complete after 1m0s [id=Z10411051RAEUMMAUH39E]"
time="2020-02-23T13:32:41Z" level=debug msg="module.dns.aws_route53_record.etcd_a_nodes[0]: Creating..."
time="2020-02-23T13:32:41Z" level=debug msg="module.dns.aws_route53_record.api_internal: Creating..."
time="2020-02-23T13:32:41Z" level=debug msg="module.dns.aws_route53_record.api_external_internal_zone: Creating..."
time="2020-02-23T13:32:41Z" level=debug msg="module.dns.aws_route53_record.etcd_a_nodes[2]: Creating..."
time="2020-02-23T13:32:41Z" level=debug msg="module.dns.aws_route53_record.etcd_a_nodes[1]: Creating..."
time="2020-02-23T13:32:42Z" level=debug msg="aws_ami_copy.main: Still creating... [4m40s elapsed]"
time="2020-02-23T13:32:51Z" level=debug msg="module.dns.aws_route53_record.etcd_a_nodes[0]: Still creating... [10s elapsed]"
time="2020-02-23T13:32:51Z" level=debug msg="module.dns.aws_route53_record.api_internal: Still creating... [10s elapsed]"
time="2020-02-23T13:32:51Z" level=debug msg="module.dns.aws_route53_record.api_external_internal_zone: Still creating... [10s elapsed]"
time="2020-02-23T13:32:51Z" level=debug msg="module.dns.aws_route53_record.etcd_a_nodes[2]: Still creating... [10s elapsed]"
time="2020-02-23T13:32:51Z" level=debug msg="module.dns.aws_route53_record.etcd_a_nodes[1]: Still creating... [10s elapsed]"
time="2020-02-23T13:32:52Z" level=debug msg="aws_ami_copy.main: Still creating... [4m50s elapsed]"
...
time="2020-02-23T13:34:43Z" level=debug msg="Apply complete! Resources: 123 added, 0 changed, 0 destroyed."
time="2020-02-23T13:34:43Z" level=debug msg="OpenShift Installer unreleased-master-2446-gc108297de972e1a6a5fb502a7668079d16e501f9-dirty"
time="2020-02-23T13:34:43Z" level=debug msg="Built from commit c108297de972e1a6a5fb502a7668079d16e501f9"
time="2020-02-23T13:34:43Z" level=info msg="Waiting up to 20m0s for the Kubernetes API at https://api.okd.kube.domain.com:6443..."
time="2020-02-23T13:35:13Z" level=debug msg="Still waiting for the Kubernetes API: Get https://api.okd.kube.domain.com:6443/version?timeout=32s: dial tcp 52.17.210.160:6443: connect: connection refused"
time="2020-02-23T13:35:50Z" level=debug msg="Still waiting for the Kubernetes API: Get https://api.okd.kube.domain.com:6443/version?timeout=32s: dial tcp 52.211.227.216:6443: connect: connection refused"
time="2020-02-23T13:36:20Z" level=debug msg="Still waiting for the Kubernetes API: Get https://api.okd.kube.domain.com:6443/version?timeout=32s: dial tcp 52.17.210.160:6443: connect: connection refused"
time="2020-02-23T13:36:51Z" level=debug msg="Still waiting for the Kubernetes API: Get https://api.okd.kube.domain.com:6443/version?timeout=32s: dial tcp 52.211.227.216:6443: connect: connection refused"
time="2020-02-23T13:37:58Z" level=debug msg="Still waiting for the Kubernetes API: Get https://api.okd.kube.domain.com:6443/version?timeout=32s: dial tcp 52.211.227.216:6443: connect: connection refused"
time="2020-02-23T13:38:00Z" level=debug msg="Still waiting for the Kubernetes API: the server could not find the requested resource"
time="2020-02-23T13:38:30Z" level=debug msg="Still waiting for the Kubernetes API: the server could not find the requested resource"
time="2020-02-23T13:38:58Z" level=debug msg="Still waiting for the Kubernetes API: Get https://api.okd.kube.domain.com:6443/version?timeout=32s: dial tcp 52.211.227.216:6443: connect: connection refused"
time="2020-02-23T13:39:28Z" level=debug msg="Still waiting for the Kubernetes API: Get https://api.okd.kube.domain.com:6443/version?timeout=32s: dial tcp 63.35.50.149:6443: connect: connection refused"
time="2020-02-23T13:39:36Z" level=info msg="API v1.17.1 up"
time="2020-02-23T13:39:36Z" level=info msg="Waiting up to 40m0s for bootstrapping to complete..."
...
time="2020-02-23T13:55:14Z" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.4.0-0.okd-2020-02-18-212654: 97% complete"
time="2020-02-23T13:55:24Z" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.4.0-0.okd-2020-02-18-212654: 99% complete"
time="2020-02-23T13:57:39Z" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.4.0-0.okd-2020-02-18-212654: 99% complete, waiting on authentication, console, monitoring"
time="2020-02-23T13:57:39Z" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.4.0-0.okd-2020-02-18-212654: 99% complete, waiting on authentication, console, monitoring"
time="2020-02-23T13:58:54Z" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.4.0-0.okd-2020-02-18-212654: 99% complete"
time="2020-02-23T14:01:40Z" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.4.0-0.okd-2020-02-18-212654: 100% complete, waiting on authentication"
time="2020-02-23T14:03:24Z" level=debug msg="Cluster is initialized"
time="2020-02-23T14:03:24Z" level=info msg="Waiting up to 10m0s for the openshift-console route to be created..."
time="2020-02-23T14:03:24Z" level=debug msg="Route found in openshift-console namespace: console"
time="2020-02-23T14:03:24Z" level=debug msg="Route found in openshift-console namespace: downloads"
time="2020-02-23T14:03:24Z" level=debug msg="OpenShift console route is created"
time="2020-02-23T14:03:24Z" level=info msg="Install complete!"
time="2020-02-23T14:03:24Z" level=info msg="To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/output/auth/kubeconfig'"
time="2020-02-23T14:03:24Z" level=info msg="Access the OpenShift web-console here: https://console-openshift-console.apps.okd.kube.domain.com"
REDACTED LINE OF OUTPUT
time="2020-02-23T14:03:25Z" level=info msg="command completed successfully" installID=jcdkd
time="2020-02-23T14:03:25Z" level=info msg="saving installer output" installID=jcdkd
time="2020-02-23T14:03:25Z" level=debug msg="installer console log: level=info msg=\"Credentials loaded from default AWS environment variables\"\nlevel=info msg=\"Consuming Install Config from target directory\"\nlevel=warning msg=\"Found override for release image. Please be warned, this is not advised\"\nlevel=info msg=\"Consuming Master Machines from target directory\"\nlevel=info msg=\"Consuming Common Manifests from target directory\"\nlevel=info msg=\"Consuming OpenShift Install from target directory\"\nlevel=info msg=\"Consuming Worker Machines from target directory\"\nlevel=info msg=\"Consuming Openshift Manifests from target directory\"\nlevel=info msg=\"Consuming Master Ignition Config from target directory\"\nlevel=info msg=\"Consuming Worker Ignition Config from target directory\"\nlevel=info msg=\"Consuming Bootstrap Ignition Config from target directory\"\nlevel=info msg=\"Creating infrastructure resources...\"\nlevel=info msg=\"Waiting up to 20m0s for the Kubernetes API at https://api.okd.kube.domain.com:6443...\"\nlevel=info msg=\"API v1.17.1 up\"\nlevel=info msg=\"Waiting up to 40m0s for bootstrapping to complete...\"\nlevel=info msg=\"Destroying the bootstrap resources...\"\nlevel=error\nlevel=error msg=\"Warning: Resource targeting is in effect\"\nlevel=error\nlevel=error msg=\"You are creating a plan with the -target option, which means that the result\"\nlevel=error msg=\"of this plan may not represent all of the changes requested by the current\"\nlevel=error msg=configuration.\nlevel=error msg=\"\\t\\t\"\nlevel=error msg=\"The -target option is not for routine use, and is provided only for\"\nlevel=error msg=\"exceptional situations such as recovering from errors or mistakes, or when\"\nlevel=error msg=\"Terraform specifically suggests to use it as part of an error message.\"\nlevel=error\nlevel=error\nlevel=error msg=\"Warning: Applied changes may be incomplete\"\nlevel=error\nlevel=error msg=\"The plan was created with the -target option in effect, so some changes\"\nlevel=error msg=\"requested in the configuration may have been ignored and the output values may\"\nlevel=error msg=\"not be fully updated. Run the following command to verify that no other\"\nlevel=error msg=\"changes are pending:\"\nlevel=error msg=\"    terraform plan\"\nlevel=error msg=\"\\t\"\nlevel=error msg=\"Note that the -target option is not suitable for routine use, and is provided\"\nlevel=error msg=\"only for exceptional situations such as recovering from errors or mistakes, or\"\nlevel=error msg=\"when Terraform specifically suggests to use it as part of an error message.\"\nlevel=error\nlevel=info msg=\"Waiting up to 30m0s for the cluster at https://api.okd.kube.domain.com:6443 to initialize...\"\nlevel=info msg=\"Waiting up to 10m0s for the openshift-console route to be created...\"\nlevel=info msg=\"Install complete!\"\nlevel=info msg=\"To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/output/auth/kubeconfig'\"\nlevel=info msg=\"Access the OpenShift web-console here: https://console-openshift-console.apps.okd.kube.domain.com\"\nREDACTED LINE OF OUTPUT\n" installID=vxghr9br
time="2020-02-23T14:03:25Z" level=info msg="install completed successfully" installID=jcdkd

After the installation of the cluster deployment has finished, the Installed value is set to True:

$ kubectl get clusterdeployments.hive.openshift.io  -n okd
NAME   CLUSTERNAME   CLUSTERTYPE   BASEDOMAIN          INSTALLED   INFRAID      AGE
okd    okd                         kube.domain.com     true        okd-jcdkd    54m

At this point you can start using the platform by getting the login credentials from the cluster credential secret Hive created during the installation:

$ kubectl get secrets -n okd okd-0-tbm9t-admin-password -o jsonpath='{.data.username}' | base64 -d
kubeadmin
$ kubectl get secrets -n okd okd-0-tbm9t-admin-password -o jsonpath='{.data.password}' | base64 -d
2T38d-aETpX-dj2YU-UBN4a

Log in via the command-line or the web console:

To delete the cluster simply delete the ClusterDeployment resources which initiates a cluster deprovision and will delete all related AWS resources. If the deprovision gets stuck, manually delete the uninstall finalizer allowing the cluster deployment to be deleted, but note that this may leave artifacts in your AWS account:

$ kubectl delete clusterdeployments.hive.openshift.io okd -n okd --wait=false
clusterdeployment.hive.openshift.io "okd" deleted

Please visit the OpenShift Hive documentation for more information about using Hive.

In the next article I will explain how you can use OpenShift Hive to create, update, delete, patch cluster resources using SyncSets.

Running Istio Service Mesh on OpenShift

In the Kubernetes/OpenShift community everyone is talking about Istio service mesh, so I wanted to share my experience about the installation and running a sample microservice application with Istio on OpenShift 3.11 and 4.0. Service mesh on OpenShift is still at least a few month away from being available generally to run in production but this gives you the possibility to start testing and exploring Istio. I have found good documentation about installing Istio on OCP and OKD have a look for more information.

To install Istio on OpenShift 3.11 you need to apply the node and master prerequisites you see below; for OpenShift 4.0 and above you can skip these steps and go directly to the istio-operator installation:

sudo bash -c 'cat << EOF > /etc/origin/master/master-config.patch
admissionConfig:
  pluginConfig:
    MutatingAdmissionWebhook:
      configuration:
        apiVersion: apiserver.config.k8s.io/v1alpha1
        kubeConfigFile: /dev/null
        kind: WebhookAdmission
    ValidatingAdmissionWebhook:
      configuration:
        apiVersion: apiserver.config.k8s.io/v1alpha1
        kubeConfigFile: /dev/null
        kind: WebhookAdmission
EOF'
        
sudo cp -p /etc/origin/master/master-config.yaml /etc/origin/master/master-config.yaml.prepatch
sudo bash -c 'oc ex config patch /etc/origin/master/master-config.yaml.prepatch -p "$(cat /etc/origin/master/master-config.patch)" > /etc/origin/master/master-config.yaml'
sudo su -
master-restart api
master-restart controllers
exit       

sudo bash -c 'cat << EOF > /etc/sysctl.d/99-elasticsearch.conf 
vm.max_map_count = 262144
EOF'

sudo sysctl vm.max_map_count=262144

The Istio installation is straight forward by starting first to install the istio-operator:

oc new-project istio-operator
oc new-app -f https://raw.githubusercontent.com/Maistra/openshift-ansible/maistra-0.9/istio/istio_community_operator_template.yaml --param=OPENSHIFT_ISTIO_MASTER_PUBLIC_URL=<-master-public-hostname->

Verify the operator deployment:

oc logs -n istio-operator $(oc -n istio-operator get pods -l name=istio-operator --output=jsonpath={.items..metadata.name})

Once the operator is running we can start deploying Istio components by creating a custom resource:

cat << EOF >  ./istio-installation.yaml
apiVersion: "istio.openshift.com/v1alpha1"
kind: "Installation"
metadata:
  name: "istio-installation"
  namespace: istio-operator
EOF

oc create -n istio-operator -f ./istio-installation.yaml

Check and watch the Istio installation progress which might take a while to complete:

oc get pods -n istio-system -w

# The installation of the core components is finished when you see:
...
openshift-ansible-istio-installer-job-cnw72   0/1       Completed   0         4m

Afterwards, to finish off the Istio installation, we need to install the Kiali web console:

bash <(curl -L https://git.io/getLatestKialiOperator)
oc get route -n istio-system -l app=kiali

Verifying that all Istio components are running:

$ oc get pods -n istio-system
NAME                                          READY     STATUS      RESTARTS   AGE
elasticsearch-0                               1/1       Running     0          9m
grafana-74b5796d94-4ll5d                      1/1       Running     0          9m
istio-citadel-db879c7f8-kfxfk                 1/1       Running     0          11m
istio-egressgateway-6d78858d89-58lsd          1/1       Running     0          11m
istio-galley-6ff54d9586-8r7cl                 1/1       Running     0          11m
istio-ingressgateway-5dcf9fdf4b-4fjj5         1/1       Running     0          11m
istio-pilot-7ccf64f659-ghh7d                  2/2       Running     0          11m
istio-policy-6c86656499-v45zr                 2/2       Running     3          11m
istio-sidecar-injector-6f696b8495-8qqjt       1/1       Running     0          11m
istio-telemetry-686f78b66b-v7ljf              2/2       Running     3          11m
jaeger-agent-k4tpz                            1/1       Running     0          9m
jaeger-collector-64bc5678dd-wlknc             1/1       Running     0          9m
jaeger-query-776d4d754b-8z47d                 1/1       Running     0          9m
kiali-5fd946b855-7lw2h                        1/1       Running     0          2m
openshift-ansible-istio-installer-job-cnw72   0/1       Completed   0          13m
prometheus-75b849445c-l7rlr                   1/1       Running     0          11m

Let’s start to deploy the microservice application example by using the Google Hipster Shop, it contains multiple microservices which is great to test with Istio:

# Create new project
oc new-project hipster-shop

# Set permissions to allow Istio to deploy the Envoy-Proxy side-car container
oc adm policy add-scc-to-user anyuid -z default -n hipster-shop
oc adm policy add-scc-to-user privileged -z default -n hipster-shop

# Create Hipster Shop deployments and Istio services
oc create -f https://raw.githubusercontent.com/berndonline/openshift-ansible/master/examples/istio-hipster-shop.yml
oc create -f https://raw.githubusercontent.com/berndonline/openshift-ansible/master/examples/istio-manifest.yml

# Wait and check that all pods are running before creating the load generator
oc get pods -n hipster-shop -w

# Create load generator deployment
oc create -f https://raw.githubusercontent.com/berndonline/openshift-ansible/master/examples/istio-loadgenerator.yml

As you see below each pod has a sidecar container with the Istio Envoy proxy which handles pod traffic:

[[email protected] ~]$ oc get pods
NAME                                     READY     STATUS    RESTARTS   AGE
adservice-7894dbfd8c-g4m9v               2/2       Running   0          49m
cartservice-758d66c648-79fj4             2/2       Running   4          49m
checkoutservice-7b9dc8b755-h2b2v         2/2       Running   0          49m
currencyservice-7b5c5f48fc-gtm9x         2/2       Running   0          49m
emailservice-79578566bb-jvwbw            2/2       Running   0          49m
frontend-6497c5f748-5fc4f                2/2       Running   0          49m
loadgenerator-764c5547fc-sw6mg           2/2       Running   0          40m
paymentservice-6b989d657c-klp4d          2/2       Running   0          49m
productcatalogservice-5bfbf4c77c-cw676   2/2       Running   0          49m
recommendationservice-c947d84b5-svbk8    2/2       Running   0          49m
redis-cart-79d84748cf-cvg86              2/2       Running   0          49m
shippingservice-6ccb7d8ff7-66v8m         2/2       Running   0          49m
[[email protected] ~]$

The Kiali web console answers the question about what microservices are part of the service mesh and how are they connected which gives you a great level of detail about the traffic flows:

Detailed traffic flow view:

The Isito installation comes with Jaeger which is an open source tracing tool to monitor and troubleshoot transactions:

Enough about this, lets connect to our cool Hipster Shop and happy shopping:

Additionally there is another example, the Istio Bookinfo if you want to try something smaller and less complex:

oc new-project myproject

oc adm policy add-scc-to-user anyuid -z default -n myproject
oc adm policy add-scc-to-user privileged -z default -n myproject

oc apply -n myproject -f https://raw.githubusercontent.com/Maistra/bookinfo/master/bookinfo.yaml
oc apply -n myproject -f https://raw.githubusercontent.com/Maistra/bookinfo/master/bookinfo-gateway.yaml
export GATEWAY_URL=$(oc get route -n istio-system istio-ingressgateway -o jsonpath='{.spec.host}')
curl -o /dev/null -s -w "%{http_code}\n" http://$GATEWAY_URL/productpage

curl -o destination-rule-all.yaml https://raw.githubusercontent.com/istio/istio/release-1.0/samples/bookinfo/networking/destination-rule-all.yaml
oc apply -f destination-rule-all.yaml

curl -o destination-rule-all-mtls.yaml https://raw.githubusercontent.com/istio/istio/release-1.0/samples/bookinfo/networking/destination-rule-all-mtls.yaml
oc apply -f destination-rule-all-mtls.yaml

oc get destinationrules -o yaml

I hope this is a useful article for getting started with Istio service mesh on OpenShift.