Kubernetes Cluster API – Provision workload clusters on AWS

The past few months I have been following the progress of the Kubernetes Cluster API which is part of the Kubernetes SIG (special interest group) Cluster-Lifecycle because they made good progress and wanted to try out the AWS provider version to deploy Kubeadm clusters. There are multiple infrastructure / cloud providers available which can be used, have a look at supported providers.

RedHat has based the Machine API Operator for the OpenShift 4 platform on the Kubernetes Cluster API and forked some of the cloud provider integrations but in OpenShift 4 this has a different use-case for the cluster to managed itself without the need of a central management cluster. I actually like RedHat’s concept and adaptation of the Cluster API and I hope we will see something similar in the upstream project.

Bootstrapping workload clusters are pretty straight forward but before we can start with deploying the workload cluster we need a central Kubernetes management cluster for running the Cluster API components for your selected cloud provider. In The Cluster API Book for example they use a KinD (Kubernetes in Docker) cluster to provision the workload clusters.

To deploy the Cluster API components you need the clusterctl (Cluster API) and clusterawsadm (Cluster API AWS Provider) command-line utilities.

curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v0.3.14/clusterctl-linux-amd64 -o clusterctl
chmod +x ./clusterctl
sudo mv ./clusterctl /usr/local/bin/clusterctl
curl -L https://github.com/kubernetes-sigs/cluster-api-provider-aws/releases/download/v0.6.4/clusterawsadm-linux-amd64 -o clusterawsadm
chmod +x ./clusterawsadm
sudo mv ./clusterawsadm /usr/local/bin/clusterawsadm

Let’s start to prepare to initialise the management cluster. You need a AWS IAM service account and in my example I enabled the experimental features-gates for MachinePool and ClusterResourceSets before running clusterawsadm to apply the required AWS IAM configuration.

$ export AWS_ACCESS_KEY_ID='<-YOUR-ACCESS-KEY->'
$ export AWS_SECRET_ACCESS_KEY='<-YOUR-SECRET-ACCESS-KEY->'
$ export EXP_MACHINE_POOL=true
$ export EXP_CLUSTER_RESOURCE_SET=true
$ clusterawsadm bootstrap iam create-cloudformation-stack
Attempting to create AWS CloudFormation stack cluster-api-provider-aws-sigs-k8s-io
I1206 22:23:19.620891  357601 service.go:59] AWS Cloudformation stack "cluster-api-provider-aws-sigs-k8s-io" already exists, updating

Following resources are in the stack: 

Resource                  |Type                                                                                |Status
AWS::IAM::InstanceProfile |control-plane.cluster-api-provider-aws.sigs.k8s.io                                  |CREATE_COMPLETE
AWS::IAM::InstanceProfile |controllers.cluster-api-provider-aws.sigs.k8s.io                                    |CREATE_COMPLETE
AWS::IAM::InstanceProfile |nodes.cluster-api-provider-aws.sigs.k8s.io                                          |CREATE_COMPLETE
AWS::IAM::ManagedPolicy   |arn:aws:iam::552276840222:policy/control-plane.cluster-api-provider-aws.sigs.k8s.io |CREATE_COMPLETE
AWS::IAM::ManagedPolicy   |arn:aws:iam::552276840222:policy/nodes.cluster-api-provider-aws.sigs.k8s.io         |CREATE_COMPLETE
AWS::IAM::ManagedPolicy   |arn:aws:iam::552276840222:policy/controllers.cluster-api-provider-aws.sigs.k8s.io   |CREATE_COMPLETE
AWS::IAM::Role            |control-plane.cluster-api-provider-aws.sigs.k8s.io                                  |CREATE_COMPLETE
AWS::IAM::Role            |controllers.cluster-api-provider-aws.sigs.k8s.io                                    |CREATE_COMPLETE
AWS::IAM::Role            |nodes.cluster-api-provider-aws.sigs.k8s.io                                          |CREATE_COMPLETE

This might take a few minutes before you can continue and run clusterctl to initialise the Cluster API components on your Kubernetes management cluster with the option –watching-namespace where you can apply the cluster deployment manifests.

$ export AWS_B64ENCODED_CREDENTIALS=$(clusterawsadm bootstrap credentials encode-as-profile)

WARNING: `encode-as-profile` should only be used for bootstrapping.

$ clusterctl init --infrastructure aws --watching-namespace k8s
Fetching providers
Installing cert-manager Version="v0.16.1"
Waiting for cert-manager to be available...
Installing Provider="cluster-api" Version="v0.3.14" TargetNamespace="capi-system"
Installing Provider="bootstrap-kubeadm" Version="v0.3.14" TargetNamespace="capi-kubeadm-bootstrap-system"
Installing Provider="control-plane-kubeadm" Version="v0.3.14" TargetNamespace="capi-kubeadm-control-plane-system"
Installing Provider="infrastructure-aws" Version="v0.6.3" TargetNamespace="capa-system"

Your management cluster has been initialized successfully!

You can now create your first workload cluster by running the following:

  clusterctl config cluster [name] --kubernetes-version [version] | kubectl apply -f -

Now we have finished deploying the needed Cluster API components and are ready to create your first Kubernetes workload cluster. I go through the different custom resources and configuration options for the cluster provisioning. This starts with the cloud infrastructure configuration as you see in the example below for the VPC setup. You don’t have to use all three Availability Zone and can start with a single AZ in a region.

---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: AWSCluster
metadata:
  name: cluster-1
  namespace: k8s
spec:
  region: eu-west-1
  sshKeyName: default
  networkSpec:
    vpc:
      cidrBlock: "10.0.0.0/23"
    subnets:
    - availabilityZone: eu-west-1a
      cidrBlock: "10.0.0.0/27"
      isPublic: true
    - availabilityZone: eu-west-1b
      cidrBlock: "10.0.0.32/27"
      isPublic: true
    - availabilityZone: eu-west-1c
      cidrBlock: "10.0.0.64/27"
      isPublic: true
    - availabilityZone: eu-west-1a
      cidrBlock: "10.0.1.0/27"
    - availabilityZone: eu-west-1b
      cidrBlock: "10.0.1.32/27"
    - availabilityZone: eu-west-1c
      cidrBlock: "10.0.1.64/27"

Alternatively you can also provision the workload cluster into an existing VPC, in this case your cloud infrastructure configuration looks slightly different and you need to specify VPC and subnet IDs.

---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: AWSCluster
metadata:
  name: cluster-1
  namespace: k8s
spec:
  region: eu-west-1
  sshKeyName: default
  networkSpec:
    vpc:
      id: vpc-0425c335226437144
    subnets:
    - id: subnet-0261219d564bb0dc5
    - id: subnet-0fdcccba78668e013
...

Next we define the Kubeadm control-plane configuration and start with the AWS Machine Template to define the instance type and custom node configuration. Then follows the Kubeadm control-plane config referencing the machine template and amounts of replicas and Kubernetes control-plane version:

---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: AWSMachineTemplate
metadata:
  name: cluster-1
  namespace: k8s
spec:
  template:
    spec:
      iamInstanceProfile: control-plane.cluster-api-provider-aws.sigs.k8s.io
      instanceType: t3.small
      sshKeyName: default
---
apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
kind: KubeadmControlPlane
metadata:
  name: cluster-1-control-plane
  namespace: k8s
spec:
  infrastructureTemplate:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: AWSMachineTemplate
    name: cluster-1-control-plane
  kubeadmConfigSpec:
    clusterConfiguration:
      apiServer:
        extraArgs:
          cloud-provider: aws
      controllerManager:
        extraArgs:
          cloud-provider: aws
    initConfiguration:
      nodeRegistration:
        kubeletExtraArgs:
          cloud-provider: aws
        name: '{{ ds.meta_data.local_hostname }}'
    joinConfiguration:
      nodeRegistration:
        kubeletExtraArgs:
          cloud-provider: aws
        name: '{{ ds.meta_data.local_hostname }}'
  replicas: 1
  version: v1.20.4

We continue with the data-plane (worker) nodes which also starts with the AWS machine template, additionally we need a Kubeadm Config Template and then the Machine Deployment for the worker nodes with a number of replicas and used Kubernetes version.

---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: AWSMachineTemplate
metadata:
  name: cluster-1-data-plane-0
  namespace: k8s
spec:
  template:
    spec:
      iamInstanceProfile: nodes.cluster-api-provider-aws.sigs.k8s.io
      instanceType: t3.small
      sshKeyName: default
---
apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
kind: KubeadmConfigTemplate
metadata:
  name: cluster-1-data-plane-0
  namespace: k8s
spec:
  template:
    spec:
      joinConfiguration:
        nodeRegistration:
          kubeletExtraArgs:
            cloud-provider: aws
          name: '{{ ds.meta_data.local_hostname }}'
---
apiVersion: cluster.x-k8s.io/v1alpha3
kind: MachineDeployment
metadata:
  name: cluster-1-data-plane-0
  namespace: k8s
spec:
  clusterName: cluster-1
  replicas: 1
  selector:
    matchLabels: null
  template:
    metadata:
      labels:
        "nodepool": "nodepool-0"
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
          kind: KubeadmConfigTemplate
          name: cluster-1-data-plane-0
      clusterName: cluster-1
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
        kind: AWSMachineTemplate
        name: cluster-1-data-plane-0
      version: v1.20.4

A workload cluster can be very easily upgraded by changing the .spec.version in the MachineDeployment and KubeadmControlPlane configuration. You can’t jump over a Kubernetes versions and can only upgrade to the next available version example: v1.18.4 to v1.19.8 or v1.19.8 to v1.20.4. See the list of supported AMIs and Kubernetes versions for the AWS provider.

At the beginning we enabled the feature-gates when we were initialising the management cluster to allow us to use ClusterResourceSets. This is incredible useful because I can define a set of resources which gets applied during the provisioning of the cluster. This only get executed one time during the bootstrap and will be not reconciled afterwards. In the configuration you see the reference to two configmaps for adding the Calico CNI plugin and the Nginx Ingress controller.

---
apiVersion: addons.cluster.x-k8s.io/v1alpha3
kind: ClusterResourceSet
metadata:
  name: cluster-1-crs-0
  namespace: k8s
spec:
  clusterSelector:
    matchLabels:
      cluster.x-k8s.io/cluster-name: cluster-1
  resources:
  - kind: ConfigMap
    name: calico-cni
  - kind: ConfigMap
    name: nginx-ingress

Example of the two configmaps which contain the YAML manifests:

apiVersion: v1
kind: ConfigMap
metadata:
  creationTimestamp: null
  name: calico-cni
  namespace: k8s
data:
  calico.yaml: |+
    ---
    # Source: calico/templates/calico-config.yaml
    # This ConfigMap is used to configure a self-hosted Calico installation.
    kind: ConfigMap
    apiVersion: v1
    metadata:
      name: calico-config
      namespace: kube-system
...
---
apiVersion: v1
data:
  deploy.yaml: |+
    ---
    apiVersion: v1
    kind: Namespace
    metadata:
      name: ingress-nginx
      labels:
        app.kubernetes.io/name: ingress-nginx
        app.kubernetes.io/instance: ingress-nginx
...

Without ClusterResourceSet you would need to manually apply the CNI and ingress controller manifests which is not great because you need the CNI plugin for all nodes to go into Ready state.

$ kubectl --kubeconfig=./cluster-1.kubeconfig   apply -f https://docs.projectcalico.org/v3.15/manifests/calico.yaml
$ kubectl --kubeconfig=./cluster-1.kubeconfig apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.41.2/deploy/static/provider/aws/deploy.yaml

Finally after we have created the configuration of the workload cluster we can apply cluster manifest with the option for setting custom clusterNetwork and specify with service and pod IP range.

---
apiVersion: cluster.x-k8s.io/v1alpha3
kind: Cluster
metadata:
  name: cluster-1
  namespace: k8s
  labels:
    cluster.x-k8s.io/cluster-name: cluster-1
spec:
  clusterNetwork:
    services:
      cidrBlocks:
      - 172.30.0.0/16
    pods:
      cidrBlocks:
      - 10.128.0.0/14
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
    kind: KubeadmControlPlane
    name: cluster-1-control-plane
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: AWSCluster
    name: cluster-1

The provisioning of the workload cluster will take around 10 to 15 mins and you can follow the progress by checking the status of different configurations we have applied previously.

You can scale both Kubeadm control-plane and MachineDeployment afterwards to change the size of your cluster. MachineDeployment can be scaled down to zero to save cost.

$ kubectl scale KubeadmControlPlane cluster-1-control-plane --replicas=1
$ kubectl scale MachineDeployment cluster-1-data-plane-0 --replicas=0

After the provisioning is completed you can get kubeconfig of the cluster from the secret which got created during the bootstrap:

$ kubectl --namespace=k8s get secret cluster-1-kubeconfig    -o jsonpath={.data.value} | base64 --decode    > cluster-1.kubeconfig

Example check the node state.

$ kubectl --kubeconfig=./cluster-1.kubeconfig get nodes

When your cluster is provisioned and nodes are in Ready state you can apply the MachineHealthCheck for the data-plane (worker) nodes. This automatically remediate unhealthy nodes and provisions new nodes to join them into the cluster.

---
apiVersion: cluster.x-k8s.io/v1alpha3
kind: MachineHealthCheck
metadata:
  name: cluster-1-node-unhealthy-5m
  namespace: k8s
spec:
  # clusterName is required to associate this MachineHealthCheck with a particular cluster
  clusterName: cluster-1
  # (Optional) maxUnhealthy prevents further remediation if the cluster is already partially unhealthy
  maxUnhealthy: 40%
  # (Optional) nodeStartupTimeout determines how long a MachineHealthCheck should wait for
  # a Node to join the cluster, before considering a Machine unhealthy
  nodeStartupTimeout: 10m
  # selector is used to determine which Machines should be health checked
  selector:
    matchLabels:
      nodepool: nodepool-0 
  # Conditions to check on Nodes for matched Machines, if any condition is matched for the duration of its timeout, the Machine is considered unhealthy
  unhealthyConditions:
  - type: Ready
    status: Unknown
    timeout: 300s
  - type: Ready
    status: "False"
    timeout: 300s

I hope this is a useful article for getting started with the Kubernetes Cluster API.

OpenShift Hive – Deploy Single Node (All-in-One) OKD Cluster on AWS

The concept of a single-node or All-in-One OpenShift / Kubernetes cluster isn’t something new, years ago when I was working with OpenShift 3 and before that with native Kubernetes, we were using single-node clusters as ephemeral development environment, integrations testing for pull-request or platform releases. It was only annoying because this required complex Jenkins pipelines, provision the node first, then install prerequisites and run the openshift-ansible installer playbook. Not always reliable and not a great experience but it done the job.

This is possible as well with the new OpenShift/OKD 4 version and with the help from OpenShift Hive. The experience is more reliable and quicker than previously and I don’t need to worry about de-provisioning, I will let Hive delete the cluster after a few hours automatically.

It requires a few simple modifications in the install-config. You need to add the Availability Zone you want where the instance will be created. When doing this the VPC will only have two subnets, one public and one private subnet in eu-west-1. You can also install the single-node cluster into an existing VPC you just have to specify subnet ids. Change the compute worker node replicas zero and control-plane replicas to one. Make sure to have an instance size with enough CPU and memory for all OpenShift components because they need to fit onto the single node. The rest of the install-config is pretty much standard.

---
apiVersion: v1
baseDomain: k8s.domain.com
compute:
- name: worker
  platform:
    aws:
      zones:
      - eu-west-1a
      rootVolume:
        iops: 100
        size: 22
        type: gp2
      type: r4.xlarge
  replicas: 0
controlPlane:
  name: master
  platform:
    aws:
      zones:
      - eu-west-1a
      rootVolume:
        iops: 100
        size: 22
        type: gp2
      type: r5.2xlarge
  replicas: 1
metadata:
  creationTimestamp: null
  name: okd-aio
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  machineCIDR: 10.0.0.0/16
  networkType: OpenShiftSDN
  serviceNetwork:
  - 172.30.0.0/16
platform:
  aws:
    region: eu-west-1
pullSecret: ""
sshKey: ""

Create a new install-config secret for the cluster.

kubectl create secret generic install-config-aio -n okd --from-file=install-config.yaml=./install-config-aio.yaml

We will be using OpenShift Hive for the cluster deployment because the provision is more simplified and Hive can also apply any configuration using SyncSets or SelectorSyncSets which is needed. Add the annotation hive.openshift.io/delete-after: “2h” and Hive will automatically delete the cluster after 4 hours.

---
apiVersion: hive.openshift.io/v1
kind: ClusterDeployment
metadata:
  creationTimestamp: null
  annotations:
    hive.openshift.io/delete-after: "2h"
  name: okd-aio 
  namespace: okd
spec:
  baseDomain: k8s.domain.com
  clusterName: okd-aio
  controlPlaneConfig:
    servingCertificates: {}
  installed: false
  platform:
    aws:
      credentialsSecretRef:
        name: aws-creds
      region: eu-west-1
  provisioning:
    releaseImage: quay.io/openshift/okd:4.5.0-0.okd-2020-07-14-153706-ga
    installConfigSecretRef:
      name: install-config-aio
  pullSecretRef:
    name: pull-secret
  sshKey:
    name: ssh-key
status:
  clusterVersionStatus:
    availableUpdates: null
    desired:
      force: false
      image: ""
      version: ""
    observedGeneration: 0
    versionHash: ""

Apply the cluster deployment to your clusters namespace.

kubectl apply -f  ./clusterdeployment-aio.yaml

This is slightly faster than provision 6 nodes cluster and will take around 30mins until your ephemeral test cluster is ready to use.

Mozilla SOPS and GitOps Toolkit (Flux CD v2) to decrypt and apply Kubernetes secrets

Using GitOps way of working and tools like the GitOps toolkit (Flux CD v2) is great for applying configuration to your Kubernetes clusters but what about secrets and how can you store them securely in your repository? The perfect tool for this is Mozilla’s SOPS which uses a cloud based KMS, HashiCorp Vault or a PGP key to encrypt and decrypt your secrets and store them in encrypted form with the rest of your configuration in a code repostory. There is a guide in the Flux documentation about how to use SOPS but I did this slightly differently with a Google Cloud KMS.

Start by downloading the latest version of the Mozilla SOPS command-line binary. This is what makes SOPS so easy to use, there is not much you need to encrypt or decrypt secrets apart for an KMS system or a simple PGP key.

sudo wget -O /usr/local/bin/sops https://github.com/mozilla/sops/releases/download/v3.7.1/sops-v3.7.1.linux
sudo chmod 755 /usr/local/bin/sops

Next create the Google Cloud KMS, which I am using in my example.

$ gcloud auth application-default login
Go to the following link in your browser:

    https://accounts.google.com/o/oauth2/auth?code_challenge=xxxxxxx&prompt=select_account&code_challenge_method=S256&access_type=offline&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&response_type=code&client_id=xxxxxxxxx-xxxxxxxxxxxxxxxxxx.apps.googleusercontent.com&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.email+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Faccounts.reauth


Enter verification code: xxxxxxxxxxxxxxx

Credentials saved to file: [/home/ubuntu/.config/gcloud/application_default_credentials.json]

These credentials will be used by any library that requests Application Default Credentials (ADC).
$ gcloud kms keyrings create sops --location global
$ gcloud kms keys create sops-key --location global --keyring sops --purpose encryption
$ gcloud kms keys list --location global --keyring sops
NAME                                                                           PURPOSE          ALGORITHM                    PROTECTION_LEVEL  LABELS  PRIMARY_ID  PRIMARY_STATE
projects/kubernetes-xxxxxx/locations/global/keyRings/sops/cryptoKeys/sops-key  ENCRYPT_DECRYPT  GOOGLE_SYMMETRIC_ENCRYPTION  SOFTWARE                  1           ENABLED

To encrypt secrets you need to create a .sops.yaml file in root your code repository.

creation_rules:
  - path_regex: \.yaml$
    gcp_kms: projects/kubernetes-xxxxxx/locations/global/keyRings/sops/cryptoKeys/sops-key
    encrypted_regex: ^(data|stringData)$

Let’s create a simple Kubernetes secret for testing.

$ cat secret.yaml
---
apiVersion: v1
kind: Secret
metadata:
  name: mysecret
  namespace: default
type: Opaque
data:
  username: YWRtaW4=
  password: MWYyZDFlMmU2N2Rm

Encrypt your secret.yaml using SOPS with the following example.

$ sops -e secret.yaml
apiVersion: v1
kind: Secret
metadata:
    name: mysecret
    namespace: default
type: Opaque
data:
    username: ENC[AES256_GCM,data:<-HASH->,type:str]
    password: ENC[AES256_GCM,data:<-HASH->,type:str]
sops:
    kms: []
    gcp_kms:
    -   resource_id: projects/kubernetes-xxxxxx/locations/global/keyRings/sops/cryptoKeys/sops-key
        created_at: '2021-03-01T17:25:29Z'
        enc: <-HASH->
    azure_kv: []
    lastmodified: '2021-03-01T17:25:29Z'
    mac: ENC[AES256_GCM,data:<-HASH->,type:str]
    pgp: []
    encrypted_regex: ^(data|stringData)$
    version: 3.5.0

Alternatively you can encrypt and replace the file in-place.

$ sops -i -e secret.yaml

To decrypt the yaml file use sops -d or replace in-place using sops -i -d.

$ sops -d secret.yaml
apiVersion: v1
kind: Secret
metadata:
    name: mysecret
    namespace: default
type: Opaque
data:
    username: YWRtaW4=
    password: MWYyZDFlMmU2N2Rm

You can also edit an encrypted file with the default terminal editor by directly using the sops command without any options.

$ sops secret.yaml
File has not changed, exiting.

Let’s use the Flux CD Kustomize controller for this to decrypt Kubernetes secrets and apply to the specified namespace. First you need to create a GCP service account for Flux and grant the permission to decrypt.

Download the GCP json authentication file for the service account and create a new secret in the Flux namespace.

$ kubectl create secret generic gcp-auth -n gotk-system --from-file=./sops-gcp
$ kubectl get secrets -n gotk-system gcp-auth -o yaml
apiVersion: v1
data:
  sops-gcp: <-BASE64-ENCODED-GCP-AUTH-JSON->
kind: Secret
metadata:
  creationTimestamp: "2021-03-01T17:34:11Z"
  name: gcp-auth
  namespace: gotk-system
  resourceVersion: "1879000"
  selfLink: /api/v1/namespaces/gotk-system/secrets/gcp-auth
  uid: 10a14c1f-19a6-41a2-8610-694b12efefee
type: Opaque

You need to update the kustomize-controller deployment and add the volume mount for the sops GCP secret and the environment variable with the value where to find the Google application credential file. This is where my example is different to what is documented because I am not using integrated cloud authentication because my cluster is running locally.

...
    spec:
      containers:
...
      - env:
        - name: GOOGLE_APPLICATION_CREDENTIALS
          value: /tmp/.gcp/credentials/sops-gcp
        name: manager
        volumeMounts:
        - mountPath: /tmp/.gcp/credentials
          name: sops-gcp
          readOnly: true
      volumes:
      - name: sops-gcp
        secret:
          defaultMode: 420
          secretName: sops-gcp
...

In the Kustomize object you enable the sops decryption provider and the controller automatically decrypts and applies secrets in the next reconcile loop.

apiVersion: kustomize.toolkit.fluxcd.io/v1beta1
kind: Kustomization
metadata:
  name: cluster
  namespace: gotk-system
spec:
  decryption:
    provider: sops
  interval: 5m0s
  path: ./clusters/cluster-dev
  prune: true
  sourceRef:
    kind: GitRepository
    name: github-source

This takes a few minutes until the sync is completed and we can find out if the example secret got created correctly.

$ kubectl get secrets mysecret -n default -o yaml
apiVersion: v1
data:
  password: MWYyZDFlMmU2N2Rm
  username: YWRtaW4=
kind: Secret
metadata:
  name: mysecret
  namespace: default
  resourceVersion: "3439293"
  selfLink: /api/v1/namespaces/default/secrets/mysecret
  uid: 4a009675-3c89-448b-bb86-6211cec3d4ea
type: Opaque

This is how to use SOPS and Flux CD v2 to decrypt and apply Kubernetes secrets using GitOps.

OpenShift / OKD 4.x Cluster Deployment using OpenShift Hive

Before you continue to deploy an OpenShift or OKD cluster please check out my other posts about OpenShift Hive – API driven OpenShift cluster provisioning and management operator and Getting started with OpenShift Hive  because you need a running OpenShift Hive operator.

To install the OKD (OpenShift Origin Community Distribution) version we need a few things beforehand: a cluster namespace, AWS credentials, SSH keys, image pull secret, install-config, cluster image version and cluster deployment.

Let’s start to create the cluster namespace:

cat <<EOF | kubectl apply -f -
---
apiVersion: v1
kind: Namespace
metadata:
  name: okd

Create a secret with your ssh key:

$ kubectl create secret generic ssh-key -n okd --from-file=ssh-privatekey=/home/ubuntu/.ssh/id_rsa --from-file=ssh-publickey=/home/ubuntu/.ssh/id_rsa.pub

Create the AWS credential secret:

$ kubectl create secret generic aws-creds -n okd --from-literal=aws_secret_access_key=$AWS_SECRET_ACCESS_KEY --from-literal=aws_access_key_id=$AWS_ACCESS_KEY_ID

Create an image pull secret, this is not important for installing a OKD 4.x cluster but needs to be present otherwise Hive will not start the cluster deployment. If you have an RedHat Enterprise subscription for OpenShift then you need to add here your RedHat image pull secret:

$ kubectl create secret generic pull-secret -n okd --from-file=.dockerconfigjson=/home/ubuntu/.docker/config.json --type=kubernetes.io/dockerconfigjson 

Create a install-config.yaml for the cluster deployment and modify to your needs:

---
apiVersion: v1
baseDomain: kube.domain.com
compute:
- name: worker
  platform:
    aws:
      rootVolume:
        iops: 100
        size: 22
        type: gp2
      type: m4.xlarge
  replicas: 3
controlPlane:
  name: master
  platform:
    aws:
      rootVolume:
        iops: 100
        size: 22
        type: gp2
      type: m4.xlarge
replicas: 3
metadata:
  creationTimestamp: null
  name: okd
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  machineCIDR: 10.0.0.0/16
  networkType: OpenShiftSDN
  serviceNetwork:
  - 172.30.0.0/16
platform:
  aws:
    region: eu-west-1
pullSecret: ""
sshKey: ""

Create the install-config secret for the cluster deployment:

$ kubectl create secret generic install-config -n okd --from-file=install-config.yaml=./install-config.yaml

Create the ClusterImageSet for OKD. In my example I am using the latest OKD 4.4.0 release. More information about the available OKD release versions you find here: https://origin-release.svc.ci.openshift.org/

cat <<EOF | kubectl apply -f -
---
apiVersion: hive.openshift.io/v1
kind: ClusterImageSet
metadata:
  name: okd-4-4-0-imageset
spec:
  releaseImage: registry.svc.ci.openshift.org/origin/release:4.4.0-0.okd-2020-02-18-212654
EOF 

Below is an example of a RedHat Enterprise OpenShift 4 ClusterImageSet:

---
apiVersion: hive.openshift.io/v1
kind: ClusterImageSet
metadata:
  name: openshift-4-3-0-imageset
spec:
  releaseImage: quay.io/openshift-release-dev/ocp-release:4.3.0-x86_64

For Hive to start with the cluster deployment, we need to modify the manifest below and add the references to the previous created secrets, install-config and cluster imageset version:

cat <<EOF | kubectl apply -f -
---
apiVersion: hive.openshift.io/v1
kind: ClusterDeployment
metadata:
  creationTimestamp: null
  name: okd
  namespace: okd
spec:
  baseDomain: kube.domain.com
  clusterName: okd
  controlPlaneConfig:
    servingCertificates: {}
  installed: false
  platform:
    aws:
      credentialsSecretRef:
        name: aws-creds
      region: eu-west-1
  provisioning:
    imageSetRef:
      name: okd-4-4-0-imageset
    installConfigSecretRef:
      name: install-config 
  pullSecretRef:
    name: pull-secret
  sshKey:
    name: ssh-key
status:
  clusterVersionStatus:
    availableUpdates: null
    desired:
      force: false
      image: ""
      version: ""
    observedGeneration: 0
    versionHash: ""
EOF

Once you submitted the ClusterDeployment manifest, the Hive operator will start to deploy the cluster straightaway:

$ kubectl get clusterdeployments.hive.openshift.io -n okd
NAME   CLUSTERNAME   CLUSTERTYPE   BASEDOMAIN          INSTALLED   INFRAID     AGE
okd    okd                         kube.domain.com     false       okd-jcdkd   107s

Hive will create the provision (install) pod for the cluster deployment and inject the installer configuration:

$ kubectl get pods -n okd
NAME                          READY   STATUS    RESTARTS   AGE
okd-0-tbm9t-provision-c5hpf   1/3     Running   0          57s

You can view the logs to check the progress of the cluster deployment. You will see the terraform output for creating the infrastructure resources and feedback from the installer about the installation progress. At the end you will see when the installation completed successfully:

$ kubectl logs okd-0-tbm9t-provision-c5hpf -n okd -c hive -f
...
time="2020-02-23T13:31:41Z" level=debug msg="module.dns.aws_route53_zone.int: Creating..."
time="2020-02-23T13:31:42Z" level=debug msg="aws_ami_copy.main: Still creating... [3m40s elapsed]"
time="2020-02-23T13:31:51Z" level=debug msg="module.dns.aws_route53_zone.int: Still creating... [10s elapsed]"
time="2020-02-23T13:31:52Z" level=debug msg="aws_ami_copy.main: Still creating... [3m50s elapsed]"
time="2020-02-23T13:32:01Z" level=debug msg="module.dns.aws_route53_zone.int: Still creating... [20s elapsed]"
time="2020-02-23T13:32:02Z" level=debug msg="aws_ami_copy.main: Still creating... [4m0s elapsed]"
time="2020-02-23T13:32:11Z" level=debug msg="module.dns.aws_route53_zone.int: Still creating... [30s elapsed]"
time="2020-02-23T13:32:12Z" level=debug msg="aws_ami_copy.main: Still creating... [4m10s elapsed]"
time="2020-02-23T13:32:21Z" level=debug msg="module.dns.aws_route53_zone.int: Still creating... [40s elapsed]"
time="2020-02-23T13:32:22Z" level=debug msg="aws_ami_copy.main: Still creating... [4m20s elapsed]"
time="2020-02-23T13:32:31Z" level=debug msg="module.dns.aws_route53_zone.int: Still creating... [50s elapsed]"
time="2020-02-23T13:32:32Z" level=debug msg="aws_ami_copy.main: Still creating... [4m30s elapsed]"
time="2020-02-23T13:32:41Z" level=debug msg="module.dns.aws_route53_zone.int: Still creating... [1m0s elapsed]"
time="2020-02-23T13:32:41Z" level=debug msg="module.dns.aws_route53_zone.int: Creation complete after 1m0s [id=Z10411051RAEUMMAUH39E]"
time="2020-02-23T13:32:41Z" level=debug msg="module.dns.aws_route53_record.etcd_a_nodes[0]: Creating..."
time="2020-02-23T13:32:41Z" level=debug msg="module.dns.aws_route53_record.api_internal: Creating..."
time="2020-02-23T13:32:41Z" level=debug msg="module.dns.aws_route53_record.api_external_internal_zone: Creating..."
time="2020-02-23T13:32:41Z" level=debug msg="module.dns.aws_route53_record.etcd_a_nodes[2]: Creating..."
time="2020-02-23T13:32:41Z" level=debug msg="module.dns.aws_route53_record.etcd_a_nodes[1]: Creating..."
time="2020-02-23T13:32:42Z" level=debug msg="aws_ami_copy.main: Still creating... [4m40s elapsed]"
time="2020-02-23T13:32:51Z" level=debug msg="module.dns.aws_route53_record.etcd_a_nodes[0]: Still creating... [10s elapsed]"
time="2020-02-23T13:32:51Z" level=debug msg="module.dns.aws_route53_record.api_internal: Still creating... [10s elapsed]"
time="2020-02-23T13:32:51Z" level=debug msg="module.dns.aws_route53_record.api_external_internal_zone: Still creating... [10s elapsed]"
time="2020-02-23T13:32:51Z" level=debug msg="module.dns.aws_route53_record.etcd_a_nodes[2]: Still creating... [10s elapsed]"
time="2020-02-23T13:32:51Z" level=debug msg="module.dns.aws_route53_record.etcd_a_nodes[1]: Still creating... [10s elapsed]"
time="2020-02-23T13:32:52Z" level=debug msg="aws_ami_copy.main: Still creating... [4m50s elapsed]"
...
time="2020-02-23T13:34:43Z" level=debug msg="Apply complete! Resources: 123 added, 0 changed, 0 destroyed."
time="2020-02-23T13:34:43Z" level=debug msg="OpenShift Installer unreleased-master-2446-gc108297de972e1a6a5fb502a7668079d16e501f9-dirty"
time="2020-02-23T13:34:43Z" level=debug msg="Built from commit c108297de972e1a6a5fb502a7668079d16e501f9"
time="2020-02-23T13:34:43Z" level=info msg="Waiting up to 20m0s for the Kubernetes API at https://api.okd.kube.domain.com:6443..."
time="2020-02-23T13:35:13Z" level=debug msg="Still waiting for the Kubernetes API: Get https://api.okd.kube.domain.com:6443/version?timeout=32s: dial tcp 52.17.210.160:6443: connect: connection refused"
time="2020-02-23T13:35:50Z" level=debug msg="Still waiting for the Kubernetes API: Get https://api.okd.kube.domain.com:6443/version?timeout=32s: dial tcp 52.211.227.216:6443: connect: connection refused"
time="2020-02-23T13:36:20Z" level=debug msg="Still waiting for the Kubernetes API: Get https://api.okd.kube.domain.com:6443/version?timeout=32s: dial tcp 52.17.210.160:6443: connect: connection refused"
time="2020-02-23T13:36:51Z" level=debug msg="Still waiting for the Kubernetes API: Get https://api.okd.kube.domain.com:6443/version?timeout=32s: dial tcp 52.211.227.216:6443: connect: connection refused"
time="2020-02-23T13:37:58Z" level=debug msg="Still waiting for the Kubernetes API: Get https://api.okd.kube.domain.com:6443/version?timeout=32s: dial tcp 52.211.227.216:6443: connect: connection refused"
time="2020-02-23T13:38:00Z" level=debug msg="Still waiting for the Kubernetes API: the server could not find the requested resource"
time="2020-02-23T13:38:30Z" level=debug msg="Still waiting for the Kubernetes API: the server could not find the requested resource"
time="2020-02-23T13:38:58Z" level=debug msg="Still waiting for the Kubernetes API: Get https://api.okd.kube.domain.com:6443/version?timeout=32s: dial tcp 52.211.227.216:6443: connect: connection refused"
time="2020-02-23T13:39:28Z" level=debug msg="Still waiting for the Kubernetes API: Get https://api.okd.kube.domain.com:6443/version?timeout=32s: dial tcp 63.35.50.149:6443: connect: connection refused"
time="2020-02-23T13:39:36Z" level=info msg="API v1.17.1 up"
time="2020-02-23T13:39:36Z" level=info msg="Waiting up to 40m0s for bootstrapping to complete..."
...
time="2020-02-23T13:55:14Z" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.4.0-0.okd-2020-02-18-212654: 97% complete"
time="2020-02-23T13:55:24Z" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.4.0-0.okd-2020-02-18-212654: 99% complete"
time="2020-02-23T13:57:39Z" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.4.0-0.okd-2020-02-18-212654: 99% complete, waiting on authentication, console, monitoring"
time="2020-02-23T13:57:39Z" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.4.0-0.okd-2020-02-18-212654: 99% complete, waiting on authentication, console, monitoring"
time="2020-02-23T13:58:54Z" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.4.0-0.okd-2020-02-18-212654: 99% complete"
time="2020-02-23T14:01:40Z" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.4.0-0.okd-2020-02-18-212654: 100% complete, waiting on authentication"
time="2020-02-23T14:03:24Z" level=debug msg="Cluster is initialized"
time="2020-02-23T14:03:24Z" level=info msg="Waiting up to 10m0s for the openshift-console route to be created..."
time="2020-02-23T14:03:24Z" level=debug msg="Route found in openshift-console namespace: console"
time="2020-02-23T14:03:24Z" level=debug msg="Route found in openshift-console namespace: downloads"
time="2020-02-23T14:03:24Z" level=debug msg="OpenShift console route is created"
time="2020-02-23T14:03:24Z" level=info msg="Install complete!"
time="2020-02-23T14:03:24Z" level=info msg="To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/output/auth/kubeconfig'"
time="2020-02-23T14:03:24Z" level=info msg="Access the OpenShift web-console here: https://console-openshift-console.apps.okd.kube.domain.com"
REDACTED LINE OF OUTPUT
time="2020-02-23T14:03:25Z" level=info msg="command completed successfully" installID=jcdkd
time="2020-02-23T14:03:25Z" level=info msg="saving installer output" installID=jcdkd
time="2020-02-23T14:03:25Z" level=debug msg="installer console log: level=info msg=\"Credentials loaded from default AWS environment variables\"\nlevel=info msg=\"Consuming Install Config from target directory\"\nlevel=warning msg=\"Found override for release image. Please be warned, this is not advised\"\nlevel=info msg=\"Consuming Master Machines from target directory\"\nlevel=info msg=\"Consuming Common Manifests from target directory\"\nlevel=info msg=\"Consuming OpenShift Install from target directory\"\nlevel=info msg=\"Consuming Worker Machines from target directory\"\nlevel=info msg=\"Consuming Openshift Manifests from target directory\"\nlevel=info msg=\"Consuming Master Ignition Config from target directory\"\nlevel=info msg=\"Consuming Worker Ignition Config from target directory\"\nlevel=info msg=\"Consuming Bootstrap Ignition Config from target directory\"\nlevel=info msg=\"Creating infrastructure resources...\"\nlevel=info msg=\"Waiting up to 20m0s for the Kubernetes API at https://api.okd.kube.domain.com:6443...\"\nlevel=info msg=\"API v1.17.1 up\"\nlevel=info msg=\"Waiting up to 40m0s for bootstrapping to complete...\"\nlevel=info msg=\"Destroying the bootstrap resources...\"\nlevel=error\nlevel=error msg=\"Warning: Resource targeting is in effect\"\nlevel=error\nlevel=error msg=\"You are creating a plan with the -target option, which means that the result\"\nlevel=error msg=\"of this plan may not represent all of the changes requested by the current\"\nlevel=error msg=configuration.\nlevel=error msg=\"\\t\\t\"\nlevel=error msg=\"The -target option is not for routine use, and is provided only for\"\nlevel=error msg=\"exceptional situations such as recovering from errors or mistakes, or when\"\nlevel=error msg=\"Terraform specifically suggests to use it as part of an error message.\"\nlevel=error\nlevel=error\nlevel=error msg=\"Warning: Applied changes may be incomplete\"\nlevel=error\nlevel=error msg=\"The plan was created with the -target option in effect, so some changes\"\nlevel=error msg=\"requested in the configuration may have been ignored and the output values may\"\nlevel=error msg=\"not be fully updated. Run the following command to verify that no other\"\nlevel=error msg=\"changes are pending:\"\nlevel=error msg=\"    terraform plan\"\nlevel=error msg=\"\\t\"\nlevel=error msg=\"Note that the -target option is not suitable for routine use, and is provided\"\nlevel=error msg=\"only for exceptional situations such as recovering from errors or mistakes, or\"\nlevel=error msg=\"when Terraform specifically suggests to use it as part of an error message.\"\nlevel=error\nlevel=info msg=\"Waiting up to 30m0s for the cluster at https://api.okd.kube.domain.com:6443 to initialize...\"\nlevel=info msg=\"Waiting up to 10m0s for the openshift-console route to be created...\"\nlevel=info msg=\"Install complete!\"\nlevel=info msg=\"To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/output/auth/kubeconfig'\"\nlevel=info msg=\"Access the OpenShift web-console here: https://console-openshift-console.apps.okd.kube.domain.com\"\nREDACTED LINE OF OUTPUT\n" installID=vxghr9br
time="2020-02-23T14:03:25Z" level=info msg="install completed successfully" installID=jcdkd

After the installation of the cluster deployment has finished, the Installed value is set to True:

$ kubectl get clusterdeployments.hive.openshift.io  -n okd
NAME   CLUSTERNAME   CLUSTERTYPE   BASEDOMAIN          INSTALLED   INFRAID      AGE
okd    okd                         kube.domain.com     true        okd-jcdkd    54m

At this point you can start using the platform by getting the login credentials from the cluster credential secret Hive created during the installation:

$ kubectl get secrets -n okd okd-0-tbm9t-admin-password -o jsonpath='{.data.username}' | base64 -d
kubeadmin
$ kubectl get secrets -n okd okd-0-tbm9t-admin-password -o jsonpath='{.data.password}' | base64 -d
2T38d-aETpX-dj2YU-UBN4a

Log in via the command-line or the web console:

To delete the cluster simply delete the ClusterDeployment resources which initiates a cluster deprovision and will delete all related AWS resources. If the deprovision gets stuck, manually delete the uninstall finalizer allowing the cluster deployment to be deleted, but note that this may leave artifacts in your AWS account:

$ kubectl delete clusterdeployments.hive.openshift.io okd -n okd --wait=false
clusterdeployment.hive.openshift.io "okd" deleted

Please visit the OpenShift Hive documentation for more information about using Hive.

In the next article I will explain how you can use OpenShift Hive to create, update, delete, patch cluster resources using SyncSets.

Getting started with OpenShift Hive

If you don’t know OpenShift Hive I recommend having a look at the video of my talk at RedHat OpenShift Commons about OpenShift Hive where I also talk about how you can provision and manage the lifecycle of OpenShift 4 clusters using the Kubernetes API and the OpenShift Hive operator.

The Hive operator has three main components the admission controller,  the Hive controller and the Hive operator itself. For more information about the Hive architecture visit the Hive docs:

You can use an OpenShift or native Kubernetes cluster to run the operator, in my case I use a EKS cluster. Let’s go through the prerequisites which are required to generate the manifests and the hiveutil:

$ curl -s "https://raw.githubusercontent.com/\
> kubernetes-sigs/kustomize/master/hack/install_kustomize.sh"  | bash
$ sudo mv ./kustomize /usr/bin/
$ wget https://dl.google.com/go/go1.13.3.linux-amd64.tar.gz
$ tar -xvf go1.13.3.linux-amd64.tar.gz
$ sudo mv go /usr/local

To setup the Go environment copy the content below and add to your .profile:

export GOPATH="${HOME}/.go"
export PATH="$PATH:/usr/local/go/bin"
export PATH="$PATH:${GOPATH}/bin:${GOROOT}/bin"

Continue with installing the Go dependencies and clone the OpenShift Hive Github repository:

$ mkdir -p ~/.go/src/github.com/openshift/
$ go get github.com/golang/mock/mockgen
$ go get github.com/golang/mock/gomock
$ go get github.com/cloudflare/cfssl/cmd/cfssl
$ go get github.com/cloudflare/cfssl/cmd/cfssljson
$ cd ~/.go/src/github.com/openshift/
$ git clone https://github.com/openshift/hive.git
$ cd hive/
$ git checkout remotes/origin/master

Before we run make deploy I would recommend modifying the Makefile that we only generate the Hive manifests without deploying them to Kubernetes:

$ sed -i -e 's#oc apply -f config/crds# #' -e 's#kustomize build overlays/deploy | oc apply -f -#kustomize build overlays/deploy > hive.yaml#' Makefile
$ make deploy
# The apis-path is explicitly specified so that CRDs are not created for v1alpha1
go run tools/vendor/sigs.k8s.io/controller-tools/cmd/controller-gen/main.go crd --apis-path=pkg/apis/hive/v1
CRD files generated, files can be found under path /home/ubuntu/.go/src/github.com/openshift/hive/config/crds.
go generate ./pkg/... ./cmd/...
hack/update-bindata.sh
# Deploy the operator manifests:
mkdir -p overlays/deploy
cp overlays/template/kustomization.yaml overlays/deploy
cd overlays/deploy && kustomize edit set image registry.svc.ci.openshift.org/openshift/hive-v4.0:hive=registry.svc.ci.openshift.org/openshift/hivev1:hive
kustomize build overlays/deploy > hive.yaml
rm -rf overlays/deploy

Quick look at the content of the hive.yaml manifest:

$ cat hive.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: hive
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: hive-operator
  namespace: hive

...

---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    control-plane: hive-operator
    controller-tools.k8s.io: "1.0"
  name: hive-operator
  namespace: hive
spec:
  replicas: 1
  revisionHistoryLimit: 4
  selector:
    matchLabels:
      control-plane: hive-operator
      controller-tools.k8s.io: "1.0"
  template:
    metadata:
      labels:
        control-plane: hive-operator
        controller-tools.k8s.io: "1.0"
    spec:
      containers:
      - command:
        - /opt/services/hive-operator
        - --log-level
        - info
        env:
        - name: CLI_CACHE_DIR
          value: /var/cache/kubectl
        image: registry.svc.ci.openshift.org/openshift/hive-v4.0:hive
        imagePullPolicy: Always
        livenessProbe:
          failureThreshold: 1
          httpGet:
            path: /debug/health
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 10
        name: hive-operator
        resources:
          requests:
            cpu: 100m
            memory: 256Mi
        volumeMounts:
        - mountPath: /var/cache/kubectl
          name: kubectl-cache
      serviceAccountName: hive-operator
      terminationGracePeriodSeconds: 10
      volumes:
      - emptyDir: {}
        name: kubectl-cache

Now we can apply the Hive custom resource definition (crds):

$ kubectl apply -f ./config/crds/
customresourcedefinition.apiextensions.k8s.io/checkpoints.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/clusterdeployments.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/clusterdeprovisions.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/clusterimagesets.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/clusterprovisions.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/clusterstates.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/dnszones.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/hiveconfigs.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/machinepools.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/selectorsyncidentityproviders.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/selectorsyncsets.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/syncidentityproviders.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/syncsets.hive.openshift.io created
customresourcedefinition.apiextensions.k8s.io/syncsetinstances.hive.openshift.io created

And continue to apply the hive.yaml manifest for deploying the OpenShift Hive operator and its components:

$ kubectl apply -f hive.yaml
namespace/hive created
serviceaccount/hive-operator created
clusterrole.rbac.authorization.k8s.io/hive-frontend created
clusterrole.rbac.authorization.k8s.io/hive-operator-role created
clusterrole.rbac.authorization.k8s.io/manager-role created
clusterrole.rbac.authorization.k8s.io/system:openshift:hive:hiveadmission created
rolebinding.rbac.authorization.k8s.io/extension-server-authentication-reader-hiveadmission created
clusterrolebinding.rbac.authorization.k8s.io/auth-delegator-hiveadmission created
clusterrolebinding.rbac.authorization.k8s.io/hive-frontend created
clusterrolebinding.rbac.authorization.k8s.io/hive-operator-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/hiveadmission-hive-hiveadmission created
clusterrolebinding.rbac.authorization.k8s.io/hiveapi-cluster-admin created
clusterrolebinding.rbac.authorization.k8s.io/manager-rolebinding created
deployment.apps/hive-operator created

For the Hive admission controller you need to generate a SSL certifcate:

$ ./hack/hiveadmission-dev-cert.sh
~/Dropbox/hive/hiveadmission-certs ~/Dropbox/hive
2020/02/03 22:17:30 [INFO] generate received request
2020/02/03 22:17:30 [INFO] received CSR
2020/02/03 22:17:30 [INFO] generating key: ecdsa-256
2020/02/03 22:17:30 [INFO] encoded CSR
certificatesigningrequest.certificates.k8s.io/hiveadmission.hive configured
certificatesigningrequest.certificates.k8s.io/hiveadmission.hive approved
-----BEGIN CERTIFICATE-----
MIICaDCCAVCgAwIBAgIQHvvDPncIWHRcnDzzoWGjQDANBgkqhkiG9w0BAQsFADAv
MS0wKwYDVQQDEyRiOTk2MzhhNS04OWQyLTRhZTAtYjI4Ny1iMWIwOGNmOGYyYjAw
HhcNMjAwMjAzMjIxNTA3WhcNMjUwMjAxMjIxNTA3WjAhMR8wHQYDVQQDExZoaXZl
YWRtaXNzaW9uLmhpdmUuc3ZjMFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEea4N
UPbvzM3VdtOkdJ7lBytekRTvwGMqs9HgG14CtqCVCOFq8f+BeqqyrRbJsX83iBfn
gMc54moElb5kIQNjraNZMFcwDAYDVR0TAQH/BAIwADBHBgNVHREEQDA+ghZoaXZl
YWRtaXNzaW9uLmhpdmUuc3ZjgiRoaXZlYWRtaXNzaW9uLmhpdmUuc3ZjLmNsdXN0
ZXIubG9jYWwwDQYJKoZIhvcNAQELBQADggEBADhgT3tNnFs6hBIZFfWmoESe6nnZ
fy9GmlmF9qEBo8FZSk/LYvV0peOdgNZCHqsT2zaJjxULqzQ4zfSb/koYpxeS4+Bf
xwgHzIB/ylzf54wVkILWUFK3GnYepG5dzTXS7VHc4uiNJe0Hwc5JI4HBj7XdL3C7
cbPm7T2cBJi2jscoCWELWo/0hDxkcqZR7rdeltQQ+Uhz87LhTTqlknAMFzL7tM/+
pJePZMQgH97vANsbk97bCFzRZ4eABYSiN0iAB8GQM5M+vK33ZGSVQDJPKQQYH6th
Kzi9wrWEeyEtaWozD5poo9s/dxaLxFAdPDICkPB2yr5QZB+NuDgA+8IYffo=
-----END CERTIFICATE-----
secret/hiveadmission-serving-cert created
~/Dropbox/hive

Afterwards we can check if all the pods are running, this might take a few seconds:

$ kubectl get pods -n hive
NAME                                READY   STATUS    RESTARTS   AGE
hive-controllers-7c6ccc84b9-q7k7m   1/1     Running   0          31s
hive-operator-f9f4447fd-jbmkh       1/1     Running   0          55s
hiveadmission-6766c5bc6f-9667g      1/1     Running   0          27s
hiveadmission-6766c5bc6f-gvvlq      1/1     Running   0          27s

The Hive operator is successfully installed on your Kubernetes cluster but we are not finished yet. To create the required Cluster Deployment manifests we need to generate the hiveutil binary:

$ make hiveutil
go generate ./pkg/... ./cmd/...
hack/update-bindata.sh
go build -o bin/hiveutil github.com/openshift/hive/contrib/cmd/hiveutil

To generate Hive Cluster Deployment manifests just run the following hiveutil command below, I output the definition with -o into yaml:

$ bin/hiveutil create-cluster --base-domain=mydomain.example.com --cloud=aws mycluster -o yaml
apiVersion: v1
items:
- apiVersion: hive.openshift.io/v1
  kind: ClusterImageSet
  metadata:
    creationTimestamp: null
    name: mycluster-imageset
  spec:
    releaseImage: quay.io/openshift-release-dev/ocp-release:4.3.2-x86_64
  status: {}
- apiVersion: v1
  kind: Secret
  metadata:
    creationTimestamp: null
    name: mycluster-aws-creds
  stringData:
    aws_access_key_id: <-YOUR-AWS-ACCESS-KEY->
    aws_secret_access_key: <-YOUR-AWS-SECRET-KEY->
  type: Opaque
- apiVersion: v1
  data:
    install-config.yaml: <-BASE64-ENCODED-OPENSHIFT4-INSTALL-CONFIG->
  kind: Secret
  metadata:
    creationTimestamp: null
    name: mycluster-install-config
  type: Opaque
- apiVersion: hive.openshift.io/v1
  kind: ClusterDeployment
  metadata:
    creationTimestamp: null
    name: mycluster
  spec:
    baseDomain: mydomain.example.com
    clusterName: mycluster
    controlPlaneConfig:
      servingCertificates: {}
    installed: false
    platform:
      aws:
        credentialsSecretRef:
          name: mycluster-aws-creds
        region: us-east-1
    provisioning:
      imageSetRef:
        name: mycluster-imageset
      installConfigSecretRef:
        name: mycluster-install-config
  status:
    clusterVersionStatus:
      availableUpdates: null
      desired:
        force: false
        image: ""
        version: ""
      observedGeneration: 0
      versionHash: ""
- apiVersion: hive.openshift.io/v1
  kind: MachinePool
  metadata:
    creationTimestamp: null
    name: mycluster-worker
  spec:
    clusterDeploymentRef:
      name: mycluster
    name: worker
    platform:
      aws:
        rootVolume:
          iops: 100
          size: 22
          type: gp2
        type: m4.xlarge
    replicas: 3
  status:
    replicas: 0
kind: List
metadata: {}

I hope this post is useful in getting you started with OpenShift Hive. In my next article I will go through the details of the OpenShift 4 cluster deployment with Hive.

Read my new article about OpenShift / OKD 4.x Cluster Deployment using OpenShift Hive