Using Operator Lifecycle Manager and create custom Operator Catalog for Kubernetes

In the beginning of 2019 RedHat announced the launch of the OperatorHub.io and a lot of things have happened since then; OpenShift version 4 got released which is fully managed by Kubernetes operators and other vendors started to release their own operators to deploy their applications to Kubernetes. Even creating your own operators is becoming more popular and state-of-the-art if you run your own Kubernetes clusters.

I want to go into the details of how the Operator Lifecycle Manager (OLM) works and how you can create your own operator Catalog Server and use with Kubernetes to install operators but before we start let’s look how this works. The OLM is responsible for installing and managing the lifecycle of Kubernetes operators and uses a CatalogSource from which it installs the operators.

We should start by creating our own Catalog Server which is straightforward: we need to clone the community-operator repository and build a new catalog-server container image. If you would like to modify which operators should be in the catalog just delete the operators in the ./upstream-community-operators/ folder or add your own operators, in my example I only want to keep the SysDig operator:

git clone https://github.com/operator-framework/community-operators
cd community-operators/
rm ci.Dockerfile
mv upstream.Dockerfile Dockerfile
cd upstream-community-operators/
rm -rfv !("sysdig")
cd ..

Now we can build the new catalog container image and push to the registry:

$ docker build . --rm -t berndonline/catalog-server
Sending build context to Docker daemon  23.34MB
Step 1/10 : FROM quay.io/operator-framework/upstream-registry-builder:v1.3.0 as builder
 ---> e08ceacda476
Step 2/10 : COPY upstream-community-operators manifests
 ---> 9e4b4e98a968
Step 3/10 : RUN ./bin/initializer -o ./bundles.db
 ---> Running in b11415b71497
time="2020-01-11T15:14:02Z" level=info msg="loading Bundles" dir=manifests
time="2020-01-11T15:14:02Z" level=info msg=directory dir=manifests file=manifests load=bundles
time="2020-01-11T15:14:02Z" level=info msg=directory dir=manifests file=sysdig load=bundles
time="2020-01-11T15:14:02Z" level=info msg=directory dir=manifests file=1.4.0 load=bundles
time="2020-01-11T15:14:02Z" level=info msg="found csv, loading bundle" dir=manifests file=sysdig-operator.v1.4.0.clusterserviceversion.yaml load=bundles
time="2020-01-11T15:14:02Z" level=info msg="loading bundle file" dir=manifests file=sysdig-operator.v1.4.0.clusterserviceversion.yaml load=bundle
time="2020-01-11T15:14:02Z" level=info msg="loading bundle file" dir=manifests file=sysdigagents.sysdig.com.crd.yaml load=bundle
time="2020-01-11T15:14:02Z" level=info msg=directory dir=manifests file=1.4.7 load=bundles
time="2020-01-11T15:14:02Z" level=info msg="found csv, loading bundle" dir=manifests file=sysdig-operator.v1.4.7.clusterserviceversion.yaml load=bundles
time="2020-01-11T15:14:02Z" level=info msg="loading bundle file" dir=manifests file=sysdig-operator.v1.4.7.clusterserviceversion.yaml load=bundle
time="2020-01-11T15:14:02Z" level=info msg="loading bundle file" dir=manifests file=sysdigagents.sysdig.com.crd.yaml load=bundle
time="2020-01-11T15:14:02Z" level=info msg="loading Packages and Entries" dir=manifests
time="2020-01-11T15:14:02Z" level=info msg=directory dir=manifests file=manifests load=package
time="2020-01-11T15:14:02Z" level=info msg=directory dir=manifests file=sysdig load=package
time="2020-01-11T15:14:02Z" level=info msg=directory dir=manifests file=1.4.0 load=package
time="2020-01-11T15:14:02Z" level=info msg=directory dir=manifests file=1.4.7 load=package
Removing intermediate container b11415b71497
 ---> d3e1417fd1ee
Step 4/10 : FROM scratch
 --->
Step 5/10 : COPY --from=builder /build/bundles.db /bundles.db
 ---> 32c4b0ba7422
Step 6/10 : COPY --from=builder /build/bin/registry-server /registry-server
 ---> 5607183f50e7
Step 7/10 : COPY --from=builder /bin/grpc_health_probe /bin/grpc_health_probe
 ---> 6e612705cab1
Step 8/10 : EXPOSE 50051
 ---> Running in 5930349a782e
Removing intermediate container 5930349a782e
 ---> 2a0e6d01f7f5
Step 9/10 : ENTRYPOINT ["/registry-server"]
 ---> Running in 1daf50f151ae
Removing intermediate container 1daf50f151ae
 ---> 9fe3fed7cc2a
Step 10/10 : CMD ["--database", "/bundles.db"]
 ---> Running in 154a8d3bb346
Removing intermediate container 154a8d3bb346
 ---> f4d99376cbef
Successfully built f4d99376cbef
Successfully tagged berndonline/catalog-server:latest
$ docker push berndonline/catalog-server
The push refers to repository [docker.io/berndonline/catalog-server]
0516ee590bf5: Pushed
3bbd78f51bb3: Pushed
e4bd72ca23da: Pushed
latest: digest: sha256:b2251ebb6049a1ea994fd710c9182c89866255011ee50fd2a6eeb55c6de2fa21 size: 947

Next we need to install the Operator Lifecycle Manager, go to the release page in Github and install the latest version. First this will add the Custom Resource Definitions for OLM and afterwards deploys the required OLM operator resources:

kubectl apply -f https://github.com/operator-framework/operator-lifecycle-manager/releases/download/0.13.0/crds.yaml
kubectl apply -f https://github.com/operator-framework/operator-lifecycle-manager/releases/download/0.13.0/olm.yaml

Next we need to add the new CatalogSource and delete the default OperatorHub one to limit which operator can be installed:

cat <<EOF | kubectl apply -n olm -f -
---
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: custom-catalog
  namespace: olm
spec:
  sourceType: grpc
  image: docker.io/berndonline/catalog-server:latest
  displayName: Custom Operators
  publisher: techbloc.net
EOF

kubectl delete catalogsource operatorhubio-catalog -n olm

Do a quick check to make sure that the OLM components are running, you will see a pod with the custom-catalog which we previously created:

$ kubectl get pods -n olm
NAME                               READY   STATUS    RESTARTS   AGE
catalog-operator-5bdf7fc7b-wcbcs   1/1     Running   0          100s
custom-catalog-4hrbg               1/1     Running   0          32s
olm-operator-5ff565fcfc-2j9gt      1/1     Running   0          100s
packageserver-7fcbddc745-6s666     1/1     Running   0          88s
packageserver-7fcbddc745-jkfxs     1/1     Running   0          88s

Now we can look for the available operator manifests and see that our Custom Operator catalog only has the SysDig operator available:

$ kubectl get packagemanifests
NAME     CATALOG            AGE
sysdig   Custom Operators   36s

To install the SysDig operator we need to create the namespace, the operator group and the subscription which will instruct OLM to install the SysDig operator:

cat <<EOF | kubectl apply -f -
---
apiVersion: v1
kind: Namespace
metadata:
  name: sysdig
---
apiVersion: operators.coreos.com/v1alpha2
kind: OperatorGroup
metadata:
  name: operatorgroup
  namespace: sysdig
spec:
  targetNamespaces:
  - sysdig
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: sysdig
  namespace: sysdig
spec:
  channel: stable
  name: sysdig
  source: custom-catalog
  sourceNamespace: olm
EOF

At the end we need to check if the OLM installed the SysDig operator:

# Check that the subscription is created
$ kubectl get sub -n sysdig
NAME     PACKAGE   SOURCE           CHANNEL
sysdig   sysdig    custom-catalog   stable

# Check that OLM created an InstallPlan for installing the operator
$ kubectl get ip -n sysdig
NAME            CSV                      APPROVAL    APPROVED
install-sf6dl   sysdig-operator.v1.4.7   Automatic   true

# Check that the InstallPlan created the Cluster Service Version and installed the operator
$ kubectl get csv -n sysdig
NAME                     DISPLAY                 VERSION   REPLACES                 PHASE
sysdig-operator.v1.4.7   Sysdig Agent Operator   1.4.7     sysdig-operator.v1.4.0   Succeeded

# Check that the SysDig operator is running
$ kubectl get pod -n sysdig
NAME                               READY   STATUS    RESTARTS   AGE
sysdig-operator-74c9f665d9-bb8l9   1/1     Running   0          46s

Now you can install the SysDig agent by adding the following custom resource:

cat <<EOF | kubectl apply -f -
---
apiVersion: sysdig.com/v1alpha1
kind: SysdigAgent
metadata:
  name: agent
  namespace: sysdig
spec:
  ebpf:
    enabled: true
  secure:
    enabled: true
  sysdig:
    accessKey: XXXXXXX
EOF

To delete the SysDig operator just delete the namespace or run the following commands to delete subscription, operator group and cluster service version:

kubectl delete sub sysdig -n sysdig
kubectl delete operatorgroup operatorgroup -n sysdig
kubectl delete csv sysdig-operator.v1.4.7 -n sysdig

Thinking ahead you can let the Flux-CD operator manage all the resources and only use GitOps to apply cluster configuration:

I hope this article is interesting and useful, if you want to read more information about the Operator Lifecycle Manager please read the olm-book which has some useful information.

January 12, 2020February 12, 2020

How to manage Kubernetes clusters the GitOps way with Flux CD

Kubernetes is becoming more and more popular, and so is managing clusters at scale. This article is about how to manage Kubernetes clusters the GitOps way using the Flux CD operator.

Flux can monitor container image and code repositories that you specify and trigger deployments to automatically change the configuration state of your Kubernetes cluster. The cluster configuration is centrally managed and stored in declarative form in Git, and there is no need for an administrator to manually apply manifests, the Flux operator synchronise to apply or delete the cluster configuration.

Before we start deploying the operator we need to install the fluxctl command-line utility and create the namespace:

sudo wget -O /usr/local/bin/fluxctl https://github.com/fluxcd/flux/releases/download/1.18.0/fluxctl_linux_amd64
sudo chmod 755 /usr/local/bin/fluxctl
kubectl create ns flux

Deploying the Flux operator is straight forward and requires a few options like git repository and git path. The path is important for my example because it tells the operator in which folder to look for manifests:

$ fluxctl install [email protected] [email protected]:berndonline/flux-cd.git --git-path=clusters/gke,common/stage --manifest-generation=true --git-branch=master --namespace=flux --registry-disable-scanning | kubectl apply -f -
deployment.apps/memcached created
service/memcached created
serviceaccount/flux created
clusterrole.rbac.authorization.k8s.io/flux created
clusterrolebinding.rbac.authorization.k8s.io/flux created
deployment.apps/flux created
secret/flux-git-deploy created

After you have applied the configuration, wait until the Flux pods are up and running:

$ kubectl get pods -n flux
NAME                       READY   STATUS    RESTARTS   AGE
flux-85cd9cd746-hnb4f      1/1     Running   0          74m
memcached-5dcd7579-d6vwh   1/1     Running   0          20h

The last step is to get the Flux operator deploy keys and copy the output to add to your Git repository:

fluxctl identity --k8s-fwd-ns flux

Now you are ready to synchronise the Flux operator with the repository. By default Flux automatically synchronises every 5 minutes to apply configuration changes:

$ fluxctl sync --k8s-fwd-ns flux
Synchronizing with [email protected]:berndonline/flux-cd.git
Revision of master to apply is 726944d
Waiting for 726944d to be applied ...
Done.

You are able to list workloads which are managed by the Flux operator:

$ fluxctl list-workloads --k8s-fwd-ns=flux -a
WORKLOAD                             CONTAINER         IMAGE                            RELEASE  POLICY
default:deployment/hello-kubernetes  hello-kubernetes  paulbouwer/hello-kubernetes:1.5  ready    automated

How do we manage the configuration for multiple Kubernetes clusters?

I want to show you a simple example using Kustomize to manage multiple clusters across two environments (staging and production) with Flux. Basically you have a single repository and multiple clusters synchronising the configuration depending how you configure the –git-path variable of the Flux operator. The option –manifest-generation enables Kustomize for the operator and it is required to add a .flux.yaml to run Kustomize build on the cluster directories and to apply the generated manifests.

Let’s look at the repository file and folder structure. We have the base folder containing the common deployment configuration, the common folder with the environment separation for stage and prod overlays and the clusters folder which contains more cluster specific configuration:

├── .flux.yaml 
├── base
│   └── common
│       ├── deployment.yaml
│       ├── kustomization.yaml
│       ├── namespace.yaml
│       └── service.yaml
├── clusters
│   ├── eks
|   |   ├── eks-app1
│   │   |   ├── deployment.yaml
|   |   |   ├── kustomization.yaml
│   │   |   └── service.yaml
|   |   └── kustomization.yaml
│   ├── gke
|   |   ├── gke-app1
│   │   |   ├── deployment.yaml
|   |   |   ├── kustomization.yaml
│   │   |   └── service.yaml
|   |   ├── gke-app2
│   │   |   ├── deployment.yaml
|   |   |   ├── kustomization.yaml
│   │   |   └── service.yaml
|   |   └── kustomization.yaml
└── common
    ├── prod
    |   ├── prod.yaml
    |   └── kustomization.yaml
    └── stage
        ├──  team1
        |    ├── deployment.yaml
        |    ├── kustomization.yaml
        |    ├── namespace.yaml
        |    └── service.yaml
        ├── stage.yaml
        └── kustomization.yaml

If you are new to Kustomize I would recommend reading the article Kustomize – The right way to do templating in Kubernetes.

The last thing we need to do is to deploy the Flux operator to the two Kubernetes clusters. The only difference between both is the git-path variable which points the operator to the cluster and common directories were Kustomize applies the overlays based what is specified in kustomize.yaml. More details about the configuration you find in my example repository: https://github.com/berndonline/flux-cd

Flux config for Google GKE staging cluster:

fluxctl install [email protected] [email protected]:berndonline/flux-cd.git --git-path=clusters/gke,common/stage --manifest-generation=true --git-branch=master --namespace=flux | kubectl apply -f -

Flux config for Amazon EKS production cluster:

fluxctl install [email protected] [email protected]:berndonline/flux-cd.git --git-path=clusters/eks,common/prod --manifest-generation=true --git-branch=master --namespace=flux | kubectl apply -f -

After a few minutes the configuration is applied to the two clusters and you can validate the configuration.

Google GKE stage workloads:

$ fluxctl list-workloads --k8s-fwd-ns=flux -a
WORKLOAD                   CONTAINER         IMAGE                            RELEASE  POLICY
common:deployment/common   hello-kubernetes  paulbouwer/hello-kubernetes:1.5  ready    automated
default:deployment/gke1    hello-kubernetes  paulbouwer/hello-kubernetes:1.5  ready    
default:deployment/gke2    hello-kubernetes  paulbouwer/hello-kubernetes:1.5  ready    
team1:deployment/team1     hello-kubernetes  paulbouwer/hello-kubernetes:1.5  ready

$ kubectl get svc --all-namespaces | grep LoadBalancer
common        common                 LoadBalancer   10.91.14.186   35.240.53.46     80:31537/TCP    16d
default       gke1                   LoadBalancer   10.91.7.169    35.195.241.46    80:30218/TCP    16d
default       gke2                   LoadBalancer   10.91.10.239   35.195.144.68    80:32589/TCP    16d
team1         team1                  LoadBalancer   10.91.1.178    104.199.107.56   80:31049/TCP    16d

GKE common stage application:

Amazon EKS prod workloads:

$ fluxctl list-workloads --k8s-fwd-ns=flux -a
WORKLOAD                          CONTAINER         IMAGE                                                                RELEASE  POLICY
common:deployment/common          hello-kubernetes  paulbouwer/hello-kubernetes:1.5                                      ready    automated
default:deployment/eks1           hello-kubernetes  paulbouwer/hello-kubernetes:1.5                                      ready

$ kubectl get svc --all-namespaces | grep LoadBalancer
common        common       LoadBalancer   10.100.254.171   a4caafcbf2b2911ea87370a71555111a-958093179.eu-west-1.elb.amazonaws.com    80:32318/TCP    3m8s
default       eks1         LoadBalancer   10.100.170.10    a4caeada52b2911ea87370a71555111a-1261318311.eu-west-1.elb.amazonaws.com   80:32618/TCP    3m8s

EKS common prod application:

I hope this article is useful to get started with GitOps and the Flux operator. In the future, I would like to see Flux being able to watch git tags which will make it easier to promote changes and manage clusters with version tags.

For more technical information have a look at the Flux CD documentation.

April 28, 2019April 28, 2019

Create and run Ansible Operator on OpenShift

Since RedHat announced the new OpenShift version 4.0 they said it will be a very different experience to install and operate the platform, mostly because of Operators managing the components of the cluster. A few month back RedHat officially released the Operator-SDK and the Operator Hub to create your own operators and to share them.

I did some testing around the Ansible Operator which I wanted to share in this article but before we dig into creating our own operator we need to first install operator-sdk:

# Make sure you are able to use docker commands
sudo groupadd docker
sudo usermod -aG docker centos
ls -l /var/run/docker.sock
sudo chown root:docker /var/run/docker.sock

# Download Go
wget https://dl.google.com/go/go1.10.3.linux-amd64.tar.gz
sudo tar -C /usr/local -xzf go1.10.3.linux-amd64.tar.gz

# Modify bash_profile
vi ~/.bash_profile
export PATH=$PATH:/usr/local/go/bin:$HOME/go
export GOPATH=$HOME/go

# Load bash_profile
source ~/.bash_profile

# Install Go dep
mkdir -p /home/centos/go/bin
curl https://raw.githubusercontent.com/golang/dep/master/install.sh | sh
sudo cp /home/centos/go/bin/dep /usr/local/go/bin/

# Download and install operator framework
mkdir -p $GOPATH/src/github.com/operator-framework
cd $GOPATH/src/github.com/operator-framework
git clone https://github.com/operator-framework/operator-sdk
cd operator-sdk
git checkout master
make dep
make install
sudo cp /home/centos/go/bin/operator-sdk /usr/local/bin/

Let’s start creating our Ansible Operator using the operator-sdk command line which create a blank operator template which we will modify. You can create three different types of operators: Go, Helm or Ansible – check out the operator-sdk repository:

operator-sdk new helloworld-operator --api-version=hello.world.com/v1alpha1 --kind=Helloworld --type=ansible --cluster-scoped
cd ./helloworld-operator/

I am using the Ansible k8s module to create a Hello OpenShift deployment configuration in tasks/main.yml.

---
# tasks file for helloworld

- name: create deployment config
  k8s:
    definition:
      apiVersion: apps.openshift.io/v1
      kind: DeploymentConfig
      metadata:
        name: '{{ meta.name }}'
        labels:
          app: '{{ meta.name }}'
        namespace: '{{ meta.namespace }}'
...

Please have a look at my Github repository openshift-helloworld-operator for more details.

After we have modified the Ansible Role we can start and build operator which will create container we can afterwards push to a container registry like Docker Hub:

$ operator-sdk build berndonline/openshift-helloworld-operator:v0.1
INFO[0000] Building Docker image berndonline/openshift-helloworld-operator:v0.1
Sending build context to Docker daemon   192 kB
Step 1/3 : FROM quay.io/operator-framework/ansible-operator:v0.5.0
Trying to pull repository quay.io/operator-framework/ansible-operator ...
v0.5.0: Pulling from quay.io/operator-framework/ansible-operator
a02a4930cb5d: Already exists
1bdeea372afe: Pull complete
3b057581d180: Pull complete
12618e5abaa7: Pull complete
6f75beb67357: Pull complete
b241f86d9d40: Pull complete
e990bcb94ae6: Pull complete
3cd07ac53955: Pull complete
3fdda52e2c22: Pull complete
0fd51cfb1114: Pull complete
feaebb94b4da: Pull complete
4ff9620dce03: Pull complete
a428b645a85e: Pull complete
5daaf234bbf2: Pull complete
8cbdd2e4d624: Pull complete
fa8517b650e0: Pull complete
a2a83ad7ba5a: Pull complete
d61b9e9050fe: Pull complete
Digest: sha256:9919407a30b24d459e1e4188d05936b52270cafcd53afc7d73c89be02262f8c5
Status: Downloaded newer image for quay.io/operator-framework/ansible-operator:v0.5.0
 ---> 1e857f3522b5
Step 2/3 : COPY roles/ ${HOME}/roles/
 ---> 6e073916723a
Removing intermediate container cb3f89ba1ed6
Step 3/3 : COPY watches.yaml ${HOME}/watches.yaml
 ---> 8f0ee7ba26cb
Removing intermediate container 56ece5b800b2
Successfully built 8f0ee7ba26cb
INFO[0018] Operator build complete.

$ docker push berndonline/openshift-helloworld-operator:v0.1
The push refers to a repository [docker.io/berndonline/openshift-helloworld-operator]
2233d56d407b: Pushed
d60aa100721d: Pushed
a3a57fad5e76: Pushed
ab38e57f8581: Pushed
79b113b67633: Pushed
9cf5b154cadd: Pushed
b191ffbd3c8d: Pushed
5e21ced2d28b: Pushed
cdadb746680d: Pushed
d105c72f21c1: Pushed
1a899839ab25: Pushed
be81e9b31e54: Pushed
63d9d56008cb: Pushed
56a62cb9d96c: Pushed
3f9dc45a1d02: Pushed
dac20332f7b5: Pushed
24f8e5ff1817: Pushed
1bdae1c8263a: Pushed
bc08b53be3d4: Pushed
071d8bd76517: Mounted from openshift/origin-node
v0.1: digest: sha256:50fb222ec47c0d0a7006ff73aba868dfb3369df8b0b16185b606c10b2e30b111 size: 4495

After we have pushed the container to the registry we can continue on OpenShift and create the operator project together with the custom resource definition:

oc new-project helloworld-operator
oc create -f deploy/crds/hello_v1alpha1_helloworld_crd.yaml

Before we apply the resources let’s review and edit operator image configuration to point to our newly create operator container image:

$ cat deploy/operator.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: helloworld-operator
spec:
  replicas: 1
  selector:
    matchLabels:
      name: helloworld-operator
  template:
    metadata:
      labels:
        name: helloworld-operator
    spec:
      serviceAccountName: helloworld-operator
      containers:
        - name: helloworld-operator
          # Replace this with the built image name
          image: berndonline/openshift-helloworld-operator:v0.1
          imagePullPolicy: Always
          env:
            - name: WATCH_NAMESPACE
              value: ""
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: OPERATOR_NAME
              value: "helloworld-operator"

$ cat deploy/role_binding.yaml
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: helloworld-operator
subjects:
- kind: ServiceAccount
  name: helloworld-operator
  # Replace this with the namespace the operator is deployed in.
  namespace: helloworld-operator
roleRef:
  kind: ClusterRole
  name: helloworld-operator
  apiGroup: rbac.authorization.k8s.io

$ cat deploy/role_user.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  creationTimestamp: null
  name: helloworld-operator-execute
rules:
- apiGroups:
  - hello.world.com
  resources:
  - '*'
  verbs:
  - '*'

Afterwards we can deploy the required resources:

oc create -f deploy/operator.yaml \
          -f deploy/role_binding.yaml \
          -f deploy/role.yaml \
          -f deploy/service_account.yaml

Create a cluster-role for the custom resource definition and add bind user to a cluster-role to be able to create a custom resource:

oc create -f deploy/role_user.yaml 
oc adm policy add-cluster-role-to-user helloworld-operator-execute berndonline

If you forget to do this you will see the following error message:

Now we can login as your openshift user and create the custom resource in the namespace myproject:

$ oc create -n myproject -f deploy/crds/hello_v1alpha1_helloworld_cr.yaml
helloworld.hello.world.com/hello-openshift created
$ oc describe Helloworld/hello-openshift -n myproject
Name:         hello-openshift
Namespace:    myproject
Labels:       
Annotations:  
API Version:  hello.world.com/v1alpha1
Kind:         Helloworld
Metadata:
  Creation Timestamp:  2019-03-16T15:33:25Z
  Generation:          1
  Resource Version:    19692
  Self Link:           /apis/hello.world.com/v1alpha1/namespaces/myproject/helloworlds/hello-openshift
  UID:                 d6ce75d7-4800-11e9-b6a8-0a238ec78c2a
Spec:
  Size:  1
Status:
  Conditions:
    Last Transition Time:  2019-03-16T15:33:25Z
    Message:               Running reconciliation
    Reason:                Running
    Status:                True
    Type:                  Running
Events:

You can also create the custom resource via the web console:

You will get a security warning which you need to confirm to apply the custom resource:

After a few minutes the operator will create the deploymentconfig and will deploy the hello-openshift pod:

$ oc get dc
NAME              REVISION   DESIRED   CURRENT   TRIGGERED BY
hello-openshift   1          1         1         config,image(hello-openshift:latest)

$ oc get pods
NAME                      READY     STATUS    RESTARTS   AGE
hello-openshift-1-pjhm4   1/1       Running   0          2m

We can modify custom resource and change the spec size to three:

$ oc edit Helloworld/hello-openshift
...
spec:
  size: 3
...

$ oc describe Helloworld/hello-openshift
Name:         hello-openshift
Namespace:    myproject
Labels:       
Annotations:  
API Version:  hello.world.com/v1alpha1
Kind:         Helloworld
Metadata:
  Creation Timestamp:  2019-03-16T15:33:25Z
  Generation:          2
  Resource Version:    24902
  Self Link:           /apis/hello.world.com/v1alpha1/namespaces/myproject/helloworlds/hello-openshift
  UID:                 d6ce75d7-4800-11e9-b6a8-0a238ec78c2a
Spec:
  Size:  3
Status:
  Conditions:
    Last Transition Time:  2019-03-16T15:33:25Z
    Message:               Running reconciliation
    Reason:                Running
    Status:                True
    Type:                  Running
Events:                    
~ centos(ocp: myproject) $

The operator will change the deployment config and change the desired state to three pods:

$ oc get dc
NAME              REVISION   DESIRED   CURRENT   TRIGGERED BY
hello-openshift   1          3         3         config,image(hello-openshift:latest)

$ oc get pods
NAME                      READY     STATUS    RESTARTS   AGE
hello-openshift-1-pjhm4   1/1       Running   0          32m
hello-openshift-1-qhqgx   1/1       Running   0          3m
hello-openshift-1-qlb2q   1/1       Running   0          3m

To clean-up and remove the deployment config you need to delete the custom resource

oc delete Helloworld/hello-openshift -n myproject
oc adm policy remove-cluster-role-from-user helloworld-operator-execute berndonline

I hope this is a good and simple example to show how powerful operators are on OpenShift / Kubernetes.

April 14, 2019

Running Istio Service Mesh on OpenShift

In the Kubernetes/OpenShift community everyone is talking about Istio service mesh, so I wanted to share my experience about the installation and running a sample microservice application with Istio on OpenShift 3.11 and 4.0. Service mesh on OpenShift is still at least a few month away from being available generally to run in production but this gives you the possibility to start testing and exploring Istio. I have found good documentation about installing Istio on OCP and OKD have a look for more information.

To install Istio on OpenShift 3.11 you need to apply the node and master prerequisites you see below; for OpenShift 4.0 and above you can skip these steps and go directly to the istio-operator installation:

sudo bash -c 'cat << EOF > /etc/origin/master/master-config.patch
admissionConfig:
  pluginConfig:
    MutatingAdmissionWebhook:
      configuration:
        apiVersion: apiserver.config.k8s.io/v1alpha1
        kubeConfigFile: /dev/null
        kind: WebhookAdmission
    ValidatingAdmissionWebhook:
      configuration:
        apiVersion: apiserver.config.k8s.io/v1alpha1
        kubeConfigFile: /dev/null
        kind: WebhookAdmission
EOF'
        
sudo cp -p /etc/origin/master/master-config.yaml /etc/origin/master/master-config.yaml.prepatch
sudo bash -c 'oc ex config patch /etc/origin/master/master-config.yaml.prepatch -p "$(cat /etc/origin/master/master-config.patch)" > /etc/origin/master/master-config.yaml'
sudo su -
master-restart api
master-restart controllers
exit       

sudo bash -c 'cat << EOF > /etc/sysctl.d/99-elasticsearch.conf 
vm.max_map_count = 262144
EOF'

sudo sysctl vm.max_map_count=262144

The Istio installation is straight forward by starting first to install the istio-operator:

oc new-project istio-operator
oc new-app -f https://raw.githubusercontent.com/Maistra/openshift-ansible/maistra-0.9/istio/istio_community_operator_template.yaml --param=OPENSHIFT_ISTIO_MASTER_PUBLIC_URL=<-master-public-hostname->

Verify the operator deployment:

oc logs -n istio-operator $(oc -n istio-operator get pods -l name=istio-operator --output=jsonpath={.items..metadata.name})

Once the operator is running we can start deploying Istio components by creating a custom resource:

cat << EOF >  ./istio-installation.yaml
apiVersion: "istio.openshift.com/v1alpha1"
kind: "Installation"
metadata:
  name: "istio-installation"
  namespace: istio-operator
EOF

oc create -n istio-operator -f ./istio-installation.yaml

Check and watch the Istio installation progress which might take a while to complete:

oc get pods -n istio-system -w

# The installation of the core components is finished when you see:
...
openshift-ansible-istio-installer-job-cnw72   0/1       Completed   0         4m

Afterwards, to finish off the Istio installation, we need to install the Kiali web console:

bash <(curl -L https://git.io/getLatestKialiOperator)
oc get route -n istio-system -l app=kiali

Verifying that all Istio components are running:

$ oc get pods -n istio-system
NAME                                          READY     STATUS      RESTARTS   AGE
elasticsearch-0                               1/1       Running     0          9m
grafana-74b5796d94-4ll5d                      1/1       Running     0          9m
istio-citadel-db879c7f8-kfxfk                 1/1       Running     0          11m
istio-egressgateway-6d78858d89-58lsd          1/1       Running     0          11m
istio-galley-6ff54d9586-8r7cl                 1/1       Running     0          11m
istio-ingressgateway-5dcf9fdf4b-4fjj5         1/1       Running     0          11m
istio-pilot-7ccf64f659-ghh7d                  2/2       Running     0          11m
istio-policy-6c86656499-v45zr                 2/2       Running     3          11m
istio-sidecar-injector-6f696b8495-8qqjt       1/1       Running     0          11m
istio-telemetry-686f78b66b-v7ljf              2/2       Running     3          11m
jaeger-agent-k4tpz                            1/1       Running     0          9m
jaeger-collector-64bc5678dd-wlknc             1/1       Running     0          9m
jaeger-query-776d4d754b-8z47d                 1/1       Running     0          9m
kiali-5fd946b855-7lw2h                        1/1       Running     0          2m
openshift-ansible-istio-installer-job-cnw72   0/1       Completed   0          13m
prometheus-75b849445c-l7rlr                   1/1       Running     0          11m

Let’s start to deploy the microservice application example by using the Google Hipster Shop, it contains multiple microservices which is great to test with Istio:

# Create new project
oc new-project hipster-shop

# Set permissions to allow Istio to deploy the Envoy-Proxy side-car container
oc adm policy add-scc-to-user anyuid -z default -n hipster-shop
oc adm policy add-scc-to-user privileged -z default -n hipster-shop

# Create Hipster Shop deployments and Istio services
oc create -f https://raw.githubusercontent.com/berndonline/openshift-ansible/master/examples/istio-hipster-shop.yml
oc create -f https://raw.githubusercontent.com/berndonline/openshift-ansible/master/examples/istio-manifest.yml

# Wait and check that all pods are running before creating the load generator
oc get pods -n hipster-shop -w

# Create load generator deployment
oc create -f https://raw.githubusercontent.com/berndonline/openshift-ansible/master/examples/istio-loadgenerator.yml

As you see below each pod has a sidecar container with the Istio Envoy proxy which handles pod traffic:

[centos@ip-172-26-1-167 ~]$ oc get pods
NAME                                     READY     STATUS    RESTARTS   AGE
adservice-7894dbfd8c-g4m9v               2/2       Running   0          49m
cartservice-758d66c648-79fj4             2/2       Running   4          49m
checkoutservice-7b9dc8b755-h2b2v         2/2       Running   0          49m
currencyservice-7b5c5f48fc-gtm9x         2/2       Running   0          49m
emailservice-79578566bb-jvwbw            2/2       Running   0          49m
frontend-6497c5f748-5fc4f                2/2       Running   0          49m
loadgenerator-764c5547fc-sw6mg           2/2       Running   0          40m
paymentservice-6b989d657c-klp4d          2/2       Running   0          49m
productcatalogservice-5bfbf4c77c-cw676   2/2       Running   0          49m
recommendationservice-c947d84b5-svbk8    2/2       Running   0          49m
redis-cart-79d84748cf-cvg86              2/2       Running   0          49m
shippingservice-6ccb7d8ff7-66v8m         2/2       Running   0          49m
[centos@ip-172-26-1-167 ~]$

The Kiali web console answers the question about what microservices are part of the service mesh and how are they connected which gives you a great level of detail about the traffic flows:

Detailed traffic flow view:

The Isito installation comes with Jaeger which is an open source tracing tool to monitor and troubleshoot transactions:

Enough about this, lets connect to our cool Hipster Shop and happy shopping:

Additionally there is another example, the Istio Bookinfo if you want to try something smaller and less complex:

oc new-project myproject

oc adm policy add-scc-to-user anyuid -z default -n myproject
oc adm policy add-scc-to-user privileged -z default -n myproject

oc apply -n myproject -f https://raw.githubusercontent.com/Maistra/bookinfo/master/bookinfo.yaml
oc apply -n myproject -f https://raw.githubusercontent.com/Maistra/bookinfo/master/bookinfo-gateway.yaml
export GATEWAY_URL=$(oc get route -n istio-system istio-ingressgateway -o jsonpath='{.spec.host}')
curl -o /dev/null -s -w "%{http_code}\n" http://$GATEWAY_URL/productpage

curl -o destination-rule-all.yaml https://raw.githubusercontent.com/istio/istio/release-1.0/samples/bookinfo/networking/destination-rule-all.yaml
oc apply -f destination-rule-all.yaml

curl -o destination-rule-all-mtls.yaml https://raw.githubusercontent.com/istio/istio/release-1.0/samples/bookinfo/networking/destination-rule-all-mtls.yaml
oc apply -f destination-rule-all-mtls.yaml

oc get destinationrules -o yaml

I hope this is a useful article for getting started with Istio service mesh on OpenShift.

April 7, 2019

Getting started with OpenShift 4.0 Container Platform

I had a first look at OpenShift 4.0 and I wanted to share some information from what I have seen so far. The installation of the cluster is super easy and RedHat did a lot to improve the overall experience of the installation process to the previous OpenShift v3.x Ansible based installation and moving towards ephemeral cluster deployments.

There are a many changes under the hood and it’s not as obvious as Bootkube for the self-hosted/healing control-plane, MachineSets and the many internal operators to install and manage the OpenShift components ( api server, scheduler, controller manager, cluster-autoscaler, cluster-monitoring, web-console, dns, ingress, networking, node-tuning, and authentication ).

For the OpenShift 4.0 developer preview you need an RedHat account because you require a pull-secret for the cluster installation. For more information please visit: https://cloud.openshift.com/clusters/install

First we need to download the openshift-installer binary:

wget https://github.com/openshift/installer/releases/download/v0.16.1/openshift-install-linux-amd64
mv openshift-install-linux-amd64 openshift-install
chmod +x openshift-install

Then we create the install-configuration, it is required that you already have AWS account credentials and an Route53 DNS domain set-up:

$ ./openshift-install create install-config
INFO Platform aws
INFO AWS Access Key ID *********
INFO AWS Secret Access Key [? for help] *********
INFO Writing AWS credentials to "/home/centos/.aws/credentials" (https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html)
INFO Region eu-west-1
INFO Base Domain paas.domain.com
INFO Cluster Name cluster1
INFO Pull Secret [? for help] *********

Let’s look at the install-config.yaml

apiVersion: v1beta4
baseDomain: paas.domain.com
compute:
- name: worker
  platform: {}
  replicas: 3
controlPlane:
  name: master
  platform: {}
  replicas: 3
metadata:
  creationTimestamp: null
  name: ew1
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  machineCIDR: 10.0.0.0/16
  networkType: OpenShiftSDN
  serviceNetwork:
  - 172.30.0.0/16
platform:
  aws:
    region: eu-west-1
pullSecret: '{"auths":{...}'

Now we can continue to create the OpenShift v4 cluster which takes around 30mins to complete. At the end of the openshift-installer you see the auto-generate credentials to connect to the cluster:

$ ./openshift-install create cluster
INFO Consuming "Install Config" from target directory
INFO Creating infrastructure resources...
INFO Waiting up to 30m0s for the Kubernetes API at https://api.cluster1.paas.domain.com:6443...
INFO API v1.12.4+0ba401e up
INFO Waiting up to 30m0s for the bootstrap-complete event...
INFO Destroying the bootstrap resources...
INFO Waiting up to 30m0s for the cluster at https://api.cluster1.paas.domain.com:6443 to initialize...
INFO Waiting up to 10m0s for the openshift-console route to be created...
INFO Install complete!
INFO Run 'export KUBECONFIG=/home/centos/auth/kubeconfig' to manage the cluster with 'oc', the OpenShift CLI.
INFO The cluster is ready when 'oc login -u kubeadmin -p jMTSJ-F6KYy-mVVZ4-QVNPP' succeeds (wait a few minutes).
INFO Access the OpenShift web-console here: https://console-openshift-console.apps.cluster1.paas.domain.com
INFO Login to the console with user: kubeadmin, password: jMTSJ-F6KYy-mVVZ4-QVNPP

The web-console has a very clean new design which I really like in addition to all the great improvements.

Under administration -> cluster settings you can explore the new auto-upgrade functionality of OpenShift 4.0:

You choose the new version to upgrade and everything else happens in the background which is a massive improvement to OpenShift v3.x where you had to run the ansible installer for this.

In the background the cluster operator upgrades the different platform components one by one.

Slowly you will see that the components move to the new build version.

Finished cluster upgrade:

You can only upgrade from one version 4.0.0-0.9 to the next version 4.0.0-0.10. It is not possible to upgrade and go straight from x-0.9 to x-0.11.

But let’s deploy the Google Hipster Shop example and expose the frontend-external service for some more testing:

oc login -u kubeadmin -p jMTSJ-F6KYy-mVVZ4-QVNPP https://api.cluster1.paas.domain.com:6443 --insecure-skip-tls-verify=true
oc new-project myproject
oc create -f https://raw.githubusercontent.com/berndonline/openshift-ansible/master/examples/hipster-shop.yml
oc expose svc frontend-external

Getting the hostname for the exposed service:

$ oc get route
NAME                HOST/PORT                                                   PATH      SERVICES            PORT      TERMINATION   WILDCARD
frontend-external   frontend-external-myproject.apps.cluster1.paas.domain.com             frontend-external   http                    None

Use the browser to connect to our Hipster Shop:

It’s also very easy to destroy the cluster as it is to create it, as you seen previously:

$ ./openshift-install destroy cluster
INFO Disassociated                                 arn="arn:aws:ec2:eu-west-1:552276840222:route-table/rtb-083e2da5d1183efa7" id=rtbassoc-01d27db162fa45402
INFO Disassociated                                 arn="arn:aws:ec2:eu-west-1:552276840222:route-table/rtb-083e2da5d1183efa7" id=rtbassoc-057f593640067efc0
INFO Disassociated                                 arn="arn:aws:ec2:eu-west-1:552276840222:route-table/rtb-083e2da5d1183efa7" id=rtbassoc-05e821b451bead18f
INFO Disassociated                                 IAM instance profile="arn:aws:iam::552276840222:instance-profile/ocp4-bgx4c-worker-profile" arn="arn:aws:ec2:eu-west-1:552276840222:instance/i-0f64a911b1ffa3eff" id=i-0f64a911b1ffa3eff name=ocp4-bgx4c-worker-profile role=ocp4-bgx4c-worker-role
INFO Deleted                                       IAM instance profile="arn:aws:iam::552276840222:instance-profile/ocp4-bgx4c-worker-profile" arn="arn:aws:ec2:eu-west-1:552276840222:instance/i-0f64a911b1ffa3eff" id=i-0f64a911b1ffa3eff name=0xc00090f9a8
INFO Deleted                                       arn="arn:aws:ec2:eu-west-1:552276840222:instance/i-0f64a911b1ffa3eff" id=i-0f64a911b1ffa3eff
INFO Deleted                                       arn="arn:aws:ec2:eu-west-1:552276840222:instance/i-00b5eedc186ba26a7" id=i-00b5eedc186ba26a7
...
INFO Deleted                                       arn="arn:aws:ec2:eu-west-1:552276840222:security-group/sg-016d4c7d435a1c97f" id=sg-016d4c7d435a1c97f
INFO Deleted                                       arn="arn:aws:ec2:eu-west-1:552276840222:subnet/subnet-076348368858e9a82" id=subnet-076348368858e9a82
INFO Deleted                                       arn="arn:aws:ec2:eu-west-1:552276840222:vpc/vpc-00c611ae1b9b8e10a" id=vpc-00c611ae1b9b8e10a
INFO Deleted                                       arn="arn:aws:ec2:eu-west-1:552276840222:dhcp-options/dopt-0ce8b6a1c31e0ceac" id=dopt-0ce8b6a1c31e0ceac

The install experience is great for OpenShift 4.0 which makes it very easy for everyone to create and get started quickly with an enterprise container platform. From the operational perspective I still need to see how to run the new platform because all the operators are great and makes it an easy to use cluster but what happens when one of the operators goes rogue and debugging this I am most interested in.

Over the coming weeks I will look into more detail around OpenShift 4.0 and the different new features, I am especially interested in Service Mesh.