Getting started with Kubernetes Operators in Golang

In the past few weeks I started to learn Golang and beginners like me can make quick progress once you understand the structure and some basics about the programming language. I felt that from all the learning and reading I’ve done on Golang and Kubernetes operators, I had enough knowledge to start writing my own Kubernetes operator in Go.

At the beginning of last year, RedHat released the operator-sdk which helps to create the scaffolding for writing your own operators in Ansible, Helm or natively in Golang. There has been quite a few changes along the way around the operator-sdk and it is maturing a lot over the course of the past year.

The instructions on how to install Golang can be found on the Golang website and we need the latest version of the operator-sdk:

$ wget https://github.com/operator-framework/operator-sdk/releases/download/v1.2.0/operator-sdk-v1.2.0-x86_64-linux-gnu
$ mv operator-sdk-v1.2.0-x86_64-linux-gnu operator-sdk
$ sudo mv operator-sdk /usr/local/bin/

Create a new folder and start to initialise the project. You see that I have already set the option --domain so all API groups will be <-group->.helloworld.io. The --repo option allows me to create the project folder outside of my $GOPATH environment. Infos about the folder structure you can find in the Kubebuilder documentation:

$ mkdir k8s-helloworld-operator
$ cd k8s-helloworld-operator
$ operator-sdk init --domain=helloworld.io --repo=github.com/berndonline/k8s-helloworld-operator

The last thing we need before we start writing the operator is to create a new API and Controller and this will scaffold the operator API at api/v1alpha1/operator_types.go and the controller at controllers/operator_controller.go.

$ operator-sdk create api --group app --version v1alpha1 --kind Operator
Create Resource [y/n]
y
Create Controller [y/n]
y
Writing scaffold for you to edit...
api/v1alpha1/operator_types.go
controllers/operator_controller.go
...
  • Define your API

Define your API for the operator custom resource by editing the Go type definitions at api/v1alpha1/operator_types.go

// OperatorSpec defines the desired state of Operator
type OperatorSpec struct {
	// INSERT ADDITIONAL SPEC FIELDS - desired state of cluster
	// Important: Run "make" to regenerate code after modifying this file

	// Foo is an example field of Operator. Edit Operator_types.go to remove/update
	Size     int32  `json:"size"`
	Image    string `json:"image"`
	Response string `json:"response"`
}
// OperatorStatus defines the observed state of Operator
type OperatorStatus struct {
	// INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
	// Important: Run "make" to regenerate code after modifying this file
	Nodes []string `json:"nodes"`
}

// Operator is the Schema for the operators API
// +kubebuilder:subresource:status
type Operator struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`

	Spec   OperatorSpec   `json:"spec,omitempty"`
	Status OperatorStatus `json:"status,omitempty"`
}

After modifying the _types.go file you always need to run the following command to update the generated code for that resource type:

$ make generate 
/home/ubuntu/.go/bin/controller-gen object:headerFile="hack/boilerplate.go.txt" paths="./..."
  • Generate Custom Resource Definition (CRD) manifests

In the previous step we defined the API with spec and status fields of the CRD manifests, which can be generated and updated with the following command:

$ make manifests
/home/ubuntu/.go/bin/controller-gen "crd:trivialVersions=true" rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases

This makefile will invoke the controller-gen to generate the CRD manifests at config/crd/bases/app.helloworld.io_operators.yaml and below you see my custom resource example for the operator:

apiVersion: app.helloworld.io/v1alpha1
kind: Operator
metadata:
  name: operator-sample
spec:
  size: 1
  response: "Hello, World!"
  image: "ghcr.io/berndonline/k8s/go-helloworld:latest"
  • Controller

In the beginning when I created the API, the operator-sdk automatically created the controller file for me at controllers/operator_controller.go which we now start to modify and add the Golang code. I will not go into every detail because the different resources you will create will all look very similar and repeat like you will see in example code. I will mainly focus on the Deployment for my Helloworld container image which I want to deploy using the operator.

Let’s start looking at the deploymentForOperator function which defines and returns the Kubernetes Deployment object. You see there that I invoke an imported Golang packages like &appsv1.Deployment and the import is defined at the top of the controller file. You can find details about this in the Go Doc reference: godoc.org/k8s.io/api/apps/v1

// deploymentForOperator returns a operator Deployment object
func (r *OperatorReconciler) deploymentForOperator(m *appv1alpha1.Operator) *appsv1.Deployment {
	ls := labelsForOperator(m.Name)
	replicas := m.Spec.Size

	dep := &appsv1.Deployment{
		ObjectMeta: metav1.ObjectMeta{
			Name:      m.Name,
			Namespace: m.Namespace,
		},
		Spec: appsv1.DeploymentSpec{
			Replicas: &replicas,
			Selector: &metav1.LabelSelector{
				MatchLabels: ls,
			},
			Template: corev1.PodTemplateSpec{
				ObjectMeta: metav1.ObjectMeta{
					Labels: ls,
				},
				Spec: corev1.PodSpec{
					Containers: []corev1.Container{{
						Image:           m.Spec.Image,
						ImagePullPolicy: "Always",
						Name:            "helloworld",
						Ports: []corev1.ContainerPort{{
							ContainerPort: 8080,
							Name:          "operator",
						}},
						Env: []corev1.EnvVar{{
							Name:  "RESPONSE",
							Value: m.Spec.Response,
						}},
						EnvFrom: []corev1.EnvFromSource{{
							ConfigMapRef: &corev1.ConfigMapEnvSource{
								LocalObjectReference: corev1.LocalObjectReference{
									Name: m.Name,
								},
							},
						}},
						VolumeMounts: []corev1.VolumeMount{{
							Name:      m.Name,
							ReadOnly:  true,
							MountPath: "/helloworld/",
						}},
					}},
					Volumes: []corev1.Volume{{
						Name: m.Name,
						VolumeSource: corev1.VolumeSource{
							ConfigMap: &corev1.ConfigMapVolumeSource{
								LocalObjectReference: corev1.LocalObjectReference{
									Name: m.Name,
								},
							},
						},
					}},
				},
			},
		},
	}

	// Set Operator instance as the owner and controller
	ctrl.SetControllerReference(m, dep, r.Scheme)
	return dep
}

We have defined the deploymentForOperator function and now we can look into the Reconcile function and add the step to check if the deployment already exists and, if not, to create the new deployment:

// Check if the deployment already exists, if not create a new one
found := &appsv1.Deployment{}
err = r.Get(ctx, types.NamespacedName{Name: operator.Name, Namespace: operator.Namespace}, found)
if err != nil && errors.IsNotFound(err) {
	// Define a new deployment
	dep := r.deploymentForOperator(operator)
	log.Info("Creating a new Deployment", "Deployment.Namespace", dep.Namespace, "Deployment.Name", dep.Name)
	err = r.Create(ctx, dep)
	if err != nil {
		log.Error(err, "Failed to create new Deployment", "Deployment.Namespace", dep.Namespace, "Deployment.Name", dep.Name)
		return ctrl.Result{}, err
	}
	// Deployment created successfully - return and requeue
	return ctrl.Result{Requeue: true}, nil
} else if err != nil {
	log.Error(err, "Failed to get Deployment")
	return ctrl.Result{}, err
}

Unfortunately this isn’t enough because this will only check if the deployment exists or not and create a new deployment, but it will not update the deployment if the custom resource is changed.

We need to add two more steps to check if the created Deployment Spec.Template matches the Spec.Template from the  deploymentForOperator function and the Deployment Spec.Replicas the defined size from the custom resource. I will make use of the defined variable found := &appsv1.Deployment{} from the previous step when I checked if the deployment exists.

// Check if the deployment Spec.Template, matches the found Spec.Template
deploy := r.deploymentForOperator(operator)
if !equality.Semantic.DeepDerivative(deploy.Spec.Template, found.Spec.Template) {
	found = deploy
	log.Info("Updating Deployment", "Deployment.Namespace", found.Namespace, "Deployment.Name", found.Name)
	err := r.Update(ctx, found)
	if err != nil {
		log.Error(err, "Failed to update Deployment", "Deployment.Namespace", found.Namespace, "Deployment.Name", found.Name)
		return ctrl.Result{}, err
	}
	return ctrl.Result{Requeue: true}, nil
}

// Ensure the deployment size is the same as the spec
size := operator.Spec.Size
if *found.Spec.Replicas != size {
	found.Spec.Replicas = &size
	err = r.Update(ctx, found)
	if err != nil {
		log.Error(err, "Failed to update Deployment", "Deployment.Namespace", found.Namespace, "Deployment.Name", found.Name)
		return ctrl.Result{}, err
	}
	// Spec updated - return and requeue
	return ctrl.Result{Requeue: true}, nil
}

The SetupWithManager() function in controllers/operator_controller.go specifies how the controller is built to watch a custom resource and other resources that are owned and managed by that controller.

func (r *OperatorReconciler) SetupWithManager(mgr ctrl.Manager) error {
	return ctrl.NewControllerManagedBy(mgr).
		For(&appv1alpha1.Operator{}).
		Owns(&appsv1.Deployment{}).
		Owns(&corev1.ConfigMap{}).
		Owns(&corev1.Service{}).
		Owns(&networkingv1beta1.Ingress{}).
		Complete(r)
}

Basically that’s all I need to write for the controller to deploy my Helloworld container image using an Kubernetes operator. In my code example you will find that I also create a Kubernetes Service, Ingress and ConfigMap but you see that this mostly repeats what I have done with the Deployment object.

  • RBAC permissions

Before we can start running the operator, we need to define the RBAC permissions the controller needs to interact with the resources it manages otherwise your controller will not work. These are specified via [RBAC markers] like these:

// +kubebuilder:rbac:groups=app.helloworld.io,resources=operators,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=app.helloworld.io,resources=operators/status,verbs=get;update;patch
// +kubebuilder:rbac:groups=app.helloworld.io,resources=operators/finalizers,verbs=update
// +kubebuilder:rbac:groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=core,resources=services,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=core,resources=configmaps,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=networking.k8s.io,resources=ingresses,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;watch

The ClusterRole manifest at config/rbac/role.yaml is generated from the above markers via controller-gen with the following command:

$ make manifests 
/home/ubuntu/.go/bin/controller-gen "crd:trivialVersions=true" rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases
  • Running the Operator

We need a Kubernetes cluster and admin privileges to run the operator. I will use Kind which will run a lightweight Kubernetes cluster in your local Docker engine, which is all I need to run and test my Helloworld operator:

$ ./scripts/create-kind-cluster.sh 
Creating cluster "kind" ...
 ✓ Ensuring node image (kindest/node:v1.19.1) 🖼 
 ✓ Preparing nodes 📦  
 ✓ Writing configuration 📜 
 ✓ Starting control-plane 🕹️ 
 ✓ Installing CNI 🔌 
 ✓ Installing StorageClass 💾 
Set kubectl context to "kind-kind"
You can now use your cluster with:

kubectl cluster-info --context kind-kind

Have a question, bug, or feature request? Let us know! https://kind.sigs.k8s.io/#community 🙂

Before running the operator the custom resource Definition must be registered with the Kubernetes apiserver:

$ make install
/home/ubuntu/.go/bin/controller-gen "crd:trivialVersions=true" rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases
/usr/bin/kustomize build config/crd | kubectl apply -f -
Warning: apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
customresourcedefinition.apiextensions.k8s.io/operators.app.helloworld.io created

We can now run the operator locally on my workstation:

$ make run
/home/ubuntu/.go/bin/controller-gen object:headerFile="hack/boilerplate.go.txt" paths="./..."
go fmt ./...
go vet ./...
/home/ubuntu/.go/bin/controller-gen "crd:trivialVersions=true" rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases
go run ./main.go
2020-11-22T18:12:49.023Z	INFO	controller-runtime.metrics	metrics server is starting to listen	{"addr": ":8080"}
2020-11-22T18:12:49.024Z	INFO	setup	starting manager
2020-11-22T18:12:49.025Z	INFO	controller-runtime.manager	starting metrics server	{"path": "/metrics"}
2020-11-22T18:12:49.025Z	INFO	controller	Starting EventSource	{"reconcilerGroup": "app.helloworld.io", "reconcilerKind": "Operator", "controller": "operator", "source": "kind source: /, Kind="}
2020-11-22T18:12:49.126Z	INFO	controller	Starting EventSource	{"reconcilerGroup": "app.helloworld.io", "reconcilerKind": "Operator", "controller": "operator", "source": "kind source: /, Kind="}
2020-11-22T18:12:49.226Z	INFO	controller	Starting EventSource	{"reconcilerGroup": "app.helloworld.io", "reconcilerKind": "Operator", "controller": "operator", "source": "kind source: /, Kind="}
2020-11-22T18:12:49.327Z	INFO	controller	Starting EventSource	{"reconcilerGroup": "app.helloworld.io", "reconcilerKind": "Operator", "controller": "operator", "source": "kind source: /, Kind="}
2020-11-22T18:12:49.428Z	INFO	controller	Starting EventSource	{"reconcilerGroup": "app.helloworld.io", "reconcilerKind": "Operator", "controller": "operator", "source": "kind source: /, Kind="}
2020-11-22T18:12:49.528Z	INFO	controller	Starting Controller	{"reconcilerGroup": "app.helloworld.io", "reconcilerKind": "Operator", "controller": "operator"}
2020-11-22T18:12:49.528Z	INFO	controller	Starting workers	{"reconcilerGroup": "app.helloworld.io", "reconcilerKind": "Operator", "controller": "operator", "worker count": 1}

Let’s open a new terminal and apply the custom resource example:

$ kubectl apply -f config/samples/app_v1alpha1_operator.yaml 
operator.app.helloworld.io/operator-sample created

Going back to the terminal where the operator is running, you see the log messages that it invoke the different functions to deploy the defined resource objects:

2020-11-22T18:15:30.412Z	INFO	controllers.Operator	Creating a new Deployment	{"operator": "default/operator-sample", "Deployment.Namespace": "default", "Deployment.Name": "operator-sample"}
2020-11-22T18:15:30.446Z	INFO	controllers.Operator	Creating a new ConfigMap	{"operator": "default/operator-sample", "ConfigMap.Namespace": "default", "ConfigMap.Name": "operator-sample"}
2020-11-22T18:15:30.453Z	INFO	controllers.Operator	Creating a new Service	{"operator": "default/operator-sample", "Service.Namespace": "default", "Service.Name": "operator-sample"}
2020-11-22T18:15:30.470Z	INFO	controllers.Operator	Creating a new Ingress	{"operator": "default/operator-sample", "Ingress.Namespace": "default", "Ingress.Name": "operator-sample"}
2020-11-22T18:15:30.927Z	DEBUG	controller	Successfully Reconciled	{"reconcilerGroup": "app.helloworld.io", "reconcilerKind": "Operator", "controller": "operator", "name": "operator-sample", "namespace": "default"}
2020-11-22T18:15:30.927Z	DEBUG	controller	Successfully Reconciled	{"reconcilerGroup": "app.helloworld.io", "reconcilerKind": "Operator", "controller": "operator", "name": "operator-sample", "namespace": "default"}
2020-11-22T18:15:33.776Z	DEBUG	controller	Successfully Reconciled	{"reconcilerGroup": "app.helloworld.io", "reconcilerKind": "Operator", "controller": "operator", "name": "operator-sample", "namespace": "default"}
2020-11-22T18:15:35.181Z	DEBUG	controller	Successfully Reconciled	{"reconcilerGroup": "app.helloworld.io", "reconcilerKind": "Operator", "controller": "operator", "name": "operator-sample", "namespace": "default"}

In the default namespace where I applied the custom resource you will see the deployed resources by the operator:

$ kubectl get operators.app.helloworld.io 
NAME              AGE
operator-sample   6m11s
$ kubectl get all
NAME                                   READY   STATUS    RESTARTS   AGE
pod/operator-sample-767897c4b9-8zwsd   1/1     Running   0          2m59s

NAME                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/kubernetes        ClusterIP   10.96.0.1               443/TCP    29m
service/operator-sample   ClusterIP   10.96.199.188           8080/TCP   2m59s

NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/operator-sample   1/1     1            1           2m59s

NAME                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/operator-sample-767897c4b9   1         1         1       2m59s

There is not much else to do other than to build the operator image and push to an image registry so that I can run the operator on a Kubernetes cluster.

$ make docker-build IMG=ghcr.io/berndonline/k8s/helloworld-operator:latest
$ make docker-push IMG=ghcr.io/berndonline/k8s/helloworld-operator:latest
$ kustomize build config/default | kubectl apply -f -

I hope this article is useful for getting you started on writing your own Kubernetes operator in Golang.

How to manage Kubernetes clusters the GitOps way with Flux CD

Kubernetes is becoming more and more popular, and so is managing clusters at scale. This article is about how to manage Kubernetes clusters the GitOps way using the Flux CD operator.

Flux can monitor container image and code repositories that you specify and trigger deployments to automatically change the configuration state of your Kubernetes cluster. The cluster configuration is centrally managed and stored in declarative form in Git, and there is no need for an administrator to manually apply manifests, the Flux operator synchronise to apply or delete the cluster configuration.

Before we start deploying the operator we need to install the fluxctl command-line utility and create the namespace:

sudo wget -O /usr/local/bin/fluxctl https://github.com/fluxcd/flux/releases/download/1.18.0/fluxctl_linux_amd64
sudo chmod 755 /usr/local/bin/fluxctl
kubectl create ns flux

Deploying the Flux operator is straight forward and requires a few options like git repository and git path. The path is important for my example because it tells the operator in which folder to look for manifests:

$ fluxctl install [email protected] [email protected]:berndonline/flux-cd.git --git-path=clusters/gke,common/stage --manifest-generation=true --git-branch=master --namespace=flux --registry-disable-scanning | kubectl apply -f -
deployment.apps/memcached created
service/memcached created
serviceaccount/flux created
clusterrole.rbac.authorization.k8s.io/flux created
clusterrolebinding.rbac.authorization.k8s.io/flux created
deployment.apps/flux created
secret/flux-git-deploy created

After you have applied the configuration, wait until the Flux pods are up and running:

$ kubectl get pods -n flux
NAME                       READY   STATUS    RESTARTS   AGE
flux-85cd9cd746-hnb4f      1/1     Running   0          74m
memcached-5dcd7579-d6vwh   1/1     Running   0          20h

The last step is to get the Flux operator deploy keys and copy the output to add to your Git repository:

fluxctl identity --k8s-fwd-ns flux

Now you are ready to synchronise the Flux operator with the repository. By default Flux automatically synchronises every 5 minutes to apply configuration changes:

$ fluxctl sync --k8s-fwd-ns flux
Synchronizing with [email protected]:berndonline/flux-cd.git
Revision of master to apply is 726944d
Waiting for 726944d to be applied ...
Done.

You are able to list workloads which are managed by the Flux operator:

$ fluxctl list-workloads --k8s-fwd-ns=flux -a
WORKLOAD                             CONTAINER         IMAGE                            RELEASE  POLICY
default:deployment/hello-kubernetes  hello-kubernetes  paulbouwer/hello-kubernetes:1.5  ready    automated

How do we manage the configuration for multiple Kubernetes clusters?

I want to show you a simple example using Kustomize to manage multiple clusters across two environments (staging and production) with Flux. Basically you have a single repository and multiple clusters synchronising the configuration depending how you configure the –git-path variable of the Flux operator. The option –manifest-generation enables Kustomize for the operator and it is required to add a .flux.yaml to run Kustomize build on the cluster directories and to apply the generated manifests.

Let’s look at the repository file and folder structure. We have the base folder containing the common deployment configuration, the common folder with the environment separation for stage and prod overlays and the clusters folder which contains more cluster specific configuration:

├── .flux.yaml 
├── base
│   └── common
│       ├── deployment.yaml
│       ├── kustomization.yaml
│       ├── namespace.yaml
│       └── service.yaml
├── clusters
│   ├── eks
|   |   ├── eks-app1
│   │   |   ├── deployment.yaml
|   |   |   ├── kustomization.yaml
│   │   |   └── service.yaml
|   |   └── kustomization.yaml
│   ├── gke
|   |   ├── gke-app1
│   │   |   ├── deployment.yaml
|   |   |   ├── kustomization.yaml
│   │   |   └── service.yaml
|   |   ├── gke-app2
│   │   |   ├── deployment.yaml
|   |   |   ├── kustomization.yaml
│   │   |   └── service.yaml
|   |   └── kustomization.yaml
└── common
    ├── prod
    |   ├── prod.yaml
    |   └── kustomization.yaml
    └── stage
        ├──  team1
        |    ├── deployment.yaml
        |    ├── kustomization.yaml
        |    ├── namespace.yaml
        |    └── service.yaml
        ├── stage.yaml
        └── kustomization.yaml

If you are new to Kustomize I would recommend reading the article Kustomize – The right way to do templating in Kubernetes.

The last thing we need to do is to deploy the Flux operator to the two Kubernetes clusters. The only difference between both is the git-path variable which points the operator to the cluster and common directories were Kustomize applies the overlays based what is specified in kustomize.yaml. More details about the configuration you find in my example repository: https://github.com/berndonline/flux-cd

Flux config for Google GKE staging cluster:

fluxctl install [email protected] [email protected]:berndonline/flux-cd.git --git-path=clusters/gke,common/stage --manifest-generation=true --git-branch=master --namespace=flux | kubectl apply -f -

Flux config for Amazon EKS production cluster:

fluxctl install [email protected] [email protected]:berndonline/flux-cd.git --git-path=clusters/eks,common/prod --manifest-generation=true --git-branch=master --namespace=flux | kubectl apply -f -

After a few minutes the configuration is applied to the two clusters and you can validate the configuration.

Google GKE stage workloads:

$ fluxctl list-workloads --k8s-fwd-ns=flux -a
WORKLOAD                   CONTAINER         IMAGE                            RELEASE  POLICY
common:deployment/common   hello-kubernetes  paulbouwer/hello-kubernetes:1.5  ready    automated
default:deployment/gke1    hello-kubernetes  paulbouwer/hello-kubernetes:1.5  ready    
default:deployment/gke2    hello-kubernetes  paulbouwer/hello-kubernetes:1.5  ready    
team1:deployment/team1     hello-kubernetes  paulbouwer/hello-kubernetes:1.5  ready
$ kubectl get svc --all-namespaces | grep LoadBalancer
common        common                 LoadBalancer   10.91.14.186   35.240.53.46     80:31537/TCP    16d
default       gke1                   LoadBalancer   10.91.7.169    35.195.241.46    80:30218/TCP    16d
default       gke2                   LoadBalancer   10.91.10.239   35.195.144.68    80:32589/TCP    16d
team1         team1                  LoadBalancer   10.91.1.178    104.199.107.56   80:31049/TCP    16d

GKE common stage application:

Amazon EKS prod workloads:

$ fluxctl list-workloads --k8s-fwd-ns=flux -a
WORKLOAD                          CONTAINER         IMAGE                                                                RELEASE  POLICY
common:deployment/common          hello-kubernetes  paulbouwer/hello-kubernetes:1.5                                      ready    automated
default:deployment/eks1           hello-kubernetes  paulbouwer/hello-kubernetes:1.5                                      ready
$ kubectl get svc --all-namespaces | grep LoadBalancer
common        common       LoadBalancer   10.100.254.171   a4caafcbf2b2911ea87370a71555111a-958093179.eu-west-1.elb.amazonaws.com    80:32318/TCP    3m8s
default       eks1         LoadBalancer   10.100.170.10    a4caeada52b2911ea87370a71555111a-1261318311.eu-west-1.elb.amazonaws.com   80:32618/TCP    3m8s

EKS common prod application:

I hope this article is useful to get started with GitOps and the Flux operator. In the future, I would like to see Flux being able to watch git tags which will make it easier to promote changes and manage clusters with version tags.

For more technical information have a look at the Flux CD documentation.

Create and run Ansible Operator on OpenShift

Since RedHat announced the new OpenShift version 4.0 they said it will be a very different experience to install and operate the platform, mostly because of Operators managing the components of the cluster. A few month back RedHat officially released the Operator-SDK and the Operator Hub to create your own operators and to share them.

I did some testing around the Ansible Operator which I wanted to share in this article but before we dig into creating our own operator we need to first install operator-sdk:

# Make sure you are able to use docker commands
sudo groupadd docker
sudo usermod -aG docker centos
ls -l /var/run/docker.sock
sudo chown root:docker /var/run/docker.sock

# Download Go
wget https://dl.google.com/go/go1.10.3.linux-amd64.tar.gz
sudo tar -C /usr/local -xzf go1.10.3.linux-amd64.tar.gz

# Modify bash_profile
vi ~/.bash_profile
export PATH=$PATH:/usr/local/go/bin:$HOME/go
export GOPATH=$HOME/go

# Load bash_profile
source ~/.bash_profile

# Install Go dep
mkdir -p /home/centos/go/bin
curl https://raw.githubusercontent.com/golang/dep/master/install.sh | sh
sudo cp /home/centos/go/bin/dep /usr/local/go/bin/

# Download and install operator framework
mkdir -p $GOPATH/src/github.com/operator-framework
cd $GOPATH/src/github.com/operator-framework
git clone https://github.com/operator-framework/operator-sdk
cd operator-sdk
git checkout master
make dep
make install
sudo cp /home/centos/go/bin/operator-sdk /usr/local/bin/

Let’s start creating our Ansible Operator using the operator-sdk command line which create a blank operator template which we will modify. You can create three different types of operators: Go, Helm or Ansible – check out the operator-sdk repository:

operator-sdk new helloworld-operator --api-version=hello.world.com/v1alpha1 --kind=Helloworld --type=ansible --cluster-scoped
cd ./helloworld-operator/

I am using the Ansible k8s module to create a Hello OpenShift deployment configuration in tasks/main.yml.

---
# tasks file for helloworld

- name: create deployment config
  k8s:
    definition:
      apiVersion: apps.openshift.io/v1
      kind: DeploymentConfig
      metadata:
        name: '{{ meta.name }}'
        labels:
          app: '{{ meta.name }}'
        namespace: '{{ meta.namespace }}'
...

Please have a look at my Github repository openshift-helloworld-operator for more details.

After we have modified the Ansible Role we can start and build operator which will create container we can afterwards push to a container registry like Docker Hub:

$ operator-sdk build berndonline/openshift-helloworld-operator:v0.1
INFO[0000] Building Docker image berndonline/openshift-helloworld-operator:v0.1
Sending build context to Docker daemon   192 kB
Step 1/3 : FROM quay.io/operator-framework/ansible-operator:v0.5.0
Trying to pull repository quay.io/operator-framework/ansible-operator ...
v0.5.0: Pulling from quay.io/operator-framework/ansible-operator
a02a4930cb5d: Already exists
1bdeea372afe: Pull complete
3b057581d180: Pull complete
12618e5abaa7: Pull complete
6f75beb67357: Pull complete
b241f86d9d40: Pull complete
e990bcb94ae6: Pull complete
3cd07ac53955: Pull complete
3fdda52e2c22: Pull complete
0fd51cfb1114: Pull complete
feaebb94b4da: Pull complete
4ff9620dce03: Pull complete
a428b645a85e: Pull complete
5daaf234bbf2: Pull complete
8cbdd2e4d624: Pull complete
fa8517b650e0: Pull complete
a2a83ad7ba5a: Pull complete
d61b9e9050fe: Pull complete
Digest: sha256:9919407a30b24d459e1e4188d05936b52270cafcd53afc7d73c89be02262f8c5
Status: Downloaded newer image for quay.io/operator-framework/ansible-operator:v0.5.0
 ---> 1e857f3522b5
Step 2/3 : COPY roles/ ${HOME}/roles/
 ---> 6e073916723a
Removing intermediate container cb3f89ba1ed6
Step 3/3 : COPY watches.yaml ${HOME}/watches.yaml
 ---> 8f0ee7ba26cb
Removing intermediate container 56ece5b800b2
Successfully built 8f0ee7ba26cb
INFO[0018] Operator build complete.

$ docker push berndonline/openshift-helloworld-operator:v0.1
The push refers to a repository [docker.io/berndonline/openshift-helloworld-operator]
2233d56d407b: Pushed
d60aa100721d: Pushed
a3a57fad5e76: Pushed
ab38e57f8581: Pushed
79b113b67633: Pushed
9cf5b154cadd: Pushed
b191ffbd3c8d: Pushed
5e21ced2d28b: Pushed
cdadb746680d: Pushed
d105c72f21c1: Pushed
1a899839ab25: Pushed
be81e9b31e54: Pushed
63d9d56008cb: Pushed
56a62cb9d96c: Pushed
3f9dc45a1d02: Pushed
dac20332f7b5: Pushed
24f8e5ff1817: Pushed
1bdae1c8263a: Pushed
bc08b53be3d4: Pushed
071d8bd76517: Mounted from openshift/origin-node
v0.1: digest: sha256:50fb222ec47c0d0a7006ff73aba868dfb3369df8b0b16185b606c10b2e30b111 size: 4495

After we have pushed the container to the registry we can continue on OpenShift and create the operator project together with the custom resource definition:

oc new-project helloworld-operator
oc create -f deploy/crds/hello_v1alpha1_helloworld_crd.yaml

Before we apply the resources let’s review and edit operator image configuration to point to our newly create operator container image:

$ cat deploy/operator.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: helloworld-operator
spec:
  replicas: 1
  selector:
    matchLabels:
      name: helloworld-operator
  template:
    metadata:
      labels:
        name: helloworld-operator
    spec:
      serviceAccountName: helloworld-operator
      containers:
        - name: helloworld-operator
          # Replace this with the built image name
          image: berndonline/openshift-helloworld-operator:v0.1
          imagePullPolicy: Always
          env:
            - name: WATCH_NAMESPACE
              value: ""
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: OPERATOR_NAME
              value: "helloworld-operator"

$ cat deploy/role_binding.yaml
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: helloworld-operator
subjects:
- kind: ServiceAccount
  name: helloworld-operator
  # Replace this with the namespace the operator is deployed in.
  namespace: helloworld-operator
roleRef:
  kind: ClusterRole
  name: helloworld-operator
  apiGroup: rbac.authorization.k8s.io

$ cat deploy/role_user.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  creationTimestamp: null
  name: helloworld-operator-execute
rules:
- apiGroups:
  - hello.world.com
  resources:
  - '*'
  verbs:
  - '*'

Afterwards we can deploy the required resources:

oc create -f deploy/operator.yaml \
          -f deploy/role_binding.yaml \
          -f deploy/role.yaml \
          -f deploy/service_account.yaml

Create a cluster-role for the custom resource definition and add bind user to a cluster-role to be able to create a custom resource:

oc create -f deploy/role_user.yaml 
oc adm policy add-cluster-role-to-user helloworld-operator-execute berndonline

If you forget to do this you will see the following error message:

Now we can login as your openshift user and create the custom resource in the namespace myproject:

$ oc create -n myproject -f deploy/crds/hello_v1alpha1_helloworld_cr.yaml
helloworld.hello.world.com/hello-openshift created
$ oc describe Helloworld/hello-openshift -n myproject
Name:         hello-openshift
Namespace:    myproject
Labels:       
Annotations:  
API Version:  hello.world.com/v1alpha1
Kind:         Helloworld
Metadata:
  Creation Timestamp:  2019-03-16T15:33:25Z
  Generation:          1
  Resource Version:    19692
  Self Link:           /apis/hello.world.com/v1alpha1/namespaces/myproject/helloworlds/hello-openshift
  UID:                 d6ce75d7-4800-11e9-b6a8-0a238ec78c2a
Spec:
  Size:  1
Status:
  Conditions:
    Last Transition Time:  2019-03-16T15:33:25Z
    Message:               Running reconciliation
    Reason:                Running
    Status:                True
    Type:                  Running
Events:                    

You can also create the custom resource via the web console:

You will get a security warning which you need to confirm to apply the custom resource:

After a few minutes the operator will create the deploymentconfig and will deploy the hello-openshift pod:

$ oc get dc
NAME              REVISION   DESIRED   CURRENT   TRIGGERED BY
hello-openshift   1          1         1         config,image(hello-openshift:latest)

$ oc get pods
NAME                      READY     STATUS    RESTARTS   AGE
hello-openshift-1-pjhm4   1/1       Running   0          2m

We can modify custom resource and change the spec size to three:

$ oc edit Helloworld/hello-openshift
...
spec:
  size: 3
...

$ oc describe Helloworld/hello-openshift
Name:         hello-openshift
Namespace:    myproject
Labels:       
Annotations:  
API Version:  hello.world.com/v1alpha1
Kind:         Helloworld
Metadata:
  Creation Timestamp:  2019-03-16T15:33:25Z
  Generation:          2
  Resource Version:    24902
  Self Link:           /apis/hello.world.com/v1alpha1/namespaces/myproject/helloworlds/hello-openshift
  UID:                 d6ce75d7-4800-11e9-b6a8-0a238ec78c2a
Spec:
  Size:  3
Status:
  Conditions:
    Last Transition Time:  2019-03-16T15:33:25Z
    Message:               Running reconciliation
    Reason:                Running
    Status:                True
    Type:                  Running
Events:                    
~ centos(ocp: myproject) $

The operator will change the deployment config and change the desired state to three pods:

$ oc get dc
NAME              REVISION   DESIRED   CURRENT   TRIGGERED BY
hello-openshift   1          3         3         config,image(hello-openshift:latest)

$ oc get pods
NAME                      READY     STATUS    RESTARTS   AGE
hello-openshift-1-pjhm4   1/1       Running   0          32m
hello-openshift-1-qhqgx   1/1       Running   0          3m
hello-openshift-1-qlb2q   1/1       Running   0          3m

To clean-up and remove the deployment config you need to delete the custom resource

oc delete Helloworld/hello-openshift -n myproject
oc adm policy remove-cluster-role-from-user helloworld-operator-execute berndonline

I hope this is a good and simple example to show how powerful operators are on OpenShift / Kubernetes.

Part three: Ansible URI module and PUT or POST

This will be the last part of my short series on the Ansible URI module and this time I will explain and show examples about when to use PUT or POST when interacting with REST APIs. I make use of the JSON_QUERY filter which I have explained in my previous article.

What is the difference between POST and PUT?

  • PUT – The PUT method is idempotent and needs the universal unique identifier (uuid) to update an API object. Example PUT /api/service/{{ object-uuid }}. The HTTP return code is 200.

  • POST – Is not idempotent and used to create an API object and an unique identifier is not needed for this. In this case the uuid is server-side generated.  Example POST /api/service/. The HTTP return code is 201.

I am again using the example from AVI Network Software Load Balancers and their REST API.

---
password: 123
api_version: 17.2.13
openshift:
  name: openshift-cloud-provider
openshift_cloud_json: "{{ lookup('template','openshift_cloud_json.j2') }}"

(Optional) Set ansible_host variable to IP address. I have had issues in the past using the DNS name and the task below overrides the variable with the IP address of the host:

- block:
  - name: Resolve hostname
    shell: dig +short "{{ ansible_host }}"
    changed_when: false
    register: dig_output
  
  - name: Set ansible_host to IP address
    set_fact:
      ansible_host: "{{ dig_output.stdout }}"
  when: ( inventory_hostname == groups ["controller"][0] )

Let’s start creating an object using POST and afterwards updating the existing object using PUT. The problem with POST is, that it is not idempotent so we need to first check if the object exists before creating it. We need to do this because creating the same object twice could be an issue:

- block: 
  - name: Avi | OpenShift | Check cloud config
    uri:
      url: "https://{{ ansible_host }}/api/cloud/?name={{ openshift.name }}" 
      method: GET 
      user: "{{ username }}" 
      password: "{{ password }}" 
      return_content: yes 
      body_format: json 
      force_basic_auth: yes 
      validate_certs: false 
      status_code: 200 
      timeout: 180 
      headers:
        X-Avi-Version: "{{ api_version }}" 
    register: check

  - name: Avi | OpenShift | Create cloud config
    uri:
      url: "https://{{ ansible_host }}/api/cloud/" 
      method: POST 
      user: "{{ username }}" 
      password: "{{ password }}" 
      return_content: yes 
      body: "{{ openshift_cloud_json }}" 
      body_format: json 
      force_basic_auth: yes 
      validate certs: false 
      status_code: 201 
      timeout: 180 
      headers:
        X-Avi-Version: "{{ api_version }}"
    when: check.json.count == 0 
  when: ( inventory_hostname == groups ["controller"][0] ) and update_config is undefined

Let’s continue with the example and using PUT to update the configuration of an existing object. To do this you need to define a extra variable update_config=true for the tasks below to be executed:

- block: 
  - name: Avi | OpenShift | Check cloud config
    uri:
      url: "https://{{ ansible_host }}/api/cloud/" 
      method: GET 
      user: "{{ username }}" 
      password: "{{ password }}" 
      return_content: yes 
      body_format: json 
      force_basic_auth: yes 
      validate_certs: false 
      status_code: 200 
      timeout: 180 
      headers:
        X-Avi-Version: "{{ api_version }}" 
    register: check

  - name: Avi | Set_fact for OpenShift name 
    set_fact:
      openshift_cloud_name: "[?name=='{{ openshift.name }}').uuid"
      
  - name: Avi | Set_fact for OpenShift uuid
    set_fact:
      openshift_cloud_uuid: "{{ check.json.results | json_query(penshift_cloud_name) }}" 
      
  - name: Avi | OpenShift | Update cloud config
    uri:
      url: "https://{{ ansible_host }}/api/cloud/{{ openshift_cloud_uuid [0] }}" 
      method: PUT 
      user: "{{ username }}" 
      password: "{{ password }}" 
      return_content: yes 
      body: "{{ openshift_cloud_json }}" 
      body_format: json 
      force_basic_auth: yes 
      validate_certs: false 
      status_code: 200 
      timeout: 180 
      headers:
        X-Avi-Version: "{{ api_version }}" 
    when: ( inventory_hostname == groups ("controller"][0] ) and update_config is defined

Here you find the links to the other articles about Ansible URI module:

Please share your feedback and leave a comment.

Part two: Ansible URI module and json_query filter

In my previous article I tried to explain how to use the Ansible URI Module and using the Jinja2 template engine to generate the JSON content. In part two I want to explain how to use the json_query filter. I will use the example with AVI Networks Load Balancers but this can be with any device with an REST API.

First we need to get the output from two objects, for both we don’t know the UUIDs and the first two tasks are to collect the configuration from the API using GET and register the output:

- block:
  - name: Avi | Get OpenShift cloud configuration
    uri:
      url: "https://{{ ansible_host }}/api/cloud/"
      method: GET
      user: "{{ avi_username }}"
      password: "{{ avi_password }}"
      return_content: yes
      force_basic_auth: yes
      validate_certs: false
      status_code: 200
      timeout: 180
      headers:
        X-Avi-Version: "{{ api_version }}"
    register: openshift_cloud 
   
  - name: Avi | Get OpenShift Service Engine group
    uri:
      url: "https://{{ ansible_host }}/api/serviceenginegroup/"
      method: GET
      user: "{{ avi_username }}"
      password: "{{ avi_password }}"
      return_content: yes
      force_basic_auth: yes
      validate_certs: false
      status_code: 200
      timeout: 180
      headers:
        X-Avi-Version: "{{ api_version }}"
    register: openshift_segroup
  when: '( inventory_hostname == groups["controller"][0] )'

The two variables openshift_cloud and openshift_segroup contain JSON content with all configuration details. For the OpenShift cloud object I don’t know the UUID, the only reference is the object name “OpenShift Cloud” which I know because I had previously created the object. I am using the Ansible module Set_Fact for specifying the query and writing the output into a new variable openshift_cloud_uuid:

- block:
  - name: Avi | set_fact for OpenShift cloud query
    set_fact:
      openshift_cloud_query: "[?name=='OpenShift Cloud'].uuid"
  
  - name: Avi | set_fact for OpenShift UUID
    set_fact:
      openshift_cloud_uuid: "{{ openshift_cloud.json.results | json_query(openshift_cloud_query) }}"
  when: '( inventory_hostname == groups["controller"][0] )' 

We now have the openshift_cloud_uuid of the OpenShift cloud configuration so let’s continue with the second object of the Service Engine group which is trickier because I don’t know the UUID or the object name. The Service Engine group was automatically set-up in the background when the OpenShift cloud object got created but I know the reference to the OpenShift cloud object and I use the json_query filter and set_fact again:

- block:
  - name: Avi | set_fact for Service Engine group query
    set_fact:
      openshift_segroup_query: "[?cloud_ref=='https://{{ ansible_host }}/api/cloud/{{ openshift_cloud_uuid[0] }}'].uuid"
  
  - name: Avi | set_fact for Service Engine group UUID
    set_fact:
      openshift_segroup_uuid: "{{ openshift_segroup.json.results | json_query(openshift_segroup_query) }}"
  when: '( inventory_hostname == groups["controller"][0] )'

Right now we know the openshift_cloud_uuid and the openshift_segroup_uuid, we use this to load a new Jinja2 template to update the Service Engine group object. See below the Jinja2 template openshift_segroup_json.j2:

{
  ...
  "name": "Default-Group",
  "tenant_ref": "https://{{ ansible_host }}/api/tenant/admin",
  "cloud_ref": "https://{{ ansible_host }}/api/cloud/{{ openshift_cloud_uuid[0] }}",
  ...
  YOUR CHANGES
  ...
}

The last part of this exercise is to load the j2 template and push the json content against the API to update the object using PUT:

- block:
  - name: Avi | set_fact to load Service Engine group json template
    set_fact:
      openshift_segroup_json: "{{ lookup('template', 'openshift_segroup_json.j2') }}"
  
  - name: Avi| Update OpenShift Service Engine group configuration
    uri:
      url: "https://{{ ansible_host }}/api/serviceenginegroup/{{ openshift_segroup_uuid[0] }}"
      method: PUT
      user: "{{ avi_username }}"
      password: "{{ avi_password }}"
      return_content: yes
      force_basic_auth: yes
      validate_certs: false
      body: "{{ openshift_segroup_json }}"
      body_format: json
      status_code: 200
      timeout: 180
      headers:
        X-Avi-Version: "{{ api_version }}"
  when: '( inventory_hostname == groups["controller"][0] )'

I hope this article is helpful on how to use the Ansible URI module and the json_query filter to extract information and update an API object. Please share your feedback and leave a comment.

Here you find the links to the other articles about Ansible URI module: