Using HashiCorp Terraform to deploy Amazon AWS VPC

Before I start deploying the AWS VPC with HashCorp’s Terraform I want to explain the design of the Virtual Private Cloud. The main focus here is primarily for redundancy to ensure that if one Availability Zone (AZ) becomes unavailable that it is not interrupting the traffic and causing outages in your network, the NAT Gateway for example run per AZ so you need to make sure that these services are spread over multiple AZs.

AWS VPC network overview:

Before you start using Terraform you need to install the binary and it is also very useful to install the AWS command line interface. Please don’t forget to register the AWS CLI and add access and secure key.

pip install awscli --upgrade --user
wget https://releases.hashicorp.com/terraform/0.11.7/terraform_0.11.7_linux_amd64.zip
unzip terraform_0.11.7_linux_amd64.zip
sudo mv terraform /usr/local/bin/

Terraform is a great product and creates infrastructure as code, and is independent from any cloud provider so there is no need to use AWS CloudFormation like in my example. My repository for the Terraform files can be found here: https://github.com/berndonline/aws-terraform

Let’s start with the variables file, which defines the needed settings for deploying the VPC. Basically you only need to change the variables to deploy the VPC to another AWS region:

...
variable "aws_region" {
  description = "AWS region to launch servers."
  default     = "eu-west-1"
}
...
variable "vpc_cidr" {
    default = "10.0.0.0/20"
  description = "the vpc cdir range"
}
variable "public_subnet_a" {
  default = "10.0.0.0/24"
  description = "Public subnet AZ A"
}
variable "public_subnet_b" {
  default = "10.0.4.0/24"
  description = "Public subnet AZ A"
}
variable "public_subnet_c" {
  default = "10.0.8.0/24"
  description = "Public subnet AZ A"
}
...

The vpc.tf file is the Terraform template which deploys the private and public subnets, the internet gateway, multiple NAT gateways and the different routing tables and adds the needed routes towards the internet:

# Create a VPC to launch our instances into
resource "aws_vpc" "default" {
    cidr_block = "${var.vpc_cidr}"
    enable_dns_support = true
    enable_dns_hostnames = true
    tags {
      Name = "VPC"
    }
}

resource "aws_subnet" "PublicSubnetA" {
  vpc_id = "${aws_vpc.default.id}"
  cidr_block = "${var.public_subnet_a}"
  tags {
        Name = "Public Subnet A"
  }
 availability_zone = "${data.aws_availability_zones.available.names[0]}"
}
...

In the main.tf you define which provider to use:

# Specify the provider and access details
provider "aws" {
  region = "${var.aws_region}"
}

# Declare the data source
data "aws_availability_zones" "available" {}

Now let’s start deploying the environment, first you need to initialise Terraform “terraform init“:

[email protected]:~/aws-terraform$ terraform init

Initializing provider plugins...
- Checking for available provider plugins on https://releases.hashicorp.com...
- Downloading plugin for provider "aws" (1.25.0)...

The following providers do not have any version constraints in configuration,
so the latest version was installed.

To prevent automatic upgrades to new major versions that may contain breaking
changes, it is recommended to add version = "..." constraints to the
corresponding provider blocks in configuration, with the constraint strings
suggested below.

* provider.aws: version = "~> 1.25"

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
[email protected]:~/aws-terraform$

Next, let’s do a dry run “terraform plan” to see all changes Terraform would apply:

[email protected]:~/aws-terraform$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

data.aws_availability_zones.available: Refreshing state...

------------------------------------------------------------------------

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  + aws_eip.natgw_a
      id:                                          
      allocation_id:                               
      association_id:                              
      domain:                                      
      instance:                                    
      network_interface:                           
      private_ip:                                  
      public_ip:                                   
      vpc:                                         "true"

...

  + aws_vpc.default
      id:                                          
      assign_generated_ipv6_cidr_block:            "false"
      cidr_block:                                  "10.0.0.0/20"
      default_network_acl_id:                      
      default_route_table_id:                      
      default_security_group_id:                   
      dhcp_options_id:                             
      enable_classiclink:                          
      enable_classiclink_dns_support:              
      enable_dns_hostnames:                        "true"
      enable_dns_support:                          "true"
      instance_tenancy:                            "default"
      ipv6_association_id:                         
      ipv6_cidr_block:                             
      main_route_table_id:                         
      tags.%:                                      "1"
      tags.Name:                                   "VPC"


Plan: 27 to add, 0 to change, 0 to destroy.

------------------------------------------------------------------------

Note: You didn't specify an "-out" parameter to save this plan, so Terraform
can't guarantee that exactly these actions will be performed if
"terraform apply" is subsequently run.

[email protected]:~/aws-terraform$

Because nothing is deployed, Terraform would apply 27 changes, so let’s do this by running “terraform apply“. Terraform will check the state and will ask you to confirm and then apply the changes:

[email protected]:~/aws-terraform$ terraform apply
data.aws_availability_zones.available: Refreshing state...

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  + aws_eip.natgw_a
      id:                                          
      allocation_id:                               
      association_id:                              
      domain:                                      
      instance:                                    
      network_interface:                           
      private_ip:                                  
      public_ip:                                   
      vpc:                                         "true"

...

  + aws_vpc.default
      id:                                          
      assign_generated_ipv6_cidr_block:            "false"
      cidr_block:                                  "10.0.0.0/20"
      default_network_acl_id:                      
      default_route_table_id:                      
      default_security_group_id:                   
      dhcp_options_id:                             
      enable_classiclink:                          
      enable_classiclink_dns_support:              
      enable_dns_hostnames:                        "true"
      enable_dns_support:                          "true"
      instance_tenancy:                            "default"
      ipv6_association_id:                         
      ipv6_cidr_block:                             
      main_route_table_id:                         
      tags.%:                                      "1"
      tags.Name:                                   "VPC"


Plan: 27 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

aws_eip.natgw_c: Creating...
  allocation_id:     "" => ""
  association_id:    "" => ""
  domain:            "" => ""
  instance:          "" => ""
  network_interface: "" => ""
  private_ip:        "" => ""
  public_ip:         "" => ""
  vpc:               "" => "true"
aws_eip.natgw_a: Creating...
  allocation_id:     "" => ""
  association_id:    "" => ""
  domain:            "" => ""
  instance:          "" => ""
  network_interface: "" => ""
  private_ip:        "" => ""
  public_ip:         "" => ""
  vpc:               "" => "true"

...

aws_route_table_association.PrivateSubnetB: Creation complete after 0s (ID: rtbassoc-174ba16c)
aws_nat_gateway.public_nat_c: Still creating... (1m40s elapsed)
aws_nat_gateway.public_nat_c: Still creating... (1m50s elapsed)
aws_nat_gateway.public_nat_c: Creation complete after 1m56s (ID: nat-093319a1fa62c3eda)
aws_route_table.private_route_c: Creating...
  propagating_vgws.#:                         "" => ""
  route.#:                                    "" => "1"
  route.4170986711.cidr_block:                "" => "0.0.0.0/0"
  route.4170986711.egress_only_gateway_id:    "" => ""
  route.4170986711.gateway_id:                "" => ""
  route.4170986711.instance_id:               "" => ""
  route.4170986711.ipv6_cidr_block:           "" => ""
  route.4170986711.nat_gateway_id:            "" => "nat-093319a1fa62c3eda"
  route.4170986711.network_interface_id:      "" => ""
  route.4170986711.vpc_peering_connection_id: "" => ""
  tags.%:                                     "" => "1"
  tags.Name:                                  "" => "Private Route C"
  vpc_id:                                     "" => "vpc-fdffb19b"
aws_route_table.private_route_c: Creation complete after 1s (ID: rtb-d64632af)
aws_route_table_association.PrivateSubnetC: Creating...
  route_table_id: "" => "rtb-d64632af"
  subnet_id:      "" => "subnet-17da194d"
aws_route_table_association.PrivateSubnetC: Creation complete after 1s (ID: rtbassoc-35749e4e)

Apply complete! Resources: 27 added, 0 changed, 0 destroyed.
[email protected]:~/aws-terraform$

Terraform successfully applied all the changes so let’s have a quick look in the AWS web console:

You can change the environment and run “terraform apply” again and Terraform would deploy the changes you have made. In my example below I didn’t, so Terraform would do nothing because it tracks the state that is deployed and that I have defined in the vpc.tf:

[email protected]:~/aws-terraform$ terraform apply
aws_eip.natgw_c: Refreshing state... (ID: eipalloc-7fa0eb42)
aws_vpc.default: Refreshing state... (ID: vpc-fdffb19b)
aws_eip.natgw_a: Refreshing state... (ID: eipalloc-3ca7ec01)
aws_eip.natgw_b: Refreshing state... (ID: eipalloc-e6bbf0db)
data.aws_availability_zones.available: Refreshing state...
aws_subnet.PublicSubnetC: Refreshing state... (ID: subnet-d6e4278c)
aws_subnet.PrivateSubnetC: Refreshing state... (ID: subnet-17da194d)
aws_subnet.PrivateSubnetA: Refreshing state... (ID: subnet-6ea62708)
aws_subnet.PublicSubnetA: Refreshing state... (ID: subnet-1ab0317c)
aws_network_acl.all: Refreshing state... (ID: acl-c75f9ebe)
aws_internet_gateway.gw: Refreshing state... (ID: igw-27652940)
aws_subnet.PrivateSubnetB: Refreshing state... (ID: subnet-ab59c8e3)
aws_subnet.PublicSubnetB: Refreshing state... (ID: subnet-4a51c002)
aws_route_table.public_route_b: Refreshing state... (ID: rtb-a45d29dd)
aws_route_table.public_route_a: Refreshing state... (ID: rtb-5b423622)
aws_route_table.public_route_c: Refreshing state... (ID: rtb-0453277d)
aws_nat_gateway.public_nat_b: Refreshing state... (ID: nat-0376fc652d362a3b1)
aws_nat_gateway.public_nat_a: Refreshing state... (ID: nat-073ed904d4cf2d30e)
aws_route_table_association.PublicSubnetA: Refreshing state... (ID: rtbassoc-b14ba1ca)
aws_route_table_association.PublicSubnetB: Refreshing state... (ID: rtbassoc-277d975c)
aws_route_table.private_route_a: Refreshing state... (ID: rtb-0745317e)
aws_route_table.private_route_b: Refreshing state... (ID: rtb-a15a2ed8)
aws_route_table_association.PrivateSubnetB: Refreshing state... (ID: rtbassoc-174ba16c)
aws_route_table_association.PrivateSubnetA: Refreshing state... (ID: rtbassoc-60759f1b)
aws_nat_gateway.public_nat_c: Refreshing state... (ID: nat-093319a1fa62c3eda)
aws_route_table_association.PublicSubnetC: Refreshing state... (ID: rtbassoc-307e944b)
aws_route_table.private_route_c: Refreshing state... (ID: rtb-d64632af)
aws_route_table_association.PrivateSubnetC: Refreshing state... (ID: rtbassoc-35749e4e)

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
[email protected]:~/aws-terraform$

To remove the environment use run “terraform destroy“:

[email protected]:~/aws-terraform$ terraform destroy
aws_eip.natgw_c: Refreshing state... (ID: eipalloc-7fa0eb42)
data.aws_availability_zones.available: Refreshing state...
aws_eip.natgw_a: Refreshing state... (ID: eipalloc-3ca7ec01)
aws_vpc.default: Refreshing state... (ID: vpc-fdffb19b)
aws_eip.natgw_b: Refreshing state... (ID: eipalloc-e6bbf0db)

...

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  - destroy

Terraform will perform the following actions:

  - aws_eip.natgw_a

  - aws_eip.natgw_b

  - aws_eip.natgw_c

...

Plan: 0 to add, 0 to change, 27 to destroy.

Do you really want to destroy?
  Terraform will destroy all your managed infrastructure, as shown above.
  There is no undo. Only 'yes' will be accepted to confirm.

  Enter a value: yes

aws_network_acl.all: Destroying... (ID: acl-c75f9ebe)
aws_route_table_association.PrivateSubnetA: Destroying... (ID: rtbassoc-60759f1b)
aws_route_table_association.PublicSubnetC: Destroying... (ID: rtbassoc-307e944b)
aws_route_table_association.PublicSubnetA: Destroying... (ID: rtbassoc-b14ba1ca)
aws_route_table_association.PublicSubnetB: Destroying... (ID: rtbassoc-277d975c)
aws_route_table_association.PrivateSubnetC: Destroying... (ID: rtbassoc-35749e4e)
aws_route_table_association.PrivateSubnetB: Destroying... (ID: rtbassoc-174ba16c)
aws_route_table_association.PrivateSubnetB: Destruction complete after 0s

...

aws_internet_gateway.gw: Destroying... (ID: igw-27652940)
aws_eip.natgw_c: Destroying... (ID: eipalloc-7fa0eb42)
aws_subnet.PrivateSubnetC: Destroying... (ID: subnet-17da194d)
aws_subnet.PrivateSubnetC: Destruction complete after 1s
aws_eip.natgw_c: Destruction complete after 1s
aws_internet_gateway.gw: Still destroying... (ID: igw-27652940, 10s elapsed)
aws_internet_gateway.gw: Destruction complete after 11s
aws_vpc.default: Destroying... (ID: vpc-fdffb19b)
aws_vpc.default: Destruction complete after 0s

Destroy complete! Resources: 27 destroyed.
[email protected]:~/aws-terraform$

I hope this article was informative and explains how to deploy a VPC with Terraform. In the coming weeks I will add additional functions like deploying EC2 Instances and Load Balancing.

Please share your feedback and leave a comment.

Getting started with OpenShift Container Platform

In the recent month I have spend a lot of time around networking and automation but I want to shift more towards running modern container platforms like Kubernetes or OpenShift which both are using networking services and as I have shared in one of my previous article about AVI software load balancer, it all fits nicely into networking in my opinion.

But before we start, please have a look at my previous article about Deploying OpenShift Origin Cluster using Ansible to create a small OpenShift platform for testing.

Create a bash completion file for oc commands:

[[email protected] ~]# oc completion bash > /etc/bash_completion.d/oc
[[email protected] ~]# . /etc/bash_completion.d/oc
  • Let’s start and login to OpenShift as a normal user account
[[email protected] ~]# oc login https://console.lab.hostgate.net:8443/
The server is using a certificate that does not match its hostname: x509: certificate is valid for lab.hostgate.net, not console.lab.hostgate.net
You can bypass the certificate check, but any data you send to the server could be intercepted by others.
Use insecure connections? (y/n): y

Authentication required for https://console.lab.hostgate.net:8443 (openshift)
Username: demo
Password:
Login successful.

[[email protected] ~]#

Instead of username and password use token which you can get from the web console:

oc login https://console.lab.hostgate.net:8443 --token=***hash token***
  • Now create the project where we want to run our web application:
[[email protected] ~]# oc new-project webapp
Now using project "webapp" on server "https://console.lab.hostgate.net:8443".

You can add applications to this project with the 'new-app' command. For example, try:

    oc new-app centos/ruby-22-centos7~https://github.com/openshift/ruby-ex.git

to build a new example application in Ruby.
[[email protected] ~]#

Afterwards we need to create a build configuration, in my example we use an external Dockerfile without starting the build directly:

[[email protected] ~]#  oc new-build --name webapp-build --binary
warning: Cannot find git. Ensure that it is installed and in your path. Git is required to work with git repositories.
    * A Docker build using binary input will be created
      * The resulting image will be pushed to image stream "webapp-build:latest"
      * A binary build was created, use 'start-build --from-dir' to trigger a new build

--> Creating resources with label build=webapp-build ...
    imagestream "webapp-build" created
    buildconfig "webapp-build" created
--> Success
[[email protected] ~]#

Create Dockerfile:

[[email protected] ~]# vi Dockerfile

Copy and paste the line below into the Dockerfile:

FROM openshift/hello-openshift

Let’s continue and start the build from the Dockerfile we specified previously

[[email protected] ~]#  oc start-build webapp-build --from-file=Dockerfile --follow
Uploading file "Dockerfile" as binary input for the build ...
build "webapp-build-1" started
Receiving source from STDIN as file Dockerfile
Pulling image openshift/hello-openshift ...
Step 1/3 : FROM openshift/hello-openshift
 ---> 7af3297a3fb4
Step 2/3 : ENV "OPENSHIFT_BUILD_NAME" "webapp-build-1" "OPENSHIFT_BUILD_NAMESPACE" "webapp"
 ---> Running in 422f63f69364
 ---> 2cd93085ec93
Removing intermediate container 422f63f69364
Step 3/3 : LABEL "io.openshift.build.name" "webapp-build-1" "io.openshift.build.namespace" "webapp"
 ---> Running in 0c3e6cce6f0b
 ---> cf178dda8238
Removing intermediate container 0c3e6cce6f0b
Successfully built cf178dda8238
Pushing image docker-registry.default.svc:5000/webapp/webapp-build:latest ...
Push successful
[[email protected] ~]#

Alternatively you can directly inject the Dockerfile options in a single command and the build would start immediately:

[[email protected] ~]#  oc new-build --name webapp-build -D $'FROM openshift/hello-openshift'
  • Create the web application
[[email protected] ~]# oc new-app webapp-build
warning: Cannot find git. Ensure that it is installed and in your path. Git is required to work with git repositories.
--> Found image cf178dd (4 minutes old) in image stream "webapp/webapp-build" under tag "latest" for "webapp-build"

    * This image will be deployed in deployment config "webapp-build"
    * Ports 8080/tcp, 8888/tcp will be load balanced by service "webapp-build"
      * Other containers can access this service through the hostname "webapp-build"

--> Creating resources ...
    deploymentconfig "webapp-build" created
    service "webapp-build" created
--> Success
    Application is not exposed. You can expose services to the outside world by executing one or more of the commands below:
     'oc expose svc/webapp-build'
    Run 'oc status' to view your app.
[[email protected] ~]#

As you see below, we are currently running a single pod:

[[email protected] ~]#  oc get pod -o wide
NAME                   READY     STATUS      RESTARTS   AGE       IP            NODE
webapp-build-1-build   0/1       Completed   0          8m        10.131.0.27   origin-node-1
webapp-build-1-znk98   1/1       Running     0          3m        10.131.0.29   origin-node-1
[[email protected] ~]#

Let’s check out endpoints and services:

[[email protected] ~]# oc get ep
NAME           ENDPOINTS                           AGE
webapp-build   10.131.0.29:8080,10.131.0.29:8888   1m
[[email protected] ~]# oc get svc
NAME           CLUSTER-IP     EXTERNAL-IP   PORT(S)             AGE
webapp-build   172.30.64.97           8080/TCP,8888/TCP   1m
[[email protected] ~]#

Running a single pod is not great for redundancy, let’s scale out:

[[email protected] ~]# oc scale --replicas=5 dc/webapp-build
deploymentconfig "webapp-build" scaled
[[email protected] ~]#  oc get pod -o wide
NAME                   READY     STATUS      RESTARTS   AGE       IP            NODE
webapp-build-1-4fb98   1/1       Running     0          15s       10.130.0.47   origin-node-2
webapp-build-1-build   0/1       Completed   0          9m        10.131.0.27   origin-node-1
webapp-build-1-dw6ww   1/1       Running     0          15s       10.131.0.30   origin-node-1
webapp-build-1-lswhg   1/1       Running     0          15s       10.131.0.31   origin-node-1
webapp-build-1-z4nk9   1/1       Running     0          15s       10.130.0.46   origin-node-2
webapp-build-1-znk98   1/1       Running     0          4m        10.131.0.29   origin-node-1
[[email protected] ~]#

We can check our endpoints and services again, and see that we have more endpoints and still one service:

[[email protected] ~]# oc get ep
NAME           ENDPOINTS                                                        AGE
webapp-build   10.130.0.46:8080,10.130.0.47:8080,10.131.0.29:8080 + 7 more...   4m
[[email protected] ~]# oc get svc
NAME           CLUSTER-IP     EXTERNAL-IP   PORT(S)             AGE
webapp-build   172.30.64.97           8080/TCP,8888/TCP   4m
[[email protected] ~]#

OpenShift uses an internal DNS service called SkyDNS to expose services for internal communication:

[[email protected] ~]# dig webapp-build.webapp.svc.cluster.local

; <<>> DiG 9.9.4-RedHat-9.9.4-61.el7 <<>> webapp-build.webapp.svc.cluster.local
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 20933
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;webapp-build.webapp.svc.cluster.local. IN A

;; ANSWER SECTION:
webapp-build.webapp.svc.cluster.local. 30 IN A	172.30.64.97

;; Query time: 1 msec
;; SERVER: 10.255.1.214#53(10.255.1.214)
;; WHEN: Sat Jun 30 08:58:19 UTC 2018
;; MSG SIZE  rcvd: 71

[[email protected] ~]#
  • Let’s expose our web application so that it is accessible from the outside world:
[[email protected] ~]# oc expose svc webapp-build
route "webapp-build" exposed
[[email protected] ~]#

Connect with a browser to the URL you see under routes:

Modify the WebApp and inject variables via a config map into our application:

[[email protected] ~]# oc create configmap webapp-map --from-literal=RESPONSE="My first OpenShift WebApp"
configmap "webapp-map" created
[[email protected] ~]#

Afterwards we need to add the previously created config map to our environment

[[email protected] ~]# oc env dc/webapp-build --from=configmap/webapp-map
deploymentconfig "webapp-build" updated
[[email protected] ~]#

Now when we check our web application again you see that the new variables are injected into the pod and displayed:

I will share more about running OpenShift Container Platform and my experience in the coming month. I hope you find this article useful and please share your feedback and leave a comment.

Ansible Playbook for deploying AVI Controller nodes and Service Engines

After my first blog post about Software defined Load Balancing with AVI Networks, here is how to automatically deploy AVI controller and services engines via Ansible.

Here are the links to my repositories; AVI Vagrant environment: https://github.com/berndonline/avi-lab-vagrant and AVI Ansible Playbook: https://github.com/berndonline/avi-lab-provision

Make sure that your vagrant environment is running,

[email protected]:~/avi-lab-vagrant$ vagrant status
Current machine states:

avi-controller-1          running (libvirt)
avi-controller-2          running (libvirt)
avi-controller-3          running (libvirt)
avi-se-1                  running (libvirt)
avi-se-2                  running (libvirt)

This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run `vagrant status NAME`.

I needed to modify the ansible.cfg to integrate a filter plugin:

[defaults]
inventory = ./.vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory
host_key_checking=False

library = /home/berndonline/avi-lab-provision/lib
filter_plugins = /home/berndonline/avi-lab-provision/lib/filter_plugins

The controller installation is actually very simple and I got it from the official AVI ansible role they created, I added a second role to check ones the controller nodes are successfully booted:

---
- hosts: avi-controller
  user: '{{ ansible_ssh_user }}'
  gather_facts: "true"
  roles:
    - {role: ansible-role-avicontroller, become: true}
    - {role: avi-post-controller, become: false}

There’s one important thing to know before we run the playbook. When you have an AVI subscription you get custom container images with a predefined default password which makes it easier for you to do the cluster setup fully automated. You find the default password variable in group_vars/all.yml there you set as well if the password should be changed.

Let’s execute the ansible playbook, it takes a bit time for the three nodes to boot up:

[email protected]:~/avi-lab-vagrant$ ansible-playbook ../avi-lab-provision/playbooks/avi-controller-install.yml

PLAY [avi-controller] *********************************************************************************************************************************************

TASK [Gathering Facts] ********************************************************************************************************************************************
ok: [avi-controller-3]
ok: [avi-controller-2]
ok: [avi-controller-1]

TASK [ansible-role-avicontroller : Avi Controller | Deployment] ***************************************************************************************************
included: /home/berndonline/avi-lab-provision/roles/ansible-role-avicontroller/tasks/docker/main.yml for avi-controller-1, avi-controller-2, avi-controller-3

TASK [ansible-role-avicontroller : Avi Controller | Services | systemd | Check if Avi Controller installed] *******************************************************
included: /home/berndonline/avi-lab-provision/roles/ansible-role-avicontroller/tasks/docker/services/systemd/check.yml for avi-controller-1, avi-controller-2, avi-controller-3

TASK [ansible-role-avicontroller : Avi Controller | Check if Avi Controller installed] ****************************************************************************
ok: [avi-controller-3]
ok: [avi-controller-2]
ok: [avi-controller-1]

TASK [ansible-role-avicontroller : Avi Controller | Services | init.d | Check if Avi Controller installed] ********************************************************
skipping: [avi-controller-1]
skipping: [avi-controller-2]
skipping: [avi-controller-3]

TASK [ansible-role-avicontroller : Avi Controller | Check minimum requirements] ***********************************************************************************
included: /home/berndonline/avi-lab-provision/roles/ansible-role-avicontroller/tasks/docker/requirements.yml for avi-controller-1, avi-controller-2, avi-controller-3

TASK [ansible-role-avicontroller : Avi Controller | Requirements | Check for docker] ******************************************************************************
ok: [avi-controller-2]
ok: [avi-controller-3]
ok: [avi-controller-1]

...

TASK [avi-post-controller : wait for cluster nodes up] ************************************************************************************************************
FAILED - RETRYING: wait for cluster nodes up (30 retries left).
FAILED - RETRYING: wait for cluster nodes up (30 retries left).
FAILED - RETRYING: wait for cluster nodes up (30 retries left).

...

FAILED - RETRYING: wait for cluster nodes up (7 retries left).
FAILED - RETRYING: wait for cluster nodes up (8 retries left).
FAILED - RETRYING: wait for cluster nodes up (7 retries left).
FAILED - RETRYING: wait for cluster nodes up (7 retries left).
ok: [avi-controller-2]
ok: [avi-controller-3]
ok: [avi-controller-1]

PLAY RECAP ********************************************************************************************************************************************************
avi-controller-1           : ok=36   changed=6    unreachable=0    failed=0
avi-controller-2           : ok=35   changed=5    unreachable=0    failed=0
avi-controller-3           : ok=35   changed=5    unreachable=0    failed=0

[email protected]:~/avi-lab-vagrant$

We are not finished yet and need to set basic settings like NTP and DNS, and need to configure the AVI three node controller cluster with another playbook:

---
- hosts: localhost
  connection: local
  roles:
    - {role: avi-cluster-setup, become: false}
    - {role: avi-change-password, become: false, when: avi_change_password == true}

The first role uses the REST API to do the configuration changes and requires the AVI ansible sdk role and for these reason it is very useful using the custom subscription images because you know the default password otherwise you need to modify the main setup.json file.

Let’s run the AVI cluster setup playbook:

[email protected]:~/avi-lab-vagrant$ ansible-playbook ../avi-lab-provision/playbooks/avi-cluster-setup.yml

PLAY [localhost] **************************************************************************************************************************************************

TASK [Gathering Facts] ********************************************************************************************************************************************
ok: [localhost]

TASK [ansible-role-avisdk : Checking if avisdk python library is present] *****************************************************************************************
ok: [localhost] => {
    "msg": "Please make sure avisdk is installed via pip. 'pip install avisdk --upgrade'"
}

TASK [avi-cluster-setup : set AVI dns and ntp facts] **************************************************************************************************************
ok: [localhost]

TASK [avi-cluster-setup : set AVI cluster facts] ******************************************************************************************************************
ok: [localhost]

TASK [avi-cluster-setup : configure ntp and dns controller nodes] *************************************************************************************************
changed: [localhost]

TASK [avi-cluster-setup : configure AVI cluster] ******************************************************************************************************************
changed: [localhost]

TASK [avi-cluster-setup : wait for cluster become active] *********************************************************************************************************
FAILED - RETRYING: wait for cluster become active (30 retries left).
FAILED - RETRYING: wait for cluster become active (29 retries left).
FAILED - RETRYING: wait for cluster become active (28 retries left).

...

FAILED - RETRYING: wait for cluster become active (14 retries left).
FAILED - RETRYING: wait for cluster become active (13 retries left).
FAILED - RETRYING: wait for cluster become active (12 retries left).
ok: [localhost]

TASK [avi-change-password : change default admin password on cluster build when subscription] *********************************************************************
skipping: [localhost]

PLAY RECAP ********************************************************************************************************************************************************
localhost                  : ok=7    changed=2    unreachable=0    failed=0

[email protected]:~/avi-lab-vagrant$

We can check in the web console to see if the cluster is booted and correctly setup:

Last but not least we need the ansible playbook for the AVI service engines installation which relies on the official AVI ansible se role:

---
- hosts: avi-se
  user: '{{ ansible_ssh_user }}'
  gather_facts: "true"
  roles:
    - {role: ansible-role-avise, become: true}

Let’s run the playbook for the service engines installation:

[email protected]:~/avi-lab-vagrant$ ansible-playbook ../avi-lab-provision/playbooks/avi-se-install.yml

PLAY [avi-se] *****************************************************************************************************************************************************

TASK [Gathering Facts] ********************************************************************************************************************************************
ok: [avi-se-2]
ok: [avi-se-1]

TASK [ansible-role-avisdk : Checking if avisdk python library is present] *****************************************************************************************
ok: [avi-se-1] => {
    "msg": "Please make sure avisdk is installed via pip. 'pip install avisdk --upgrade'"
}
ok: [avi-se-2] => {
    "msg": "Please make sure avisdk is installed via pip. 'pip install avisdk --upgrade'"
}

TASK [ansible-role-avise : Avi SE | Set facts] ********************************************************************************************************************
skipping: [avi-se-1]
skipping: [avi-se-2]

TASK [ansible-role-avise : Avi SE | Deployment] *******************************************************************************************************************
included: /home/berndonline/avi-lab-provision/roles/ansible-role-avise/tasks/docker/main.yml for avi-se-1, avi-se-2

TASK [ansible-role-avise : Avi SE | Check minimum requirements] ***************************************************************************************************
included: /home/berndonline/avi-lab-provision/roles/ansible-role-avise/tasks/docker/requirements.yml for avi-se-1, avi-se-2

TASK [ansible-role-avise : Avi SE | Requirements | Check for docker] **********************************************************************************************
ok: [avi-se-2]
ok: [avi-se-1]

TASK [ansible-role-avise : Avi SE | Requirements | Set facts] *****************************************************************************************************
ok: [avi-se-1]
ok: [avi-se-2]

TASK [ansible-role-avise : Avi SE | Requirements | Validate Parameters] *******************************************************************************************
ok: [avi-se-1] => {
    "changed": false,
    "msg": "All assertions passed"
}
ok: [avi-se-2] => {
    "changed": false,
    "msg": "All assertions passed"
}

...

TASK [ansible-role-avise : Avi SE | Services | systemd | Start the service since it's not running] ****************************************************************
changed: [avi-se-1]
changed: [avi-se-2]

RUNNING HANDLER [ansible-role-avise : Avi SE | Services | systemd | Daemon reload] ********************************************************************************
ok: [avi-se-2]
ok: [avi-se-1]

RUNNING HANDLER [ansible-role-avise : Avi SE | Services | Restart the avise service] ******************************************************************************
changed: [avi-se-2]
changed: [avi-se-1]

PLAY RECAP ********************************************************************************************************************************************************
avi-se-1                   : ok=47   changed=7    unreachable=0    failed=0
avi-se-2                   : ok=47   changed=7    unreachable=0    failed=0

[email protected]:~/avi-lab-vagrant$

After a few minutes you see the AVI service engines automatically register on the controller cluster and you are ready start configuring the detailed load balancing configuration:

Please share your feedback and leave a comment.

Software defined Load Balancing with AVI Networks

Throughout my career I have used various load balancing platforms, from commercial products like F5 or Citrix NetScaler to open source software like HA proxy. All of them do their job of balancing traffic between servers but the biggest problem is the scalability: yes you can deploy more load balancers but the config is static bound to the appliance.

AVI Networks has a very interesting concept of moving away from the traditional idea of load balancing and solving this problem by decoupling the control-plane from the data-plane which makes the load balancing Service Engines basically just forward traffic and can be more easily scaled-out when needed. Another nice advantage is that these Service Engines are container based and can run on basically every type of infrastructure from Bare Metal, on VMs to modern containerized platforms like Kubernetes or OpenShift:

All the AVI components are running as container image on any type of infrastructure or platform architecture which makes the deployment very easy to run on-premise or cloud systems.

The Service Engines on Hypervisor or Base-metal servers need network cards which support Intel’s DPDK for better packet forwarding. Have a look at the AVI linux server deployment guide: https://avinetworks.com/docs/latest/installing-avi-vantage-for-a-linux-server-cloud/

Here now, is a basic step-by-step guide on how to install the AVI Vantage Controller and additional Service Engines. Have a look at the AVI Knowledge-Base where the install is explained in detail:  https://avinetworks.com/docs/latest/installing-avi-vantage-for-a-linux-server-cloud/

Here is the link to my Vagrant environment: https://github.com/berndonline/avi-lab-vagrant

Let’s start with the manual AVI Controller installation:

[[email protected] ~]$ sudo ./avi_baremetal_setup.py
AviVantage Version Tag: 17.2.11-9014
Found disk with largest capacity at [/]

Welcome to Avi Initialization Script

Pre-requisites: This script assumes the below utilities are installed:
                  docker (yum -y install docker/apt-get install docker.io)
Supported Vers: OEL - 6.5,6.7,6.9,7.0,7.1,7.2,7.3,7.4 Centos/RHEL - 7.0,7.1,7.2,7.3,7.4, Ubuntu - 14.04,16.04

Do you want to run Avi Controller on this Host [y/n] y
Do you want to run Avi SE on this Host [y/n] n
Enter The Number Of Cores For Avi Controller. Range [4, 4] 4
Please Enter Memory (in GB) for Avi Controller. Range [12, 7]
Please enter directory path for Avi Controller Config (Default [/opt/avi/controller/data/])
Please enter disk size (in GB) for Avi Controller Config (Default [30G]) 10
Do you have separate partition for Avi Controller Metrics ? If yes, please enter directory path, else leave it blank
Do you have separate partition for Avi Controller Client Logs ? If yes, please enter directory path, else leave it blank
Please enter Controller IP (Default [10.255.1.232])
Enter the Controller SSH port. (Default [5098])
Enter the Controller system-internal portal port. (Default [8443])
AviVantage Version Tag: 17.2.11-9014
AviVantage Version Tag: 17.2.11-9014
Run SE           : No
Run Controller   : Yes
Controller Cores : 4
Memory(GB)       : 7
Disk(GB)         : 10
Controller IP    : 10.255.1.232
Disabling Avi Services...
Loading Avi CONTROLLER Image. Please Wait..
Installation Successful. Starting Services..
[[email protected] ~]$
[[email protected] ~]$ sudo systemctl start avicontroller

Or as a single command without interactive mode:

[[email protected] ~]$ sudo ./avi_baremetal_setup.py -c -cd 10 -cc 4 -cm 7 -i 10.255.1.232
AviVantage Version Tag: 17.2.11-9014
Found disk with largest capacity at [/]
AviVantage Version Tag: 17.2.11-9014
AviVantage Version Tag: 17.2.11-9014
Run SE           : No
Run Controller   : Yes
Controller Cores : 4
Memory(GB)       : 7
Disk(GB)         : 10
Controller IP    : 10.255.1.232
Disabling Avi Services...
Loading Avi CONTROLLER Image. Please Wait..
Installation Successful. Starting Services..
[[email protected] ~]$
[[email protected] ~]$ sudo systemctl start avicontroller

The installer basically installed a container image on the server which runs the AVI Controller:

[[email protected] ~]$ sudo docker ps
CONTAINER ID        IMAGE                                                 COMMAND                  CREATED              STATUS              PORTS                                                                                                                                    NAMES
c689435f74fd        avinetworks/controller:17.2.11-9014                   "/opt/avi/scripts/do…"   About a minute ago   Up About a minute   0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp, 0.0.0.0:5054->5054/tcp, 0.0.0.0:5098->5098/tcp, 0.0.0.0:8443->8443/tcp, 0.0.0.0:161->161/udp   avicontroller
[[email protected] ~]$

Next you can connect via the web console to change the password and finalise the configuration to configure DNS, NTP and SMTP:

When you get to the menu Orchestrator integration you can put in the details for the controller to install additional service engines:

In the meantime the AVI Controller installs the specified Service Engines in the background, which automatically appear once this is completed under the infrastructure menu:

Like with the AVI Controller, the Service Engines run as container image:

[[email protected] ~]$ sudo docker ps
CONTAINER ID        IMAGE                                         COMMAND                  CREATED             STATUS              PORTS               NAMES
2c6b207ed376        avinetworks/se:17.2.11-9014                   "/opt/avi/scripts/do…"   51 seconds ago      Up 50 seconds                           avise
[[email protected] ~]$

The next article will be about automatically deploying the AVI Controller and Service Engines via Ansible, and looking into how to integrate AVI with OpenShift.

Please share your feedback and leave a comment.

NetBox Open Source DCIM and IPAM tool

I wanted to share some information about an open source tool I have found some time ago which helps you to keep track of your infrastructure assets and configuration items. The name is NetBox which is an DCIM (Datacenter infrastructure management) and IPAM (IP address management) tool. NetBox was started by the network engineering team from DigitalOcean, specifically to address the needs of network and infrastructure engineers.

We all know that documentation is something no one wants to do, and no one has time for. What makes NetBox interesting is that not only does it focus on infrastructure documentation with a clean web console, it also comes with a API to push changes via the API , or use NetBox as dynamic inventory for Ansible.

Here a few screenshots showing the look and feel from NetBox:

The rack overview:

The IPAM module:

Here is an example how to add a device via the REST API, very useful if you use ZTP (zero touch provisioning) and add your switches or servers automatically to NetBox or in your automation scripts when you deploy configurations:

[email protected]:~$ curl -X POST -H "Authorization: Token fde02a67ca0c248bf5695bbf5cd56975add33655" -H "Content-Type: application/json" -H "Accept: application/json; indent=4" http://localhost:80/api/dcim/devices/ --data '{ "nae": "server-9", "display_name": "server-9", "device_type": 5, "device_role": 8 , "site": 1 }'
{
    "id": 21,
    "name": "server-9",
    "device_type": 5,
    "device_role": 8,
    "tenant": null,
    "platform": null,
    "serial": "",
    "asset_tag": null,
    "site": 1,
    "rack": null,
    "position": null,
    "face": null,
    "status": 1,
    "primary_ip4": null,
    "primary_ip6": null,
    "cluster": null,
    "virtual_chassis": null,
    "vc_position": null,
    "vc_priority": null,
    "comments": "",
    "created": "2018-04-16",
    "last_updated": "2018-04-16T14:40:47.787862Z"
}
[email protected]:~$

In the web console you see the device I have just added via the REST API:

On the main NetBox Github repository page you find links for a Ansible Role or Vagrant environment.

I am personally very interested in using NetBox as dynamic inventory with Ansible. I will write a separate article about this in the coming weeks.

Please share your feedback and leave a comment.