Martin Ahrer

Thinking outside the box

Introduction to Hashicorp Nomad

2022-07-20 9 min read martin

In the past 2 years I successfully architected and developed a new product for a customer and brought it to production. This product uses Hashicorp’s job scheduler Nomad for managing workload.

Nomad is an excellent alternative to Kubernetes. It is known for its simplicity in regard to usage and maintainability and allows to schedule containerized and non-containerized applications.

So in case your development is not delivering container images, Nomad can schedule any Java workload (e.g. a Spring boot application) or even native binaries (e.g. Spring Boot native application).

I have worked with Kubernetes in the past, and I love Kubernetes for its flexibility and power. However, it is quite expensive to operate and develop applications for Kubernetes and not every customer may be willing to take that. Besides, probably the most popular way of running Kubernetes is a managed Kubernetes cloud offering that may not fit for example when you just can’t have yor data in the cloud.

However, in this article I don’t want to discuss pros/cons of Nomad and Kubernetes and for this rather refer to https://www.nomadproject.io/docs/nomad-vs-kubernetes. You find my reasons there for why I was going with Nomad over Kubernetes.

My intention is to start a series of posts showing deployment of Nomad and developing applications running with Nomad. Towards the end of this series you will know how to run Nomad in production.

Recently Nomad 1.3 was released making it even more attractive to run because it makes Nomad even simpler than it already was before. Nomad 1.3 added basic service discovery. Before, a typical setup included integration of Hashicorp’s Consul providing very powerful service discovery (and more). For Nomad 1.4 it was already announced that its service discovery support will be further extended to even provide health checks (as available with Consul services) for service registrations.

But now let’s move on to showing how to run a simple setup of Nomad for local development and then do a deployment of a simple container based workload.

Run Nomad locally

For running Nomad we have to run the Nomad agent which can run in server or client mode. For installing the agent go ahead and read the Nomad installation tutorial. Also, it won’t hurt to read up on the Nomad vocabulary to better follow along.

For local development we can run an agent in a so-called dev mode activating server and client within a single process. This is the most trivial setup and is well documented in https://learn.hashicorp.com/tutorials/nomad/get-started-run?in=nomad/get-started.

I have chosen to show a setup that gives us multiple clients (like nodes) without needing to start slow VMs. This is a very valid setup for demos or even for local development. We will set up one server agent and multiple client agents. Here we will run all agents on the same machine (hence some very special settings), but in production those agents would run on separated machines (virtual machines eventually). For a production-ready setup see the Nomad reference architecture.

The following sections have been implemented based on Nomad 1.3, but due to its excellent compatibility policies will very likely work with other versions. I have used rather basic configuration items, so it should work with any recent version.

Run the server agent

The following configuration server.hcl spins up a Nomad agent in server mode (single node).

server.hcl
log_level = "DEBUG"

data_dir = "/tmp/nomad/server1"

name = "server1"

# Enable the server
server {
  enabled = true

  # Self-elect, should be 3 or 5 for production
  bootstrap_expect = 1
}

The following command starts the server agent.

nomad agent -config server.hcl

In order to schedule a workload, an agent in client mode must be added. Therefore, we add a client configuration that connects to the server agent on IP address 127.0.0.1 (localhost). As we run multiple agents on a single machine we have to modify the port which is 4646 by default and already allocated by the server agent. So we are choosing port 5656 for the first client. For any further client we just would have to adjust the client name, data directory and the port.

The following configuration client1.hcl spins up a Nomad agent in client mode

client1.hcl
log_level = "DEBUG"

data_dir = "/tmp/nomad/client1"

name = "client1"

client {
  enabled = true

  servers = ["127.0.0.1"]
}

ports {
  http = 5656
}

# Because we will potentially have two clients talking to the same
# Docker daemon, we have to disable the dangling container cleanup,
# otherwise they will stop each other's work thinking it was orphaned.
plugin "docker" {
  config {
    gc {
      dangling_containers {
        enabled = false
      }
    }
  }
}
Do not disable the garbage collector in production settings. It is disabled in this setup as we are running multiple clients on the same machine.

The following command starts the client agent.

nomad agent -config client1.hcl

Deploy a containerized application

After we have brought up a Nomad server and a client we want to deploy a very simple application. This application is delivered as a container image. It’s a JVM application built with Spring Boot and delivered as a container image. For the following it is sufficient to know that this application is offering an HTTP service in port 8080 (it embeds a Tomcat servlet engine using Spring Boot) and that it is fully configurable through environment variables.

Job specification continuousdelivery.hcl
variable "api_image_tag" {
    type = string
}

variable "registry_auth_username" {
    type = string
}

variable "registry_auth_password" {
    type = string
}

job "continuousdelivery" {
    datacenters = ["dc1"]
    type = "service"

    group "api" {
        count = 3
        network {
            port "http" {
                to = "8080"
            }
            port "management_http" {
                to = "8081"
            }
        }

        task "api" {
            driver = "docker"
            config {
                image = "registry.gitlab.com/martinahrer/continuousdelivery:${var.api_image_tag}"
                auth {
                    username = "${var.registry_auth_username}"
                    password = "${var.registry_auth_password}"
                }
                ports = [
                    "http",
                    "management_http"
                ]
                dns_servers = [
                    "1.1.1.1",
                    "1.0.0.1"
                ]
            }
            resources {
                memory = 1024
            }
            env {
                SERVER_PORT             = "8080"
                MANAGEMENT_SERVER_PORT  = "8081"
                SPRING_JPA_GENERATE_DDL = true
            }

            service {
                name        = "continuousdelivery-api"
                provider    = "nomad"
                port        = "http"
                tags        = [ "api" ]
            }
        }
    }
}

The job specification is utilizing Hashicorp’s configuration language HCL2 that is found across many Hashicorp products. It declares a few variables for externalized configuration and a job.

A job consists of one or multiple groups where a group can contain one or more tasks.

A task is an instance of the application to be deployed (the workload). Tasks can be grouped and all tasks of a group will always be scheduled to the same client (node).

I will not go into too much detail with these object types as I plan to release future posts explaining much more of Nomad’s powerful concepts. But in case you can’t resist, you can read about job specifications in the Nomad job specification documentation.

The above job specification configures a task named api that is implemented by a container image that includes the Spring Boot based JVM application. For pulling the image the task adds the config key with the image registry location and some credentials for accessing the registry. These credentials are provided as variables as externalized configuration.

The config key also provides a mapping of ports (e.g. HTTP ports for accessing the application) to be exposed by the workload. These port mappings have been declared earlier by the group’s network key and with that we can refer to these ports using a name.

In addition, the task adds a resources key for allocating client resources such as CPU, memory, etc. Using the env key passes configuration to the workload using environment variables. And finally within the task, a service key registers the task instances with the built-in Nomad service registry.

Finally, we have the group’s count and update keys. The count key expresses that we want 3 instances of the group’s tasks (the application in this case).

For deploying the job specification we will use the Nomad binary again invoking the job run sub-command.

nomad job run \ (1)
    --var-file credentials.hcl \ (2)
    --var-file application.hcl \ (3)
    continuousdelivery.hcl
1The run sub-command will send the job specification to the Nomad server to evaluate and deploy
2Provides variables for the credentials of the container image registry
3Provides variables for the task image

After that command we have to give the server a bit of time to pull the container image, and create the deployment and allocations. We can query the progress and status with the following command:

> nomad job status continuousdelivery

ID            = continuousdelivery
Name          = continuousdelivery
Submit Date   = 2022-07-20T16:23:40+02:00
Type          = service
Priority      = 50
Datacenters   = dc1
Namespace     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost  Unknown
api         0       0         3        0       0         0     0

Latest Deployment
ID          = b77b8089
Status      = successful
Description = Deployment completed successfully

Deployed
Task Group  Auto Revert  Desired  Placed  Healthy  Unhealthy  Progress Deadline
api         true         3        3       3        0          2022-07-20T16:34:20+02:00

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created    Modified
1b5a2511  b949630b  api         1        run      running  1h37m ago  1h30m ago
64a2bc2c  b949630b  api         1        run      running  1h37m ago  1h30m ago
6a6807c6  b949630b  api         1        run      running  1h37m ago  1h30m ago

The job status tells us that the deployment has created 3 allocations according to the group’s count=3 specification. We also can read from the job status that all allocations have been scheduled to the same client (node) as we have started only a single client. If we had multiple clients then the scheduler would have tried to spread the allocations across the available clients.

Finally, let’s look at how we can access the workload that is exposing some REST API. When creating an allocation, the scheduler assigns a random port to each port exposed by the allocation which we can also query through the Nomad CLI. The console output makes the Nomad service registry information visible telling the socket address of each allocation’s HTTP address.

>  nomad service info continuousdelivery-api

Job ID              Address             Tags   Node ID   Alloc ID
continuousdelivery  192.168.1.18:26181  [api]  b949630b  1b5a2511
continuousdelivery  192.168.1.18:27114  [api]  b949630b  64a2bc2c
continuousdelivery  192.168.1.18:25454  [api]  b949630b  6a6807c6

So this was just a birds-view on Nomad. Nomad can solve very complex problems but still is very simple to run and use. In future posts I plan to demonstrate solutions to more complex problems such as

  • east-west load-balancing,

  • ingress,

  • canary deployments,

  • blue/green deployments,

  • integrating with Consul,

  • integrating external load-balancers (like traefik),

  • continuous deployment,

  • ..

I hope that I have drawn your attention and hope that you will read my upcoming posts related to Nomad.