Getting Started as a Developer

Development Principles

Before you start developing on these projects, please take the time to read this section on best practice. These are not just about good practice but also about good security and so it is essential that every developer understands that it is their responsibility to upkeep the standards of these projects.

Everything Open Source

A decision has been made that all components of this project will be made open source under an MIT License. Each repository should have this license in the top of the project. In addition to this all container images will be made available on docker hub as public repositories. The exceptions to this may occur when we are interacting with proprietry software provided by external providers as needed for the project, however preference should be given to secure open source software rather than proprietry software.

This allows us to maximise collaboration with other organisations and individuals in order to improve the services we’re offering.

Logging

All microservices should log output logs to STDOUT and STDERR on their respective containers. This will allow logs to be consumed by centralised logging platforms and be analysed accordingly, as well as to be able to be accessed via kubectl for real-time diagnostics. It is the reposibility of individual organisations to implement their own centralised logging in their own environments.

Kubernetes Native by Default

Where preferable, microservices should be kubernetes native, leveraging the custom resource definitions (CRD’s) and the kubernetes control plane API’s.

Security by Design

Encryption in transit

Services should be designed with security in mind as early as possible. Traffic between services should be encrypted-in-transit using the appropriate TLS standards. Where applicable, security certificates should be customisable to the services via mounted secrets.

Least Privilege

Access to services should be given based upon the principle of least privilege. Users and services should be given only the permissions required to perform their role.

Network Connectivity

External networks should always be trusted as hostile networks, rather than trusted, as a result network policies and firewall rules should be in place to protect the services exposed by the solution.

Keeping things Generic

As we are developing open source solutions it is important that every service we build should be designed in a manner that is as generic as possible, meaning that it will work in any environment. As a result we should not put any environment specifics into any applications or services. Environment config should be handled by the deployment of it on its relevant cluster as close to the cluster configuration as possible.

For example, if we are building a new service, such as the aks-dns-operator, this has code in the form of:

flowchart LR
Service[Python Service] --> Image[Docker Image] --> Helm[Helm Chart] --> Flux[Flux Configuration] --> IAC[Terraform Deployment]

In this example, we reference a number of environmental variables from the application:

PRIVATE_DNS_ZONE
DNS_PREFIX
AZ_SUBSCRIPTION_ID
RESOURCE_GROUP_NAME

Each of these has a default value of an empty string. These same variables are then referenced in the docker image definition, but only to let us know that these are available variables for the docker image metadata.

The helm chart then has a section for environment variables in the values file but this is also empty.

On our flux configuration we start to see some of these environmental variables be defined:

env:
- name: AZURE_CLIENT_ID
  value: ${azure_client_id}

- name: AZURE_TENANT_ID
  value: ${azure_tenant_id}

- name: DNS_PREFIX
  value: ${environment_dns_prefix}

- name: AZ_SUBSCRIPTION_ID
  value: ${dns_subscription_id}

- name: PRIVATE_DNS_ZONE 
  value: ${private_dnz_zone}
  
- name: RESOURCE_GROUP_NAME
  value: ${dns_resource_group}

however you will note that these are coming from substitution in the flux configuration and actually come from a environment configuration configured by the terraform.

module "kubernetes_cluster_configuration" {
  ...
  cluster_configuation = {
    "azure_client_id" = module.kubernetes_cluster.kubelet_identity_client_id
    "azure_keyvault_name" = module.key_vault.name
    "azure_storage_account" = module.storage_account.name
    "azure_tenant_id" = data.azurerm_client_config.current.tenant_id
    "private_dnz_zone" = var.dns_zone
    "dns_prefix" = var.dns_prefix
    "dns_resource_group" = var.private_zone_resource_group_name
    "dns_subscription_id" = var.hub_subscription_id
    "azure_subscription_id" = data.azurerm_client_config.current.subscription_id
    "azure_sql_server" = module.sql_server.name
    "azure_resource_group" = module.resource_group.name
    "azure_location" = var.location
  }
}

So the source data is coming from terraform and is built automatically by the system, as is.

Secrets

Never store any secrets inside of the repository, secrets should be stored in a secure vault such as azure key vault and distributed to the relevant services. A default key vault is configured as part of the terraform deployment for the azure environment. This can be accessed using the user managed identity via the CSI driver for secrets. These secrets can then be distributed by the secrets distributor service. For local development environments, an equivalent secret can be created and mounted into the secrets distributor service.

As in the section entitled Keeping things generic we should keep these as close to the environment as possible.

An example of this might be the Cookie Secret used by jupyter hub. This is in the helm chart used for jupyterhub as the following value:

hub:
    cookieSecret: "******"

This is passed in via the flux configuration to the HelmRelease resource in the valuesFrom section:

...
spec:
  valuesFrom:
  - kind: Secret
    name: jupyter-cookie
    valuesKey: jupyterhub_cookie_secret
    targetPath: hub.cookieSecret

This in turn comes from a secret called jupyter-cookie. This secret is created by the secret distributor using the following custom resource:

apiVersion: xlscsde.nhs.uk/v1
kind: SecretsDistribution
metadata:
  name: jupyter-cookie
  annotations:
    xlscsde.nhs.uk/secretUsage: "Jupyter Cookie Secret"
spec:
  name: jupyter-cookie
  secrets:
  - from: JupyterCookieSecret
    to: jupyterhub_cookie_secret

Which in turns pulls from a secret served by the secret distributor called JupyterCookieSecret. This is actually stored in key vault and served into the secret distributor as a volume using the CSI driver:

volumeMounts:
  - name: secrets-store-inline
    mountPath: /mnt/secrets
    readOnly: true
volumes:
  - name: secrets-store-inline
    csi:
      driver: secrets-store.csi.k8s.io
      readOnly: true
      volumeAttributes:
        secretProviderClass: "keyvault-sync"

Which in turn refers to a secretProviderClass called keyvault-sync which exists in the cluster configuration:

apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
  name: keyvault-sync
  namespace: secrets-distributor
  annotations:
    xlscsde.nhs.uk/secretUsage: "Key Vault Sync"
spec:
  provider: azure
  parameters:
    usePodIdentity: "false"
    useVMManagedIdentity: "true"
    userAssignedIdentityID: ${azure_client_id}
    keyvaultName: ${azure_keyvault_name}
    tenantId: "${azure_tenant_id}"
    objects: |
        array:
        ...
        - |
          objectName: JupyterCookieSecret
          objectType: secret

This in turn gets what it needs to connect to the key vault from the substitutions which again come from the environment config configmap deployed by terraform.

Repository Splitting

In order to keep the solution as modular as possible, each repository should have a specific category and type of code inside. If for example we’re building a service such as the aks-dns-operator, we should split out all the various components into their own repositories:

Docker Image (this can include the application itself)
Helm Chart
Flux Configuration

Rather than having one repository for everything. The rationale behind this is that we may have contributors in the future who wish to contribute to the docker image, but not to the helm chart, or to the flux configuration. It allows us to compartmentalise everything and maintain separate versions of each component individually allowing us to move in different directions at a later date if necessary.

Repository Naming

Please ensure that each repository is named according to its type and purpose:

Flux repositories

Flux repositories should be named as follows:

iac-flux-{name}

Helm repositories

Helm repositories should be named as follows:

iac-helm-{name}

Container Images

Docker container images should be named as follows:

docker-{name}

Checking out the code

TODO: Add commands to checkout the code

QUESTIONS: Do we want to recommend that they fork the repositories from the beginning? do we want to include a CLI command for them to do so?