Planning container operations

What to consider when planning for your container operations?

Slowly but certainly the ops part of DevOps is maturing to become boring. In this case boring is good, maturity in IT operations means security and predictability. Technologies necessary to do container operations have gone from obscure entries in GitHub to products and services businesses can rely on with support broadly available. Also an abundance of new products and services have become available that also others than the best scripting professionals can use to manage the critical Kubernetes estate. After all that Kubernetes estate now owns all the action when it comes to new application development.

Why does this matter?

Cloud Native application operations are complicated and not becoming easier, while the same requirements for security and compliance still apply across the IT landscape. The first part of the problem is difficult to solve. There is a limited supply of specialists that understand and are willing to maintain container operations platforms. Technologies with support and well thought through user experience ease those tasks and help ops stay out of the devs way, but it is still necessary that someone in the team has a thorough understanding of the infrastructure to make the right decisions both when developing and troubleshooting. Which leads to the second point. A typical way how container operations are set up is that when the business requirement for a new digitalization effort appear, the developers will not only develop the solution but will also set up the necessary infrastructure to build, deploy and run cloud native applications.

This is great until the business comes up with the next project and the devs attention is required elsewhere. Even if cloud native has advantages over applications running in virtual machines, there are necessary ongoing maintenance tasks and development items. Cloud native apps have vulnerabilities and similarly, logs must be collected and controls must be set up to stick to those ISO requirements (or equivalent).

This maintenance of the container operations infrastructure is a necessary part of any Cloud Native application setup and in a mature operation not the developer’s task.

What should then be considered? At Datalounges we propose you to consider at least the following items:

Kubernetes infrastructure management, observability, infrastructure containers, storage services, container lifecycle management, access controls and developer services.

Each of these items have increasing depth and we at Datalounges would be happy to discuss your specific requirements or questions in detail, but for the time being here is a brief overview of some of these topics.

Managing the Kubernetes infrastructure

Kubernetes is the central piece of the immutable infrastructure. It runs and orchestrates containers and provides the magic necessary for cloud native applications. It is also software. Kubernetes needs to be configured, managed, updated and maintained. It is the central infrastructure component and yes, it also has needs.

Datalounges recommends the industry’s leading platform for Kubernetes management, Rancher, as-a-service for the task.

Partnering with SUSE allows customers to deploy Kubernetes clusters to public and private clouds to maintain those infrastructures and to secure them with policies as a Datalounges service.

Observability of cloud native applications and infrastructure

What then is happening in the highly automated infrastructure? And more importantly, what is happening with the cloud native applications deployed in that infrastructure? Collecting logs, metrics and monitoring data from the infrastructure will enable better understanding of the systems and complements Rancher management.

It is also important to store that data somewhere else than in the volumes of those systems (and especially in that same cloud) and archive away for possible inspection to satisfy compliance requirements. Visualizing the views for Dev, Ops and Biz purposes is then the value add of observability allowing for the different stakeholders to stay on top of their business applications the way they want.

Datalounges uses for this multi-cloud, multi-cluster and multi-use case observability another Datalounges SaaS -service named R4DAR. You can read more about R4DAR here https://www.datalounges.com/r4dar/

Cloud native infrastructure containers

To make applications run similarly from cluster to cluster there are a number of containerized services that are necessary for the infrastructure. It makes sense to deploy some of them from Rancher, but it is likely that many will need to be tailored to match the computing environment requirements or have someone do this task for you.

Datalounges has prepared these basic services for Tuuli Kubernetes, and those same containers can be used in your container production environment of choice. For example an Ingress manages external access to your services. Datalounges provides its customers with a fully configured and maintained nginx ingress to make sure devs and ops can expect similar behavior across clusters, namespaces and computing platforms for applications. Same applies to other necessary services such as certificate management, private registry services, storage management and more.

Datalounges is your partner when selecting and designing this soft fabric that surrounds applications and makes DevOps possible.

Storage and services

So what kind of storage services do containers require? Are they not meant to be stateless or ephemeral? Certainly, some of the containers only write into a temporary scratch that lasts for a lifetime of the container. However, that is not the only kind of container anymore. As applications become more diverse, so does the requirement for storage. It is necessary to have a software defined storage layer than can interact with the fluent nature of container workloads when they require persistent volumes.

Kubernetes will manage a significant part of the problem but to complement these requirements Datalounges uses Ceph -storage to provide the cloud native services a container infrastructure may require. This includes block, object and file services, taking systematic backups (also from Public Cloud), organizing disaster recovery scenarios and using snapshots to stay up to date with the data.

Even with those stateless containers the versioning used, the setup to continuously build containers and the storage required for registry artifacts are worth backing up. And essential for recovery.

Container lifecycle management

Patching containers is a challenging topic. As they are not VM’s, managing the configuration and maintaining secure up-to-date versions of the base image and other packages is different from the widely adopted policies used for business systems. It is a complicated and not entirely a technical problem.

Even if containers are not full operating systems, the base images containers are built from still have a lot of Linux components that need updating from time to time as vulnerabilities get announced. This applies to other necessary building blocks of the application as well. As the only meaningful way to have container packages updated is to rebuild them, the process requires that the developer or someone with that role in the ops team is involved. There is an organizational process angle to this problem that Datalounges has also investigated, but that topic will be covered in another post.

From a service perspective it is important to make sure that

  • vulnerabilities that affect containers are identified
  • base images are kept up to date to make sure new images do not suffer from the found vulnerabilities
  • responsibilities to rebuild and deploy containers have been agreed upon to initiate rebuild when necessary

Datalounges uses an industry leading private registry (Harbor) to scan all images for vulnerabilities and has made available up to date and scanned base images from which to rebuild containers.

Access Controls

An important component of providing applications and services is how access is managed. User access to applications, Dev access to developer infrastructure, privileged user access to the infrastructure and of course how customers access the applications. An angle to consider for security and compliance is the policy based access management of the containerized applications to the infrastructure. Each consideration is a significant building block and requires its own attention.

When considering this access challenge, the requirements of Datalounges customers converge around a cloud native technology KeyCloak. As an access service, KeyCloack is a versatile offering with options to maintain and manage services for admins and applications.

KeyCloack is also well supported by Rancher and recommended by Datalounges to serve your access requirements. Same applies to OPA for admission control that allows use of policies and controls across the Kubernetes infrastructure.

Developer services

Today the developer often is also responsible for the cloud native infrastructure. Devs select the computing environment, set up automations, select versioning and come up with the base image to get the apps up and running. This tends to be because the tooling in their own infrastructure is not there and ordering a delivery from the service provider will take time. Hence the quickest way to get to deploying is in the Public Cloud.

Datalounges research suggests that this is the most common scenario. And its great, because it serves the developer and via that the requirements of the business. As mentioned in the beginning of the article – as use cases scale and applications mature so do the requirements for compliance and security. Datalounges research found that Devs build great ops infrastructures, but have challenges maintaining them. Similarly the ops teams struggle with containers as their services are built to support VM’s.

There are a lot of opportunities to separate Dev and Ops teams responsibilities for a handover, and where most of the common ground is found according to interviews related to the study is in versioning. Checking in code and using for example Git as the tool to select base images and build containers using a CI/CD that drop containers to the registry based on naming conventions provides an excellent option for Dev and Ops responsibility separation.

To enable this Datalounges has developed an integration between GitLab and Harbor -registry to make Git the interface to cloud services for containers.

Together these components create a loosely coupled set of services that depending on the use case and maturity level of the adopter either allow a full control PaaS deployment to serve developers or adoption of individual technology services based on business need.

Learn more about the solutions and combinations. Talk to Datalounges.