In this blog series on Kubernetes, we’ve already covered:
- The basic setup for building applications in Kubernetes
- How to set up processes using pods and controllers
- Configuring Kubernetes networking services to allow pods to communicate reliably
- How to identify and manage the environment-specific configurations to make applications portable between environments
In this series’ final installment, I’ll explain how to provision storage to a Kubernetes application.
Step 4: Provisioning Storage
The final component we want to think about when we build applications for Kubernetes is storage. Remember, a container’s filesystem is transient, and any data kept there is at risk of being deleted along with your container if that container ever exits or is rescheduled. If we want to guarantee that data lives beyond the short lifecycle of a container, we must write it out to external storage.
Any container that generates or collects valuable data should be pushing that data out to stable external storage. In our web app example, the database tier should be pushing its on-disk contents out to external storage so they can survive a catastrophic failure of our database pods.
Similarly, any container that requires the provisioning of a lot of data should be getting that data from an external storage location. We can even leverage external storage to push stateful information out of our containers, making them stateless and therefore easier to schedule and route to.
Decision #5: What data does your application gather or use that should live longer than the lifecycle of a pod?
The full Kubernetes storage model has a number of moving parts:
- Container Storage Interface (CSI) Plugins can be thought of as the driver for your external storage.
- StorageClass objects take a CSI driver and add some metadata that typically configures how storage on that backend will be treated
- PersistentVolume (PV) objects represent an actual bucket of storage, as parameterized by a StorageClass
- PersistentVolumeClaim (PVC) objects allow a pod to ask for a PersistentVolume to be provisioned to it
- Finally, we met Volumes earlier in this series. In the case of storage, we can populate a volume with the contents of the external storage captured by a PV and requested by a PVC, provision that volume to a pod and finally mount its contents into a container in that pod.
Managing all these components can be cumbersome during development, but as in our discussion of configuration, Kubernetes volumes provide a convenient abstraction by defining how and where to mount external storage into your containers. They form the start of what I like to think of as the “storage frontend” in Kubernetes—these are the components most closely integrated with your pods and which won’t change from environment to environment.
All those other components, from the CSI driver all the way through the PVC, which I like to think of as the “storage backend”, can be torn out and replaced as you move between environments without affecting your code, containers, or the controller definitions that deploy them.
Note that on a single-node cluster (like the one created for your by Docker Desktop on your development machine), you can create hostpath backed persistentVolumes which will provision persistent storage from your local disk without setting up any CSI plugins or special storage classes. This is an easy way to get started developing your application without getting bogged down in the diagram above—effectively deferring the decision and setup of CSI plugins and storageClasses until you’re ready to move off of your dev machine and into a larger cluster.
The simple hostpath PVs mentioned above are appropriate for early development and proof-of-principle work, but they will need to be replaced with more powerful storage solutions before you get to production. This will require you to look into the ‘backend’ components of Kubernetes’ storage solution, namely StorageClasses and CSI plugins:
In this series, I’ve walked you through the basic Kubernetes tooling you’ll need to containerize a wide variety of applications, and provided you with next-step pointers on where to look for more advanced information. Try working through the stages of containerizing workloads, networking them together, modularizing their config, and provisioning them with storage to get fluent with the ideas above.
Kubernetes provides powerful solutions for all four of these areas, and a well-built app will leverage all four of them. If you’d like more guidance and technical details on how to operationalize these ideas, you can explore the Docker Training team’s workshop offerings, and check back for new Training content landing regularly.
After mastering the basics of building a Kubernetes application, ask yourself, “How well does this application fit the values of portability, scalability and shareability we started with?” Containers themselves are engineered to easily move between clusters and users, but what about the entire application you just built? How can we move that around while still preserving its integrity and not invalidating any unit and integration testing you’ll perform on it?
Docker App sets out to solve that problem by packaging applications in an integrated bundle that can be moved around as easily as a single image. Stay tuned to this blog and Docker Training for more guidance on how to use this emerging format to share your Kubernetes applications seamlessly.
To learn more about Kubernetes storage and Kubernetes in general:
- Read the Kubernetes documentation on PersistentVolumes and PersistentVolumeClaims.
- Find out more about running Kubernetes on Docker Enterprise and Docker Desktop.
- Check out Play with Kubernetes, powered by Docker.
We will also be offering training on Kubernetes starting in early 2020. In the training, we’ll provide more specific examples and hands on exercises.To get notified when the training is available, sign up here: