Configure Dotmesh in Kubernetes
This guide will explain the ins and outs of the Dotmesh Kubernetes YAML file. It’s not a guide to just using it to install and use Dotmesh with Kubernetes - you can find that in the Installation guide. This guide is for people who want to modify the YAML and do non-standard installations.
We assume you have read the Concrete Architecture guide, and understand what the different components of a Dotmesh cluster are.
Using customised YAML.
The Dotmesh YAML is split between two files.
The first file is a ConfigMap that, as you might expect, provides configuration used by the Dotmesh Operator to configure Dotmesh on your cluster; that’s the one you’re most likely to need to modify.
The second file actually sets up the components of the cluster, and is less likely to need changing. However, this guide will explain both in detail, to enable customised setups.
We provide pre-customised versions of both YAML files:
- Core YAML for Kubernetes 1.7
- Core YAML for Kubernetes 1.8
- Core YAML for Kubernetes 1.8 on AKS (Azure)
Getting the YAML ready for customisation
Grab the most appropriate base YAML for your situation, and customise it like so:
$ curl https://get.dotmesh.io/yaml/configmap.yaml > configmap-default.yaml $ curl https://get.dotmesh.io/yaml/dotmesh-k8s-1.8.yaml > dotmesh-default.yaml $ cp configmap-default.yaml configmap-customised.yaml $ cp dotmesh-default.yaml dotmesh-customised.yaml # ...edit configmap-customised.yaml and/or dotmesh-customised.yaml... $ kubectl apply -f https://get.dotmesh.io/yaml/configmap-customised.yaml $ kubectl apply -f https://get.dotmesh.io/yaml/dotmesh-customised.yaml
The ConfigMap has the following keys in its
nodeSelector: A selector for the nodes that should run Dotmesh. If it’s left as the empty string, then Dotmesh will be installed on every node.
upgradesUrl: The URL of the checkpoint server. The Dotmesh server on each node will periodically ping an API call to this URL to find out if a new version is available; if so, a message will be presented to users when they run
dm version. To turn this off so that we don’t know you’re running Dotmesh, set it to the empty string.
upgradesIntervalSeconds: How many seconds to wait between checks of the checkpoint server.
flexvolumeDriverDir: The directory on the nodes (in the host filesystem) where flexdriver plugins need to be installed. This varies between cloud providers; this is the line that changes between the vanilla, GKE and AKS versions of the ConfigMap YAML.
poolName: The name of the ZFS pool to use for backend storage.
logAddress: The IP address of a syslog server to send log messages to. If left as the empty string, then logging will go to standard output (which means is recommended).
storageMode: This is reserved for future expansion. Leave it as
local.poolSizePerNode: How large a pool file to create on each node. Defaults to
10Gfor a ten gigabyte pool.
local.poolLocation: The location on the host filesystem where the pool file will be created.
Components of the Core YAML.
The YAML is a List, composed of a series of different objects. We’ll summarise them, then look at each in detail.
All namespaced objects are in the
dotmesh namespace; but
ClusterRoles, ClusterRoleBindings and StorageClasses are not
namespaced in Kubernetes.
The following objects comprise the core Dotmesh cluster:
Then the following comprise the Dynamic Provisioner:
dotmesh-etcd-cluster etcd cluster.
This is the ServiceAccount that will be used to run the Dotmesh server. You shouldn’t need to change this.
This is the ServiceAccount that will be used to run the Dotmesh operator. You shouldn’t need to change this.
This is the role Dotmesh will run under. You shouldn’t need to change this file.
If you are running Kubernetes >=
1.8 then RBAC is probably enabled and you need to create a
cluster-admin role for your cluster.
Here is an example of adding that role for a gcloud user running a GKE cluster:
$ kubectl create clusterrolebinding cluster-admin-binding \ --clusterrole cluster-admin \ --user "$(gcloud config get-value core/account)"
--user can be replaced with a local user (e.g.
root) or another user depending on where your cluster is deployed.
This simply binds the
dotmesh ServiceAccount to the
ClusterRole. You shouldn’t need to change this.
This simply binds the
dotmesh ServiceAccount to the
ClusterRole (so that it can manage pods and PVCs). You shouldn’t need
to change this.
This is the service used to access the Dotmesh server on port 32607. It’s needed both for internal connectivity between nodes in the cluster, and to allow other clusters to push/pull from this one.
This runs the Dotmesh operator, which then creates a Dotmesh server
pod for every node in your cluster (that matches the
the ConfigMap, at any rate).
The operator references the
dotmesh-operator ServiceAccount in order
to be able to create and destroy pods and PVCs.
The Dotmesh server pods it creates reference the
ServiceAccount in order to obtain the privileges it needs.
They also refer to the
dotmesh secret (also in the
namespace) to configure the initial API key and admin password, which
is not created by the YAML - you have to provide these secrets
yourself; we wouldn’t dream of shipping you a default API key! This
gets mounted into the container filesystem at
This is the ServiceAccount that will be used to run the Dotmesh provisioner. You shouldn’t need to change this.
This is the role the Dotmesh provisioner will run under. You shouldn’t need to change this.
This simply binds the
dotmesh-provisioner ServiceAccount to the
dotmesh-provisioner-runner ClusterRole. You shouldn’t need to change
This actually runs the dynamic provisioner. Only one replica of it needs to run somewhere in the cluster; it just looks for Dotmesh PVCs and creates corresponding PVs, so it doesn’t need to actually run on very node.
It references the
dotmesh secret from the
dotmesh namespace, in
order to obtain the cluster’s admin API key so it can communicate with
the Dotmesh server.
This defines a default
dotmesh StorageClass that, when referenced
from a PersistentVolumeClaim, will cause Dotmesh to manage the
parameters section, there’s a single configurable
dotmeshNamespace is the default namespace for dots accessed
through this StorageClass. You probably don’t need to change it unless
you’re doing something interesting.
A PersistentVolumeClaim using a Dotmesh StorageClass has the following structure:
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: example-pvc annotations: dotmeshNamespace: admin dotmeshName: example dotmeshSubdot: logging_db spec: storageClassName: dotmesh accessModes: - ReadWriteOnce resources: requests: storage: 1Gi
The interesting parts:
spec.storageClassNamemust reference a suitably configured Dotmesh StorageClass, or Dotmesh won’t manage this volume.
metadata.annotations.dotmeshNamespacecan be used to override the namespace in which the dot is kept. The default is inherited from the StorageClass, or defaults to
adminif none is specifed there, so it needn’t be specified here unless you’re doing something strange.
metadata.annotations.dotmeshNameis the name of the dot. If you don’t specify it, then the name of the PVC (in this case,
example-pvc) will be used.
metadata.annotations.dotmeshSubdotis the name of the subdot to use. If left unspecified, then the default of
__default__will be used. Use
__root__to reference the root of the dot, or any other name to use a specific subdot.