Drive Dotmesh from your shell
Let’s take a look at the
dm command-line tool in more detail. This
is a reference guide - a tutorial is where to go if
you want a quick start guide. We’re also going to assume you’re
familiar with Dotmesh concepts here. As a reference, it’s
written so you can dive straight into the heading for the command you
want - but it’s also been laid out in a suitable order for reading
top-to-bottom, should you want to become a Dotmesh command-line expert
and impress your friends.
How to read the examples.
In the examples given in this guide, anything in
something you need to replace with your own text. For instance, if
you’re told to type
dm commit -m 'MESSAGE', that means you need to
put your own message in instead of
MESSAGE. Anything in
brackets] is optional; the text will describe the consequences of
missing it out. Lists of things
that you get to choose one of them.
Examples look like this:
$ cat hello.txt # Text YOU type looks like this Hello! Text the computer echoes back to you looks like this. If any bits of the output need calling out, we'll highlight them like this.
If you just type
dm on its own, it will give you basic command-line
dm uses the “subcommands” pattern, where one command provides
lots of different functions through different subcommands given on the
command line. For instance,
dm version will print out the versions
of Dotmesh components. Some subcommands have subcommands of their own
dm dot delete to delete a Dot).
You can get further information about any command by typing
$ dm dotmesh (dm) is like git for your data in Docker. This is the client. Configure it to talk to a dotmesh cluster with 'dm remote add'. Create a dotmesh cluster with 'dm cluster init'. Usage: dm [command] Available Commands: branch List branches checkout Switch or make branches clone Make a complete copy of a remote dot cluster Install a dotmesh server on a docker host, creating or joining a cluster commit Record changes to a dot debug Make API calls dot Manage dots init Create an empty dot list Enumerate dots on the current remote log Show commit logs pull Pull new commits from a remote dot to a local copy of that dot push Push new commits from the current dot and branch to a remote dot (creating it if necessary) remote List remote clusters. Use dm remote -v to see remotes reset Reset current HEAD to the specified state s3 Commands that handle S3 connections switch Change which dot is active version Show the Dotmesh version information Flags: -c, --config string Config file to use (default "~/.dotmesh/config") --verbose Display details of RPC requests and responses to the dotmesh server Use "dm [command] --help" for more information about a command.
dm version --help.
$ dm version --help Show the Dotmesh version information Usage: dm version [flags] Global Flags: -c, --config string Config file to use (default "~/.dotmesh/config") --verbose Display details of RPC requests and responses to the dotmesh server
The configuration file.
dm stores some local state in a config file. You never need to edit
this directory -
dm will manage it for you. By default, it’s located
$HOME/.dotmesh/config, but all
dm subcommands accept a
--config PATH flag, to make
dm use a different config file.
You can view the contents of the RPC requests between the
dm client and the
dotmesh server by using the
--verbose flag. It will print the contents of
the JSON request and reponse body to standard out.
Connecting to clusters.
dm communicates to Dotmesh clusters using the Dotmesh
API. In order to do anything interesting, it needs a
username, a hostname to connect to, and an API key to use. These login
details for a cluster are called a “remote”, and a list of remotes is
stored in the configuration file.
One of the remotes in the config file is marked as the “current
remote”. That’s the one
dm will use, until told otherwise.
If you create a local cluster using
dm cluster init, then the
username will always be
admin; the admin user is created by
cluster init and automatically saved under a remote called
which is current to begin with, so
dm commands will just work out
of the box. But if you need to connect to an existing cluster, or you
want to use the
dm command directly against the Hub, you’re going to
need to add a remote yourself.
The following commands are for managing the list of remotes stored in your local configuration file.
Add a new remote:
dm remote add NAME USER@HOSTNAME[:PORT].
$ dm remote add test firstname.lastname@example.org API key: Paste your API key here, it won't be echoed! Remote added.
$ kubectl examine secret dotmesh -n dotmesh -o yaml apiVersion: v1 data: dotmesh-admin-password.txt: Y29ycmVjdGhvcnNlYmF0dGVyeXN0YXBsZQo= dotmesh-api-key.txt: VlZLR1lDQzNHNEs1RzJRTTNHTElWVEVDVlNCV1dKWkQK kind: Secret metadata: creationTimestamp: 2018-01-17T15:03:11Z name: dotmesh namespace: dotmesh resourceVersion: "418" selfLink: /api/v1/namespaces/dotmesh/secrets/dotmesh uid: 88c31d8b-fb97-11e7-b1fe-0242cd52be10 type: Opaque $ echo VlZLR1lDQzNHNEs1RzJRTTNHTElWVEVDVlNCV1dKWkQK | base64 -d VVKGYCC3G4K5G2QM3GLIVTECVSBWWJZD
The admin API key from an existing Docker-based cluster created with
dm cluster init can be found from the Dotmesh config file where
cluster init was run, with the following command:
$ cat ~/.dotmesh/config | jq -r .Remotes.local.ApiKey VVKGYCC3G4K5G2QM3GLIVTECVSBWWJZD
For S3 remotes please see
dm s3 remote add
dm remote -v
$ dm remote -v hub email@example.com test firstname.lastname@example.org * local email@example.com
All the remotes in the config file are listed, one per line. Each line
has the name of the remote, followed by the username and the hostname
USER@HOSTNAME form. The API keys are not printed out.
Note that the current remote is marked with a
* at the start of the
Remove a remote:
dm remote rm NAME.
$ dm remote -v hub firstname.lastname@example.org test email@example.com * local firstname.lastname@example.org $ dm remote rm test $ dm remote -v hub email@example.com * local firstname.lastname@example.org
Select the current remote:
dm remote switch NAME.
$ dm remote -v hub email@example.com * local firstname.lastname@example.org $ dm remote switch hub $ dm remote -v * hub email@example.com local firstname.lastname@example.org
Comparing client and remote versions:
$ dm version Current remote: local (use 'dm remote -v' to list and 'dm remote switch' to switch) Client: Version: release-0.1.0 Server: Version: release-0.1.0
The current dot.
You often need to perform lots of operations on a single dot, so rather than specifying the name of the dot in every command, each remote in the config file has a “current dot”. That means that if you switch remotes, the current dot will change, and will change back if you return to the original remote. The current dot for each remote is stored in the configuration file.
List the available dots:
dm list [-H|--scripting].
$ dm list Current remote: local (use 'dm remote -v' to list and 'dm remote switch' to switch) DOT BRANCH SERVER CONTAINERS SIZE COMMITS DIRTY * important_data master 504954d09db78174 19.00 kiB 0 19.00 kiB test_data master 504954d09db78174 19.00 kiB 0 19.00 kiB
Note that the current dot is marked with a
* at the start of the line. The fields are:
- The dot name.
- The currently selected branch on that dot.
- The ID of the server that’s currently managing that dot.
- The names of any containers currently using this dot.
- The size of the dot.
- How many commits have been made on this branch of the dot.
- How much data has been generated or modified since the last commit.
If you’re writing a script, you can also obtain this information in a more parseable format (without headings or prettification of numbers, and with a single tab between each field) using
dm list -H or
dm list --scripting - but it might be easier to use the API if you’re doing anything more complicated.
$ dm list -H important_data master 504954d09db78174 19456 0 19456 test_data master 504954d09db78174 19456 0 19456
Select a different current dot:
dm switch DOT.
Remember, each remote has a different list of dots - so the current dot is particular to each remote.
$ dm list Current remote: local (use 'dm remote -v' to list and 'dm remote switch' to switch) DOT BRANCH SERVER CONTAINERS SIZE COMMITS DIRTY * important_data master 504954d09db78174 19.00 kiB 0 19.00 kiB test_data master 504954d09db78174 19.00 kiB 0 19.00 kiB $ dm switch test_data $ dm list Current remote: local (use 'dm remote -v' to list and 'dm remote switch' to switch) DOT BRANCH SERVER CONTAINERS SIZE COMMITS DIRTY important_data master 504954d09db78174 19.00 kiB 0 19.00 kiB * test_data master 504954d09db78174 19.00 kiB 0 19.00 kiB
Create an empty dot:
dm init DOT.
$ dm init staging_data $ dm list Current remote: local (use 'dm remote -v' to list and 'dm remote switch' to switch) DOT BRANCH SERVER CONTAINERS SIZE COMMITS DIRTY important_data master 504954d09db78174 19.00 kiB 0 19.00 kiB * staging_data master 504954d09db78174 19.00 kiB 0 19.00 kiB test_data master 504954d09db78174 19.00 kiB 0 19.00 kiB
A newly created dot has no subdots, but it starts off with a small amount of “dirty” data because basic filesystem metadata has been created.
Delete a dot:
dm dot delete [-f|--force] DOT.
You will be prompted for confirmation, unless you specify the
$ dm dot delete staging_data Please confirm that you really want to delete the dot staging_data, including all branches and commits? (enter Y to continue): Y
Examine a dot:
dm dot show [-H|--scripting] DOT.
$ dm dot show test_data Dot admin/test_data: Master branch ID: e05cf6bf-46b9-4e34-6e08-01bc9f323a72 Dot is current. Dot size: 19.00 kiB (19.00 kiB dirty) Branches: * master Tracks dot alaric/test_data on remote hub
The results show:
- The full name of the dot, including a namespace.
- The master branch ID, which isn’t something you generally need when using the command line, but is useful for debugging your API apps.
- If this dot is the current dot, it will display
Dot is current.
- The size of the dot, and the amount of generated/modified “dirty” data since the last snapshot.
- The list of all the branches of the dot, with the current branch marked with a
- The default upstream dot on each remote that has one configured for this dot.
You can get all that data in a form more amenable to scripting with the
$ dm dot show --scripting test_data namespace admin name test_data masterBranchId e05cf6bf-46b9-4e34-6e08-01bc9f323a72 current size 19456 dirty 19456 currentBranch master branch master defaultUpstreamDot hub alaric/test_data
When you clone a dot from the Hub or another cluster,
dm stores the
assocation between your local dot and the original remote dot in the
Likewise, if you push a dot to another cluster, or pull updates to it
from another cluster,
dm will remember that association if it didn’t
already have one for that remote.
Each dot may have a “default upstream dot” for each remote in your configuration. There can’t be two default upstreams of a dot on any remote, but there might be none!
The list of upstream dots for a dot can be viewed with
dm dot show
DOT. Upstream dots may be assigned or re-assigned with
set-upstream [DOT] REMOTE REMOTE-DOT.
These commands can be a little confusing, because they involve two
remotes at once. There is always a current remote selected with
remote switch that is the “target” of your commands; that’s the
“local cluster” from the perspective of these commands. The command
line for transfer commands always names a second remote, which is the
“remote cluster” we are transferring dots to and from.
dm clone [--local-name LOCAL-DOT] REMOTE DOT BRANCH
In this example, we’ll clone the dot
alice/testing_data from the
Hub, and call it
$ dm clone --local-name new_data hub alice/testing_data Pulling admin/new_data from hub:alice/testing_data Calculating... finished 9.50 KB / 9.50 KB [==========================] 100.00% 0.43 MiB/s (1/1) Done!
If we run
dm dot show on
new_data, we’ll see that
alice/testing_data is the default upstream dot for it on
$ dm dot show new_data Dot admin/new_data: Master branch ID: c78bb46e-0d52-43e9-70bc-f2b78ace0f9d Dot size: 19.00 kiB (all clean) Branches: * master Tracks dot alice/test_data on remote hub
If you omit the
--local-name LOCAL-DOT part, then the dot will just
have the same name as the remote one - in this case,
If you are cloning an S3 bucket and only want to select a subset of the files, please see
dm s3 clone-subset
dm pull REMOTE [DOT [BRANCH]] [--remote-name REMOTE-DOT]
This command pulls new commits and branches from a remote dot into your local cluster.
If you only specify a
REMOTE name, then it will attempt to pull
updates to all branches of the current dot from that remote. If you
--remote-name REMOTE-DOT, it will pull from
REMOTE-DOT on the remote cluster. If not, and there is a default
upstream dot for that remote, it will pull from that dot. Otherwise,
it will pull from a dot with the same name as the current dot on your
local cluster, in the namespace corresponding to your username on the
remote cluster (eg, your Hub username).
If you specify a
REMOTE name and a
DOT name, then it will perform
the same steps, but with the local dot being the one named rather than
the current dot.
If you specify a
REMOTE name, a
DOT name and a
BRANCH, then it
will only pull new commits on the named branch, rather than trying to
pull commits for every branch.
$ dm pull hub Pulling admin/new_data from hub:alice/testing_data Calculating... finished 9.50 KB / 9.50 KB [==========================] 100.00% 0.43 MiB/s (1/1) Done!
dm push REMOTE [--remote-name DOT]
This command pushes the current branch of the current dot to the
REMOTE. If the destination dot already exists, local
commits that aren’t present in the destination will be pushed up,
bringing it up to date. If the destination does not already exists, it
will be created and all the commits on the current branch (and other
branches that the current branch depends upon) will be pushed up.
--remote-name is specified, then that is the name of the
destination dot on the remote cluster. Otherwise, if the current dot
has a default upstream dot for that remote, that will be the
destination dot. If not, the destination dot name will be the same as
the current dot’s name, but in your user’s namespace on the remote.
$ dm push hub Pushing admin/new_data to hub:alice/testing_data Calculating... finished 9.50 KB / 9.50 KB [==========================] 100.00% 0.38 MiB/s (1/1) Done!
Set the upstream dot:
dm dot set-upstream [DOT] REMOTE REMOTE-DOT
You can set the upstream dot for any given remote using this
command. If you omit the
DOT, then the current dot is used.
$ dm dot set-upstream new_data production bob/test_data $ dm dot show new_data Dot admin/new_data: Master branch ID: c78bb46e-0d52-43e9-70bc-f2b78ace0f9d Dot size: 19.00 kiB (all clean) Branches: * master Tracks dot alice/test_data on remote hub Tracks dot bob/test_data on remote production
These commands deal with the contents of a dot: branches and commits.
dm commit -m 'MESSAGE' [--metadata fieldname=value].
This command takes the “dirty” changes to the current dot since the
last commit (or the creation of the dot), and makes them into a new
commit with the given
$ dm commit -m "A well-written commit message"
You can also pass extra metadata fields that are added to the commit
by using the
--metadata flag. You can pass multiple metadata fields,
each using the format:
$ dm commit -m "A well-written commit message" --metadata fruit=apples --metadata color=red
This command lists the commits on the current branch.
$ dm log commit c96eefda-6940-499a-411c-22521f4a3452 author: admin date: 1516898188388491967 fruit: apples color: red A well-written commit message commit e568407c-5ea3-42bc-48e8-6e375c121d2b author: admin date: 1516898511693726664 fruit: apples color: red A poorly-written commit message
Note the commit IDs (highlighted) - they are needed to do a
List the branches:
This command lists the branches in the current dot.
$ dm branch version_1 * master
Note how the current branch is indicated with a leading
Create a branch:
dm checkout -b BRANCH.
This command creates a new branch, starting with the current branch, and makes the new branch current.
$ dm checkout -b version_2
dm checkout BRANCH.
This command makes a different branch current. If there are running containers using this dot that haven’t been pinned to a specific branch, they will be stopped before the change and restarted afterwards, using the new branch.
$ dm checkout version_1
Roll back commits:
dm reset [--hard] COMMIT.
This command rolls back the state of the current branch to a given
commit ID (which must be from this branch!). To get the commit IDs,
The command won’t let you roll back if there are uncommitted changes,
unless you specify
--hard to override it.
$ dm reset c96eefda-6940-499a-411c-22521f4a3452
These commands are for managing a Dotmesh cluster built using Docker. If you’re using Kubernetes, you don’t need these commands - the Dotmesh Kubernetes integration handles all of this for you!
Create a cluster:
dm cluster init [--port PORTNUM].
This command creates a new single-node Dotmesh cluster. You can force the cluster to be exposed on a specific port by specifying the port flag.
If a ZFS pool called
pool already exists, it will be used for Dot
storage. Otherwise, Dotmesh will default to creating a pool based on a
/var/lib/dotmesh. The file will be ten gigibytes in size.
The newly-created cluster will be automically configured as a remote
local in your
$ dm cluster init Checking suitable Docker is installed... assuming post-semver Docker client is sufficient. assuming post-semver Docker server is sufficient. Checking dotmesh isn't running... done. Pulling dotmesh-server docker image... done. Registering new cluster... got URL: https://discovery.dotmesh.io/da045bfb125bb69f7f55902ed0409494 Generating PKI assets... done. If you want more than one node in your cluster, run this on other nodes: dm cluster join https://discovery.dotmesh.io/da045bfb125bb69f7f55902ed0409494:DYJNVRS2PNJVBTQ44P3KVAC7LWKV325X This is the last time this secret will be printed, so keep it safe! Guessing docker host's IPv4 address (should be routable from other cluster nodes)... got: 192.168.1.34,192.168.1.33,10.192.0.1,172.18.0.1,172.19.0.1. Guessing unique name for docker host (using hostname, must be unique wrt other cluster nodes)... got: nixos. Starting etcd... done. Succeeded setting initial admin password to 'UMY5XI6WFMHKAMNO2HGWGN3MHQ74KMUH' - writing it to /home/alaric/.dotmesh/admin-password.txt Configuring dm CLI to authenticate to dotmesh server /home/alaric/.dotmesh/config... done. Starting dotmesh server... done. Waiting for dotmesh server to come up... done.
Note the join command, highlighted in the example above. Keep a copy of that - you can’t get it again, and you’ll need it if you want to add any more nodes to your cluster.
Join a cluster:
dm cluster join DISCOVERY-URL.
This command sets up a Dotmesh node on your computer, and joins it
into an existing cluster using the
DISCOVERY-URL printed out when
the original cluster was created.
If you specify a pool
PATH, then files will be created in the
directory pointed at by
PATH to store the actual dots.
If, instead, you specify a
ZFSPOOL, then the dots will be stored in
the ZFS pool with that name, which you must have created yourself. Use
this option if you have dedicated disk partitions for Dotmesh to
If you specify neither, then Dotmesh will default to creating a pool
The cluster will be automically configured as a remote called
dm configuration file.
$ dm cluster join https://discovery.dotmesh.io/1e52c023dfaa2f9e812ec7835bdd0540:OWSWZGRMUCBT5FFFD5NJIVCP5QQSQXVH Checking suitable Docker is installed... yes, got 1.12.6. Checking dotmesh isn't running... done. Pulling dotmesh-server docker image... done. Downloading PKI assets... done! Guessing docker host's IPv4 address (should be routable from other cluster nodes)... got: 10.192.0.2. Guessing unique name for docker host (using hostname, must be unique wrt other cluster nodes)... got: cluster-1516891762883170057-0-node-0. Starting etcd... done. Succeeded getting initial admin API key 'E3M6NJBGEEIWEKSPH7E4XLQAKQBQPBAB' Configuring dm CLI to authenticate to dotmesh server /root/.dotmesh/config... done. Starting dotmesh server... done. Waiting for dotmesh server to come up.... done.
Upgrade your node:
dm cluster upgrade.
This command stops the Dotmesh server on the current node, downloads
the Dotmesh server Docker image corresponding to the version of the
dm client you’re using, and starts it up. You would normally upgrade
Dotmesh on your node by downloading a new
dm client binary and
dm cluster upgrade with it. You can use
dm version to
check the client and server versions (make sure you’ve selected the
$ dm cluster upgrade Pulling dotmesh-server docker image... done. Stopping dotmesh-server...done. Stopping dotmesh-server-inner...done. Starting dotmesh server... done.
Remove Dotmesh from your node:
dm cluster reset.
This command stops the Dotmesh server on the current node, then deletes its resources. It doesn’t delete the Dot data itself, but it does destroy the local copy of the Dot metadata!
$ dm cluster reset Destroying all dotmesh data... done. Deleting dotmesh-etcd container... done. Deleting dotmesh-server containers... done. Deleting dotmesh-server-inner containers... done. Deleting dotmesh socket... done. Deleting dotmesh-etcd-data local volume... done. Deleting dotmesh-kernel-modules local volume... done. Deleting 'local' remote... done. Deleting cached PKI assets... done.
Add a new S3 remote:
dm s3 remote add ACCESS_KEY:SECRET_KEY[@HOST:PORT].
$ dm s3 remote add test access_key:secret S3 remote added.
Invoking this command will check that Dotmesh is able to list buckets using the access key and secret supplied - if it cannot connect it will fail with an appropriate error.
You can then manage S3 buckets using
pull as if they were Dotmesh servers, but you will not be able to make an S3 remote your current default. You can also clone a subset of an S3 bucket using
dm s3 clone-subset.
It is recommended that you enable versioning on your S3 bucket in order for Dotmesh to be able to discern changes easily.
Clone a section of an S3 bucket:
dm s3 clone-subset REMOTE BUCKET PREFIXES [--local-name LOCAL-DOT].
This command will clone only a selection of files from an S3 bucket, as dictated by PREFIXES.
$ dm s3 clone-subset --local-name new_data s3 test directory_1/ Pulling admin/new_data from s3:/test Calculating... finished 9.50 KB / 9.50 KB [==========================] 100.00% 0.43 MiB/s (1/1) Done!
You can also use multiple prefixes, separating them by a comma:
$ dm s3 clone-subset –local-name new_data s3 test directory_1/,hello- Pulling admin/new_data from s3:/test Calculating… finished 9.50 KB / 9.50 KB [==========================] 100.00% 0.43 MiB/s (2⁄2) Done!
When pulling or pushing a volume cloned in this way, only files which begin with these prefixes will be updated.