For optimal reading, please switch to desktop mode.
Published: Wed 10 October 2018
Updated: Wed 10 October 2018
By Bharat Kunwar
Stig Telfer
In Data .
tags: beegfs deployment data baremetal cluster
BeeGFS is a parallel file system suitable
for High Performance Computing with a proven track record in scalable
storage solution space. In this article, we explore how different
components of BeeGFS are pieced together and how we have incorporated
them into an Ansible role for a seamless storage cluster deployment
experience.
We've previously described ways of integrating OpenStack and
High-Performance Data . In
this post we'll focus on some practical details for how to dynamically
provision BeeGFS filesystems and/or clients running in cloud
environments. There are actually no dependencies on OpenStack here
- although we do like to draw our Ansible inventory from
Cluster-as-a-Service infrastructure .
As described here ,
BeeGFS has components which may be familiar concepts to those working
in parallel file system solution space:
Management service: for registering and watching all other services
Storage service: for storing the distributed file contents
Metadata service: for storing access permissions and striping info
Client service: for mounting the file system to access stored data
Admon service (optional): for presenting administration and
monitoring options through a graphical user interface.
Introducing our Ansible role for BeeGFS...
We have an Ansible role published on Ansible Galaxy which handles the
end-to-end deployment of BeeGFS. It takes care of details all the way
from deployment of management, storage and metadata servers to setting
up client nodes and mounting the storage point. To install, simply run:
ansible-galaxy install stackhpc.beegfs
There is a README
that describes the role parameters and example usage.
An Ansible inventory is organised into groups, each representing a
different role within the filesystem (or its clients). An example
inventory-beegfs file with two hosts bgfs1 and bgfs2 may
look like this:
[leader]
bgfs1 ansible_host = 172.16.1.1 ansible_user=centos
[follower]
bgfs2 ansible_host = 172.16.1.2 ansible_user=centos
[cluster:children]
leader
follower
[cluster_beegfs_mgmt:children]
leader
[cluster_beegfs_mds:children]
leader
[cluster_beegfs_oss:children]
leader
follower
[cluster_beegfs_client:children]
leader
follower
Through controlling the membership of each inventory group, it is
possible to create a variety of use cases and configurations . For example,
client-only deployments, server-only deployments, or hyperconverged
use cases in which the filesystem servers are also the clients
(as above).
A minimal Ansible playbook which we shall refer to as beegfs.yml to
configure the cluster may look something like this:
---
- hosts :
- cluster_beegfs_mgmt
- cluster_beegfs_mds
- cluster_beegfs_oss
- cluster_beegfs_client
roles :
- role : stackhpc.beegfs
beegfs_state : present
beegfs_enable :
mgmt : "{{ inventory_hostname in groups['cluster_beegfs_mgmt'] }}"
oss : "{{ inventory_hostname in groups['cluster_beegfs_oss'] }}"
meta : "{{ inventory_hostname in groups['cluster_beegfs_mds'] }}"
client : "{{ inventory_hostname in groups['cluster_beegfs_client'] }}"
admon : no
beegfs_mgmt_host : "{{ groups['cluster_beegfs_mgmt'] | first }}"
beegfs_oss :
- dev : "/dev/sdb"
port : 8003
- dev : "/dev/sdc"
port : 8103
- dev : "/dev/sdd"
port : 8203
beegfs_client :
path : "/mnt/beegfs"
port : 8004
beegfs_interfaces :
- "ib0"
beegfs_fstype : "xfs"
beegfs_force_format : no
beegfs_rdma : yes
...
To create a BeeGFS cluster spanning the two nodes as defined in the
inventory, run a single Ansible playbook to handle the setup and the
teardown of BeeGFS storage cluster components by setting
beegfs_state flag to present or absent :
# build cluster
ansible-playbook beegfs.yml -i inventory-beegfs -e beegfs_state = present
# teardown cluster
ansible-playbook beegfs.yml -i inventory-beegfs -e beegfs_state = absent
The playbook is designed to fail if the path specified for BeeGFS
storage service under beegfs_oss is already being used for another
service. To override this behaviour, pass an extra option as -e
beegfs_force_format=yes . Be warned that this will cause data loss as
it formats the disk if a block device is specifed and also erase
management and metadata server data if there is an existing BeeGFS
deployment.
Highlights of the Ansible role for BeeGFS:
The idempotent role will leave state unchanged if the configuration
has not changed compared to the previous deployment.
The tuning parameters for optimal performance of the storage servers
recommended by
the BeeGFS maintainers themselves are automatically set.
The role can be used to deploy both storage-as-a-service and
hyperconverged
architecture by the nature of how roles are ascribed to hosts in
the Ansible inventory. For example, the hyperconverged case would
have storage and client services running on the same nodes while
in the disaggregated case, the clients are not aware of storage
servers.
Other things we learnt along the way:
BeeGFS is sensitive to hostname. It prefers hostnames to be consistent
and permanent. If the hostname changes, services refuse to start . As a result,
this is worth being mindful of during the initial setup.
This is unrelated to BeeGFS specifically but we had to set a -K flag
when formatting NVME devices in order to prevent it from discarding
blocks under instructions from Dell otherwise the disk would disappear
with the following error message:
[ 7926 .276759] nvme nvme3: Removing after probe failure status: -19
[ 7926 .349051] nvme3n1: detected capacity change from 3200631791616 to 0
Looking Ahead
The simplicity of BeeGFS deployment and configuration makes it a
great fit for automated cloud-native deployments. We have seen
a lot of potential in the performance of BeeGFS, and we hope to be
publishing more details from our tests in a future post.
We are also investigating the current state of Kubernetes integration,
using the emerging CSI driver API to support the attachment of
BeeGFS filesystems to Kubernetes-orchestrated containerised workloads.
Watch this space!
In the meantime, if you would like to get in touch we would love to hear
from you. Reach out to us via Twitter
or directly via our contact page .