For optimal reading, please switch to desktop mode.
Projects are an excellent way to separate OpenStack resources, but out of the
box, that separation is logical rather than physical. Virtual machines will
schedule wherever they fit, provisioning on any hypervisor that offers up the
correct combination of resources and traits for their given flavor. But what if
you want to tightly control the compute resources of different projects?
Perhaps a lower-priority project that should squash its VMs onto hypervisors
with much higher overcommit ratios? Maybe important projects need to always
have compute resources available. What you need is a way of isolating projects
to groups of hypervisors.
Nova configuration
For this guide, we're going to assume we have a project called
top-priority, and we'll call the aggregate of private hypervisors
private-hosts.
Before making changes at the OpenStack level, set the following in your
nova.conf file for the Nova Scheduler:
[scheduler]
limit_tenants_to_placement_aggregate = True
enable_isolated_aggregate_filtering = True
According to the Nova documentation,
limit_tenants_to_placement_aggregate lets the scheduler restrict tenants to
specific placement aggregates. Now that's a lot of big words, so let's break it
down a bit further. When a user creates a server instance in OpenStack, the
scheduler decides which hypervisor host it should be allocated to. The
scheduler is almost endlessly configurable and understanding exactly how it
works is a rabbit hole we'll try to avoid for now. There are only really three
things you need to know:
- Placement aggregates are logical groups of resource providers. In this case,
it's going to be the private-hosts group that we want to be isolating.
- "Tenant" is just an old term for "Project" in OpenStack. So what that option
really does is just let us limit projects to only use certain
hypervisors.
- That's only half the story though, because now we can make sure
top-priority VMs stay on the private-hosts, it does nothing to block
other projects from also putting their VMs on private-hosts. For that,
we need our second option.
Again, according to the Nova documentation,
enable_isolated_aggregate_filtering allows the scheduler to restrict hosts
in aggregates based on matching required traits in the aggregate metadata and
the instance flavor/image. What this means is that we can have a set of
required traits on our aggregate, so it will block any images or flavors
without those traits. This might not sound very helpful at first, but OpenStack
images and flavors can be private to a project. If we create a set of private
flavors with some unique traits, we can tell the private-hosts to only
allow those flavors in. And voila! We can now exclusively isolate projects to
hypervisors!
OpenStack configuration
Well now we know it's all possible in theory, we just need to put it into
practice.
It's worth noting that administrator access is assumed for all the OpenStack
commands below. We'll also assume you're already using a Python virtual
environment with regular OpenStack CLI access configured.
Step 0 is to install the placement client. It's a separate package to the
normal OpenStack client:
pip install osc-placement
Placement used to be part of Nova, but has grown to be its own OpenStack
service for tracking resource provider inventories and usage.
Step 0.1 is to take note of the UUIDs of any hypervisors we want to isolate,
along with the top-priority project.
openstack --os-compute-api-version=2.53 hypervisor list
openstack project show top-priority -c name -c id
The first real step is to create that private-hosts
aggregate, and add the hosts we want to isolate:
openstack aggregate create private-hosts
openstack aggregate add host private-hosts <hypervisor1 UUID>
openstack aggregate add host private-hosts <hypervisor2 UUID>
...
It's worth noting that aggregates in Nova are not the same as aggregates in
Placement! That being said, since the Rocky release the nova-api service
will attempt to keep them in sync. When an administrator adds or removes a host
to/from a Nova host aggregate, the change should be mirrored in placement. If
they do get out of sync, there's a nova-manage command
to bring them back together manually.
Limiting the project to that aggregate is as easy as setting a single property
on the aggregate:
openstack aggregate set --property filter_tenant_id=<top-priority UUID> private-hosts
It's worth noting that if the top-priority project spawns a sequel, it is
very easy to put both projects on the same set of hosts. The
filter_tenant_id key can be suffixed with any string to add multiple
projects to the same aggregate, such as filter_tenant_id2=$<top-priority-2
UUID>.
Now to block any other projects from using those hosts, we need a custom trait.
Custom traits must always start with CUSTOM_ so in this case we'll call it
CUSTOM_PRIVATE_HOSTS.
openstack --os-placement-api-version 1.6 trait create CUSTOM_PRIVATE_HOSTS
For this example, we'll put the trait on a test flavor. In production, it can
be slapped on any combination of images or flavors you want. The trait just
needs to be present somewhere on the VM for it to be scheduled properly.
openstack flavor create --vcpus 2 --ram 8 --disk 30 --private --project top-priority --property trait:CUSTOM_PRIVATE_HOSTS=required private-flavor
Now for the awkward bit. The trait needs to be added to the aggregate and
every hypervisor individually. Crazy, I know, but that's just the way it is.
Be careful setting traits on hypervisors using the CLI, adding new traits will
overwrite everything previously set. Standard traits will just be re-discovered
automatically, but anything custom needs to be explicitly added again. The
example below uses some bash trickery to get around this and just append the
new traits:
# Apply the trait to each hypervisor
traits=$(openstack --os-placement-api-version 1.6 resource provider trait list -f value <hypervisor UUID> | sed 's/^/--trait /')
openstack --os-placement-api-version 1.6 resource provider trait set $traits --trait CUSTOM_PRIVATE_HOSTS <hypervisor_UUID>
# Apply the trait to the aggregate
openstack --os-compute-api-version 2.53 aggregate set --property trait:CUSTOM_PRIVATE_HOSTS=required private-hosts
Hypervisors do have the alternative option of using a provider config file.
This can be a much more manageable solution for larger deployments since the
configuration can be version-controlled and deployment tools such as Kolla
Ansible
allow the configuration to be deployed to groups of hosts en masse. Below is a
very simple example of a provider.yml file which would also set the custom
trait.
meta:
schema_version: '1.0'
providers:
- identification:
name: $COMPUTE_NODE
traits:
additional:
- 'CUSTOM_PRIVATE_HOSTS'
And that's it! private-flavor is blocked outside of top-priority and is
mandatory inside it (any other flavor will fail to schedule). So
top-priority will never share a hypervisor with another project again.
The upsides, the downsides, and the middlesides
A setup like this isn't perfect. The most obvious flaw is that you'll need a
new set of flavors or images for your private project, which could mean a whole
lot of unnecessary duplication.
Another issue is that it's not very easy to revert these changes once you've
started creating VMs. The hypervisor configuration is very easy to remove, a
few simple commands will remove the traits and take the host out of the
aggregate. The flavor configuration however is different. Editing a flavor to
remove a trait won't remove the trait from any existing VMs. The worst part is
that there's no way to tell that these ghost traits still exist, except for
manually inspecting the Nova database. That can leave you in the awkward
situation of having an unschedulable VM. It will keep running where it is, but
any migration operations will fail because no valid hosts exist. There are two
ways to get out of this trap. One is to never edit your flavors (which is good
advice in any case) and instead resize to a new flavor that doesn't have the
old traits. The other is to dust off your SQL skills and manually edit the
Database to remove the offending trait. Neither solution is perfect, the first
requires a restart of the VM, the second is a sketchy workaround at best and a
typo away from catastrophe at worst.
One alternative would be to enable placement_aggregate_required_for_tenants
for Nova and attach every project to an aggregate. That moves all of the
configuration away from images and flavors and onto the aggregates, which are
far easier to lock down.
That approach solves the first problem but exacerbates the second, and also
doesn't scale particularly well. Our original method is nice because all the
extra config doesn't interfere with any defaults. Projects and hypervisors
without any of the extra config will still behave as normal. With the
alternative approach, all new hypervisors and projects need to be added to
aggregates, and the complexity eternally grows with the size of the cloud.
Unfortunately, there's just no one-size-fits-all solution. It would be lovely
if a single configuration option existed, but alas it does not. If it
particularly bothers you, dear reader, you are more than welcome to make an
upstream contribution. For now, we must work with what we have.