Thanks Curtis, Robert, David and Mohammed for your responses.
As a follow up question, do you use any deployment automation tools for setting up the HA control plane?
I can see the value of deploying each service in separate virtual environment or containers but automating such deployment requires developing some new tools. Openstack-ansible is one potential deployment tool that I am aware of but that had limited support CentOS.
Here we currently have a physical loadbalancer which provides the ALL the HA logic.
The services are installed on vms managed by Puppet.
IMHO a loadbalancer is the right place to solve HA since you only have to do it once.
(Depending on your Neutron implementation you might also need something for Neutron)
To rant a bit about deployment/automation:
I am not necessarily a fan of management with puppet since module dependencies can become a nightmare even with things like R10K.
e.g. you want to deploy a newer version of keystone, this requires a newer openstack-common puppet module.
Now you have a change that affects everything (other OpenStack puppet modules also use openstack-common) and might now need upgrades as well.
To solve these kind of things we are looking at containers and investigated two possible deployment scenario’s:
OpenStack helm (+containers build by kolla) which is pretty nice and uses k8s.
The problem is that it is still early days for both k8s and helm.
Things that stuck out most:
* Helm: Nice for a deployment from scratch.
Integration with our environment is a bit of a pain (e.g. if you want to start with just one service)
It would need a lot of work to above into the product and not everything would be easy to get into upstream.
Still a very early implementation needs quite a bit of TLC.
If you can live with what comes out of the box it might be a nice solution.
* K8S: It is a relatively complex product and it is still missing some features especially for self-hosted installations.
After some deliberation, we decided to go with the “hashi” stack (with kola build containers).
This stack has more of a unix philosophy, simple processes that do 1 thing well:
* Nomad –scheduling
* Consul - service discovery and KV store
* Vault – secret management
* Fabio zeroconf loadbalancer which integrates with Consul
In general this stack is really easy to understand for everyone. (just work with it half a day and you really understand what is going on under the hood)
There are no overlay networks :)
Lots of the stuff can break without much impact. E.g. Nomad is only relevant when you want to start/stop containers it can crash or turned off the rest of the time.
Another pro for this is that there is a significant amount of knowledge around these products in house.
To give an example on complexity: if you look at deployment of k8s itself and the hashi stack:
* deployment of k8s with kargo: you have a very large playblook which takes 30 minutes to run to setup a cluster.
* deployment of the hashi stuff: is just one binary for each component with 1 config file basically done in a few minutes if even that.