On Fri, Jul 7, 2017 at 6:50 PM, James Slagle email@example.com wrote:
I proposed a session for the PTG
(https://etherpad.openstack.org/p/tripleo-ptg-queens) about forming a
common plan and vision around Ansible in TripleO.
I think it's important however that we kick this discussion off more
broadly before the PTG, so that we can hopefully have some agreement
for deeper discussions and prototyping when we actually meet in
Thanks for starting this James, it's a topic that I've also been
giving quite a lot of thought to lately (and as you've seen, have
pushed some related patches) so it's good to get some broader
Right now, we have multiple uses of Ansible in TripleO:
(0) tripleo-quickstart which follows the common and well accepted
approach to bundling a set of Ansible playbooks/roles.
FWIW I agree with Giulio that quickstart is a separate case, and while
I also do agree with David that there's plenty of scope for
improvement of the oooq user experience, but I'm going to focus on the
TripleO deployment aspects below.
(1) Mistral calling Ansible. This is the approach used by
tripleo-validations where Mistral directly executes ansible playbooks
using a dynamic inventory. The inventory is constructed from the
server related stack outputs of the overcloud stack.
(2) Ansible running playbooks against localhost triggered by the
heat-config Ansible hook. This approach is used by
tripleo-heat-templates for upgrade tasks and various tasks for
(3) Mistral calling Heat calling Mistral calling Ansible. In this
approach, we have Mistral resources in tripleo-heat-templates that are
created as part of the overcloud stack and in turn, the created
Mistral action executions run ansible. This has been prototyped with
using ceph-ansible to install Ceph as part of the overcloud
deployment, and some of the work has already landed. There are also
proposed WIP patches using this approach to install Kubernetes.
There are also some ideas forming around pulling the Ansible playbooks
and vars out of Heat so that they can be rerun (or run initially)
independently from the Heat SoftwareDeployment delivery mechanism:
(5) Another idea I'd like to prototype is a local tool that runs on
the undercloud and pulls all of the SoftwareDeployment data out of
Heat as the stack is being created and generates corresponding Ansible
playbooks to apply those deployments. Once a given playbook is
generated by the tool, the tool would signal back to Heat that the
deployment is complete. Heat then creates the whole stack without
actually applying a single deployment to an overcloud node. At that
point, Ansible (or Mistral->Ansible for an API) would be used to do
the actual deployment of the Overcloud with the Undercloud as the
Yeah so my idea with (4), and subsequent patches such as is to
gradually move the deploy steps performed to configure services (on
baremetal and in containers) to a single ansible playbook.
There's currently still heat orchestration around the host preparation
(although this is performed via ansible) and iteration over each step
(where we re-apply the same deploy-steps playbook with an incrementing
step variable, but this could be replaced by e.g an ansible or mistral
loop), but my idea was to enable end-to-end configuration of nodes via
ansible-playbook, without the need for any special tooks (e.g we
refactor t-h-t enough that we don't need any special tools, and we
make deploy-steps-playbook.yaml the only method of deployment (for
baremetal and container cases)
All of this work has merit as we investigate longer term plans, and
it's all at different stages with some being for dev/CI (0), some
being used already in production (1 and 2), some just at the
experimental stage (3 and 4), and some does not exist other than an
I'd like to get the remaining work for (4) done so it's a supportable
option for minor updates, but there's still a bit more t-h-t
refactoring required to enable it I think, but I think we're already
pretty close to being able to run end-to-end ansible for most of the
PostDeploy steps without any special tooling.
Note this related patch from Matthieu:
I think we'll need to go further here but it's a starting point which
shows how we could expose ansible tasks from the heat stack outputs as
a first step to enabling standalone configuration via ansible (or
My intent with this mail is to start a discussion around what we've
learned from these approaches and start discussing a consolidated plan
around Ansible. And I'm not saying that whatever we come up with
should only use Ansible a certain way. Just that we ought to look at
how users/operators interact with Ansible and TripleO today and try
and come up with the best solution(s) going forward.
I think that (1) has been pretty successful, and my idea with (5)
would use a similar approach once the playbooks were generated.
Further, my idea with (5) would give us a fully backwards compatible
solution with our existing template interfaces from
tripleo-heat-templates. Longer term (or even in parallel for some
time), the generated playbooks could stop being generated (and just
exist in git), and we could consider moving away from Heat more
Yeah I think working towards aligning more TripleO configuration with
the approach taken by tripleo-validations is fine, and we can e.g add
more heat generated data about the nodes to the dynamic ansible
We've been gradually adding data there, which I hope will enable a
cleaner "split stack", where the nodes are deployed via heat, then
ansible can do the configuration based on data exposed via stack
outputs (which again is a pattern that I think has been proven to work
quite well for tripleo-validations, and is also something I've been
using locally for dev testing quite successfully).
I recognize that saying "moving away from Heat" may be quite
controversial. While it's not 100% the same discussion as what we are
doing with Ansible, I think it is a big part of the discussion and if
we want to continue with Heat as the primary orchestration tool in
Yeah, I think the first step is to focus on a clean "split stack"
model where the nodes/networks etc are still deployed via heat, then
ansible handles the configuration of the nodes.
In the long term I could see benefits in a "tripleo lite" model,
where, say, we only used mistral+Ironic+ansible, but IMO we're not at
the point yet where that's achievable, primarily because there's
coupling between the heat parameter interfaces and multiple
integrations we can't break (e.g users with environment files,
tripleo-ui, vendor integrations, etc).
It's a good discussion to kick off regardless though, so personally
I'd like to focus on these as the first "baby steps":
How to perform end-to-end configuration via ansible (outside of
heat, but probably still using data and possibly playbooks generated
How to deploy nodes directly via Ironic, with a mistral workflow
(e.g no Nova and potentially no Neutron?), I started that in
https://review.openstack.org/#/c/313048/ but could use some help
I've been hearing a lot of feedback from various operators about how
difficult the baremetal deployment is with Heat. While feedback about
Ironic is generally positive, a lot of the negative feedback is around
the Heat->Nova->Ironic interaction. And, if we also move more towards
Ansible for the service deployment, I wonder if there is still a long
term place for Heat at all.
So while there are plenty of valid complaints, one observation is Heat
always gets blamed because it's the operator visible interface, but
quite often the problems are e.g Nova or some other non-heat issue,
for example "No valid host found" is often perceived a heat problem by
new users when in reality it's not.
That said, there are valid complaints around the SoftwareDeployment
approach and operator familiarity vs some more traditional tool such
Personally, I'm pretty apprehensive about the approach taken in (3). I
feel that it is a lot of complexity that could be done simpler if we
took a step back and thought more about a longer term approach. I
recognize that it's mostly an experiment/POC at this stage, and I'm
not trying to directly knock down the approach. It's just that when I
start to see more patches (Kubernetes installation) using the same
approach, I figure it's worth discussing more broadly vs trying to
have a discussion by -1'ing patch reviews, etc.
I agree, I think the approach in (3) is a stopgap until we can define
a cleaner approach with less layers.
IMO the first step towards that is likely to be a "split stack" which
outputs heat data, then deployment configuration is performed via
mistral->ansible just like we already do in (1).
I'm interested in all feedback of course. And I plan to take a shot at
working on the prototype I mentioned in (5) if anyone would like to
collaborate around that.
I'm very happy to collaborate, and this is quite closely related to
the investigations I've been doing around enabling minor updates for
Lets sync up about it, but as I mentioned above I'm not yet fully sold
on a new translation tool, vs just more t-h-t refactoring to enable
output of data directly consumable via ansible-playbook (which can
then be run via operators, or heat, or mistral, or whatever).
I think if we can form some broad agreement before the PTG, we have a
chance at making some meaningful progress during Queens.
Agreed, although we probably do need to make some more progress on
some aspects of this for container minor updates that we'll need for