tl;dr everything looks great, and memory usage has dropped by about 64%
since the initial Newton release of Heat.
I re-ran my analysis of Heat memory usage in the tripleo-heat-templates
gate. (This is based on the gate-tripleo-ci-centos-7-ovb-nonha job.)
Here's a pretty picture:
There is one major caveat here: for the period marked in grey where it
says "Only 2 engine workers", the job was configured to use only 2
heat-enginer worker processes instead of 4, so this is not an
apples-to-apples comparison. The inital drop at the beginning and the
subsequent bounce at the end are artifacts of this change. Note that the
stable/newton branch is still using only 2 engine workers.
The rapidly increasing usage on the left is due to increases in the
complexity of the templates during the Newton cycle. It's clear that if
there has been any similar complexity growth during Ocata, it has had a
tiny effect on memory consumption in comparison.
I tracked down most of the step changes to identifiable patches:
2016-10-07: 2.44GiB -> 1.64GiB
- https://review.openstack.org/382068/ merged, making ResourceInfo
classes more memory-efficient. Judging by the stable branch (where this
and the following patch were merged at different times), this was
responsible for dropping the memory usage from 2.44GiB -> 1.83GiB.
(Which seems like a disproportionately large change?)
- https://review.openstack.org/#/c/382377/ merged, so we no longer
create multiple yaql contexts. (This was responsible for the drop from
1.83GiB -> 1.64GiB.)
2016-10-17: 1.62GiB -> 0.93GiB
- https://review.openstack.org/#/c/386696/ merged, reducing the number
of engine workers on the undercloud to 2.
2016-10-19: 0.93GiB -> 0.73GiB (variance also seemed to drop after this)
- https://review.openstack.org/#/c/386247/ merged (on 2016-10-16),
avoiding loading all nested stacks in a single process simultaneously
much of the time.
- https://review.openstack.org/#/c/383839/ merged (on 2016-10-16),
switching output calculations to RPC to avoid almost all simultaneous
loading of all nested stacks.
2016-11-08: 0.76GiB -> 0.70GiB
- This one is a bit of a mystery???
2016-11-22: 0.69GiB -> 0.50GiB
- https://review.openstack.org/#/c/398476/ merged, improving the
efficiency of resource listing?
2016-12-01: 0.49GiB -> 0.88GiB
- https://review.openstack.org/#/c/399619/ merged, returning the
number of engine workers on the undercloud to 4.
It's not an exact science because IIUC there's a delay between a patch
merging in Heat and it being used in subsequent t-h-t gate jobs. e.g.
the change to getting outputs over RPC landed the day before the
instack-undercloud patch that cut the number of engine workers, but the
effects don't show up until 2 days after. I'd love to figure out what
happened on the 8th of November, but I can't correlate it to anything
obvious. The attribution of the change on the 22nd also seems dubious,
but the timing adds up (including on stable/newton).
It's fair to say that none of the other patches we merged in an attempt
to reduce memory usage had any discernible effect :D
It's worth reiterating that TripleO still disables convergence in the
undercloud, so these are all tests of the legacy code path. It would be
great if we could set up a non-voting job on t-h-t with convergence
enabled and start tracking memory use over time there too. As a first
step, maybe we could at least add an experimental job on Heat to give us
The next big improvement to memory use is likely to come from
https://review.openstack.org/#/c/407326/ or something like it (though I
don't think we have a firm decision on whether we'd apply this to
non-convergence stacks). Hopefully that will deliver a nice speed boost
for convergence too.
OpenStack Development Mailing List (not for usage questions)