On 2017-11-15 00:37:26 +0000 (+0000), Fox, Kevin M wrote:
One idea is that at the root of chaos monkey. If something is
hard, do it frequently. If upgrading is hard, we need to be doing
it constantly so the pain gets largely eliminated. One idea would
be to discourage the use of standing up a fresh devstack all the
time by devs and have them upgrade them instead. If its hard, then
its likely someone will chip in to make it less hard.
This is also the idea behind running grenade in CI. The previous
OpenStack release is deployed, an attempt at a representative (if
small) dataset is loaded into it, and then it is upgraded to the
release under development with the proposed change applied and
exercised to make sure the original resources built under the
earlier release are still in working order. We can certainly do more
to make this a better representation of "The Real World" within the
resource constraints of our continuous integration, but we do at
least have a framework in place to attempt it.
Another is devstack in general. the tooling used by devs and that
used by ops are so different as to isolate the devs from ops'
pain. If they used more opsish tooling, then they would hit the
same issues and would be more likely to find solutions that work
for both parties.
Keep in mind that DevStack was developed to have a quick framework
anyone could use to locally deploy an all-in-one OpenStack from
source. It was not actually developed for CI automation, to the
extent that we developed a separate wrapper project to make DevStack
usable within our CI (the now somewhat archaically-named
devstack-gate project). It's certainly possible to replace that with
a more mainstream deployment tool, I think, so long as it maintains
the primary qualities we rely on: 1. rapid deployment, 2. can work
on a single system with fairly limited resources, 3. can deploy from
source and incorporate proposed patches, 4. pluggable/extensible so
that new services can be easily integrated even before they're
A third one is supporting multiple version upgrades in the gate. I
rarely have a problem with a cloud has a database one version
back. I have seen lots of issues with databases that contain data
back when the cloud was instantiated and then upgraded multiple
I believe this will be necessary anyway if we want to officially
support so-called "fast forward" upgrades, since anything that's not
tested is assumed to be (and in fact usually is) broken.
Another option is trying to unify/detangle the upgrade procedure.
upgrading compute kit should be one or two commands if you can
live with the defaults. Not weeks of poring through release notes,
finding correct orders from pages of text and testing vigorously
on test systems.
This also sounds like a defect in our current upgrade testing, if
we're somehow embedding upgrade automation in our testing without
providing the same tools to easily perform those steps in production
How about some tool that does the: dump database to somewhere
temporary, iterate over all the upgrade job components, and see if
it will successfully not corrupt your database. That takes a while
to do manually. Ideally it could even upload stacktraces back a
bug tracker for attention.
Without a clearer definition of "successfully not corrupt your
database" suitable for automated checking, I don't see how this one
is realistic. Do we have a database validation tool now? If we do,
is it deficient in some way? If we don't, what specifically should
it be checking? Seems like something we would also want to run at
the end of all our upgrade tests too.
OpenStack Development Mailing List (not for usage questions)