Re: [Openstack-operators] [nova] Looking for feedback on a spec to limit max_count in multi-create requests

Hello Matt,

starting 1000 instances in production works for me already. We are on
Openstack Newton.
I described my configuration here:

If things blow up for you with hundreds, probably there is a
regression somewhere.



2017-10-06 23:43 GMT+02:00 Matt Riedemann

I've been chasing something weird I was seeing in devstack when creating
hundreds of instances in a single request where at some limit, things blow
up in an unexpected way during scheduling and all instances were put into
ERROR state. Given the environment I was running in, this shouldn't have
been happening, and today we figured out what was actually happening. To
summarize, we retry scheduling requests on RPC timeout so you can have
schedulermaxattempts greenthreads running concurrently trying to schedule
1000 instances and melt your scheduler.

I've started a spec which goes into the details of the actual issue:

It also proposes a solution, but I don't feel it's the greatest solution, so
there are also some alternatives in there.

I'm really interested in operator feedback on this because I assume that
people are dealing with stuff like this in production already, and have had
to come up with ways to solve it.




