settingsLogin | Registersettings

[openstack-dev] [nova] key_pair update on rebuild (a whole lot of conversations)

0 votes

There is currently a spec up for being able to specify a new key_pair
name during the rebuild operation in Nova -
https://review.openstack.org/#/c/375221/

For those not completely familiar with Nova operations, rebuild triggers
the "reset this vm to initial state" by throwing out all the disks, and
rebuilding them from the initial glance images. It does however keep the
IP address and device models when you do that. So it's useful for
ephemeral but repeating workloads, where you'd rather not have the
network information change out from under you.

The spec is a little vague about when this becomes really useful,
because this will not save you from "I lost my private key, and I have
important data on that disk". Because the disk is destroyed. That's the
point of rebuild. We once added this preserve_ephemeral flag to rebuild
for trippleo on ironic, but it's so nasty we've scoped it to only work
with ironic backends. Ephemeral should mean ephemeral.

Rebuild bypasses the scheduler. A rebuilt server stays on the same host
as it was before, which means the operation has a good chance of being
faster than a DELETE + CREATE, as the image cache on that host should
already have the base image for you instance.

A bunch of data was collected today in a lot of different IRC channels
(#openstack-nova, #openstack-infra, #openstack-operators).

= OpenStack Operators =

mnaser said that for their customers this would be useful. Keys get lost
often, but keeping the IP is actually valuable. They would also like this.

penick said that for their existing environment, they have a workflow
where this would be useful. But they are moving away from using nova for
key distribution because in Nova keys are user owned, which actually
works poorly given that everything else is project owned. So they are
building something to do key distribution after boot in the guest not
using nova's metadata.

Lots of people said they didn't use nova's keypair interfaces, they just
did it all in config management after the fact.

= Also on reboot? =

Because the reason people said they wanted it was: "I lost my private
key", the question at PTG was "does that mean you want it on reboot?"

But as we dive through the constraints of that, people that build "pet"
VMs typically delete or disable cloud-init (or similar systems) after
first boot. Without that kind of agent, this isn't going to work anyway.

So also on reboot seems very fragile and unuseful.

= Infra =

We asked the infra team if this is useful to them, the answer was no.
What would be useful them is if keypairs could be updated. They use a
symbolic name for a keypair but want to do regular key rotation. Right
now they do this by deleting then recreating keypairs, but that does
mean there is a window where there is no keypair with that name, so
server creates fail.

It is agreed that something supporting key rotation in the future would
be handy, that's not in this scope.

= Barbican =

In the tradition of making a simple fix a generic one, it does look like
there is a longer term part of this where Nova should really be able to
specify a Barbican resource url for a key so that things like rotation
could be dealt with in a system that specializes in that. It also would
address the very weird oddity of user vs. project scoping.

That's a bigger more nebulous thing. Other folks would need to be
engaged on that one.

= Where I think we are? =

I think with all this data we're at the following:

Q: Should we add this to rebuild
A: Yes, probably - after some enhancement to the spec *


    • we really should have much better use cases about the situations it
      is expected to be used in. We spend a lot of time 2 and 3 years out
      trying to figure out how anyone would ever use a feature, and adding
      another one without this doesn't seem good

Q: should this also be on reboot?
A: NO - it would be too fragile

I also think figuring out a way to get Nova out of the key storage
business (which it really shouldn't be in) would be good. So if anyone
wants to tackle Nova using Barbican for keys, that would be ++. Rebuild
doesn't wait on that, but Barbican urls for keys seems like a much
better world to be in.

-Sean

--
Sean Dague
http://dague.net


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
asked Oct 4, 2017 in openstack-dev by Sean_Dague (66,200 points)   4 8 14

6 Responses

0 votes

On 10/3/2017 3:16 PM, Sean Dague wrote:
There is currently a spec up for being able to specify a new key_pair
name during the rebuild operation in Nova -
https://review.openstack.org/#/c/375221/

For those not completely familiar with Nova operations, rebuild triggers
the "reset this vm to initial state" by throwing out all the disks, and
rebuilding them from the initial glance images. It does however keep the
IP address and device models when you do that. So it's useful for
ephemeral but repeating workloads, where you'd rather not have the
network information change out from under you.

We also talked quite a bit about rebuild with volume-backed instances
today, and the fact the root disk isn't replaced during rebuild in that
case, for which there are many reported bugs...

The spec is a little vague about when this becomes really useful,
because this will not save you from "I lost my private key, and I have
important data on that disk". Because the disk is destroyed. That's the
point of rebuild. We once added this preserve_ephemeral flag to rebuild
for trippleo on ironic, but it's so nasty we've scoped it to only work
with ironic backends. Ephemeral should mean ephemeral.

Rebuild bypasses the scheduler. A rebuilt server stays on the same host
as it was before, which means the operation has a good chance of being
faster than a DELETE + CREATE, as the image cache on that host should
already have the base image for you instance.

It also means no chances for NoValidHost or resource claim failures.

A bunch of data was collected today in a lot of different IRC channels
(#openstack-nova, #openstack-infra, #openstack-operators).

= OpenStack Operators =

mnaser said that for their customers this would be useful. Keys get lost
often, but keeping the IP is actually valuable. They would also like this.

penick said that for their existing environment, they have a workflow
where this would be useful. But they are moving away from using nova for
key distribution because in Nova keys are user owned, which actually
works poorly given that everything else is project owned. So they are
building something to do key distribution after boot in the guest not
using nova's metadata.

Lots of people said they didn't use nova's keypair interfaces, they just
did it all in config management after the fact.

= Also on reboot? =

Because the reason people said they wanted it was: "I lost my private
key", the question at PTG was "does that mean you want it on reboot?"

But as we dive through the constraints of that, people that build "pet"
VMs typically delete or disable cloud-init (or similar systems) after
first boot. Without that kind of agent, this isn't going to work anyway.

So also on reboot seems very fragile and unuseful.

= Infra =

We asked the infra team if this is useful to them, the answer was no.
What would be useful them is if keypairs could be updated. They use a
symbolic name for a keypair but want to do regular key rotation. Right
now they do this by deleting then recreating keypairs, but that does
mean there is a window where there is no keypair with that name, so
server creates fail.

It is agreed that something supporting key rotation in the future would
be handy, that's not in this scope.

= Barbican =

In the tradition of making a simple fix a generic one, it does look like
there is a longer term part of this where Nova should really be able to
specify a Barbican resource url for a key so that things like rotation
could be dealt with in a system that specializes in that. It also would
address the very weird oddity of user vs. project scoping.

That's a bigger more nebulous thing. Other folks would need to be
engaged on that one.

= Where I think we are? =

I think with all this data we're at the following:

Q: Should we add this to rebuild
A: Yes, probably - after some enhancement to the spec *


    • we really should have much better use cases about the situations it
      is expected to be used in. We spend a lot of time 2 and 3 years out
      trying to figure out how anyone would ever use a feature, and adding
      another one without this doesn't seem good

Q: should this also be on reboot?
A: NO - it would be too fragile

I also think figuring out a way to get Nova out of the key storage
business (which it really shouldn't be in) would be good. So if anyone
wants to tackle Nova using Barbican for keys, that would be ++. Rebuild
doesn't wait on that, but Barbican urls for keys seems like a much
better world to be in.

-Sean

Sean, thanks for summarizing the various discussions had today. I've
also included the operators list on this.

--

Thanks,

Matt


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Oct 3, 2017 by mriedemos_at_gmail.c (15,720 points)   2 4 5
0 votes

I think new-keypair-on-rebuild makes sense for some forms of key rotation
as well. For example, I've worked with a big data ironic customer who uses
rebuild to deploy new OS images onto their ironic managed machines.
Presumably if they wanted to do a keypair rotation they'd do it in a very
similar way.

So yes, I think you've reached the right conclusion here. Thanks for your
work Sean.

Michael

On Wed, Oct 4, 2017 at 9:06 AM, Matt Riedemann mriedemos@gmail.com wrote:

On 10/3/2017 3:16 PM, Sean Dague wrote:

There is currently a spec up for being able to specify a new key_pair
name during the rebuild operation in Nova -
https://review.openstack.org/#/c/375221/

For those not completely familiar with Nova operations, rebuild triggers
the "reset this vm to initial state" by throwing out all the disks, and
rebuilding them from the initial glance images. It does however keep the
IP address and device models when you do that. So it's useful for
ephemeral but repeating workloads, where you'd rather not have the
network information change out from under you.

We also talked quite a bit about rebuild with volume-backed instances
today, and the fact the root disk isn't replaced during rebuild in that
case, for which there are many reported bugs...

The spec is a little vague about when this becomes really useful,
because this will not save you from "I lost my private key, and I have
important data on that disk". Because the disk is destroyed. That's the
point of rebuild. We once added this preserve_ephemeral flag to rebuild
for trippleo on ironic, but it's so nasty we've scoped it to only work
with ironic backends. Ephemeral should mean ephemeral.

Rebuild bypasses the scheduler. A rebuilt server stays on the same host
as it was before, which means the operation has a good chance of being
faster than a DELETE + CREATE, as the image cache on that host should
already have the base image for you instance.

It also means no chances for NoValidHost or resource claim failures.

A bunch of data was collected today in a lot of different IRC channels
(#openstack-nova, #openstack-infra, #openstack-operators).

= OpenStack Operators =

mnaser said that for their customers this would be useful. Keys get lost
often, but keeping the IP is actually valuable. They would also like this.

penick said that for their existing environment, they have a workflow
where this would be useful. But they are moving away from using nova for
key distribution because in Nova keys are user owned, which actually
works poorly given that everything else is project owned. So they are
building something to do key distribution after boot in the guest not
using nova's metadata.

Lots of people said they didn't use nova's keypair interfaces, they just
did it all in config management after the fact.

= Also on reboot? =

Because the reason people said they wanted it was: "I lost my private
key", the question at PTG was "does that mean you want it on reboot?"

But as we dive through the constraints of that, people that build "pet"
VMs typically delete or disable cloud-init (or similar systems) after
first boot. Without that kind of agent, this isn't going to work anyway.

So also on reboot seems very fragile and unuseful.

= Infra =

We asked the infra team if this is useful to them, the answer was no.
What would be useful them is if keypairs could be updated. They use a
symbolic name for a keypair but want to do regular key rotation. Right
now they do this by deleting then recreating keypairs, but that does
mean there is a window where there is no keypair with that name, so
server creates fail.

It is agreed that something supporting key rotation in the future would
be handy, that's not in this scope.

= Barbican =

In the tradition of making a simple fix a generic one, it does look like
there is a longer term part of this where Nova should really be able to
specify a Barbican resource url for a key so that things like rotation
could be dealt with in a system that specializes in that. It also would
address the very weird oddity of user vs. project scoping.

That's a bigger more nebulous thing. Other folks would need to be
engaged on that one.

= Where I think we are? =

I think with all this data we're at the following:

Q: Should we add this to rebuild
A: Yes, probably - after some enhancement to the spec *


    • we really should have much better use cases about the situations it
      is expected to be used in. We spend a lot of time 2 and 3 years out
      trying to figure out how anyone would ever use a feature, and adding
      another one without this doesn't seem good

Q: should this also be on reboot?
A: NO - it would be too fragile

I also think figuring out a way to get Nova out of the key storage
business (which it really shouldn't be in) would be good. So if anyone
wants to tackle Nova using Barbican for keys, that would be ++. Rebuild
doesn't wait on that, but Barbican urls for keys seems like a much
better world to be in.

    -Sean

Sean, thanks for summarizing the various discussions had today. I've also
included the operators list on this.

--

Thanks,

Matt


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Oct 4, 2017 by Michael_Still (16,180 points)   3 5 13
0 votes

On 10/03/2017 03:16 PM, Sean Dague wrote:
= Where I think we are? =

I think with all this data we're at the following:

Q: Should we add this to rebuild
A: Yes, probably - after some enhancement to the spec *


    • we really should have much better use cases about the situations it
      is expected to be used in. We spend a lot of time 2 and 3 years out
      trying to figure out how anyone would ever use a feature, and adding
      another one without this doesn't seem good

Here's an example from my use: I create a Heat stack, then realize I
deployed some of the instances with the wrong keypair. I'd rather not
tear down the entire stack just to fix that, and being able to change
keys on rebuild would allow me to avoid doing so. I can rebuild a
Heat-owned instance without causing any trouble, but I can't re-create it.

I don't know how common this is, but it's definitely something that has
happened to me in the past.

Q: should this also be on reboot?
A: NO - it would be too fragile

I also think figuring out a way to get Nova out of the key storage
business (which it really shouldn't be in) would be good. So if anyone
wants to tackle Nova using Barbican for keys, that would be ++. Rebuild
doesn't wait on that, but Barbican urls for keys seems like a much
better world to be in.

-Sean


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Oct 4, 2017 by Ben_Nemec (19,660 points)   2 3 3
0 votes

Excerpts from Sean Dague's message of 2017-10-03 16:16:48 -0400:

There is currently a spec up for being able to specify a new key_pair
name during the rebuild operation in Nova -
https://review.openstack.org/#/c/375221/

For those not completely familiar with Nova operations, rebuild triggers
the "reset this vm to initial state" by throwing out all the disks, and
rebuilding them from the initial glance images. It does however keep the
IP address and device models when you do that. So it's useful for
ephemeral but repeating workloads, where you'd rather not have the
network information change out from under you.

The spec is a little vague about when this becomes really useful,
because this will not save you from "I lost my private key, and I have
important data on that disk". Because the disk is destroyed. That's the
point of rebuild. We once added this preserve_ephemeral flag to rebuild
for trippleo on ironic, but it's so nasty we've scoped it to only work
with ironic backends. Ephemeral should mean ephemeral.

Let me take a moment to apologize for that feature. It was the worst idea
we had in TripleO, even worse than the name. ;)

Rebuild bypasses the scheduler. A rebuilt server stays on the same host
as it was before, which means the operation has a good chance of being
faster than a DELETE + CREATE, as the image cache on that host should
already have the base image for you instance.

There are some pro's, but for the most part I'd rather train my users
to be creating new instances than train them to cling to fixed IPs and
single compute node resources. It's a big feature, and obviously we've
given it to users so they use it. But that doesn't mean it's the best
use of Nova development's time to be supporting it, nor is it the most
scalable way for users to interact with a cloud.

A trade-off for instance, is that a rebuilding server is unavailable while
rebuilding. The user cannot choose how long that server is unavailable,
or choose to roll back and make it available if something goes wrong. It's
rebuilding until it isn't. A new server, spun up somewhere else, can be
fully prepared before any switch is made. One of the best things about
being a cloud operator is that you put more onus on the users to fix
their own problems, and give them lots of tools to do it. But while a
server is being rebuilt it is entirely the operator's problem.

Also as an operator, while I appreciate that it's quick on that compute
node, I'd rather new servers be scheduled to the places that my scheduler
rules say they should go. I will at times want to drain a compute node,
and the longer the pet servers stick around and are rebuilt, the more
likely I am to have to migrate them forcibly.

= Where I think we are? =

I think with all this data we're at the following:

Q: Should we add this to rebuild
A: Yes, probably - after some enhancement to the spec *


    • we really should have much better use cases about the situations it
      is expected to be used in. We spend a lot of time 2 and 3 years out
      trying to figure out how anyone would ever use a feature, and adding
      another one without this doesn't seem good

Q: should this also be on reboot?
A: NO - it would be too fragile

I also think figuring out a way to get Nova out of the key storage
business (which it really shouldn't be in) would be good. So if anyone
wants to tackle Nova using Barbican for keys, that would be ++. Rebuild
doesn't wait on that, but Barbican urls for keys seems like a much
better world to be in.

The keys are great. Barbican is a fantastic tool for storing secret
keys, but feels like a massive amount of overkill for this tiny blob of
public data.


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Oct 4, 2017 by Clint_Byrum (40,940 points)   4 5 9
0 votes

On 10/3/2017 3:16 PM, Sean Dague wrote:
There is currently a spec up for being able to specify a new key_pair
name during the rebuild operation in Nova -
https://review.openstack.org/#/c/375221/

For those not completely familiar with Nova operations, rebuild triggers
the "reset this vm to initial state" by throwing out all the disks, and
rebuilding them from the initial glance images. It does however keep the
IP address and device models when you do that. So it's useful for
ephemeral but repeating workloads, where you'd rather not have the
network information change out from under you.

The spec is a little vague about when this becomes really useful,
because this will not save you from "I lost my private key, and I have
important data on that disk". Because the disk is destroyed. That's the
point of rebuild. We once added this preserve_ephemeral flag to rebuild
for trippleo on ironic, but it's so nasty we've scoped it to only work
with ironic backends. Ephemeral should mean ephemeral.

Rebuild bypasses the scheduler. A rebuilt server stays on the same host
as it was before, which means the operation has a good chance of being
faster than a DELETE + CREATE, as the image cache on that host should
already have the base image for you instance.

A bunch of data was collected today in a lot of different IRC channels
(#openstack-nova, #openstack-infra, #openstack-operators).

= OpenStack Operators =

mnaser said that for their customers this would be useful. Keys get lost
often, but keeping the IP is actually valuable. They would also like this.

penick said that for their existing environment, they have a workflow
where this would be useful. But they are moving away from using nova for
key distribution because in Nova keys are user owned, which actually
works poorly given that everything else is project owned. So they are
building something to do key distribution after boot in the guest not
using nova's metadata.

Lots of people said they didn't use nova's keypair interfaces, they just
did it all in config management after the fact.

= Also on reboot? =

Because the reason people said they wanted it was: "I lost my private
key", the question at PTG was "does that mean you want it on reboot?"

But as we dive through the constraints of that, people that build "pet"
VMs typically delete or disable cloud-init (or similar systems) after
first boot. Without that kind of agent, this isn't going to work anyway.

So also on reboot seems very fragile and unuseful.

= Infra =

We asked the infra team if this is useful to them, the answer was no.
What would be useful them is if keypairs could be updated. They use a
symbolic name for a keypair but want to do regular key rotation. Right
now they do this by deleting then recreating keypairs, but that does
mean there is a window where there is no keypair with that name, so
server creates fail.

It is agreed that something supporting key rotation in the future would
be handy, that's not in this scope.

= Barbican =

In the tradition of making a simple fix a generic one, it does look like
there is a longer term part of this where Nova should really be able to
specify a Barbican resource url for a key so that things like rotation
could be dealt with in a system that specializes in that. It also would
address the very weird oddity of user vs. project scoping.

That's a bigger more nebulous thing. Other folks would need to be
engaged on that one.

= Where I think we are? =

I think with all this data we're at the following:

Q: Should we add this to rebuild
A: Yes, probably - after some enhancement to the spec *


    • we really should have much better use cases about the situations it
      is expected to be used in. We spend a lot of time 2 and 3 years out
      trying to figure out how anyone would ever use a feature, and adding
      another one without this doesn't seem good

Q: should this also be on reboot?
A: NO - it would be too fragile

I also think figuring out a way to get Nova out of the key storage
business (which it really shouldn't be in) would be good. So if anyone
wants to tackle Nova using Barbican for keys, that would be ++. Rebuild
doesn't wait on that, but Barbican urls for keys seems like a much
better world to be in.

-Sean

We didn't talk about this during the spec review, but it just came up
when I was reviewing the patch.

https://review.openstack.org/#/c/379128/12/nova/api/openstack/compute/servers.py@908

The way this is currently written, it only ever updates the instance
key if the user passes in a new value. If they pass key_name=None, we
ignore that request. There is a subtle difference in the semantics here.

Should we allow unsetting an instance key during rebuild? I think it
would be easy enough to do that, but I'm not sure if it would (a) be
desirable or (b) cause problems. You can certainly create an instance
without a key so I don't think there would actually be problems,
especially if, for example, you're rebuilding with a new image that
already has known credentials set in it.

I'm thinking of something like revoking a key without having to specify
a new key in the process.

Thoughts?

--

Thanks,

Matt


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Nov 4, 2017 by mriedemos_at_gmail.c (15,720 points)   2 4 5
0 votes

Excerpts from Ben Nemec's message of 2017-10-03 23:05:49 -0500:

On 10/03/2017 03:16 PM, Sean Dague wrote:

= Where I think we are? =

I think with all this data we're at the following:

Q: Should we add this to rebuild
A: Yes, probably - after some enhancement to the spec *


    • we really should have much better use cases about the situations it
      is expected to be used in. We spend a lot of time 2 and 3 years out
      trying to figure out how anyone would ever use a feature, and adding
      another one without this doesn't seem good

Here's an example from my use: I create a Heat stack, then realize I
deployed some of the instances with the wrong keypair. I'd rather not
tear down the entire stack just to fix that, and being able to change
keys on rebuild would allow me to avoid doing so. I can rebuild a
Heat-owned instance without causing any trouble, but I can't re-create it.

I don't know how common this is, but it's definitely something that has
happened to me in the past.

Sorry but this is an argument to use Heat more, but rebuild is totally
unnecessary.

In heat if you change the keypair and update the stack, it will create a new
one with the right keypair and delete the old instance (or you can make it use
rebuild, a feature I believe I developed actually). The updated IPs will be
rolled out to all resources that reference that instance's IP. If you have wait
conditions which depend on this instance, Heat will wait until they are
re-triggered before deleting the old instance. This is literally why Heat is a
cool thing, because it lets you use the cloud the way the cloud was intended to
be used.

If you use rebuild, while it is rebuilding, your service is unavailable. If you
use create/wait/delete you have a chance to automate the transition from the
old to new instance.

Q: should this also be on reboot?
A: NO - it would be too fragile

I also think figuring out a way to get Nova out of the key storage
business (which it really shouldn't be in) would be good. So if anyone
wants to tackle Nova using Barbican for keys, that would be ++. Rebuild
doesn't wait on that, but Barbican urls for keys seems like a much
better world to be in.

-Sean


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Nov 4, 2017 by Clint_Byrum (40,940 points)   4 5 9
...