settingsLogin | Registersettings

[openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr] Gap analysis: Heat as a k8s orchestrator

0 votes

I spent a bit of time exploring the idea of using Heat as an external
orchestration layer on top of Kubernetes - specifically in the case of
TripleO controller nodes but I think it could be more generally useful
too - but eventually came to the conclusion it doesn't work yet, and
probably won't for a while. Nevertheless, I think it's helpful to
document a bit to help other people avoid going down the same path, and
also to help us focus on working toward the point where it is
possible, since I think there are other contexts where it would be
useful too.

We tend to refer to Kubernetes as a "Container Orchestration Engine" but
it does not actually do any orchestration, unless you count just
starting everything at roughly the same time as 'orchestration'. Which I
wouldn't. You generally handle any orchestration requirements between
services within the containers themselves, possibly using external
services like etcd to co-ordinate. (The Kubernetes project refer to this
as "choreography", and explicitly disclaim any attempt at orchestration.)

What Kubernetes does do is more like an actively-managed version of
Heat's SoftwareDeploymentGroup (emphasis on the Group). Brief recap:
SoftwareDeploymentGroup is a type of ResourceGroup; you give it a map of
resource names to server UUIDs and it creates a SoftwareDeployment for
each server. You have to generate the list of servers somehow to give it
(the easiest way is to obtain it from the output of another
ResourceGroup containing the servers). If e.g. a server goes down you
have to detect that externally, and trigger a Heat update that removes
it from the templates, redeploys a replacement server, and regenerates
the server list before a replacement SoftwareDeployment is created. In
constrast, Kubernetes is running on a cluster of servers, can use rules
to determine where to run containers, and can very quickly redeploy
without external intervention in response to a server or container
falling over. (It also does rolling updates, which Heat can also do
albeit in a somewhat hacky way when it comes to SoftwareDeployments -
which we're planning to fix.)

So this seems like an opportunity: if the dependencies between services
could be encoded in Heat templates rather than baked into the containers
then we could use Heat as the orchestration layer following the
dependency-based style I outlined in [1]. (TripleO is already moving in
this direction with the way that composable-roles uses
SoftwareDeploymentGroups.) One caveat is that fully using this style
likely rules out for all practical purposes the current Pacemaker-based
HA solution. We'd need to move to a lighter-weight HA solution, but I
know that TripleO is considering that anyway.

What's more though, assuming this could be made to work for a Kubernetes
cluster, a couple of remappings in the Heat environment file should get
you an otherwise-equivalent single-node non-HA deployment basically for
free. That's particularly exciting to me because there are definitely
deployments of TripleO that need HA clustering and deployments that
don't and which wouldn't want to pay the complexity cost of running
Kubernetes when they don't make any real use of it.

So you'd have a Heat resource type for the controller cluster that maps
to either an OS::Nova::Server or (the equivalent of) an OS::Magnum::Bay,
and a bunch of software deployments that map to either a
OS::Heat::SoftwareDeployment that calls (I assume) docker-compose
directly or a Kubernetes Pod resource to be named later.

The first obstacle is that we'd need that Kubernetes Pod resource in
Heat. Currently there is no such resource type, and the OpenStack API
that would be expected to provide that API (Magnum's /container
endpoint) is being deprecated, so that's not a long-term solution.[2]
Some folks from the Magnum community may or may not be working on a
separate project (which may or may not be called Higgins) to do that.
It'd be some time away though.

An alternative, though not a good one, would be to create a Kubernetes
resource type in Heat that has the credentials passed in somehow. I'm
very against that though. Heat is just not good at handling credentials
other than Keystone ones. We haven't ever created a resource type like
this before, except for the Docker one in /contrib that serves as a
prime example of what not to do. And if it doesn't make sense to wrap
an OpenStack API around this then IMO it isn't going to make any more
sense to wrap a Heat resource around it.

A third option might be a SoftwareDeployment, possibly on one of the
controller nodes themselves, that calls the k8s client. (We could create
a software deployment hook to make this easy.) That would suffer from
all of the same issues that TripleO currently has about having to choose
a server on which to deploy though.

The secondary obstacle is networking. TripleO has some pretty
complicated networking requirements (specifically network isolation for
the various services) that for now can't be supported when deploying a
cluster with Magnum. The Kuryr project is working on improved networking
for Magnum, but I don't know whether this is a use-case that would be
covered.

There's also the issue that IIUC Magnum operates its Neutron L3 agents
in such a way that connectivity to the user nodes is guaranteed only if
Magnum itself is running in an HA cloud. This is a problematic
assumption in general, but it's particularly problematic in the case of
the TripleO undercloud, which is not HA and which we very much do not
want to be in the networking path for the overcloud controller nodes.
Again, I don't know if this will be resolved by Kuryr or when.

Magnum does offer the option to pass a custom template, and I assume
that would allow us to set up the networking the way we want it.
However, TripleO uses all kinds of tricks with the environment and
parameters, so there'd quite likely need to be some enhancements to both
Heat (in order to access the current environment from within a template)
and Magnum (to pass an environment along with the template) to support that.

At that point it's a legitimate question to ask what exactly Magnum is
buying us if TripleO has to maintain its own Kubernetes deployment
templates anyway. I can think of only two things: an easier transition
later if we do believe that the networking stuff will be resolved, and
the /containers API. And the /containers API is being deprecated.

In that sense, the Magnum/Higgins split could be a good thing for the
Heat+Kubernetes use case in the long term - if we had a
Keystone-authenticated API that can allow Heat to make use of any k8s
cluster, not just those deployed via Magnum, then Magnum could be cut
out of the loop in those cases where networking issues preclude its use.

In the short term, though, there seems to be a number of obstacles.
Perhaps some of the folks involved in the relevant projects could
comment on when/if those are likely to be resolved.

cheers,
Zane.

[1]
http://lists.openstack.org/pipermail/openstack-dev/2016-March/090055.html
[2]https://etherpad.openstack.org/p/newton-magnum-unified-abstraction


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
asked May 27, 2016 in openstack-dev by Zane_Bitter (21,640 points)   4 6 9
retagged Jan 25, 2017 by admin

6 Responses

0 votes

Hi Zane,

I've been working on the k8s side of the equation right now...

See these two PR's:
https://github.com/kubernetes/kubernetes/pull/25391
https://github.com/kubernetes/kubernetes/pull/25624

I'm still hopeful these can make k8s 1.3 as experimental plugins. There is keystone username/password auth support in 1.2 & 1.3, but it is unsuitable for heat usage. It also does not support authorization at all.

After these patches are in, heat, horizon, and higgins should be able to use the k8s api. I believe they should be complete enough for testing now though, if you want to build it yourself.

There also will need to be a small patch to magnum to set the right flags to bind the deployed k8s to the local cloud if you want to use magnum to deploy.

After the patches are in, I was thinking about taking a stab at a heat resource for deployments, but if you can get to it before I can, that would be great too. :)

Thanks,
Kevin


From: Zane Bitter [zbitter@redhat.com]
Sent: Friday, May 27, 2016 3:30 PM
To: OpenStack Development Mailing List
Subject: [openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr] Gap analysis: Heat as a k8s orchestrator

I spent a bit of time exploring the idea of using Heat as an external
orchestration layer on top of Kubernetes - specifically in the case of
TripleO controller nodes but I think it could be more generally useful
too - but eventually came to the conclusion it doesn't work yet, and
probably won't for a while. Nevertheless, I think it's helpful to
document a bit to help other people avoid going down the same path, and
also to help us focus on working toward the point where it is
possible, since I think there are other contexts where it would be
useful too.

We tend to refer to Kubernetes as a "Container Orchestration Engine" but
it does not actually do any orchestration, unless you count just
starting everything at roughly the same time as 'orchestration'. Which I
wouldn't. You generally handle any orchestration requirements between
services within the containers themselves, possibly using external
services like etcd to co-ordinate. (The Kubernetes project refer to this
as "choreography", and explicitly disclaim any attempt at orchestration.)

What Kubernetes does do is more like an actively-managed version of
Heat's SoftwareDeploymentGroup (emphasis on the Group). Brief recap:
SoftwareDeploymentGroup is a type of ResourceGroup; you give it a map of
resource names to server UUIDs and it creates a SoftwareDeployment for
each server. You have to generate the list of servers somehow to give it
(the easiest way is to obtain it from the output of another
ResourceGroup containing the servers). If e.g. a server goes down you
have to detect that externally, and trigger a Heat update that removes
it from the templates, redeploys a replacement server, and regenerates
the server list before a replacement SoftwareDeployment is created. In
constrast, Kubernetes is running on a cluster of servers, can use rules
to determine where to run containers, and can very quickly redeploy
without external intervention in response to a server or container
falling over. (It also does rolling updates, which Heat can also do
albeit in a somewhat hacky way when it comes to SoftwareDeployments -
which we're planning to fix.)

So this seems like an opportunity: if the dependencies between services
could be encoded in Heat templates rather than baked into the containers
then we could use Heat as the orchestration layer following the
dependency-based style I outlined in [1]. (TripleO is already moving in
this direction with the way that composable-roles uses
SoftwareDeploymentGroups.) One caveat is that fully using this style
likely rules out for all practical purposes the current Pacemaker-based
HA solution. We'd need to move to a lighter-weight HA solution, but I
know that TripleO is considering that anyway.

What's more though, assuming this could be made to work for a Kubernetes
cluster, a couple of remappings in the Heat environment file should get
you an otherwise-equivalent single-node non-HA deployment basically for
free. That's particularly exciting to me because there are definitely
deployments of TripleO that need HA clustering and deployments that
don't and which wouldn't want to pay the complexity cost of running
Kubernetes when they don't make any real use of it.

So you'd have a Heat resource type for the controller cluster that maps
to either an OS::Nova::Server or (the equivalent of) an OS::Magnum::Bay,
and a bunch of software deployments that map to either a
OS::Heat::SoftwareDeployment that calls (I assume) docker-compose
directly or a Kubernetes Pod resource to be named later.

The first obstacle is that we'd need that Kubernetes Pod resource in
Heat. Currently there is no such resource type, and the OpenStack API
that would be expected to provide that API (Magnum's /container
endpoint) is being deprecated, so that's not a long-term solution.[2]
Some folks from the Magnum community may or may not be working on a
separate project (which may or may not be called Higgins) to do that.
It'd be some time away though.

An alternative, though not a good one, would be to create a Kubernetes
resource type in Heat that has the credentials passed in somehow. I'm
very against that though. Heat is just not good at handling credentials
other than Keystone ones. We haven't ever created a resource type like
this before, except for the Docker one in /contrib that serves as a
prime example of what not to do. And if it doesn't make sense to wrap
an OpenStack API around this then IMO it isn't going to make any more
sense to wrap a Heat resource around it.

A third option might be a SoftwareDeployment, possibly on one of the
controller nodes themselves, that calls the k8s client. (We could create
a software deployment hook to make this easy.) That would suffer from
all of the same issues that TripleO currently has about having to choose
a server on which to deploy though.

The secondary obstacle is networking. TripleO has some pretty
complicated networking requirements (specifically network isolation for
the various services) that for now can't be supported when deploying a
cluster with Magnum. The Kuryr project is working on improved networking
for Magnum, but I don't know whether this is a use-case that would be
covered.

There's also the issue that IIUC Magnum operates its Neutron L3 agents
in such a way that connectivity to the user nodes is guaranteed only if
Magnum itself is running in an HA cloud. This is a problematic
assumption in general, but it's particularly problematic in the case of
the TripleO undercloud, which is not HA and which we very much do not
want to be in the networking path for the overcloud controller nodes.
Again, I don't know if this will be resolved by Kuryr or when.

Magnum does offer the option to pass a custom template, and I assume
that would allow us to set up the networking the way we want it.
However, TripleO uses all kinds of tricks with the environment and
parameters, so there'd quite likely need to be some enhancements to both
Heat (in order to access the current environment from within a template)
and Magnum (to pass an environment along with the template) to support that.

At that point it's a legitimate question to ask what exactly Magnum is
buying us if TripleO has to maintain its own Kubernetes deployment
templates anyway. I can think of only two things: an easier transition
later if we do believe that the networking stuff will be resolved, and
the /containers API. And the /containers API is being deprecated.

In that sense, the Magnum/Higgins split could be a good thing for the
Heat+Kubernetes use case in the long term - if we had a
Keystone-authenticated API that can allow Heat to make use of any k8s
cluster, not just those deployed via Magnum, then Magnum could be cut
out of the loop in those cases where networking issues preclude its use.

In the short term, though, there seems to be a number of obstacles.
Perhaps some of the folks involved in the relevant projects could
comment on when/if those are likely to be resolved.

cheers,
Zane.

[1]
http://lists.openstack.org/pipermail/openstack-dev/2016-March/090055.html
[2]https://etherpad.openstack.org/p/newton-magnum-unified-abstraction


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded May 27, 2016 by Fox,_Kevin_M (29,360 points)   1 3 4
0 votes

-----Original Message-----
From: Zane Bitter [mailto:zbitter@redhat.com]
Sent: May-27-16 6:31 PM
To: OpenStack Development Mailing List
Subject: [openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr]
Gap analysis: Heat as a k8s orchestrator

I spent a bit of time exploring the idea of using Heat as an external
orchestration layer on top of Kubernetes - specifically in the case of
TripleO controller nodes but I think it could be more generally useful
too - but eventually came to the conclusion it doesn't work yet, and
probably won't for a while. Nevertheless, I think it's helpful to
document a bit to help other people avoid going down the same path, and
also to help us focus on working toward the point where it is
possible, since I think there are other contexts where it would be
useful too.

We tend to refer to Kubernetes as a "Container Orchestration Engine"
but it does not actually do any orchestration, unless you count just
starting everything at roughly the same time as 'orchestration'. Which
I wouldn't. You generally handle any orchestration requirements between
services within the containers themselves, possibly using external
services like etcd to co-ordinate. (The Kubernetes project refer to
this as "choreography", and explicitly disclaim any attempt at
orchestration.)

What Kubernetes does do is more like an actively-managed version of
Heat's SoftwareDeploymentGroup (emphasis on the Group). Brief recap:
SoftwareDeploymentGroup is a type of ResourceGroup; you give it a map
of resource names to server UUIDs and it creates a SoftwareDeployment
for each server. You have to generate the list of servers somehow to
give it (the easiest way is to obtain it from the output of another
ResourceGroup containing the servers). If e.g. a server goes down you
have to detect that externally, and trigger a Heat update that removes
it from the templates, redeploys a replacement server, and regenerates
the server list before a replacement SoftwareDeployment is created. In
constrast, Kubernetes is running on a cluster of servers, can use rules
to determine where to run containers, and can very quickly redeploy
without external intervention in response to a server or container
falling over. (It also does rolling updates, which Heat can also do
albeit in a somewhat hacky way when it comes to SoftwareDeployments -
which we're planning to fix.)

So this seems like an opportunity: if the dependencies between services
could be encoded in Heat templates rather than baked into the
containers then we could use Heat as the orchestration layer following
the dependency-based style I outlined in [1]. (TripleO is already
moving in this direction with the way that composable-roles uses
SoftwareDeploymentGroups.) One caveat is that fully using this style
likely rules out for all practical purposes the current Pacemaker-based
HA solution. We'd need to move to a lighter-weight HA solution, but I
know that TripleO is considering that anyway.

What's more though, assuming this could be made to work for a
Kubernetes cluster, a couple of remappings in the Heat environment file
should get you an otherwise-equivalent single-node non-HA deployment
basically for free. That's particularly exciting to me because there
are definitely deployments of TripleO that need HA clustering and
deployments that don't and which wouldn't want to pay the complexity
cost of running Kubernetes when they don't make any real use of it.

So you'd have a Heat resource type for the controller cluster that maps
to either an OS::Nova::Server or (the equivalent of) an OS::Magnum::Bay,
and a bunch of software deployments that map to either a
OS::Heat::SoftwareDeployment that calls (I assume) docker-compose
directly or a Kubernetes Pod resource to be named later.

The first obstacle is that we'd need that Kubernetes Pod resource in
Heat. Currently there is no such resource type, and the OpenStack API
that would be expected to provide that API (Magnum's /container
endpoint) is being deprecated, so that's not a long-term solution.[2]
Some folks from the Magnum community may or may not be working on a
separate project (which may or may not be called Higgins) to do that.
It'd be some time away though.

An alternative, though not a good one, would be to create a Kubernetes
resource type in Heat that has the credentials passed in somehow. I'm
very against that though. Heat is just not good at handling credentials
other than Keystone ones. We haven't ever created a resource type like
this before, except for the Docker one in /contrib that serves as a
prime example of what not to do. And if it doesn't make sense to wrap
an OpenStack API around this then IMO it isn't going to make any more
sense to wrap a Heat resource around it.

There are ways to alleviate the credential handling issue. First, Kubernetes supports Keystone authentication [1]. Magnum has a BP [2] to turn on this feature. In addition, there is a Kubernetes python-binding [3] under development. By combining all these efforts, it is possible to create a Kubernetes resource in Heat without handing credentials other than the Keystone ones.

[1] http://kubernetes.io/docs/admin/authentication/
[2] https://blueprints.launchpad.net/magnum/+spec/keystone-for-k8s-bay
[3] https://github.com/openstack/python-k8sclient

A third option might be a SoftwareDeployment, possibly on one of the
controller nodes themselves, that calls the k8s client. (We could
create a software deployment hook to make this easy.) That would suffer
from all of the same issues that TripleO currently has about having to
choose a server on which to deploy though.

From my point of view, the Kubernetes Heat resources approach is possibly more user-friendly than the SoftwareDeployment approach. That is because SoftwareDeployment and SoftwareDeploymentGroup resources are very advanced and complex. It might take a while for users to figure out how to use them. The requirement of building a custom image is another barrier of entry. In Magnum, we explored the possibility to leverage SD/SDG in Atomic-based COEs, but stopped on that direction until the os-- tools have been fully containerized [4] so that those resources could work on any OS.

[4] https://bugs.launchpad.net/magnum/+bug/1424969

The secondary obstacle is networking. TripleO has some pretty
complicated networking requirements (specifically network isolation for
the various services) that for now can't be supported when deploying a
cluster with Magnum. The Kuryr project is working on improved
networking for Magnum, but I don't know whether this is a use-case that
would be covered.

Sorry, I don't get this. Mind elaborating the details of your network requirements?

There's also the issue that IIUC Magnum operates its Neutron L3 agents
in such a way that connectivity to the user nodes is guaranteed only if
Magnum itself is running in an HA cloud. This is a problematic
assumption in general, but it's particularly problematic in the case of
the TripleO undercloud, which is not HA and which we very much do not
want to be in the networking path for the overcloud controller nodes.
Again, I don't know if this will be resolved by Kuryr or when.

Magnum does offer the option to pass a custom template, and I assume
that would allow us to set up the networking the way we want it.
However, TripleO uses all kinds of tricks with the environment and
parameters, so there'd quite likely need to be some enhancements to
both Heat (in order to access the current environment from within a
template) and Magnum (to pass an environment along with the template)
to support that.

Magnum prefers to leverage the Heat conditionals feature instead of leveraging environments, because we expected Heat conditionals would make our Heat templates simpler and easier to maintain. If we can pass a parameter to Heat template and use conditionals to interpret the parameter, I am not sure if we also need to support passing environments as well (it seems conditionals can do whatever environments can do).

At that point it's a legitimate question to ask what exactly Magnum is
buying us if TripleO has to maintain its own Kubernetes deployment
templates anyway. I can think of only two things: an easier transition
later if we do believe that the networking stuff will be resolved, and
the /containers API. And the /containers API is being deprecated.

In that sense, the Magnum/Higgins split could be a good thing for the
Heat+Kubernetes use case in the long term - if we had a
Keystone-authenticated API that can allow Heat to make use of any k8s
cluster, not just those deployed via Magnum, then Magnum could be cut
out of the loop in those cases where networking issues preclude its use.

Wearing my Magnum PTL hat, I am sorry to hear Magnum couldn't resolve your problem immediately. Wearing my Higgins core hat, I am thrilled that Higgins is under your consideration in long term.

In the short term, though, there seems to be a number of obstacles.
Perhaps some of the folks involved in the relevant projects could
comment on when/if those are likely to be resolved.

cheers,
Zane.

[1]
http://lists.openstack.org/pipermail/openstack-dev/2016-
March/090055.html
[2]https://etherpad.openstack.org/p/newton-magnum-unified-abstraction



OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-
request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded May 28, 2016 by hongbin.lu_at_huawei (11,620 points)   2 3 4
0 votes

Hongbin,

Re Netowrk coverage, he is talking about the best practice way to deploy
an OpenStack cloud. I have a diagram here:

http://www.gliffy.com/go/publish/10486755

I think what Zane is getting that network Diagram above to magically map
into Kubernetes is not possible at present and may not be possible ever.

Regards
-steve

On 5/28/16, 1:16 PM, "Hongbin Lu" hongbin.lu@huawei.com wrote:

-----Original Message-----
From: Zane Bitter [mailto:zbitter@redhat.com]
Sent: May-27-16 6:31 PM
To: OpenStack Development Mailing List
Subject: [openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr]
Gap analysis: Heat as a k8s orchestrator

I spent a bit of time exploring the idea of using Heat as an external
orchestration layer on top of Kubernetes - specifically in the case of
TripleO controller nodes but I think it could be more generally useful
too - but eventually came to the conclusion it doesn't work yet, and
probably won't for a while. Nevertheless, I think it's helpful to
document a bit to help other people avoid going down the same path, and
also to help us focus on working toward the point where it is
possible, since I think there are other contexts where it would be
useful too.

We tend to refer to Kubernetes as a "Container Orchestration Engine"
but it does not actually do any orchestration, unless you count just
starting everything at roughly the same time as 'orchestration'. Which
I wouldn't. You generally handle any orchestration requirements between
services within the containers themselves, possibly using external
services like etcd to co-ordinate. (The Kubernetes project refer to
this as "choreography", and explicitly disclaim any attempt at
orchestration.)

What Kubernetes does do is more like an actively-managed version of
Heat's SoftwareDeploymentGroup (emphasis on the Group). Brief recap:
SoftwareDeploymentGroup is a type of ResourceGroup; you give it a map
of resource names to server UUIDs and it creates a SoftwareDeployment
for each server. You have to generate the list of servers somehow to
give it (the easiest way is to obtain it from the output of another
ResourceGroup containing the servers). If e.g. a server goes down you
have to detect that externally, and trigger a Heat update that removes
it from the templates, redeploys a replacement server, and regenerates
the server list before a replacement SoftwareDeployment is created. In
constrast, Kubernetes is running on a cluster of servers, can use rules
to determine where to run containers, and can very quickly redeploy
without external intervention in response to a server or container
falling over. (It also does rolling updates, which Heat can also do
albeit in a somewhat hacky way when it comes to SoftwareDeployments -
which we're planning to fix.)

So this seems like an opportunity: if the dependencies between services
could be encoded in Heat templates rather than baked into the
containers then we could use Heat as the orchestration layer following
the dependency-based style I outlined in [1]. (TripleO is already
moving in this direction with the way that composable-roles uses
SoftwareDeploymentGroups.) One caveat is that fully using this style
likely rules out for all practical purposes the current Pacemaker-based
HA solution. We'd need to move to a lighter-weight HA solution, but I
know that TripleO is considering that anyway.

What's more though, assuming this could be made to work for a
Kubernetes cluster, a couple of remappings in the Heat environment file
should get you an otherwise-equivalent single-node non-HA deployment
basically for free. That's particularly exciting to me because there
are definitely deployments of TripleO that need HA clustering and
deployments that don't and which wouldn't want to pay the complexity
cost of running Kubernetes when they don't make any real use of it.

So you'd have a Heat resource type for the controller cluster that maps
to either an OS::Nova::Server or (the equivalent of) an OS::Magnum::Bay,
and a bunch of software deployments that map to either a
OS::Heat::SoftwareDeployment that calls (I assume) docker-compose
directly or a Kubernetes Pod resource to be named later.

The first obstacle is that we'd need that Kubernetes Pod resource in
Heat. Currently there is no such resource type, and the OpenStack API
that would be expected to provide that API (Magnum's /container
endpoint) is being deprecated, so that's not a long-term solution.[2]
Some folks from the Magnum community may or may not be working on a
separate project (which may or may not be called Higgins) to do that.
It'd be some time away though.

An alternative, though not a good one, would be to create a Kubernetes
resource type in Heat that has the credentials passed in somehow. I'm
very against that though. Heat is just not good at handling credentials
other than Keystone ones. We haven't ever created a resource type like
this before, except for the Docker one in /contrib that serves as a
prime example of what not to do. And if it doesn't make sense to wrap
an OpenStack API around this then IMO it isn't going to make any more
sense to wrap a Heat resource around it.

There are ways to alleviate the credential handling issue. First,
Kubernetes supports Keystone authentication [1]. Magnum has a BP [2] to
turn on this feature. In addition, there is a Kubernetes python-binding
[3] under development. By combining all these efforts, it is possible to
create a Kubernetes resource in Heat without handing credentials other
than the Keystone ones.

[1] http://kubernetes.io/docs/admin/authentication/
[2] https://blueprints.launchpad.net/magnum/+spec/keystone-for-k8s-bay
[3] https://github.com/openstack/python-k8sclient

A third option might be a SoftwareDeployment, possibly on one of the
controller nodes themselves, that calls the k8s client. (We could
create a software deployment hook to make this easy.) That would suffer
from all of the same issues that TripleO currently has about having to
choose a server on which to deploy though.

From my point of view, the Kubernetes Heat resources approach is possibly
more user-friendly than the SoftwareDeployment approach. That is because
SoftwareDeployment and SoftwareDeploymentGroup resources are very
advanced and complex. It might take a while for users to figure out how
to use them. The requirement of building a custom image is another
barrier of entry. In Magnum, we explored the possibility to leverage
SD/SDG in Atomic-based COEs, but stopped on that direction until the
os-- tools have been fully containerized [4] so that those resources
could work on any OS.

[4] https://bugs.launchpad.net/magnum/+bug/1424969

The secondary obstacle is networking. TripleO has some pretty
complicated networking requirements (specifically network isolation for
the various services) that for now can't be supported when deploying a
cluster with Magnum. The Kuryr project is working on improved
networking for Magnum, but I don't know whether this is a use-case that
would be covered.

Sorry, I don't get this. Mind elaborating the details of your network
requirements?

There's also the issue that IIUC Magnum operates its Neutron L3 agents
in such a way that connectivity to the user nodes is guaranteed only if
Magnum itself is running in an HA cloud. This is a problematic
assumption in general, but it's particularly problematic in the case of
the TripleO undercloud, which is not HA and which we very much do not
want to be in the networking path for the overcloud controller nodes.
Again, I don't know if this will be resolved by Kuryr or when.

Magnum does offer the option to pass a custom template, and I assume
that would allow us to set up the networking the way we want it.
However, TripleO uses all kinds of tricks with the environment and
parameters, so there'd quite likely need to be some enhancements to
both Heat (in order to access the current environment from within a
template) and Magnum (to pass an environment along with the template)
to support that.

Magnum prefers to leverage the Heat conditionals feature instead of
leveraging environments, because we expected Heat conditionals would make
our Heat templates simpler and easier to maintain. If we can pass a
parameter to Heat template and use conditionals to interpret the
parameter, I am not sure if we also need to support passing environments
as well (it seems conditionals can do whatever environments can do).

At that point it's a legitimate question to ask what exactly Magnum is
buying us if TripleO has to maintain its own Kubernetes deployment
templates anyway. I can think of only two things: an easier transition
later if we do believe that the networking stuff will be resolved, and
the /containers API. And the /containers API is being deprecated.

In that sense, the Magnum/Higgins split could be a good thing for the
Heat+Kubernetes use case in the long term - if we had a
Keystone-authenticated API that can allow Heat to make use of any k8s
cluster, not just those deployed via Magnum, then Magnum could be cut
out of the loop in those cases where networking issues preclude its use.

Wearing my Magnum PTL hat, I am sorry to hear Magnum couldn't resolve
your problem immediately. Wearing my Higgins core hat, I am thrilled that
Higgins is under your consideration in long term.

In the short term, though, there seems to be a number of obstacles.
Perhaps some of the folks involved in the relevant projects could
comment on when/if those are likely to be resolved.

cheers,
Zane.

[1]
http://lists.openstack.org/pipermail/openstack-dev/2016-
March/090055.html
[2]https://etherpad.openstack.org/p/newton-magnum-unified-abstraction



OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-
request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded May 29, 2016 by Steven_Dake_(stdake) (24,540 points)   2 10 24
0 votes

Quick question below.

On 5/28/16, 1:16 PM, "Hongbin Lu" hongbin.lu@huawei.com wrote:

-----Original Message-----
From: Zane Bitter [mailto:zbitter@redhat.com]
Sent: May-27-16 6:31 PM
To: OpenStack Development Mailing List
Subject: [openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr]
Gap analysis: Heat as a k8s orchestrator

I spent a bit of time exploring the idea of using Heat as an external
orchestration layer on top of Kubernetes - specifically in the case of
TripleO controller nodes but I think it could be more generally useful
too - but eventually came to the conclusion it doesn't work yet, and
probably won't for a while. Nevertheless, I think it's helpful to
document a bit to help other people avoid going down the same path, and
also to help us focus on working toward the point where it is
possible, since I think there are other contexts where it would be
useful too.

We tend to refer to Kubernetes as a "Container Orchestration Engine"
but it does not actually do any orchestration, unless you count just
starting everything at roughly the same time as 'orchestration'. Which
I wouldn't. You generally handle any orchestration requirements between
services within the containers themselves, possibly using external
services like etcd to co-ordinate. (The Kubernetes project refer to
this as "choreography", and explicitly disclaim any attempt at
orchestration.)

What Kubernetes does do is more like an actively-managed version of
Heat's SoftwareDeploymentGroup (emphasis on the Group). Brief recap:
SoftwareDeploymentGroup is a type of ResourceGroup; you give it a map
of resource names to server UUIDs and it creates a SoftwareDeployment
for each server. You have to generate the list of servers somehow to
give it (the easiest way is to obtain it from the output of another
ResourceGroup containing the servers). If e.g. a server goes down you
have to detect that externally, and trigger a Heat update that removes
it from the templates, redeploys a replacement server, and regenerates
the server list before a replacement SoftwareDeployment is created. In
constrast, Kubernetes is running on a cluster of servers, can use rules
to determine where to run containers, and can very quickly redeploy
without external intervention in response to a server or container
falling over. (It also does rolling updates, which Heat can also do
albeit in a somewhat hacky way when it comes to SoftwareDeployments -
which we're planning to fix.)

So this seems like an opportunity: if the dependencies between services
could be encoded in Heat templates rather than baked into the
containers then we could use Heat as the orchestration layer following
the dependency-based style I outlined in [1]. (TripleO is already
moving in this direction with the way that composable-roles uses
SoftwareDeploymentGroups.) One caveat is that fully using this style
likely rules out for all practical purposes the current Pacemaker-based
HA solution. We'd need to move to a lighter-weight HA solution, but I
know that TripleO is considering that anyway.

What's more though, assuming this could be made to work for a
Kubernetes cluster, a couple of remappings in the Heat environment file
should get you an otherwise-equivalent single-node non-HA deployment
basically for free. That's particularly exciting to me because there
are definitely deployments of TripleO that need HA clustering and
deployments that don't and which wouldn't want to pay the complexity
cost of running Kubernetes when they don't make any real use of it.

So you'd have a Heat resource type for the controller cluster that maps
to either an OS::Nova::Server or (the equivalent of) an OS::Magnum::Bay,
and a bunch of software deployments that map to either a
OS::Heat::SoftwareDeployment that calls (I assume) docker-compose
directly or a Kubernetes Pod resource to be named later.

The first obstacle is that we'd need that Kubernetes Pod resource in
Heat. Currently there is no such resource type, and the OpenStack API
that would be expected to provide that API (Magnum's /container
endpoint) is being deprecated, so that's not a long-term solution.[2]
Some folks from the Magnum community may or may not be working on a
separate project (which may or may not be called Higgins) to do that.
It'd be some time away though.

An alternative, though not a good one, would be to create a Kubernetes
resource type in Heat that has the credentials passed in somehow. I'm
very against that though. Heat is just not good at handling credentials
other than Keystone ones. We haven't ever created a resource type like
this before, except for the Docker one in /contrib that serves as a
prime example of what not to do. And if it doesn't make sense to wrap
an OpenStack API around this then IMO it isn't going to make any more
sense to wrap a Heat resource around it.

There are ways to alleviate the credential handling issue. First,
Kubernetes supports Keystone authentication [1]. Magnum has a BP [2] to
turn on this feature. In addition, there is a Kubernetes python-binding
[3] under development. By combining all these efforts, it is possible to
create a Kubernetes resource in Heat without handing credentials other
than the Keystone ones.

[1] http://kubernetes.io/docs/admin/authentication/
[2] https://blueprints.launchpad.net/magnum/+spec/keystone-for-k8s-bay
[3] https://github.com/openstack/python-k8sclient

A third option might be a SoftwareDeployment, possibly on one of the
controller nodes themselves, that calls the k8s client. (We could
create a software deployment hook to make this easy.) That would suffer
from all of the same issues that TripleO currently has about having to
choose a server on which to deploy though.

From my point of view, the Kubernetes Heat resources approach is possibly
more user-friendly than the SoftwareDeployment approach. That is because
SoftwareDeployment and SoftwareDeploymentGroup resources are very
advanced and complex. It might take a while for users to figure out how
to use them. The requirement of building a custom image is another
barrier of entry. In Magnum, we explored the possibility to leverage
SD/SDG in Atomic-based COEs, but stopped on that direction until the
os-- tools have been fully containerized [4] so that those resources
could work on any OS.

[4] https://bugs.launchpad.net/magnum/+bug/1424969

The secondary obstacle is networking. TripleO has some pretty
complicated networking requirements (specifically network isolation for
the various services) that for now can't be supported when deploying a
cluster with Magnum. The Kuryr project is working on improved
networking for Magnum, but I don't know whether this is a use-case that
would be covered.

Sorry, I don't get this. Mind elaborating the details of your network
requirements?

There's also the issue that IIUC Magnum operates its Neutron L3 agents
in such a way that connectivity to the user nodes is guaranteed only if
Magnum itself is running in an HA cloud. This is a problematic
assumption in general, but it's particularly problematic in the case of
the TripleO undercloud, which is not HA and which we very much do not
want to be in the networking path for the overcloud controller nodes.
Again, I don't know if this will be resolved by Kuryr or when.

Magnum does offer the option to pass a custom template, and I assume
that would allow us to set up the networking the way we want it.
However, TripleO uses all kinds of tricks with the environment and
parameters, so there'd quite likely need to be some enhancements to
both Heat (in order to access the current environment from within a
template) and Magnum (to pass an environment along with the template)
to support that.

Magnum prefers to leverage the Heat conditionals feature instead of
leveraging environments, because we expected Heat conditionals would make
our Heat templates simpler and easier to maintain. If we can pass a
parameter to Heat template and use conditionals to interpret the
parameter, I am not sure if we also need to support passing environments
as well (it seems conditionals can do whatever environments can do).

At that point it's a legitimate question to ask what exactly Magnum is
buying us if TripleO has to maintain its own Kubernetes deployment
templates anyway. I can think of only two things: an easier transition
later if we do believe that the networking stuff will be resolved, and
the /containers API. And the /containers API is being deprecated.

In that sense, the Magnum/Higgins split could be a good thing for the
Heat+Kubernetes use case in the long term - if we had a
Keystone-authenticated API that can allow Heat to make use of any k8s
cluster, not just those deployed via Magnum, then Magnum could be cut
out of the loop in those cases where networking issues preclude its use.

Wearing my Magnum PTL hat, I am sorry to hear Magnum couldn't resolve
your problem immediately. Wearing my Higgins core hat, I am thrilled that
Higgins is under your consideration in long term.

Who is the PTL and core team of Higgins? I didn't see an announcement on
the mailing list, although granted I probably missed it given the volume
we have :)

Regards
-steve

In the short term, though, there seems to be a number of obstacles.
Perhaps some of the folks involved in the relevant projects could
comment on when/if those are likely to be resolved.

cheers,
Zane.

[1]
http://lists.openstack.org/pipermail/openstack-dev/2016-
March/090055.html
[2]https://etherpad.openstack.org/p/newton-magnum-unified-abstraction



OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-
request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded May 29, 2016 by Steven_Dake_(stdake) (24,540 points)   2 10 24
0 votes

-----Original Message-----
From: Steven Dake (stdake) [mailto:stdake@cisco.com]
Sent: May-29-16 3:29 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev]
[TripleO][Kolla][Heat][Higgins][Magnum][Kuryr] Gap analysis: Heat as a
k8s orchestrator

Quick question below.

On 5/28/16, 1:16 PM, "Hongbin Lu" hongbin.lu@huawei.com wrote:

-----Original Message-----
From: Zane Bitter [mailto:zbitter@redhat.com]
Sent: May-27-16 6:31 PM
To: OpenStack Development Mailing List
Subject: [openstack-dev]
[TripleO][Kolla][Heat][Higgins][Magnum][Kuryr]
Gap analysis: Heat as a k8s orchestrator

I spent a bit of time exploring the idea of using Heat as an
external
orchestration layer on top of Kubernetes - specifically in the case
of TripleO controller nodes but I think it could be more generally
useful too - but eventually came to the conclusion it doesn't work
yet, and probably won't for a while. Nevertheless, I think it's
helpful to document a bit to help other people avoid going down the
same path, and also to help us focus on working toward the point
where it is possible, since I think there are other contexts where
it would be useful too.

We tend to refer to Kubernetes as a "Container Orchestration Engine"
but it does not actually do any orchestration, unless you count just
starting everything at roughly the same time as 'orchestration'.
Which I wouldn't. You generally handle any orchestration
requirements
between services within the containers themselves, possibly using
external services like etcd to co-ordinate. (The Kubernetes project
refer to this as "choreography", and explicitly disclaim any attempt
at
orchestration.)

What Kubernetes does do is more like an actively-managed version
of
Heat's SoftwareDeploymentGroup (emphasis on the Group). Brief
recap:
SoftwareDeploymentGroup is a type of ResourceGroup; you give it a
map
of resource names to server UUIDs and it creates a
SoftwareDeployment
for each server. You have to generate the list of servers somehow to
give it (the easiest way is to obtain it from the output of another
ResourceGroup containing the servers). If e.g. a server goes down
you
have to detect that externally, and trigger a Heat update that
removes it from the templates, redeploys a replacement server, and
regenerates the server list before a replacement SoftwareDeployment
is created. In constrast, Kubernetes is running on a cluster of
servers, can use rules to determine where to run containers, and can
very quickly redeploy without external intervention in response to a
server or container falling over. (It also does rolling updates,
which Heat can also do albeit in a somewhat hacky way when it comes
to SoftwareDeployments - which we're planning to fix.)

So this seems like an opportunity: if the dependencies between
services could be encoded in Heat templates rather than baked into
the containers then we could use Heat as the orchestration layer
following the dependency-based style I outlined in [1]. (TripleO is
already moving in this direction with the way that composable-roles
uses
SoftwareDeploymentGroups.) One caveat is that fully using this style
likely rules out for all practical purposes the current
Pacemaker-based HA solution. We'd need to move to a lighter-weight
HA
solution, but I know that TripleO is considering that anyway.

What's more though, assuming this could be made to work for a
Kubernetes cluster, a couple of remappings in the Heat environment
file should get you an otherwise-equivalent single-node non-HA
deployment basically for free. That's particularly exciting to me
because there are definitely deployments of TripleO that need HA
clustering and deployments that don't and which wouldn't want to pay
the complexity cost of running Kubernetes when they don't make any
real use of it.

So you'd have a Heat resource type for the controller cluster that
maps to either an OS::Nova::Server or (the equivalent of) an
OS::Magnum::Bay, and a bunch of software deployments that map to
either a OS::Heat::SoftwareDeployment that calls (I assume)
docker-compose directly or a Kubernetes Pod resource to be named
later.

The first obstacle is that we'd need that Kubernetes Pod resource in
Heat. Currently there is no such resource type, and the OpenStack
API
that would be expected to provide that API (Magnum's /container
endpoint) is being deprecated, so that's not a long-term solution.[2]
Some folks from the Magnum community may or may not be working on a
separate project (which may or may not be called Higgins) to do that.
It'd be some time away though.

An alternative, though not a good one, would be to create a
Kubernetes resource type in Heat that has the credentials passed in
somehow. I'm very against that though. Heat is just not good at
handling credentials other than Keystone ones. We haven't ever
created a resource type like this before, except for the Docker one
in /contrib that serves as a prime example of what not to do. And
if it doesn't make sense to wrap an OpenStack API around this then
IMO it isn't going to make any more sense to wrap a Heat resource
around it.

There are ways to alleviate the credential handling issue. First,
Kubernetes supports Keystone authentication [1]. Magnum has a BP [2]
to
turn on this feature. In addition, there is a Kubernetes python-
binding
[3] under development. By combining all these efforts, it is possible
to create a Kubernetes resource in Heat without handing credentials
other than the Keystone ones.

[1] http://kubernetes.io/docs/admin/authentication/
[2] https://blueprints.launchpad.net/magnum/+spec/keystone-for-k8s-bay
[3] https://github.com/openstack/python-k8sclient

A third option might be a SoftwareDeployment, possibly on one of the
controller nodes themselves, that calls the k8s client. (We could
create a software deployment hook to make this easy.) That would
suffer from all of the same issues that TripleO currently has about
having to choose a server on which to deploy though.

From my point of view, the Kubernetes Heat resources approach is
possibly more user-friendly than the SoftwareDeployment approach. That
is because SoftwareDeployment and SoftwareDeploymentGroup resources
are
very advanced and complex. It might take a while for users to figure
out how to use them. The requirement of building a custom image is
another barrier of entry. In Magnum, we explored the possibility to
leverage SD/SDG in Atomic-based COEs, but stopped on that direction
until the
os-- tools have been fully containerized [4] so that those resources
could work on any OS.

[4] https://bugs.launchpad.net/magnum/+bug/1424969

The secondary obstacle is networking. TripleO has some pretty
complicated networking requirements (specifically network isolation
for the various services) that for now can't be supported when
deploying a cluster with Magnum. The Kuryr project is working on
improved networking for Magnum, but I don't know whether this is a
use-case that would be covered.

Sorry, I don't get this. Mind elaborating the details of your network
requirements?

There's also the issue that IIUC Magnum operates its Neutron L3
agents in such a way that connectivity to the user nodes is
guaranteed only if Magnum itself is running in an HA cloud. This is
a
problematic assumption in general, but it's particularly problematic
in the case of the TripleO undercloud, which is not HA and which
we
very much do not want to be in the networking path for the overcloud
controller nodes.
Again, I don't know if this will be resolved by Kuryr or when.

Magnum does offer the option to pass a custom template, and I assume
that would allow us to set up the networking the way we want it.
However, TripleO uses all kinds of tricks with the environment and
parameters, so there'd quite likely need to be some enhancements to
both Heat (in order to access the current environment from within a
template) and Magnum (to pass an environment along with the template)
to support that.

Magnum prefers to leverage the Heat conditionals feature instead of
leveraging environments, because we expected Heat conditionals would
make our Heat templates simpler and easier to maintain. If we can pass
a parameter to Heat template and use conditionals to interpret the
parameter, I am not sure if we also need to support passing
environments as well (it seems conditionals can do whatever
environments can do).

At that point it's a legitimate question to ask what exactly Magnum
is buying us if TripleO has to maintain its own Kubernetes
deployment
templates anyway. I can think of only two things: an easier
transition later if we do believe that the networking stuff will be
resolved, and the /containers API. And the /containers API is being
deprecated.

In that sense, the Magnum/Higgins split could be a good thing for
the
Heat+Kubernetes use case in the long term - if we had a
Keystone-authenticated API that can allow Heat to make use of any
k8s
cluster, not just those deployed via Magnum, then Magnum could be
cut
out of the loop in those cases where networking issues preclude its
use.

Wearing my Magnum PTL hat, I am sorry to hear Magnum couldn't resolve
your problem immediately. Wearing my Higgins core hat, I am thrilled
that Higgins is under your consideration in long term.

Who is the PTL and core team of Higgins? I didn't see an announcement
on the mailing list, although granted I probably missed it given the
volume we have :)

Here is the core team: https://review.openstack.org/#/admin/groups/1382,members . There is no official Higgins PTL, since there is no PTL election yet. Right now, I am coordinating the contribution and running the weekly team meeting. You can consider me as the temporary PTL until the official PTL is elected. We will find the right time to hold a PTL election.

Regards
-steve

In the short term, though, there seems to be a number of obstacles.
Perhaps some of the folks involved in the relevant projects could
comment on when/if those are likely to be resolved.

cheers,
Zane.

[1]
http://lists.openstack.org/pipermail/openstack-dev/2016-
March/090055.html
[2]https://etherpad.openstack.org/p/newton-magnum-unified-
abstraction


__


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-
request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


_
___ OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-
request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded May 29, 2016 by hongbin.lu_at_huawei (11,620 points)   2 3 4
0 votes

On 29/05/16 08:16, Hongbin Lu wrote:

-----Original Message-----
From: Zane Bitter [mailto:zbitter@redhat.com]
Sent: May-27-16 6:31 PM
To: OpenStack Development Mailing List
Subject: [openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr]
Gap analysis: Heat as a k8s orchestrator

I spent a bit of time exploring the idea of using Heat as an external
orchestration layer on top of Kubernetes - specifically in the case of
TripleO controller nodes but I think it could be more generally useful
too - but eventually came to the conclusion it doesn't work yet, and
probably won't for a while. Nevertheless, I think it's helpful to
document a bit to help other people avoid going down the same path, and
also to help us focus on working toward the point where it is
possible, since I think there are other contexts where it would be
useful too.

We tend to refer to Kubernetes as a "Container Orchestration Engine"
but it does not actually do any orchestration, unless you count just
starting everything at roughly the same time as 'orchestration'. Which
I wouldn't. You generally handle any orchestration requirements between
services within the containers themselves, possibly using external
services like etcd to co-ordinate. (The Kubernetes project refer to
this as "choreography", and explicitly disclaim any attempt at
orchestration.)

What Kubernetes does do is more like an actively-managed version of
Heat's SoftwareDeploymentGroup (emphasis on the Group). Brief recap:
SoftwareDeploymentGroup is a type of ResourceGroup; you give it a map
of resource names to server UUIDs and it creates a SoftwareDeployment
for each server. You have to generate the list of servers somehow to
give it (the easiest way is to obtain it from the output of another
ResourceGroup containing the servers). If e.g. a server goes down you
have to detect that externally, and trigger a Heat update that removes
it from the templates, redeploys a replacement server, and regenerates
the server list before a replacement SoftwareDeployment is created. In
constrast, Kubernetes is running on a cluster of servers, can use rules
to determine where to run containers, and can very quickly redeploy
without external intervention in response to a server or container
falling over. (It also does rolling updates, which Heat can also do
albeit in a somewhat hacky way when it comes to SoftwareDeployments -
which we're planning to fix.)

So this seems like an opportunity: if the dependencies between services
could be encoded in Heat templates rather than baked into the
containers then we could use Heat as the orchestration layer following
the dependency-based style I outlined in [1]. (TripleO is already
moving in this direction with the way that composable-roles uses
SoftwareDeploymentGroups.) One caveat is that fully using this style
likely rules out for all practical purposes the current Pacemaker-based
HA solution. We'd need to move to a lighter-weight HA solution, but I
know that TripleO is considering that anyway.

What's more though, assuming this could be made to work for a
Kubernetes cluster, a couple of remappings in the Heat environment file
should get you an otherwise-equivalent single-node non-HA deployment
basically for free. That's particularly exciting to me because there
are definitely deployments of TripleO that need HA clustering and
deployments that don't and which wouldn't want to pay the complexity
cost of running Kubernetes when they don't make any real use of it.

So you'd have a Heat resource type for the controller cluster that maps
to either an OS::Nova::Server or (the equivalent of) an OS::Magnum::Bay,
and a bunch of software deployments that map to either a
OS::Heat::SoftwareDeployment that calls (I assume) docker-compose
directly or a Kubernetes Pod resource to be named later.

The first obstacle is that we'd need that Kubernetes Pod resource in
Heat. Currently there is no such resource type, and the OpenStack API
that would be expected to provide that API (Magnum's /container
endpoint) is being deprecated, so that's not a long-term solution.[2]
Some folks from the Magnum community may or may not be working on a
separate project (which may or may not be called Higgins) to do that.
It'd be some time away though.

An alternative, though not a good one, would be to create a Kubernetes
resource type in Heat that has the credentials passed in somehow. I'm
very against that though. Heat is just not good at handling credentials
other than Keystone ones. We haven't ever created a resource type like
this before, except for the Docker one in /contrib that serves as a
prime example of what not to do. And if it doesn't make sense to wrap
an OpenStack API around this then IMO it isn't going to make any more
sense to wrap a Heat resource around it.
There are ways to alleviate the credential handling issue. First, Kubernetes supports Keystone authentication [1]. Magnum has a BP [2] to turn on this feature. In addition, there is a Kubernetes python-binding [3] under development. By combining all these efforts, it is possible to create a Kubernetes resource in Heat without handing credentials other than the Keystone ones.

[1] http://kubernetes.io/docs/admin/authentication/
[2] https://blueprints.launchpad.net/magnum/+spec/keystone-for-k8s-bay
[3] https://github.com/openstack/python-k8sclient

A third option might be a SoftwareDeployment, possibly on one of the
controller nodes themselves, that calls the k8s client. (We could
create a software deployment hook to make this easy.) That would suffer
from all of the same issues that TripleO currently has about having to
choose a server on which to deploy though.
From my point of view, the Kubernetes Heat resources approach is possibly more user-friendly than the SoftwareDeployment approach.
Having kubernetes accept a keystone token would likely meet TripleO's
requirements and justify creating a heat resource which interacts
directly with kubernetes. A general solution would also need some
multi-tenancy separation - and that is what magnum does.

So we could have an in-tree kubernetes resource as long as we don't do
what we did for the docker resource and allow the endpoint/auth to be
specified via resource properties. The endpoint would have to come from
a magnum resource, or the keystone catalog, or something else beyond the
influence of the template author.

That is because SoftwareDeployment and SoftwareDeploymentGroup resources are very advanced and complex. It might take a while for users to figure out how to use them. The requirement of building a custom image is another barrier of entry. In Magnum, we explored the possibility to leverage SD/SDG in Atomic-based COEs, but stopped on that direction until the os-- tools have been fully containerized [4] so that those resources could work on any OS.

[4] https://bugs.launchpad.net/magnum/+bug/1424969
There have been containerized heat-agent options for quite some time
now. heat-templates hosts one using docker-compose:
https://github.com/openstack/heat-templates/tree/master/hot/software-config/heat-container-agent

But likely a better starting point is the the one being developed in
tripleo-common:
https://github.com/openstack/tripleo-common/tree/master/heat_docker_agent

It uses docker-compose but will soon switch to using docker directly,
while using an identical configuration format.

The secondary obstacle is networking. TripleO has some pretty
complicated networking requirements (specifically network isolation for
the various services) that for now can't be supported when deploying a
cluster with Magnum. The Kuryr project is working on improved
networking for Magnum, but I don't know whether this is a use-case that
would be covered.
Sorry, I don't get this. Mind elaborating the details of your network requirements?
IAmNotANetworkingExpert, but here is my understanding of the requirements.

TripleO uses neutron to define an overlay network. The architecture of
this network is completely flexible based on the deployer's specific
requirements for network isolation of various traffic classes. Each node
has a corresponding os-net-config data deployed to it to configure its X
interfaces to Y isolated networks (using bonding if X!=Y)

Since we already have the overhead of one overlay network, we have the
hard requirement of kubernetes not adding another one. I believe this
rules out any solution involving flannel.

It sounds like full integration between Kuryr and Kubernetes is the best
chance of providing what we need, and I see two new repos that look like
are intended to host this integration:
http://git.openstack.org/cgit/openstack/kuryr-kubernetes
http://git.openstack.org/cgit/openstack/kuryr-libnetwork

Tripleo defines neutron ports for the overcloud service VIPs and
configures a HA loadbalancer in the overcloud to back them. It would be
very nice if the kuryr-kubernetes integration could integrate kubernetes
service vips with already created neutron ports.

There's also the issue that IIUC Magnum operates its Neutron L3 agents
in such a way that connectivity to the user nodes is guaranteed only if
Magnum itself is running in an HA cloud. This is a problematic
assumption in general, but it's particularly problematic in the case of
the TripleO undercloud, which is not HA and which we very much do not
want to be in the networking path for the overcloud controller nodes.
Again, I don't know if this will be resolved by Kuryr or when.

Magnum does offer the option to pass a custom template, and I assume
that would allow us to set up the networking the way we want it.
However, TripleO uses all kinds of tricks with the environment and
parameters, so there'd quite likely need to be some enhancements to
both Heat (in order to access the current environment from within a
template) and Magnum (to pass an environment along with the template)
to support that.
Magnum prefers to leverage the Heat conditionals feature instead of leveraging environments, because we expected Heat conditionals would make our Heat templates simpler and easier to maintain. If we can pass a parameter to Heat template and use conditionals to interpret the parameter, I am not sure if we also need to support passing environments as well (it seems conditionals can do whatever environments can do).

I assume that the HA requirements of Magnum is due to the overlay
network it manages, needing an HA undercloud is a second reason why we
wouldn't want to add another overlay.

At that point it's a legitimate question to ask what exactly Magnum is
buying us if TripleO has to maintain its own Kubernetes deployment
templates anyway. I can think of only two things: an easier transition
later if we do believe that the networking stuff will be resolved, and
the /containers API. And the /containers API is being deprecated.

In that sense, the Magnum/Higgins split could be a good thing for the
Heat+Kubernetes use case in the long term - if we had a
Keystone-authenticated API that can allow Heat to make use of any k8s
cluster, not just those deployed via Magnum, then Magnum could be cut
out of the loop in those cases where networking issues preclude its use.
Wearing my Magnum PTL hat, I am sorry to hear Magnum couldn't resolve your problem immediately. Wearing my Higgins core hat, I am thrilled that Higgins is under your consideration in long term.

I'd like to think that if kubernetes had a viable no-overlay
neutron-integrated networking option then Magnum would be prepared to
support it.

In this case TripleO could consider using Magnum to manage the cluster
via the Bay API. All I'm seeing to achieve this is the Bay node_count.
This is fine for adding nodes to the controller cluster. Magnum is
backed by Heat so it shouldn't be hard to expose a mechanism to scale
down by removing specific nodes, but TripleO needs more than that. There
may need to be REST API exposure and some COE integration to do things like:
- evacuate running containers from a node in preparation for scale down,
replacement, or temporary removal for repair
- fencing a misbehaving node
- remove one node and add another in a single operation

Some discussion on whether it would be appropriate to perform these
functions would be interesting. And if not Magnum then what? TripleO has
something now but its been quite a journey to get
heat/nova/ironic/pacemaker to work together in scale down and node
replacement scenarios.

In the short term, though, there seems to be a number of obstacles.
Perhaps some of the folks involved in the relevant projects could
comment on when/if those are likely to be resolved.

cheers,
Zane.

[1]
http://lists.openstack.org/pipermail/openstack-dev/2016-
March/090055.html
[2]https://etherpad.openstack.org/p/newton-magnum-unified-abstraction



OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-
request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded May 30, 2016 by Steve_Baker (7,380 points)   1 3 6
...