settingsLogin | Registersettings

[openstack-dev] [trove][all][tc] A proposal to rearchitect Trove

0 votes

Trove has evolved rapidly over the past several years, since integration in
IceHouse when it only supported single instances of a few databases. Today
it supports a dozen databases including clusters and replication.

The user survey [1] indicates that while there is strong interest in the
project, there are few large production deployments that are known of (by
the development team).

Recent changes in the OpenStack community at large (company realignments,
acquisitions, layoffs) and the Trove community in particular, coupled with
a mounting burden of technical debt have prompted me to make this proposal
to re-architect Trove.

This email summarizes several of the issues that face the project, both
structurally and architecturally. This email does not claim to include a
detailed specification for what the new Trove would look like, merely the
recommendation that the community should come together and develop one so
that the project can be sustainable and useful to those who wish to use it
in the future.

TL;DR

Trove, with support for a dozen or so databases today, finds itself in a
bind because there are few developers, and a code-base with a significant
amount of technical debt.

Some architectural choices which the team made over the years have
consequences which make the project less than ideal for deployers.

Given that there are no major production deployments of Trove at present,
this provides us an opportunity to reset the project, learn from our v1 and
come up with a strong v2.

An important aspect of making this proposal work is that we seek to
eliminate the effort (planning, and coding) involved in migrating existing
Trove v1 deployments to the proposed Trove v2. Effectively, with work
beginning on Trove v2 as proposed here, Trove v1 as released with Pike will
be marked as deprecated and users will have to migrate to Trove v2 when it
becomes available.

While I would very much like to continue to support the users on Trove v1
through this transition, the simple fact is that absent community
participation this will be impossible. Furthermore, given that there are no
production deployments of Trove at this time, it seems pointless to build
that upgrade path from Trove v1 to Trove v2; it would be the proverbial
bridge from nowhere.

This (previous) statement is, I realize, contentious. There are those who
have told me that an upgrade path must be provided, and there are those who
have told me of unnamed deployments of Trove that would suffer. To this,
all I can say is that if an upgrade path is of value to you, then please
commit the development resources to participate in the community to make
that possible. But equally, preventing a v2 of Trove or delaying it will
only make the v1 that we have today less valuable.

We have learned a lot from v1, and the hope is that we can address that in
v2. Some of the more significant things that I have learned are:

  • We should adopt a versioned front-end API from the very beginning; making
    the REST API versioned is not a ‘v2 feature’

  • A guest agent running on a tenant instance, with connectivity to a shared
    management message bus is a security loophole; encrypting traffic,
    per-tenant-passwords, and any other scheme is merely lipstick on a security
    hole

  • Reliance on Nova for compute resources is fine, but dependence on Nova VM
    specific capabilities (like instance rebuild) is not; it makes things like
    containers or bare-metal second class citizens

  • A fair portion of what Trove does is resource orchestration; don’t
    reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as far
    along when Trove got started but that’s not the case today and we have an
    opportunity to fix that now

  • A similarly significant portion of what Trove does is to implement a
    state-machine that will perform specific workflows involved in implementing
    database specific operations. This makes the Trove taskmanager a stateful
    entity. Some of the operations could take a fair amount of time. This is a
    serious architectural flaw

  • Tenants should not ever be able to directly interact with the underlying
    storage and compute used by database instances; that should be the default
    configuration, not an untested deployment alternative

  • The CI should test all databases that are considered to be ‘supported’
    without excessive use of resources in the gate; better code modularization
    will help determine the tests which can safely be skipped in testing changes

  • Clusters should be first class citizens not an afterthought, single
    instance databases may be the ‘special case’, not the other way around

  • The project must provide guest images (or at least complete tooling for
    deployers to build these); while the project can’t distribute operating
    systems and database software, the current deployment model merely impedes
    adoption

  • Clusters spanning OpenStack deployments are a real thing that must be
    supported

This might sound harsh, that isn’t the intent. Each of these is the
consequence of one or more perfectly rational decisions. Some of those
decisions have had unintended consequences, and others were made knowing
that we would be incurring some technical debt; debt we have not had the
time or resources to address. Fixing all these is not impossible, it just
takes the dedication of resources by the community.

I do not have a complete design for what the new Trove would look like. For
example, I don’t know how we will interact with other projects (like Heat).
Many questions remain to be explored and answered.

Would it suffice to just use the existing Heat resources and build
templates around those, or will it be better to implement custom Trove
resources and then orchestrate things based on those resources?

Would Trove implement the workflows required for multi-stage database
operations by itself, or would it rely on some other project (say Mistral)
for this? Is Mistral really a workflow service, or just cron on steroids? I
don’t know the answer but I would like to find out.

While we don’t have the answers to these questions, I think this is a
conversation that we must have, one that we must decide on, and then as a
community commit the resources required to make a Trove v2 which delivers
on the mission of the project; “To provide scalable and reliable Cloud
Database as a Service provisioning functionality for both relational and
non-relational database engines, and to continue to improve its
fully-featured and extensible open source framework.”[2]

Thanks,

-amrith​

[1] https://www.openstack.org/assets/survey/April2017SurveyReport.pdf
[2] https://wiki.openstack.org/wiki/Trove#Mission_Statement


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
asked Jul 18, 2017 in openstack-dev by amrith.kumar_at_gmai (3,580 points)   2 3

41 Responses

0 votes

Fox, Kevin M wrote:
[...]
If you build a Tessmaster clone just to do mariadb, then you share nothing with the other communities and have to reinvent the wheel, yet again. Operators load increases because the tool doesn't function like other tools.

If you rely on a container orchestration engine that's already cross cloud that can be easily deployed by user or cloud operator, and fill in the gaps with what Trove wants to support, easy management of db's, you get to reuse a lot of the commons and the users slight increase in investment in dealing with the bit of extra plumbing in there allows other things to also be easily added to their cluster. Its very rare that a user would need to deploy/manage only a database. The net load on the operator decreases, not increases.

I think the user-side tool could totally deploy on Kubernetes clusters
-- if that was the only possible target that would make it a Kubernetes
tool more than an open infrastructure tool, but that's definitely a
possibility. I'm not sure work is needed there though, there are already
tools (or charts) doing that ?

For a server-side approach where you want to provide a DB-provisioning
API, I fear that making the functionality depend on K8s would make
TroveV2/Hoard would not only depend on Heat and Nova, but also depend on
something that would deploy a Kubernetes cluster (Magnum?), which would
likely hurt its adoption (and reusability in simpler setups). Since
databases would just work perfectly well in VMs, it feels like a
gratuitous dependency addition ?

We generally need to be very careful about creating dependencies between
OpenStack projects. On one side there are base services (like Keystone)
that we said it was alright to depend on, but depending on anything else
is likely to reduce adoption. Magnum adoption suffers from its
dependency on Heat. If Heat starts depending on Zaqar, we make the
problem worse. I understand it's a hard trade-off: you want to reuse
functionality rather than reinvent it in every project... we just need
to recognize the cost of doing that.

--
Thierry Carrez (ttx)


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 22, 2017 by Thierry_Carrez (57,480 points)   3 8 13
0 votes

My $0.02.

That view of dependencies is why Kubernetes development is outpacing OpenStacks and some users are leaving IMO. Not trying to be mean here but trying to shine some light on this issue.

Kubernetes at its core has essentially something kind of equivalent to keystone (k8s rbac), nova (container mgmt), cinder (pv/pvc/storageclasses), heat with convergence (deployments/daemonsets/etc), barbican (secrets), designate (kube-dns), and octavia (kube-proxy,svc,ingress) in one unit. Ops dont have to work hard to get all of it, users can assume its all there, and devs don't have many silo's to cross to implement features that touch multiple pieces.

This core functionality being combined has allowed them to land features that are really important to users but has proven difficult for OpenStack to do because of the silo's. OpenStack's general pattern has been, stand up a new service for new feature, then no one wants to depend on it so its ignored and each silo reimplements a lesser version of it themselves.

The OpenStack commons then continues to suffer.

We need to stop this destructive cycle.

OpenStack needs to figure out how to increase its commons. Both internally and externally. etcd as a common service was a step in the right direction.

I think k8s needs to be another common service all the others can rely on. That could greatly simplify the rest of the OpenStack projects as a lot of its functionality no longer has to be implemented in each project.

We also need a way to break down the silo walls and allow more cross project collaboration for features. I fear the new push for letting projects run standalone will make this worse, not better, further fracturing OpenStack.

Thanks,
Kevin


From: Thierry Carrez [thierry@openstack.org]
Sent: Thursday, June 22, 2017 12:58 AM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove

Fox, Kevin M wrote:
[...]
If you build a Tessmaster clone just to do mariadb, then you share nothing with the other communities and have to reinvent the wheel, yet again. Operators load increases because the tool doesn't function like other tools.

If you rely on a container orchestration engine that's already cross cloud that can be easily deployed by user or cloud operator, and fill in the gaps with what Trove wants to support, easy management of db's, you get to reuse a lot of the commons and the users slight increase in investment in dealing with the bit of extra plumbing in there allows other things to also be easily added to their cluster. Its very rare that a user would need to deploy/manage only a database. The net load on the operator decreases, not increases.

I think the user-side tool could totally deploy on Kubernetes clusters
-- if that was the only possible target that would make it a Kubernetes
tool more than an open infrastructure tool, but that's definitely a
possibility. I'm not sure work is needed there though, there are already
tools (or charts) doing that ?

For a server-side approach where you want to provide a DB-provisioning
API, I fear that making the functionality depend on K8s would make
TroveV2/Hoard would not only depend on Heat and Nova, but also depend on
something that would deploy a Kubernetes cluster (Magnum?), which would
likely hurt its adoption (and reusability in simpler setups). Since
databases would just work perfectly well in VMs, it feels like a
gratuitous dependency addition ?

We generally need to be very careful about creating dependencies between
OpenStack projects. On one side there are base services (like Keystone)
that we said it was alright to depend on, but depending on anything else
is likely to reduce adoption. Magnum adoption suffers from its
dependency on Heat. If Heat starts depending on Zaqar, we make the
problem worse. I understand it's a hard trade-off: you want to reuse
functionality rather than reinvent it in every project... we just need
to recognize the cost of doing that.

--
Thierry Carrez (ttx)


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 22, 2017 by Fox,_Kevin_M (29,360 points)   1 3 4
0 votes

2017-06-22 18:59 GMT+03:00 Fox, Kevin M Kevin.Fox@pnnl.gov:

My $0.02.

That view of dependencies is why Kubernetes development is outpacing
OpenStacks and some users are leaving IMO. Not trying to be mean here but
trying to shine some light on this issue.

Kubernetes at its core has essentially something kind of equivalent to
keystone (k8s rbac), nova (container mgmt), cinder (pv/pvc/storageclasses),
heat with convergence (deployments/daemonsets/etc), barbican (secrets),
designate (kube-dns), and octavia (kube-proxy,svc,ingress) in one unit. Ops
dont have to work hard to get all of it, users can assume its all there,
and devs don't have many silo's to cross to implement features that touch
multiple pieces.

This core functionality being combined has allowed them to land features
that are really important to users but has proven difficult for OpenStack
to do because of the silo's. OpenStack's general pattern has been, stand up
a new service for new feature, then no one wants to depend on it so its
ignored and each silo reimplements a lesser version of it themselves.

Totally agree

The OpenStack commons then continues to suffer.

We need to stop this destructive cycle.

OpenStack needs to figure out how to increase its commons. Both internally
and externally. etcd as a common service was a step in the right direction.

I think k8s needs to be another common service all the others can rely on.
That could greatly simplify the rest of the OpenStack projects as a lot of
its functionality no longer has to be implemented in each project.

We also need a way to break down the silo walls and allow more cross
project collaboration for features. I fear the new push for letting
projects run standalone will make this worse, not better, further
fracturing OpenStack.

Thanks,
Kevin


From: Thierry Carrez [thierry@openstack.org]
Sent: Thursday, June 22, 2017 12:58 AM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect
Trove

Fox, Kevin M wrote:

[...]
If you build a Tessmaster clone just to do mariadb, then you share
nothing with the other communities and have to reinvent the wheel, yet
again. Operators load increases because the tool doesn't function like
other tools.

If you rely on a container orchestration engine that's already cross
cloud that can be easily deployed by user or cloud operator, and fill in
the gaps with what Trove wants to support, easy management of db's, you get
to reuse a lot of the commons and the users slight increase in investment
in dealing with the bit of extra plumbing in there allows other things to
also be easily added to their cluster. Its very rare that a user would need
to deploy/manage only a database. The net load on the operator decreases,
not increases.

I think the user-side tool could totally deploy on Kubernetes clusters
-- if that was the only possible target that would make it a Kubernetes
tool more than an open infrastructure tool, but that's definitely a
possibility. I'm not sure work is needed there though, there are already
tools (or charts) doing that ?

For a server-side approach where you want to provide a DB-provisioning
API, I fear that making the functionality depend on K8s would make
TroveV2/Hoard would not only depend on Heat and Nova, but also depend on
something that would deploy a Kubernetes cluster (Magnum?), which would
likely hurt its adoption (and reusability in simpler setups). Since
databases would just work perfectly well in VMs, it feels like a
gratuitous dependency addition ?

We generally need to be very careful about creating dependencies between
OpenStack projects. On one side there are base services (like Keystone)
that we said it was alright to depend on, but depending on anything else
is likely to reduce adoption. Magnum adoption suffers from its
dependency on Heat. If Heat starts depending on Zaqar, we make the
problem worse. I understand it's a hard trade-off: you want to reuse
functionality rather than reinvent it in every project... we just need
to recognize the cost of doing that.

--
Thierry Carrez (ttx)


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

--
Best regards,
Andrey Kurilin.


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 22, 2017 by andr.kurilin_at_gmai (300 points)  
0 votes

On 06/22/2017 11:59 AM, Fox, Kevin M wrote:
My $0.02.

That view of dependencies is why Kubernetes development is outpacing OpenStacks and some users are leaving IMO. Not trying to be mean here but trying to shine some light on this issue.

Kubernetes at its core has essentially something kind of equivalent to keystone (k8s rbac), nova (container mgmt), cinder (pv/pvc/storageclasses), heat with convergence (deployments/daemonsets/etc), barbican (secrets), designate (kube-dns), and octavia (kube-proxy,svc,ingress) in one unit. Ops dont have to work hard to get all of it, users can assume its all there, and devs don't have many silo's to cross to implement features that touch multiple pieces.

I think it's kind of hysterical that you're advocating a monolithic
approach when the thing you're advocating (k8s) is all about enabling
non-monolithic microservices architectures.

Look, the fact of the matter is that OpenStack's mission is larger than
that of Kubernetes. And to say that "Ops don't have to work hard" to get
and maintain a Kubernetes deployment (which, frankly, tends to be dozens
of Kubernetes deployments, one for each tenant/project/namespace) is
completely glossing over the fact that by abstracting away the
infrastructure (k8s' "cloud provider" concept), Kubernetes developers
simply get to ignore some of the hardest and trickiest parts of operations.

So, let's try to compare apples to apples, shall we?

It sounds like the end goal that you're advocating -- more than anything
else -- is an easy-to-install package of OpenStack services that
provides a Kubernetes-like experience for application developers.

I 100% agree with that goal. 100%.

But pulling Neutron, Cinder, Keystone, Designate, Barbican, and Octavia
back into Nova is not the way to do that. You're trying to solve a
packaging and installation problem with a code structure solution.

In fact, if you look at the Kubernetes development community, you see
the opposite direction being taken: they have broken out and are
actively breaking out large pieces of the Kubernetes repository/codebase
into separate repositories and addons/plugins. And this is being done to
accelerate development of Kubernetes in very much the same way that
splitting services out of Nova was done to accelerate the development of
those various pieces of infrastructure code.

This core functionality being combined has allowed them to land features that are really important to users but has proven difficult for OpenStack to do because of the silo's. OpenStack's general pattern has been, stand up a new service for new feature, then no one wants to depend on it so its ignored and each silo reimplements a lesser version of it themselves.

I disagree. I believe the reason Kubernetes is able to land features
that are "really important to users" is primarily due to the following
reasons:

1) The Kubernetes technical leadership strongly resists pressure from
vendors to add yet-another-specialized-feature to the codebase. This
ability to say "No" pays off in spades with regards to stability and focus.

2) The mission of Kubernetes is much smaller than OpenStack. If the
OpenStack community were able to say "OpenStack is a container
orchestration system", and not "OpenStack is a ubiquitous open source
cloud operating system", we'd probably be able to deliver features in a
more focused fashion.

The OpenStack commons then continues to suffer.

We need to stop this destructive cycle.

OpenStack needs to figure out how to increase its commons. Both internally and externally. etcd as a common service was a step in the right direction.

I think k8s needs to be another common service all the others can rely on. That could greatly simplify the rest of the OpenStack projects as a lot of its functionality no longer has to be implemented in each project.

I don't disagree with the goal of being able to rely on Kubernetes for
many things. But relying on Kubernetes doesn't solve the "I want some
easy-to-install infrastructure" problem. Nor does it solve the types of
advanced networking scenarios that the NFV community requires.

We also need a way to break down the silo walls and allow more cross project collaboration for features. I fear the new push for letting projects run standalone will make this worse, not better, further fracturing OpenStack.

Perhaps you are referring to me with the above? As I said on Twitter,
"Make your #OpenStack project usable by and useful for things outside of
the OpenStack ecosystem. Fewer deps. Do one thing well. Solid APIs."

I don't think that the above leads to "further fracturing OpenStack". I
think it leads to solid, reusable components.

Best,
-jay

Thanks,
Kevin


From: Thierry Carrez [thierry@openstack.org]
Sent: Thursday, June 22, 2017 12:58 AM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove

Fox, Kevin M wrote:

[...]
If you build a Tessmaster clone just to do mariadb, then you share nothing with the other communities and have to reinvent the wheel, yet again. Operators load increases because the tool doesn't function like other tools.

If you rely on a container orchestration engine that's already cross cloud that can be easily deployed by user or cloud operator, and fill in the gaps with what Trove wants to support, easy management of db's, you get to reuse a lot of the commons and the users slight increase in investment in dealing with the bit of extra plumbing in there allows other things to also be easily added to their cluster. Its very rare that a user would need to deploy/manage only a database. The net load on the operator decreases, not increases.

I think the user-side tool could totally deploy on Kubernetes clusters
-- if that was the only possible target that would make it a Kubernetes
tool more than an open infrastructure tool, but that's definitely a
possibility. I'm not sure work is needed there though, there are already
tools (or charts) doing that ?

For a server-side approach where you want to provide a DB-provisioning
API, I fear that making the functionality depend on K8s would make
TroveV2/Hoard would not only depend on Heat and Nova, but also depend on
something that would deploy a Kubernetes cluster (Magnum?), which would
likely hurt its adoption (and reusability in simpler setups). Since
databases would just work perfectly well in VMs, it feels like a
gratuitous dependency addition ?

We generally need to be very careful about creating dependencies between
OpenStack projects. On one side there are base services (like Keystone)
that we said it was alright to depend on, but depending on anything else
is likely to reduce adoption. Magnum adoption suffers from its
dependency on Heat. If Heat starts depending on Zaqar, we make the
problem worse. I understand it's a hard trade-off: you want to reuse
functionality rather than reinvent it in every project... we just need
to recognize the cost of doing that.

--
Thierry Carrez (ttx)


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 22, 2017 by Jay_Pipes (59,760 points)   3 11 14
0 votes

tl;dr - I think Trove's successor has a future, but there are two
conflicting ideas presented and Trove should pick one or the other.

Excerpts from Amrith Kumar's message of 2017-06-18 07:35:49 -0400:

We have learned a lot from v1, and the hope is that we can address that in
v2. Some of the more significant things that I have learned are:

  • We should adopt a versioned front-end API from the very beginning; making
    the REST API versioned is not a ‘v2 feature’

+1

  • A guest agent running on a tenant instance, with connectivity to a shared
    management message bus is a security loophole; encrypting traffic,
    per-tenant-passwords, and any other scheme is merely lipstick on a security
    hole

This is a broad statement, and I'm not sure I understand the actual risk
you're presenting here as "a security loophole".

How else would you administer a database server than through some kind
of agent? Whether that agent is a python daemon of our making, sshd, or
whatever kubernetes component lets you change things, they're all
administrative pieces that sit next to the resource.

  • Reliance on Nova for compute resources is fine, but dependence on Nova VM
    specific capabilities (like instance rebuild) is not; it makes things like
    containers or bare-metal second class citizens

I whole heartedly agree that rebuild is a poor choice for database
servers. In fact, I believe it is a completely non-scalable feature that
should not even exist in Nova.

This is kind of a "we shouldn't be this". What should we be running
database clusters on?

  • A fair portion of what Trove does is resource orchestration; don’t
    reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as far
    along when Trove got started but that’s not the case today and we have an
    opportunity to fix that now

Yeah. You can do that. I'm not really sure what it gets you at this
level. There was an effort a few years ago to use Heat for Trove and
some other pieces, but they fell short at the point where they had to
ask Heat for a few features like, oddly enough, rebuild confirmation
after test. Also, it increases friction to your project if your project
requires Heat in a cloud. That's a whole new service that one would have
to choose to expose or not to users and manage just for Trove. That's
a massive dependency, and it should come with something significant. I
don't see what it actually gets you when you already have to keep track
of your resources for cluster membership purposes anyway.

  • A similarly significant portion of what Trove does is to implement a
    state-machine that will perform specific workflows involved in implementing
    database specific operations. This makes the Trove taskmanager a stateful
    entity. Some of the operations could take a fair amount of time. This is a
    serious architectural flaw

A state driven workflow is unavoidable if you're going to do cluster
manipulation. So you can defer this off to Mistral or some other
workflow engine, but I don't think it's an architectural flaw that
Trove does it
. Clusters have states. They have to be tracked. Do that
well and your users will be happy.

  • Tenants should not ever be able to directly interact with the underlying
    storage and compute used by database instances; that should be the default
    configuration, not an untested deployment alternative

Agreed. There's no point in having an "inside the cloud" service if
you're just going to hand them the keys to the VMs and volumes anyway.

The point of something like Trove is to be able to retain control at the
operator level, and only give users the interface you promised,
optimized without the limitations of the cloud.

  • The CI should test all databases that are considered to be ‘supported’
    without excessive use of resources in the gate; better code modularization
    will help determine the tests which can safely be skipped in testing changes

Take the same approach as the other driver-hosting things. If it's
in-tree, it has to have a gate test.

  • Clusters should be first class citizens not an afterthought, single
    instance databases may be the ‘special case’, not the other way around

+1

  • The project must provide guest images (or at least complete tooling for
    deployers to build these); while the project can’t distribute operating
    systems and database software, the current deployment model merely impedes
    adoption

IIRC the project provides dib elements and a basic command line to build
images for your cloud, yes? Has that not worked out?

  • Clusters spanning OpenStack deployments are a real thing that must be
    supported

This is the most problematic thing you asserted. There are two basic
desires I see that drive a Trove adoption:

1) I need database clusters and I don't know how to do them right.
2) I need high performance/availability/capacity databases and my
cloud's standard VM flavors/hosts/networks/disks/etc. stand in the way
of that.

For the openstack-spanning cluster, thing, (1) is fine. But (1) can and
probably should be handled by things like Helm, Juju, Ansible, Habitat,
Docker Compose, etc.

(2) is much more likely to draw people into an official "inside the cloud"
Trove deployment. Let the operators install Ironic, wire up some baremetal
with huge disks or powerful RAID controllers or an infiniband mesh,
and build their own images with tuned kernels and tightly controlled
builds of MySQL/MariaDB/Postgres/MongoDB/etc.

Don't let the users know anything about the computers their database
cluster runs on. They get cluster access details, and knobs that
are workload specific. But not all the knobs, just the knobs that an
operator can't possibly know. And in return you give them highly capable
databases.

But (2) is directly counter to (1). I would say pick one, and focus on
that for Trove. To me, (2) is the more interesting story. (1) is a place
to let 1000 flowers bloom (in many cases they already have, and just
need porting from AWS/GCE/Azure/DigitalOcean to OpenStack). If you want
to run cross cloud, you are accepting the limitations of multi-cloud,
and should likely be building cloud-native apps that don't rely on a
beefy database cluster.


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 22, 2017 by Clint_Byrum (40,940 points)   4 5 9
0 votes

-----Original Message-----
From: Zane Bitter [mailto:zbitter@redhat.com]
Sent: June-20-17 4:57 PM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect
Trove

On 20/06/17 11:45, Jay Pipes wrote:

Good discussion, Zane. Comments inline.

++

On 06/20/2017 11:01 AM, Zane Bitter wrote:

On 20/06/17 10:08, Jay Pipes wrote:

On 06/20/2017 09:42 AM, Doug Hellmann wrote:

Does "service VM" need to be a first-class thing? Akanda creates
them, using a service user. The VMs are tied to a "router" which
is
the billable resource that the user understands and interacts with
through the API.

Frankly, I believe all of these types of services should be built
as
applications that run on OpenStack (or other) infrastructure. In
other words, they should not be part of the infrastructure itself.

There's really no need for a user of a DBaaS to have access to the
host or hosts the DB is running on. If the user really wanted that,
they would just spin up a VM/baremetal server and install the thing
themselves.

Hey Jay,
I'd be interested in exploring this idea with you, because I think
everyone agrees that this would be a good goal, but at least in my
mind it's not obvious what the technical solution should be.
(Actually, I've read your email a bunch of times now, and I go back
and forth on which one you're actually advocating for.) The two
options, as I see it, are as follows:

1) The database VMs are created in the user's tena^W project. They
connect directly to the tenant's networks, are governed by the
user's
quota, and are billed to the project as Nova VMs (on top of whatever
additional billing might come along with the management services). A
[future] feature in Nova (https://review.openstack.org/#/c/438134/)
allows the Trove service to lock down access so that the user cannot
actually interact with the server using Nova, but must go through
the
Trove API. On a cloud that doesn't include Trove, a user could run
Trove as an application themselves and all it would have to do
differently is not pass the service token to lock down the VM.

alternatively:

2) The database VMs are created in a project belonging to the
operator of the service. They're connected to the user's network
through , and isolated from other users' databases running in
the same project through .
Trove has its own quota management and billing. The user cannot
interact with the server using Nova since it is owned by a different
project. On a cloud that doesn't include Trove, a user could run
Trove as an application themselves, by giving it credentials for
their own project and disabling all of the cross-tenant networking
stuff.

None of the above :)

Don't think about VMs at all. Or networking plumbing. Or volume
storage or any of that.

OK, but somebody has to ;)

Think only in terms of what a user of a DBaaS really wants. At the
end
of the day, all they want is an address in the cloud where they can
point their application to write and read data from.

Do they want that data connection to be fast and reliable? Of course,
but how that happens is irrelevant to them

Do they want that data to be safe and backed up? Of course, but how
that happens is irrelevant to them.

Fair enough. The world has changed a lot since RDS (which was the model
for Trove) was designed, it's certainly worth reviewing the base
assumptions before embarking on a new design.

The problem with many of these high-level *aaS projects is that they
consider their user to be a typical tenant of general cloud
infrastructure -- focused on launching VMs and creating volumes and
networks etc. And the discussions around the implementation of these
projects always comes back to minutia about how to set up secure
communication channels between a control plane message bus and the
service VMs.

Incidentally, the reason that discussions always come back to that is
because OpenStack isn't very good at it, which is a huge problem not
only for the *aaS projects but for user applications in general running
on OpenStack.

If we had fine-grained authorisation and ubiquitous multi-tenant
asynchronous messaging in OpenStack then I firmly believe that we, and
application developers, would be in much better shape.

If you create these projects as applications that run on cloud
infrastructure (OpenStack, k8s or otherwise),

I'm convinced there's an interesting idea here, but the terminology
you're using doesn't really capture it. When you say 'as applications
that run on cloud infrastructure', it sounds like you mean they should
run in a Nova VM, or in a Kubernetes cluster somewhere, rather than on
the OpenStack control plane. I don't think that's what you mean though,
because you can (and IIUC Rackspace does) deploy OpenStack services
that way already, and it has no real effect on the architecture of
those services.

then the discussions focus
instead on how the real end-users -- the ones that actually call the
APIs and utilize the service -- would interact with the APIs and not
the underlying infrastructure itself.

Here's an example to think about...

What if a provider of this DBaaS service wanted to jam 100 database
instances on a single VM and provide connectivity to those database
instances to 100 different tenants?

Would those tenants know if those databases were all serviced from a
single database server process running on the VM?

You bet they would when one (or all) of the other 99 decided to run a
really expensive query at an inopportune moment :)

Or 100 contains each
running a separate database server process? Or 10 containers running
10 database server processes each?

No, of course not. And the tenant wouldn't care at all, because the

Well, if they had any kind of regulatory (or even performance)
requirements then the tenant might care really quite a lot. But I take
your point that many might not and it would be good to be able to offer
them lower cost options.

point of the DBaaS service is to get a database. It isn't to get one
or more VMs/containers/baremetal servers.

I'm not sure I entirely agree here. There are two kinds of DBaaS. One
is a data API: a multitenant database a la DynamoDB. Those are very
cool, and I'm excited about the potential to reduce the granularity of
billing to a minimum, in much the same way Swift does for storage, and
I'm sad that OpenStack's attempt in this space (MagnetoDB) didn't work
out. But Trove is not that.

People use Trove because they want to use a particular database, but
still have all the upgrades, backups, &c. handled for them. Given that
the choice of database is explicitly not abstracted away from them,
things like how many different VMs/containers/baremetal servers the
database is running on are very much relevant IMHO, because what you
want depends on both the database and how you're trying to use it. And
because (afaik) none of them have native multitenancy, it's necessary
that no tenant should have to share with any other.

Essentially Trove operates at a moderate level of abstraction -
somewhere between managing the database + the infrastructure it runs on
yourself and just an API endpoint you poke data into. It also operates
at the coarse end of a granularity spectrum running from
VMs->Containers->pay as you go.

It's reasonable to want to move closer to the middle of the granularity
spectrum. But you can't go all the way to the high abstraction/fine
grained ends of the spectra (which turn out to be equivalent) without
becoming something qualitatively different.

At the end of the day, I think Trove is best implemented as a hosted
application that exposes an API to its users that is entirely
separate
from the underlying infrastructure APIs like Cinder/Nova/Neutron.

This is similar to Kevin's k8s Operator idea, which I support but in
a
generic fashion that isn't specific to k8s.

In the same way that k8s abstracts the underlying infrastructure (via
its "cloud provider" concept), I think that Trove and similar
projects
need to use a similar abstraction and focus on providing a different
API to their users that doesn't leak the underlying infrastructure
API
concepts out.

OK, so trying to summarise (stop me if I'm getting it wrong):
essentially you support option (2) because it is a closed abstraction.
Trove has its own quota management, billing, &c. and the user can't see
the VM, so the operator is free to substitute a different backend that
allocates compute capacity in finer-grained increments than Nova does.

Interestingly, that's only an issue because there is no finer-grained
compute resource than a VM available through the OpenStack API. If
there were an OpenStack API (or even just a Keystone-authenticated API)
to a shared, multitenant container orchestration cluster, this wouldn't
be an issue. But apart from OpenShift, I can't think of any cloud

[Hongbin Lu] I just wanted to clarify that there is such OpenStack API, which is Zun. Zun's API is container-centric that would give you a finer-grained compute resource than a VM, which is a container. Zun is Keystone-authenticated and multitenant, and it can bundle with Heat [1] (or Senlin in the future) to provide container orchestration equivalent functionalities.

[1] https://review.openstack.org/#/c/437810/

service that's doing that - AWS, Google, OpenStack are all using the
model where the COE cluster is deployed on VMs that are owned by a
particular tenant. Of all the things you could run in containers on
shared servers, databases have arguably the most to lose (performance,
security) and the least to gain (since they're by definition stateful).
So my question is:
if this is such a good idea for databases, why isn't anybody doing it
for everything container-based? i.e. instead of Magnum/Zun should we
just be working on a Keystone auth gateway for OpenShift (a.k.a. the
one thing that everyone had hitherto agreed was definitely out of
scope :D )?

Until then it seems to me that the tradeoff is between decoupling it
from the particular cloud it's running on so that users can optionally
deploy it standalone (essentially Vish's proposed solution for the *aaS
services from many moons ago) vs. decoupling it from OpenStack in
general so that the operator has more flexibility in how to deploy.

I'd love to be able to cover both - from a user using it standalone to
spin up and manage a DB in containers on a shared PaaS, through to a
user accessing it as a service to provide a DB running on a dedicated
VM or bare metal server, and everything in between. I don't know is
such a thing is feasible. I suspect we're going to have to talk a lot
about VMs and network plumbing and volume storage :)

cheers,
Zane.

Best,
-jay

Of course the current situation, as Amrith alluded to, where the
default is option (1) except without the lock-down feature in Nova,
though some operators are deploying option (2) but it's not tested
upstream... clearly that's the worst of all possible worlds, and
AIUI
nobody disagrees with that.

To my mind, (1) sounds more like "applications that run on OpenStack
(or other) infrastructure", since it doesn't require stuff like the
admin-only cross-project networking that makes it effectively "part
of the infrastructure itself" - as evidenced by the fact that
unprivileged users can run it standalone with little more than a
simple auth middleware change. But I suspect you are going to use
similar logic to argue for (2)? I'd be interested to hear your
thoughts.

cheers,
Zane.



OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


____ OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-
request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 22, 2017 by hongbin.lu_at_huawei (11,620 points)   2 3 4
0 votes

No, I'm not necessarily advocating a monolithic approach.

I'm saying that they have decided to start with functionality and accept whats needed to get the task done. Theres not really such strong walls between the various functionality, rbac/secrets/kublet/etc. They don't spawn off a whole new project just to add functionality. they do so only when needed. They also don't balk at one feature depending on another.

rbac's important, so they implemented it. ssl cert management was important. so they added that. adding a feature that restricts secret downloads only to the physical nodes need them, could then reuse the rbac system and ssl cert management.

Their sigs are more oriented to features/functionality (or catagories there of), not as much specific components. We need to do X. X may involve changes to components A and B.

OpenStack now tends to start with A and B and we try and work backwards towards implementing X, which is hard due to the strong walls and unclear ownership of the feature. And the general solution has been to try and make C but not commit to C being in the core so users cant depend on it which hasn't proven to be a very successful pattern.

Your right, they are breaking up their code base as needed, like nova did. I'm coming around to that being a pretty good approach to some things. starting things is simpler, and if it ends up not needing its own whole project, then it doesn't get one. if it needs one, then it gets one. Its not by default, start whole new project with db user, db schema, api, scheduler, etc. And the project might not end up with daemons split up in exactly the way you would expect if you prepoptomized breaking off a project not knowing exactly how it might integrate with everything else.

Maybe the porcelain api that's been discussed for a while is part of the solution. initial stuff can prototyped/start there and break off as needed to separate projects and moved around without the user needing to know where it ends up.

Your right that OpenStack's scope is much grater. and think that the commons are even more important in that case. If it doesn't have a solid base, every project has to re-implement its own base. That takes a huge amount of manpower all around. Its not sustainable.

I guess we've gotten pretty far away from discussing Trove at this point.

Thanks,
Kevin


From: Jay Pipes [jaypipes@gmail.com]
Sent: Thursday, June 22, 2017 10:05 AM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove

On 06/22/2017 11:59 AM, Fox, Kevin M wrote:
My $0.02.

That view of dependencies is why Kubernetes development is outpacing OpenStacks and some users are leaving IMO. Not trying to be mean here but trying to shine some light on this issue.

Kubernetes at its core has essentially something kind of equivalent to keystone (k8s rbac), nova (container mgmt), cinder (pv/pvc/storageclasses), heat with convergence (deployments/daemonsets/etc), barbican (secrets), designate (kube-dns), and octavia (kube-proxy,svc,ingress) in one unit. Ops dont have to work hard to get all of it, users can assume its all there, and devs don't have many silo's to cross to implement features that touch multiple pieces.

I think it's kind of hysterical that you're advocating a monolithic
approach when the thing you're advocating (k8s) is all about enabling
non-monolithic microservices architectures.

Look, the fact of the matter is that OpenStack's mission is larger than
that of Kubernetes. And to say that "Ops don't have to work hard" to get
and maintain a Kubernetes deployment (which, frankly, tends to be dozens
of Kubernetes deployments, one for each tenant/project/namespace) is
completely glossing over the fact that by abstracting away the
infrastructure (k8s' "cloud provider" concept), Kubernetes developers
simply get to ignore some of the hardest and trickiest parts of operations.

So, let's try to compare apples to apples, shall we?

It sounds like the end goal that you're advocating -- more than anything
else -- is an easy-to-install package of OpenStack services that
provides a Kubernetes-like experience for application developers.

I 100% agree with that goal. 100%.

But pulling Neutron, Cinder, Keystone, Designate, Barbican, and Octavia
back into Nova is not the way to do that. You're trying to solve a
packaging and installation problem with a code structure solution.

In fact, if you look at the Kubernetes development community, you see
the opposite direction being taken: they have broken out and are
actively breaking out large pieces of the Kubernetes repository/codebase
into separate repositories and addons/plugins. And this is being done to
accelerate development of Kubernetes in very much the same way that
splitting services out of Nova was done to accelerate the development of
those various pieces of infrastructure code.

This core functionality being combined has allowed them to land features that are really important to users but has proven difficult for OpenStack to do because of the silo's. OpenStack's general pattern has been, stand up a new service for new feature, then no one wants to depend on it so its ignored and each silo reimplements a lesser version of it themselves.

I disagree. I believe the reason Kubernetes is able to land features
that are "really important to users" is primarily due to the following
reasons:

1) The Kubernetes technical leadership strongly resists pressure from
vendors to add yet-another-specialized-feature to the codebase. This
ability to say "No" pays off in spades with regards to stability and focus.

2) The mission of Kubernetes is much smaller than OpenStack. If the
OpenStack community were able to say "OpenStack is a container
orchestration system", and not "OpenStack is a ubiquitous open source
cloud operating system", we'd probably be able to deliver features in a
more focused fashion.

The OpenStack commons then continues to suffer.

We need to stop this destructive cycle.

OpenStack needs to figure out how to increase its commons. Both internally and externally. etcd as a common service was a step in the right direction.

I think k8s needs to be another common service all the others can rely on. That could greatly simplify the rest of the OpenStack projects as a lot of its functionality no longer has to be implemented in each project.

I don't disagree with the goal of being able to rely on Kubernetes for
many things. But relying on Kubernetes doesn't solve the "I want some
easy-to-install infrastructure" problem. Nor does it solve the types of
advanced networking scenarios that the NFV community requires.

We also need a way to break down the silo walls and allow more cross project collaboration for features. I fear the new push for letting projects run standalone will make this worse, not better, further fracturing OpenStack.

Perhaps you are referring to me with the above? As I said on Twitter,
"Make your #OpenStack project usable by and useful for things outside of
the OpenStack ecosystem. Fewer deps. Do one thing well. Solid APIs."

I don't think that the above leads to "further fracturing OpenStack". I
think it leads to solid, reusable components.

Best,
-jay

Thanks,
Kevin


From: Thierry Carrez [thierry@openstack.org]
Sent: Thursday, June 22, 2017 12:58 AM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove

Fox, Kevin M wrote:

[...]
If you build a Tessmaster clone just to do mariadb, then you share nothing with the other communities and have to reinvent the wheel, yet again. Operators load increases because the tool doesn't function like other tools.

If you rely on a container orchestration engine that's already cross cloud that can be easily deployed by user or cloud operator, and fill in the gaps with what Trove wants to support, easy management of db's, you get to reuse a lot of the commons and the users slight increase in investment in dealing with the bit of extra plumbing in there allows other things to also be easily added to their cluster. Its very rare that a user would need to deploy/manage only a database. The net load on the operator decreases, not increases.

I think the user-side tool could totally deploy on Kubernetes clusters
-- if that was the only possible target that would make it a Kubernetes
tool more than an open infrastructure tool, but that's definitely a
possibility. I'm not sure work is needed there though, there are already
tools (or charts) doing that ?

For a server-side approach where you want to provide a DB-provisioning
API, I fear that making the functionality depend on K8s would make
TroveV2/Hoard would not only depend on Heat and Nova, but also depend on
something that would deploy a Kubernetes cluster (Magnum?), which would
likely hurt its adoption (and reusability in simpler setups). Since
databases would just work perfectly well in VMs, it feels like a
gratuitous dependency addition ?

We generally need to be very careful about creating dependencies between
OpenStack projects. On one side there are base services (like Keystone)
that we said it was alright to depend on, but depending on anything else
is likely to reduce adoption. Magnum adoption suffers from its
dependency on Heat. If Heat starts depending on Zaqar, we make the
problem worse. I understand it's a hard trade-off: you want to reuse
functionality rather than reinvent it in every project... we just need
to recognize the cost of doing that.

--
Thierry Carrez (ttx)


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 22, 2017 by Fox,_Kevin_M (29,360 points)   1 3 4
0 votes

(Top posting. Deal with it ;)

You're both right!

Making OpenStack monolithic is not the answer. In fact, rearranging Git
repos has nothing to do with the answer.

But back in the day we had a process (incubation) for adding stuff to
OpenStack that it made sense to depend on being there. It was a highly
imperfect process. We got rid of that process with the big tent reform,
but didn't really replace it with anything at all. Tags never evolved
into a replacement as I hoped they would.

So now we have a bunch of things that are integral to building a
"Kubernetes-like experience for application developers" - secret
storage, DNS, load balancing, asynchronous messaging - that exist but
are not in most clouds. (Not to mention others like fine-grained
authorisation control that are completely MIA.)

Instead of trying to drive adoption of all of that stuff, we are either
just giving up or reinventing bits of it, badly, in multiple places. The
biggest enemy of "do one thing and do it well" is when a thing that you
need to do was chosen by a project in another silo as their "one thing",
but you don't want to just depend on that project because it's not
widely adopted.

I'm not saying this is an easy problem. It's something that the
proprietary public cloud providers don't face: if you have only one
cloud then you can just design everything to be as tightly integrated as
it needs to be. When you have multiple clouds and the components are
optional you have to do a bit more work. But if those components are
rarely used at all then you lose the feedback loop that helps create a
single polished implementation and everything else has to choose between
not integrating, or implementing just the bits it needs itself so that
whatever smaller feedback loop does manage to form, the benefits are
contained entirely within the silo. OpenStack is arguably the only cloud
project that has to deal with this. (Azure is also going into the same
market, but they already have the feedback loop set up because they run
their own public cloud built from the components.) Figuring out how to
empower the community to solve this problem is our #1 governance concern
IMHO.

In my view, one of the keys is to stop thinking of OpenStack as an
abstraction layer over a bunch of vendor technologies. If you think of
Nova as an abstraction layer over libvirt/Xen/HyperV, and Keystone as an
abstraction layer over LDAP/ActiveDirectory, and Cinder/Neutron as an
abstraction layer over a bunch of storage/network vendors, then two
things will happen. The first is unrelenting "pressure from vendors to
add yet-another-specialized-feature to the codebase" that you won't be
able to push back against because you can't point to a competing vision.
And the second is that you will never build a integrated,
application-centric cloud, because the integration bit needs to happen
at the layer above the backends we are abstracting.

We need to think of those things as the compute, authn, block storage
and networking components of an integrated, application-centric cloud.
And to remember that by no means are those the only components it will
need - "The mission of Kubernetes is much smaller than OpenStack";
there's a lot we need to do.

So no, the strength of k8s isn't in having a monolithic git repo (and I
don't think that's what Kevin was suggesting). That's actually a
slow-motion train-wreck waiting to happen. Its strength is being able to
do all of this stuff and still be easy enough to install, so that
there's no question of trying to build bits of it without relying on
shared primitives.

cheers,
Zane.

On 22/06/17 13:05, Jay Pipes wrote:
On 06/22/2017 11:59 AM, Fox, Kevin M wrote:

My $0.02.

That view of dependencies is why Kubernetes development is outpacing
OpenStacks and some users are leaving IMO. Not trying to be mean here
but trying to shine some light on this issue.

Kubernetes at its core has essentially something kind of equivalent to
keystone (k8s rbac), nova (container mgmt), cinder
(pv/pvc/storageclasses), heat with convergence
(deployments/daemonsets/etc), barbican (secrets), designate
(kube-dns), and octavia (kube-proxy,svc,ingress) in one unit. Ops dont
have to work hard to get all of it, users can assume its all there,
and devs don't have many silo's to cross to implement features that
touch multiple pieces.

I think it's kind of hysterical that you're advocating a monolithic
approach when the thing you're advocating (k8s) is all about enabling
non-monolithic microservices architectures.

Look, the fact of the matter is that OpenStack's mission is larger than
that of Kubernetes. And to say that "Ops don't have to work hard" to get
and maintain a Kubernetes deployment (which, frankly, tends to be dozens
of Kubernetes deployments, one for each tenant/project/namespace) is
completely glossing over the fact that by abstracting away the
infrastructure (k8s' "cloud provider" concept), Kubernetes developers
simply get to ignore some of the hardest and trickiest parts of operations.

So, let's try to compare apples to apples, shall we?

It sounds like the end goal that you're advocating -- more than anything
else -- is an easy-to-install package of OpenStack services that
provides a Kubernetes-like experience for application developers.

I 100% agree with that goal. 100%.

But pulling Neutron, Cinder, Keystone, Designate, Barbican, and Octavia
back into Nova is not the way to do that. You're trying to solve a
packaging and installation problem with a code structure solution.

In fact, if you look at the Kubernetes development community, you see
the opposite direction being taken: they have broken out and are
actively breaking out large pieces of the Kubernetes repository/codebase
into separate repositories and addons/plugins. And this is being done to
accelerate development of Kubernetes in very much the same way that
splitting services out of Nova was done to accelerate the development of
those various pieces of infrastructure code.

This core functionality being combined has allowed them to land
features that are really important to users but has proven difficult
for OpenStack to do because of the silo's. OpenStack's general pattern
has been, stand up a new service for new feature, then no one wants to
depend on it so its ignored and each silo reimplements a lesser
version of it themselves.

I disagree. I believe the reason Kubernetes is able to land features
that are "really important to users" is primarily due to the following
reasons:

1) The Kubernetes technical leadership strongly resists pressure from
vendors to add yet-another-specialized-feature to the codebase. This
ability to say "No" pays off in spades with regards to stability and focus.

2) The mission of Kubernetes is much smaller than OpenStack. If the
OpenStack community were able to say "OpenStack is a container
orchestration system", and not "OpenStack is a ubiquitous open source
cloud operating system", we'd probably be able to deliver features in a
more focused fashion.

The OpenStack commons then continues to suffer.

We need to stop this destructive cycle.

OpenStack needs to figure out how to increase its commons. Both
internally and externally. etcd as a common service was a step in the
right direction.

I think k8s needs to be another common service all the others can rely
on. That could greatly simplify the rest of the OpenStack projects as
a lot of its functionality no longer has to be implemented in each
project.

I don't disagree with the goal of being able to rely on Kubernetes for
many things. But relying on Kubernetes doesn't solve the "I want some
easy-to-install infrastructure" problem. Nor does it solve the types of
advanced networking scenarios that the NFV community requires.

We also need a way to break down the silo walls and allow more cross
project collaboration for features. I fear the new push for letting
projects run standalone will make this worse, not better, further
fracturing OpenStack.

Perhaps you are referring to me with the above? As I said on Twitter,
"Make your #OpenStack project usable by and useful for things outside of
the OpenStack ecosystem. Fewer deps. Do one thing well. Solid APIs."

I don't think that the above leads to "further fracturing OpenStack". I
think it leads to solid, reusable components.

Best,
-jay

Thanks,
Kevin


From: Thierry Carrez [thierry@openstack.org]
Sent: Thursday, June 22, 2017 12:58 AM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [trove][all][tc] A proposal to
rearchitect Trove

Fox, Kevin M wrote:

[...]
If you build a Tessmaster clone just to do mariadb, then you share
nothing with the other communities and have to reinvent the wheel,
yet again. Operators load increases because the tool doesn't function
like other tools.

If you rely on a container orchestration engine that's already cross
cloud that can be easily deployed by user or cloud operator, and fill
in the gaps with what Trove wants to support, easy management of
db's, you get to reuse a lot of the commons and the users slight
increase in investment in dealing with the bit of extra plumbing in
there allows other things to also be easily added to their cluster.
Its very rare that a user would need to deploy/manage only a
database. The net load on the operator decreases, not increases.

I think the user-side tool could totally deploy on Kubernetes clusters
-- if that was the only possible target that would make it a Kubernetes
tool more than an open infrastructure tool, but that's definitely a
possibility. I'm not sure work is needed there though, there are already
tools (or charts) doing that ?

For a server-side approach where you want to provide a DB-provisioning
API, I fear that making the functionality depend on K8s would make
TroveV2/Hoard would not only depend on Heat and Nova, but also depend on
something that would deploy a Kubernetes cluster (Magnum?), which would
likely hurt its adoption (and reusability in simpler setups). Since
databases would just work perfectly well in VMs, it feels like a
gratuitous dependency addition ?

We generally need to be very careful about creating dependencies between
OpenStack projects. On one side there are base services (like Keystone)
that we said it was alright to depend on, but depending on anything else
is likely to reduce adoption. Magnum adoption suffers from its
dependency on Heat. If Heat starts depending on Zaqar, we make the
problem worse. I understand it's a hard trade-off: you want to reuse
functionality rather than reinvent it in every project... we just need
to recognize the cost of doing that.

--
Thierry Carrez (ttx)


OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 22, 2017 by Zane_Bitter (21,640 points)   4 6 9
0 votes

Zane Bitter wrote:
But back in the day we had a process (incubation) for adding stuff to
OpenStack that it made sense to depend on being there. It was a highly
imperfect process. We got rid of that process with the big tent reform,
but didn't really replace it with anything at all. Tags never evolved
into a replacement as I hoped they would.

So now we have a bunch of things that are integral to building a
"Kubernetes-like experience for application developers" - secret
storage, DNS, load balancing, asynchronous messaging - that exist but
are not in most clouds.

Yet another tangent in that thread, but you seem to regret a past that
never happened. The "integrated release" was never about stuff that you
can "depend on being there". It was about things that were tested to
work well together, and released together. Projects were incubating
until they were deemed mature-enough (and embedded-enough in our
community) that it was fine for other projects to take the hit to be
tested with them, and take the risk of being released together. I don't
blame you for thinking otherwise: since the integrated release was the
only answer we gave, everyone assumed it answered their specific
question[1]. And that was why we needed to get rid of it.

If it was really about stuff you can "depend on being there" then most
OpenStack clouds would have had Swift, Ceilometer, Trove and Sahara.

Stuff you can "depend on being there" is a relatively-new concept:
https://governance.openstack.org/tc/reference/base-services.html

Yes, we can (and should) add more of those when they are relevant to
most OpenStack deployments, otherwise projects will never start
depending on Barbican and continue to NIH secrets management locally.
But since any addition comes with a high operational cost, we need to
consider them very carefully.

We should also consider use cases and group projects together (a concept
we start to call "constellations"). Yes, it would be great that, if you
have a IaaS/Compute use case, you could assume Designate is part of the mix.

[1] https://ttx.re/facets-of-the-integrated-release.html

--
Thierry Carrez (ttx)


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 23, 2017 by Thierry_Carrez (57,480 points)   3 8 13
0 votes

On 23/06/17 05:31, Thierry Carrez wrote:
Zane Bitter wrote:

But back in the day we had a process (incubation) for adding stuff to
OpenStack that it made sense to depend on being there. It was a highly
imperfect process. We got rid of that process with the big tent reform,
but didn't really replace it with anything at all. Tags never evolved
into a replacement as I hoped they would.

So now we have a bunch of things that are integral to building a
"Kubernetes-like experience for application developers" - secret
storage, DNS, load balancing, asynchronous messaging - that exist but
are not in most clouds.

Yet another tangent in that thread, but you seem to regret a past that
never happened.

It kind of did. The TC used to require that new projects graduating into
OpenStack didn't reimplement anything that an existing project in the
integrated release already did. e.g. Sahara and Trove were required to
use Heat for orchestration rather than rolling their own orchestration.
The very strong implication was that once something was officially
included in OpenStack you didn't develop the same thing again. It's true
that nothing was ever enforced against existing projects (the only
review was at incubation/graduation), but then again I can't think of a
situation where it would have come up at that time.

The "integrated release" was never about stuff that you
can "depend on being there". It was about things that were tested to
work well together, and released together. Projects were incubating
until they were deemed mature-enough (and embedded-enough in our
community) that it was fine for other projects to take the hit to be
tested with them, and take the risk of being released together. I don't
blame you for thinking otherwise: since the integrated release was the
only answer we gave, everyone assumed it answered their specific
question[1]. And that was why we needed to get rid of it.

I agree and I supported getting rid of it. But not all of the roles it
fulfilled (intended or otherwise) were replaced with anything. One of
the things that fell by the wayside was the sense some of us had that we
were building an integrated product with flexible deployment options,
rather than a series of disconnected islands.

If it was really about stuff you can "depend on being there" then most
OpenStack clouds would have had Swift, Ceilometer, Trove and Sahara.

Stuff you can "depend on being there" is a relatively-new concept:
https://governance.openstack.org/tc/reference/base-services.html

Yes, we can (and should) add more of those when they are relevant to
most OpenStack deployments, otherwise projects will never start
depending on Barbican and continue to NIH secrets management locally.
But since any addition comes with a high operational cost, we need to
consider them very carefully.

+1

We should also consider use cases and group projects together (a concept
we start to call "constellations"). Yes, it would be great that, if you
have a IaaS/Compute use case, you could assume Designate is part of the mix.

+1

[1] https://ttx.re/facets-of-the-integrated-release.html


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 23, 2017 by Zane_Bitter (21,640 points)   4 6 9
...