settingsLogin | Registersettings

[openstack-dev] [tc] Active or passive role with our database layer

0 votes

Hi all!

As the discussion around PostgreSQL has progressed, it has come clear to
me that there is a decently deep philosophical question on which we do
not currently share either definition or agreement. I believe that the
lack of clarity on this point is one of the things that makes the
PostgreSQL conversation difficult.

I believe the question is between these two things:

  • Should OpenStack assume the existence of an external database service
    that it treat as an black-box on the other side of a connection string?

  • Should OpenStack take an active and/or opinionated role in managing
    the database service?

A potentially obvious question about that (asked by Mike Bayer in a
different thread) is: "what do you mean by managing?"

What I mean by managing is doing all of the things you can do related to
database operational controls short of installing the software, writing
the basic db config files to disk and stopping and starting the
services. It means being much more prescriptive about what types of
config we support, validating config settings that cannot be overridden
at runtime and refusing to operate if they are unworkable.

Why would we want to be 'more active'? When managing and tuning
databases, there are some things that are driven by the environment and
some things that are driven by the application.

Things that are driven by the environment include things like the amount
of RAM actually available, whether or not the machines running the
database are dedicated or shared, firewall settings, selinux settings
and what versions of software are available.

Things that are driven by the application are things like character set
and collation, schema design, data types, schema upgrade and HA strategies.

One might argue that HA strategies are an operator concern, but in
reality the set of workable HA strategies is tightly constrained by how
the application works, and the pairing an application expecting one HA
strategy with a deployment implementing a different one can have
negative results ranging from unexpected downtime to data corruption.

For example: An HA strategy using slave promotion and a VIP that points
at the current write master paired with an application incorrectly
configured to do such a thing can lead to writes to the wrong host after
a failover event and an application that seems to be running fine until
the data turns up weird after a while.

For the areas in which the characteristics of the database are tied
closely to the application behavior, there is a constrained set of valid
choices at the database level. Sometimes that constrained set only has
one member.

The approach to those is what I'm talking about when I ask the question
about "external" or "active".

In the "external" approach, we document the expectations and then write
the code assuming that the database is set up appropriately. We may
provide some helper tools, such as 'nova-manage db sync' and
documentation on the sequence of steps the operator should take.

In the "active" approach, we still document expectations, but we also
validate them. If they are not what we expect but can be changed at
runtime, we change them overriding conflicting environmental config, and
if we can't, we hard-stop indicating an unsuitable environment. Rather
than providing helper tools, we perform the steps needed ourselves, in
the order they need to be performed, ensuring that they are done in the
manner in which they need to be done.

Some examples:

  • Character Sets / Collations

We currently enforce at testing time that all database migrations are
explicit about InnoDB. We also validate in oslo.db that table character
sets have the string 'utf8' in them. (only on MySQL) We do not have any
check for case-sensitive or case-insensitive collations (these affect
sorting and comparison operations) Because we don't, different server
config settings or different database backends for different clouds can
actually behave differently through the REST API.

To deal with that:

First we'd have to decide whether case sensitive or case insensitive was
what we wanted. If we decided we wanted case sensitive, we could add an
enforcement of that in oslo.db, and write migrations to get from case
insensitive indexes to case sensitive indexes on tables where we
detected that a case insensitive collation had been used. If we decided
we wanted to stick with case insensitive we could similarly add code to
enforce it on MySQL. To enforce it actively on PostgresSQL, we'd need to
either switch our code that's using comparisons to use the sqlalchemy
case-insensitive versions explicitly, or maybe write some sort of
overloaded driver for PG that turns all comparisons into
case-insensitive, which would wrap both sides of comparisons in lower()
calls (which has some indexing concerns, but let's ignore that for the
moment) We could also take the 'external' approach and just document it,
then define API tests and try to tie the insensitive behavior in the API
to Interop Compliance. I'm not 100% sure how a db operator would
remediate this - but PG has some fancy computed index features - so
maybe it would be possible.

A similar issue lurks with the fact that MySQL unicode storage is 3-byte
by default and 4-byte is opt-in. We could take the 'external' approach
and document it and assume the operator has configured their my.cnf with
the appropriate default, or taken an 'active' approach where we override
it in all the models and make migrations to get us from 3 to 4 byte.

  • Schema Upgrades

The way you roll out online schema changes is highly dependent on your
database architecture.

Just limiting to the MySQL world:

If you do Galera, you can do roll them out in Total Order or Rolling
fashion. Total Order locks basically everything while it's happening, so
isn't a candidate for "online". In rolling you apply the schema change
to one node at a time. If you do that, the application has to be able to
deal with both forms of the table, and you have to deal with ensuring
that data can replicate appropriately while the schema change is happening.

If you do DRBD active/passive or a single-node deployment you only have
one upgrade operation to perform, but you will only lock certain things
- depending on what schema change operations you were performing.

If you do master/slave, you can roll out the schema change to your
slaves one at a time, wait for them all to catch up, then promote a
slave taking the current master out of commission - update the old
master then then put it into the slave pool. Like Galera rolling, the
app needs to be able to handle old and new versions and the replication
stream needs to be able to replicate between the versions.

Making sure that the stream is able to replicate puts a set of
limitations on the types of schema changes you can perform, but it is an
understandable constrained set.

In either approach the OpenStack service has to be able to talk to both
old and new versions of the schema. And in either approach we need to
make sure to limit the schema change operations to the set that can be
accomplished in an online fashion. We also have to be careful to not
start writing values to new columns until all of the nodes have been
updated, because the replication stream can't replicate the new column
value to nodes that don't have the new column.

In either approach we can decide to limit the number of architectures we
support for "online" upgrades.

In an 'external' approach, we make sure to do those things, we write
documentation and we assume the database will be updated appropriately.
We can document that if the deployer chooses to do Total Order on
Galera, they will not have online upgrades. There will also have to be a
deployer step to let the services know that they can start writing
values to the new schema format once the upgrade is complete.

In an 'active' approach, we can notice that we have an update available
to run, and we can drive it from code. We can check for Galera, and if
it's there we can run the upgrade in Rolling fashion one node at a time
with no work needed on the part of the deployer. Since we're driving the
upgrade, we know when it's done, so we can signal ourselves to start
using the new version. We'd obviously have to pick the set of acceptable
architectures we can handle consistently orchestrating.

  • Versions

It's worth noting that behavior for schema updates and other things
change over time with backend database version. We set minimum versions
of other things, like libvirt and OVS - so we might also want to set
minimum versions for what we can support in the database. That way we
can know for a given release of OpenStack what DDL operations are safe
to use for a rolling upgrade and what are not. That means detecting such
a version and potentially refusing to perform an upgrade if the version
isn't acceptable. That reduces the operator's ability to choose what
version of the database software to run, but increases our ability to be
able to provide tooling and operations that we can be confident will work.

== Summary ==

These are just a couple of examples - but I hope they're at least mildly
useful to explain some of the sorts of issues at hand - and why I think
we need to clarify what our intent is separate from the issue of what
databases we "support".

Some operations have one and only one "right" way to be done. For those
operations if we take an 'active' approach, we can implement them once
and not make all of our deployers and distributors each implement and
run them. However, there is a cost to that. Automatic and prescriptive
behavior has a higher dev cost that is proportional to the number of
supported architectures. This then implies a need to limit deployer
architecture choices.

On the other hand, taking an 'external' approach allows us to federate
the work of supporting the different architectures to the deployers.
This means more work on the deployer's part, but also potentially a
greater amount of freedom on their part to deploy supporting services
the way they want. It means that some of the things that have been
requested of us - such as easier operation and an increase in the number
of things that can be upgraded with no-downtime - might become
prohibitively costly for us to implement.

I honestly think that both are acceptable choices we can make and that
for any given topic there are middle grounds to be found at any given
moment in time.

BUT - without a decision as to what our long-term philosophical intent
in this space is that is clear and understandable to everyone, we cannot
have successful discussions about the impact of implementation choices,
since we will not have a shared understanding of the problem space or
the solutions we're talking about.

For my part - I hear complaints that OpenStack is 'difficult' to operate
and requests for us to make it easier. This is why I have been
advocating some actions that are clearly rooted in an 'active' worldview.

Finally, this is focused on the database layer but similar questions
arise in other places. What is our philosophy on prescriptive/active
choices on our part coupled with automated action and ease of operation
vs. expanded choices for the deployer at the expense of configuration
and operational complexity. For now let's see if we can answer it for
databases, and see where that gets us.

Thanks for reading.

Monty


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
asked May 23, 2017 in openstack-dev by Monty_Taylor (22,780 points)   2 5 7

15 Responses

0 votes

On 05/21/2017 03:38 PM, Monty Taylor wrote:
documentation on the sequence of steps the operator should take.

In the "active" approach, we still document expectations, but we also
validate them. If they are not what we expect but can be changed at
runtime, we change them overriding conflicting environmental config, and
if we can't, we hard-stop indicating an unsuitable environment. Rather
than providing helper tools, we perform the steps needed ourselves, in
the order they need to be performed, ensuring that they are done in the
manner in which they need to be done.

we do this in places like tripleo. The MySQL configs and such are
checked into the source tree, it includes details like
innodbfileper_table, timeouts used by haproxy, etc. I know tripleo
is not like the service itself like Nova but it's also not exactly
something we hand off to the operators to figure out from scratch either.

We do some of it in oslo.db as well. We set things like MySQL SQL_MODE.
We try to make sure the unicode-ish flags are set up and that we're
using utf-8 encoding.

Some examples:

  • Character Sets / Collations

We currently enforce at testing time that all database migrations are
explicit about InnoDB. We also validate in oslo.db that table character
sets have the string 'utf8' in them. (only on MySQL) We do not have any
check for case-sensitive or case-insensitive collations (these affect
sorting and comparison operations) Because we don't, different server
config settings or different database backends for different clouds can
actually behave differently through the REST API.

To deal with that:

First we'd have to decide whether case sensitive or case insensitive was
what we wanted. If we decided we wanted case sensitive, we could add an
enforcement of that in oslo.db, and write migrations to get from case
insensitive indexes to case sensitive indexes on tables where we
detected that a case insensitive collation had been used. If we decided
we wanted to stick with case insensitive we could similarly add code to
enforce it on MySQL. To enforce it actively on PostgresSQL, we'd need to
either switch our code that's using comparisons to use the sqlalchemy
case-insensitive versions explicitly, or maybe write some sort of
overloaded driver for PG that turns all comparisons into
case-insensitive, which would wrap both sides of comparisons in lower()
calls (which has some indexing concerns, but let's ignore that for the
moment) We could also take the 'external' approach and just document it,
then define API tests and try to tie the insensitive behavior in the API
to Interop Compliance. I'm not 100% sure how a db operator would
remediate this - but PG has some fancy computed index features - so
maybe it would be possible.

let's make the case sensitivity explicitly enforced!

A similar issue lurks with the fact that MySQL unicode storage is 3-byte
by default and 4-byte is opt-in. We could take the 'external' approach
and document it and assume the operator has configured their my.cnf with
the appropriate default, or taken an 'active' approach where we override
it in all the models and make migrations to get us from 3 to 4 byte.

let's force MySQL to use utf8mb4! Although I am curious what is the
actual use case we want to hit here (which gets into, zzzeek is ignorant
as to which unicode glyphs actually live in 4-byte utf8 characters).

  • Schema Upgrades

The way you roll out online schema changes is highly dependent on your
database architecture.

Just limiting to the MySQL world:

If you do Galera, you can do roll them out in Total Order or Rolling
fashion. Total Order locks basically everything while it's happening, so
isn't a candidate for "online". In rolling you apply the schema change
to one node at a time. If you do that, the application has to be able to
deal with both forms of the table, and you have to deal with ensuring
that data can replicate appropriately while the schema change is happening.

Galera replicates DDL operations. If I add a column on a node, it pops
up on the other nodes too in a similar way as transactions are
replicated, e.g. nearly synchronous. I would assume it has to do
this in the context of it's usual transaction ordering, even though
MySQL doesn't do transactional DDL, so that if the cluster sees
transaction A, schema change B, transaction C that depends on B, that
ordering is serialized appropriately. However, even if it doesn't do
that, the rolling upgrades we do don't start the services talking to the
new schema structures until the DDL changes are complete, and Galera is
near-synchronous replication.

Also speaking to the "active" question, we certainly have all kinds of
logic in Openstack (the optimistic update strategy in particular) that
take "Galera" into account. And of course we have Galera config inside
of tripleo. So that's kind of the "active" approach, I think.

If you do DRBD active/passive or a single-node deployment you only have
one upgrade operation to perform, but you will only lock certain things
- depending on what schema change operations you were performing.

If you do master/slave, you can roll out the schema change to your
slaves one at a time, wait for them all to catch up, then promote a
slave taking the current master out of commission - update the old
master then then put it into the slave pool. Like Galera rolling, the
app needs to be able to handle old and new versions and the replication
stream needs to be able to replicate between the versions.

Making sure that the stream is able to replicate puts a set of
limitations on the types of schema changes you can perform, but it is an
understandable constrained set.

My current thinking for online upgrades, the schema changes and the
application speaking to those schema changes are at least isolated
states of the openstack cluster as a whole. That's at least how it
seems to work right now. Also right now, Openstack has almost no code
I'm aware of that takes advantage of true master / asynchronous slaves.
While it's been kind of stuck in oslo.db for years, and in
enginefacade I added new decorators that allow you to declare a method
as safe to run in a "slave", applications are hardly using this feature
at all. I vaguely recall one obscure feature in Nova maybe using it for
something. But last I checked, even if you configure Opentack with a
"master" and "slave" database URL (which we support!), 90% of everything
is on the "master" anyway (perhaps some projects that I never look at do
in fact use the "slave", please let me know as I should probably be more
familiar with that).

In either approach the OpenStack service has to be able to talk to both
old and new versions of the schema. And in either approach we need to
make sure to limit the schema change operations to the set that can be
accomplished in an online fashion. We also have to be careful to not
start writing values to new columns until all of the nodes have been
updated, because the replication stream can't replicate the new column
value to nodes that don't have the new column.

This is...what everyone (except keystone w/ the evil triggers) does
already, I thought?

In either approach we can decide to limit the number of architectures we
support for "online" upgrades.

In an 'external' approach, we make sure to do those things, we write
documentation and we assume the database will be updated appropriately.
We can document that if the deployer chooses to do Total Order on
Galera, they will not have online upgrades. There will also have to be a
deployer step to let the services know that they can start writing
values to the new schema format once the upgrade is complete.

In an 'active' approach, we can notice that we have an update available
to run, and we can drive it from code. We can check for Galera, and if
it's there we can run the upgrade in Rolling fashion one node at a time
with no work needed on the part of the deployer. Since we're driving the
upgrade, we know when it's done, so we can signal ourselves to start
using the new version. We'd obviously have to pick the set of acceptable
architectures we can handle consistently orchestrating.

  • Versions

It's worth noting that behavior for schema updates and other things
change over time with backend database version. We set minimum versions
of other things, like libvirt and OVS - so we might also want to set
minimum versions for what we can support in the database.

agree though so far I don't think we've hit too many features that have
an issue here, the MySQL/Mariadb 5.x set of features are ubiquitous now
and that's pretty much what we target. In the Postgresql world, they
are crazy with the new syntaxes every release (to my dismay having to
support them all) but none of these are really appropriate for Openstack
as long as we are targeting MySQL also.

That way we
can know for a given release of OpenStack what DDL operations are safe
to use for a rolling upgrade and what are not. That means detecting such
a version and potentially refusing to perform an upgrade if the version
isn't acceptable. That reduces the operator's ability to choose what
version of the database software to run, but increases our ability to be
able to provide tooling and operations that we can be confident will work.

We definitely make sure that if we put a migration directive somewhere,
it's going to work on the MySQL/MariaDB's that are in general use. I
think there might have even been some behavior recently that was perhaps
on the 5.5/5.6 border but I can't recall.

== Summary ==

These are just a couple of examples - but I hope they're at least mildly
useful to explain some of the sorts of issues at hand - and why I think
we need to clarify what our intent is separate from the issue of what
databases we "support".

Some operations have one and only one "right" way to be done. For those
operations if we take an 'active' approach, we can implement them once
and not make all of our deployers and distributors each implement and
run them. However, there is a cost to that. Automatic and prescriptive
behavior has a higher dev cost that is proportional to the number of
supported architectures. This then implies a need to limit deployer
architecture choices.

On the other hand, taking an 'external' approach allows us to federate
the work of supporting the different architectures to the deployers.
This means more work on the deployer's part, but also potentially a
greater amount of freedom on their part to deploy supporting services
the way they want. It means that some of the things that have been
requested of us - such as easier operation and an increase in the number
of things that can be upgraded with no-downtime - might become
prohibitively costly for us to implement.

I think right now we are doing a "hybrid". If you're on a MySQL
variant, you get the cadillac version and if you're going with
Postgresql, you get the stick shift. I'm not endorsing this but it
does seem work to some extent.

I honestly think that both are acceptable choices we can make and that
for any given topic there are middle grounds to be found at any given
moment in time.

ok i just said that

BUT - without a decision as to what our long-term philosophical intent
in this space is that is clear and understandable to everyone, we cannot
have successful discussions about the impact of implementation choices,
since we will not have a shared understanding of the problem space or
the solutions we're talking about.

For my part - I hear complaints that OpenStack is 'difficult' to operate
and requests for us to make it easier. This is why I have been
advocating some actions that are clearly rooted in an 'active' worldview.

I think this goes to a point I typed on the etherpad in the boston
session, I don't think that MySQL defaults to 3-byte utf8 or that if a
deployer happens to use Postgresql they suddenly get case sensitive
comparisons are the big reasons openstack is "difficult". I find
openstack to be really difficult but setting the db connection URL and
running the "db-manage" scripts is kind of the easiest part of it (but
of course, I'm super biased on that).

Finally, this is focused on the database layer but similar questions
arise in other places. What is our philosophy on prescriptive/active
choices on our part coupled with automated action and ease of operation
vs. expanded choices for the deployer at the expense of configuration
and operational complexity. For now let's see if we can answer it for
databases, and see where that gets us.

Thanks for reading.

Monty


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded May 22, 2017 by Mike_Bayer (15,260 points)   1 5 6
0 votes

On 05/21/2017 10:09 PM, Mike Bayer wrote:

A similar issue lurks with the fact that MySQL unicode storage is
3-byte by default and 4-byte is opt-in. We could take the 'external'
approach and document it and assume the operator has configured their
my.cnf with the appropriate default, or taken an 'active' approach
where we override it in all the models and make migrations to get us
from 3 to 4 byte.

let's force MySQL to use utf8mb4! Although I am curious what is the
actual use case we want to hit here (which gets into, zzzeek is ignorant
as to which unicode glyphs actually live in 4-byte utf8 characters).

There are sets of existing CJK ideographs in the 4-byte range, and the
reality is that all the world's languages are still not encoded in
unicode, so more Asian languages probably land in here in the future.

We've had specific bug reports in Nova on this, but it's actually a lot
to dig out of because that db migration seems expensive.

-Sean

--
Sean Dague
http://dague.net


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded May 22, 2017 by Sean_Dague (66,200 points)   4 8 14
0 votes

On Sun, 21 May 2017, Monty Taylor wrote:

As the discussion around PostgreSQL has progressed, it has come clear to me
that there is a decently deep philosophical question on which we do not
currently share either definition or agreement. I believe that the lack of
clarity on this point is one of the things that makes the PostgreSQL
conversation difficult.

Good analysis. I think this does hit to at least some of the core
differences, maybe even most. And as with so many other things we do
in OpenStack, because we have landed somewhere in the middle between
the two positions we find ourselves in a pickle (see, for example,
the different needs for and attitudes to orchestration underlying
this thread [1]).

You're right to say we need to pick one and move in that direction
but our standard struggles with reaching agreement across the entire
community, especially on an opinionated position, will need to be
overcome. Writing about it to make it visible is a good start.

In the "external" approach, we document the expectations and then write the
code assuming that the database is set up appropriately. We may provide some
helper tools, such as 'nova-manage db sync' and documentation on the sequence
of steps the operator should take.

In the "active" approach, we still document expectations, but we also
validate them. If they are not what we expect but can be changed at runtime,
we change them overriding conflicting environmental config, and if we can't,
we hard-stop indicating an unsuitable environment. Rather than providing
helper tools, we perform the steps needed ourselves, in the order they need
to be performed, ensuring that they are done in the manner in which they need
to be done.

I think there's a middle ground here which is "externalize but
validate" which is:

  • document expectations
  • validate them
  • do not change at runtime, but tell people what's wrong

Some operations have one and only one "right" way to be done. For those
operations if we take an 'active' approach, we can implement them once and
not make all of our deployers and distributors each implement and run them.
However, there is a cost to that. Automatic and prescriptive behavior has a
higher dev cost that is proportional to the number of supported
architectures. This then implies a need to limit deployer architecture
choices.

That "higher dev cost" is one of my objections to the 'active'
approach but it is another implication that worries me more. If we
limit deployer architecture choices at the persistence layer then it
seems very likely that we will be tempted to build more and more
power and control into the persistence layer rather than in the
so-called "business" layer. In my experience this is a recipe for
ossification. The persistence layer needs to be dumb and
replaceable.

On the other hand, taking an 'external' approach allows us to federate the
work of supporting the different architectures to the deployers. This means
more work on the deployer's part, but also potentially a greater amount of
freedom on their part to deploy supporting services the way they want. It
means that some of the things that have been requested of us - such as easier
operation and an increase in the number of things that can be upgraded with
no-downtime - might become prohibitively costly for us to implement.

That's not necessarily the case. Consider that in an external
approach, where the persistence layer is opaque to the application, it
means that third parties (downstream consumers, the market, the
invisible hand, etc) have the option to do all kinds of wacky stuff.
Probably avec containers™.

In that model, the core functionality is simple and adequate but not
deluxe. Deluxe is an after-market add on.

BUT - without a decision as to what our long-term philosophical intent in
this space is that is clear and understandable to everyone, we cannot have
successful discussions about the impact of implementation choices, since we
will not have a shared understanding of the problem space or the solutions
we're talking about.

Yes.

For my part - I hear complaints that OpenStack is 'difficult' to operate and
requests for us to make it easier. This is why I have been advocating some
actions that are clearly rooted in an 'active' worldview.

If OpenStack were more of a monolith instead of a system with 3 to
many different databases, along with some optional number of other
ways to do other kinds of (short term) persistence, I would find the
'active' model a good option. If we were to start over I'd say let's
do that.

But as it stands implementing actually useful 'active' management of
the database feels like a very large amount of work that will take
so long that by the time we complete it it will be not just out of
date but also limit us.

External but validate feels much more viable. What we really want is
that people can get reasonably good results without trying that hard
and great (but also various) results with a bit of effort.

So that means it ought to be possible to do enough OpenStack to
think it is cool with whatever database I happen to have handy. And
then once I dig it I should be able to manage it effectively using
the solutions that are best for my environment.

Finally, this is focused on the database layer but similar questions arise in
other places. What is our philosophy on prescriptive/active choices on our
part coupled with automated action and ease of operation vs. expanded choices
for the deployer at the expense of configuration and operational complexity.
For now let's see if we can answer it for databases, and see where that gets
us.

I continue to think that this issue is somewhat special at the
persistence layer because of the balance of who it impacts the most:
the deployers, developers, and distributors more than the users[2].
Making global conclusions about external and active based on this
issue may be premature.

Thanks for reading.

Thanks for writing. You've done a lot of writing lately. Is good.

[1] http://lists.openstack.org/pipermail/openstack-operators/2017-May/013464.html

[2] That our database choices impacts the users (e.g., the case and encoding
things at the API layer) is simply a mistake that we all made together, a
bug to be fixed, not an architectural artifact.
--
Chris Dent ┬──┬◡ノ(° -°ノ) https://anticdent.org/
freenode: cdent tw: @anticdent__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

responded May 23, 2017 by cdent_plus_os_at_ant (12,800 points)   2 2 4
0 votes

On 05/23/2017 07:23 AM, Chris Dent wrote:

Some operations have one and only one "right" way to be done. For
those operations if we take an 'active' approach, we can implement
them once and not make all of our deployers and distributors each
implement and run them. However, there is a cost to that. Automatic
and prescriptive behavior has a higher dev cost that is proportional
to the number of supported architectures. This then implies a need to
limit deployer architecture choices.

That "higher dev cost" is one of my objections to the 'active'
approach but it is another implication that worries me more. If we
limit deployer architecture choices at the persistence layer then it
seems very likely that we will be tempted to build more and more
power and control into the persistence layer rather than in the
so-called "business" layer. In my experience this is a recipe for
ossification. The persistence layer needs to be dumb and
replaceable.

Why?

Do you have an example of an Open Source project that (after it was
widely deployed) replaced their core storage engine for their existing
users?

I do get that when building more targeted things, this might be a value,
but I don't see that as a useful design constraint for OpenStack.

-Sean

--
Sean Dague
http://dague.net


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded May 23, 2017 by Sean_Dague (66,200 points)   4 8 14
0 votes

On Tue, 23 May 2017, Sean Dague wrote:

Do you have an example of an Open Source project that (after it was
widely deployed) replaced their core storage engine for their existing
users?

That's not the point here. The point is that new deployments may
choose to use a different one and old ones can choose to change if
they like (but don't have to) if storage is abstracted.

The notion of a "core storage engine" is not something that I see as
currently existing in OpenStack. It is clear it is something that at
least you and likely several other people would like to see.

But it is most definitely not something we have now and as I
responded to Monty, getting there from where we are now would be a
huge undertaking with as yet unproven value [1].

I do get that when building more targeted things, this might be a value,
but I don't see that as a useful design constraint for OpenStack.

Completely the opposite from my point of view. When something is as
frameworky as OpenStack is (perhaps accidently and probably
unfortunately) then of course replaceable DBs are the norm,
expected, useful and potentially required to satisfy more use cases.

Adding specialization (tier 1?) is probably something we want and
want to encourage but it is not something we should build into the
"core" of the "product".

But there's that philosophical disagreement again. I'm not sure we
can resolve that. What I'm hoping is that by starting the ball
rolling other people will join in and people like you and me can
step out of the way.

[1] Of the issues described elsewhere in the thread the only one
which seems to be a bit sticking point is the trigger thing, and
there's significant disagreement on that being "okay".

--
Chris Dent ┬──┬◡ノ(° -°ノ) https://anticdent.org/
freenode: cdent tw: @anticdent__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

responded May 23, 2017 by cdent_plus_os_at_ant (12,800 points)   2 2 4
0 votes

Comments below..

On 5/21/2017 1:38 PM, Monty Taylor wrote:
Hi all!

As the discussion around PostgreSQL has progressed, it has come clear
to me that there is a decently deep philosophical question on which we
do not currently share either definition or agreement. I believe that
the lack of clarity on this point is one of the things that makes the
PostgreSQL conversation difficult.

I believe the question is between these two things:

  • Should OpenStack assume the existence of an external database
    service that it treat as an black-box on the other side of a
    connection string?

  • Should OpenStack take an active and/or opinionated role in managing
    the database service?

A potentially obvious question about that (asked by Mike Bayer in a
different thread) is: "what do you mean by managing?"

What I mean by managing is doing all of the things you can do related
to database operational controls short of installing the software,
writing the basic db config files to disk and stopping and starting
the services. It means being much more prescriptive about what types
of config we support, validating config settings that cannot be
overridden at runtime and refusing to operate if they are unworkable.

I think it's helpful and important for us to have automation tooling
like tripleo, puppet, etc. that can stand up a MySQL database. But we
also have to realize that as shops mature, they will deploy more
complicated database topologies, clustered configurations, and
replication scenarios. So I think we shouldn't go overboard with being
prescriptive. We also have to realize that in the enterprise space,
databases are usually deployed and managed by a separate database team,
which means less control over that layer. So we shouldn't force people
into this model. We should provide best practice documentation, examples
(tripleo, puppet, ansible, etc.), and leave it up to the operator.

Why would we want to be 'more active'? When managing and tuning
databases, there are some things that are driven by the environment
and some things that are driven by the application.

Things that are driven by the environment include things like the
amount of RAM actually available, whether or not the machines running
the database are dedicated or shared, firewall settings, selinux
settings and what versions of software are available.

This is a good example of an area that we should focus on documenting
best practices and leave it to the operator to implement. Guidelines
around cpu, memory, security settings, tunables, etc. are what's needed
here. Today, there isn't really any guidance or best practices on even
sizing the database(s) for a given deployment size.

Things that are driven by the application are things like character
set and collation, schema design, data types, schema upgrade and HA
strategies.

These are things that we can have a bit more control or direction on.

One might argue that HA strategies are an operator concern, but in
reality the set of workable HA strategies is tightly constrained by
how the application works, and the pairing an application expecting
one HA strategy with a deployment implementing a different one can
have negative results ranging from unexpected downtime to data
corruption.

For example: An HA strategy using slave promotion and a VIP that
points at the current write master paired with an application
incorrectly configured to do such a thing can lead to writes to the
wrong host after a failover event and an application that seems to be
running fine until the data turns up weird after a while.

This is definitely a more complicated area that becomes more and more
specific to the clustering technology being used. Galera vs. MySQL
Cluster is a good example. Galera has an active/passive architecture
where the above issues become a concern for sure. While MySQL Cluster
(NDB) is an active/active architecture, so losing a node only effects
any uncommitted transactions, that could easily be addressed with a
retry. These topologies will become more complicated as people start
looking at cross regional replication and DR.

For the areas in which the characteristics of the database are tied
closely to the application behavior, there is a constrained set of
valid choices at the database level. Sometimes that constrained set
only has one member.

The approach to those is what I'm talking about when I ask the
question about "external" or "active".

In the "external" approach, we document the expectations and then
write the code assuming that the database is set up appropriately. We
may provide some helper tools, such as 'nova-manage db sync' and
documentation on the sequence of steps the operator should take.

In the "active" approach, we still document expectations, but we also
validate them. If they are not what we expect but can be changed at
runtime, we change them overriding conflicting environmental config,
and if we can't, we hard-stop indicating an unsuitable environment.
Rather than providing helper tools, we perform the steps needed
ourselves, in the order they need to be performed, ensuring that they
are done in the manner in which they need to be done.

This might be a trickier situation, especially if the database(s) are in
a separate or dedicated environment that the OpenStack service processes
don't have access to. Of course for SQL commands, this isn't a problem.
But changing the configuration files and restarting the database may be
a harder thing to expect.

Some examples:

  • Character Sets / Collations

We currently enforce at testing time that all database migrations are
explicit about InnoDB. We also validate in oslo.db that table
character sets have the string 'utf8' in them. (only on MySQL) We do
not have any check for case-sensitive or case-insensitive collations
(these affect sorting and comparison operations) Because we don't,
different server config settings or different database backends for
different clouds can actually behave differently through the REST API.

To deal with that:

First we'd have to decide whether case sensitive or case insensitive
was what we wanted. If we decided we wanted case sensitive, we could
add an enforcement of that in oslo.db, and write migrations to get
from case insensitive indexes to case sensitive indexes on tables
where we detected that a case insensitive collation had been used. If
we decided we wanted to stick with case insensitive we could similarly
add code to enforce it on MySQL. To enforce it actively on
PostgresSQL, we'd need to either switch our code that's using
comparisons to use the sqlalchemy case-insensitive versions
explicitly, or maybe write some sort of overloaded driver for PG that
turns all comparisons into case-insensitive, which would wrap both
sides of comparisons in lower() calls (which has some indexing
concerns, but let's ignore that for the moment) We could also take the
'external' approach and just document it, then define API tests and
try to tie the insensitive behavior in the API to Interop Compliance.
I'm not 100% sure how a db operator would remediate this - but PG has
some fancy computed index features - so maybe it would be possible.

I think that abstraction with oslo.db would be the right path here. But
you are also right that we need to have a consistent compliance policy
at the API layer. We may fix things down at the DB level with oslo.db,
but everything on top of that needs to also fall in-line. There is a
very high chance that there are hard-coded workarounds or assumptions in
the services and apis today.

A similar issue lurks with the fact that MySQL unicode storage is
3-byte by default and 4-byte is opt-in. We could take the 'external'
approach and document it and assume the operator has configured their
my.cnf with the appropriate default, or taken an 'active' approach
where we override it in all the models and make migrations to get us
from 3 to 4 byte.

I think an active approach on this would be ideal, just like the utf8
and InnoDB settings are today. FYI, not all services are enforcing these
in a consistent manor today. Another example of something that should be
abstracted at the oslo.db layer and get the human element out of the way.

  • Schema Upgrades

The way you roll out online schema changes is highly dependent on your
database architecture.

Just limiting to the MySQL world:

If you do Galera, you can do roll them out in Total Order or Rolling
fashion. Total Order locks basically everything while it's happening,
so isn't a candidate for "online". In rolling you apply the schema
change to one node at a time. If you do that, the application has to
be able to deal with both forms of the table, and you have to deal
with ensuring that data can replicate appropriately while the schema
change is happening.

If you do DRBD active/passive or a single-node deployment you only
have one upgrade operation to perform, but you will only lock certain
things - depending on what schema change operations you were performing.

If you do master/slave, you can roll out the schema change to your
slaves one at a time, wait for them all to catch up, then promote a
slave taking the current master out of commission - update the old
master then then put it into the slave pool. Like Galera rolling, the
app needs to be able to handle old and new versions and the
replication stream needs to be able to replicate between the versions.

Making sure that the stream is able to replicate puts a set of
limitations on the types of schema changes you can perform, but it is
an understandable constrained set.

In either approach the OpenStack service has to be able to talk to
both old and new versions of the schema. And in either approach we
need to make sure to limit the schema change operations to the set
that can be accomplished in an online fashion. We also have to be
careful to not start writing values to new columns until all of the
nodes have been updated, because the replication stream can't
replicate the new column value to nodes that don't have the new column.

This is another area where something like MySQL Cluster (NDB) would
operate differently because it's an active/active architecture. So
limiting the number of online changes while a table is locked across the
cluster would be very important. There is also the timeouts for the
applications to consider, something that could be abstracted again with
oslo.db.

In either approach we can decide to limit the number of architectures
we support for "online" upgrades.

In an 'external' approach, we make sure to do those things, we write
documentation and we assume the database will be updated
appropriately. We can document that if the deployer chooses to do
Total Order on Galera, they will not have online upgrades. There will
also have to be a deployer step to let the services know that they can
start writing values to the new schema format once the upgrade is
complete.

In an 'active' approach, we can notice that we have an update
available to run, and we can drive it from code. We can check for
Galera, and if it's there we can run the upgrade in Rolling fashion
one node at a time with no work needed on the part of the deployer.
Since we're driving the upgrade, we know when it's done, so we can
signal ourselves to start using the new version. We'd obviously have
to pick the set of acceptable architectures we can handle consistently
orchestrating.

This would be an interesting idea to expand to a autonomic orchestration
framework within the control plane to handle the database upgrades
online and the restarting of the dependent services in the correct
order. If we only focus on the database piece, it may not be as
interesting for operators.

  • Versions

It's worth noting that behavior for schema updates and other things
change over time with backend database version. We set minimum
versions of other things, like libvirt and OVS - so we might also want
to set minimum versions for what we can support in the database. That
way we can know for a given release of OpenStack what DDL operations
are safe to use for a rolling upgrade and what are not. That means
detecting such a version and potentially refusing to perform an
upgrade if the version isn't acceptable. That reduces the operator's
ability to choose what version of the database software to run, but
increases our ability to be able to provide tooling and operations
that we can be confident will work.

Validating the MySQL database version is a good idea. The features do
change over time. A good example is how in 5.7, you'll get warnings
about duplicate indexes being dropped in a future release which will
definitely affect multiple services today.

== Summary ==

These are just a couple of examples - but I hope they're at least
mildly useful to explain some of the sorts of issues at hand - and why
I think we need to clarify what our intent is separate from the issue
of what databases we "support".

Some operations have one and only one "right" way to be done. For
those operations if we take an 'active' approach, we can implement
them once and not make all of our deployers and distributors each
implement and run them. However, there is a cost to that. Automatic
and prescriptive behavior has a higher dev cost that is proportional
to the number of supported architectures. This then implies a need to
limit deployer architecture choices.

On the other hand, taking an 'external' approach allows us to federate
the work of supporting the different architectures to the deployers.
This means more work on the deployer's part, but also potentially a
greater amount of freedom on their part to deploy supporting services
the way they want. It means that some of the things that have been
requested of us - such as easier operation and an increase in the
number of things that can be upgraded with no-downtime - might become
prohibitively costly for us to implement.

I honestly think that both are acceptable choices we can make and that
for any given topic there are middle grounds to be found at any given
moment in time.

BUT - without a decision as to what our long-term philosophical intent
in this space is that is clear and understandable to everyone, we
cannot have successful discussions about the impact of implementation
choices, since we will not have a shared understanding of the problem
space or the solutions we're talking about.

For my part - I hear complaints that OpenStack is 'difficult' to
operate and requests for us to make it easier. This is why I have been
advocating some actions that are clearly rooted in an 'active' worldview.

Finally, this is focused on the database layer but similar questions
arise in other places. What is our philosophy on prescriptive/active
choices on our part coupled with automated action and ease of
operation vs. expanded choices for the deployer at the expense of
configuration and operational complexity. For now let's see if we can
answer it for databases, and see where that gets us.

Thanks for reading.

Monty


OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded May 23, 2017 by Octave_J._Orgeron (1,520 points)   1
0 votes

On 05/23/2017 01:10 PM, Octave J. Orgeron wrote:
Comments below..

On 5/21/2017 1:38 PM, Monty Taylor wrote:

For example: An HA strategy using slave promotion and a VIP that
points at the current write master paired with an application
incorrectly configured to do such a thing can lead to writes to the
wrong host after a failover event and an application that seems to be
running fine until the data turns up weird after a while.

This is definitely a more complicated area that becomes more and more
specific to the clustering technology being used. Galera vs. MySQL
Cluster is a good example. Galera has an active/passive architecture
where the above issues become a concern for sure.

This is not my understanding; Galera is multi-master and if you lose a
node, you don't lose any committed transactions; the writesets are
validated as acceptable by, and pushed out to all nodes before your
commit succeeds. There's an option to make it wait until all those
writesets are fully written to disk as well, but even with that option
flipped off, if you COMMIT to one node then that node explodes, you lose
nothing. your writesets have been verified as will be accepted by all
the other nodes.

active/active is the second bullet point on the main homepage:
http://galeracluster.com/products/

In the "active" approach, we still document expectations, but we also
validate them. If they are not what we expect but can be changed at
runtime, we change them overriding conflicting environmental config,
and if we can't, we hard-stop indicating an unsuitable environment.
Rather than providing helper tools, we perform the steps needed
ourselves, in the order they need to be performed, ensuring that they
are done in the manner in which they need to be done.

This might be a trickier situation, especially if the database(s) are in
a separate or dedicated environment that the OpenStack service processes
don't have access to. Of course for SQL commands, this isn't a problem.
But changing the configuration files and restarting the database may be
a harder thing to expect.

nevertheless the HA setup within tripleo does do this, currently using
Pacemaker and resource agents. This is within the scope of at least
parts of Openstack.

In either approach the OpenStack service has to be able to talk to
both old and new versions of the schema. And in either approach we
need to make sure to limit the schema change operations to the set
that can be accomplished in an online fashion. We also have to be
careful to not start writing values to new columns until all of the
nodes have been updated, because the replication stream can't
replicate the new column value to nodes that don't have the new column.

This is another area where something like MySQL Cluster (NDB) would
operate differently because it's an active/active architecture. So
limiting the number of online changes while a table is locked across the
cluster would be very important. There is also the timeouts for the
applications to consider, something that could be abstracted again with
oslo.db.

So the DDL we do on Galera, to confirm but also clarify Monty's point,
is under the realm of "total order isolation", which means it's going to
hold up the whole cluster while DDL is applied to all nodes. Monty
says this disqualifies it as an "online upgrade", which is because if
you emitted DDL that had to run default values into a million rows then
your whole cluster would temporarily have to wait for that to happen; we
handle that by making sure we don't do migrations with that kind of data
requirement and while yes, the DB has to wait for a schema change to
apply, they are at least very short (in theory). For practical
purposes, it is mostly an "online" style of migration because all the
services that talk to the database can keep on talking to the database
without being stopped, upgraded to new software version, and restarted,
which IMO is what's really hard about "online" upgrades. It does mean
that services will just have a little more latency while operations
proceed. Maybe we need a new term called "quasi-online" or something
like that.

Facebook has released a Python version of their "online" schema
migration tool for MySQL which does the full blown "create a new, blank
table" approach, e.g. which contains the newer version of the schema, so
that nothing at all stops or slows down at all. And then to manage
between the two tables while everything is running it also makes a
"change capture" table to keep track of what's going on, and then to
wire it all together it uses...triggers!
https://github.com/facebookincubator/OnlineSchemaChange/wiki/How-OSC-works.
Crazy Facebook kids. How we know that "make two more tables and wire
it all together with new triggers" in fact is more performant than just,
"add a column to the table", I'm not sure how/when they make that
determination. I don't see an Openstack cluster as quite the same
thing as hosting a site like Facebook so I lean towards the more liberal
interpretation of "online upgrades".

  • Versions

It's worth noting that behavior for schema updates and other things
change over time with backend database version. We set minimum
versions of other things, like libvirt and OVS - so we might also want
to set minimum versions for what we can support in the database. That
way we can know for a given release of OpenStack what DDL operations
are safe to use for a rolling upgrade and what are not. That means
detecting such a version and potentially refusing to perform an
upgrade if the version isn't acceptable. That reduces the operator's
ability to choose what version of the database software to run, but
increases our ability to be able to provide tooling and operations
that we can be confident will work.

Validating the MySQL database version is a good idea. The features do
change over time. A good example is how in 5.7, you'll get warnings
about duplicate indexes being dropped in a future release which will
definitely affect multiple services today.

== Summary ==

These are just a couple of examples - but I hope they're at least
mildly useful to explain some of the sorts of issues at hand - and why
I think we need to clarify what our intent is separate from the issue
of what databases we "support".

Some operations have one and only one "right" way to be done. For
those operations if we take an 'active' approach, we can implement
them once and not make all of our deployers and distributors each
implement and run them. However, there is a cost to that. Automatic
and prescriptive behavior has a higher dev cost that is proportional
to the number of supported architectures. This then implies a need to
limit deployer architecture choices.

On the other hand, taking an 'external' approach allows us to federate
the work of supporting the different architectures to the deployers.
This means more work on the deployer's part, but also potentially a
greater amount of freedom on their part to deploy supporting services
the way they want. It means that some of the things that have been
requested of us - such as easier operation and an increase in the
number of things that can be upgraded with no-downtime - might become
prohibitively costly for us to implement.

I honestly think that both are acceptable choices we can make and that
for any given topic there are middle grounds to be found at any given
moment in time.

BUT - without a decision as to what our long-term philosophical intent
in this space is that is clear and understandable to everyone, we
cannot have successful discussions about the impact of implementation
choices, since we will not have a shared understanding of the problem
space or the solutions we're talking about.

For my part - I hear complaints that OpenStack is 'difficult' to
operate and requests for us to make it easier. This is why I have been
advocating some actions that are clearly rooted in an 'active' worldview.

Finally, this is focused on the database layer but similar questions
arise in other places. What is our philosophy on prescriptive/active
choices on our part coupled with automated action and ease of
operation vs. expanded choices for the deployer at the expense of
configuration and operational complexity. For now let's see if we can
answer it for databases, and see where that gets us.

Thanks for reading.

Monty


OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded May 23, 2017 by Mike_Bayer (15,260 points)   1 5 6
0 votes

On 05/23/2017 07:23 AM, Chris Dent wrote:
That "higher dev cost" is one of my objections to the 'active'
approach but it is another implication that worries me more. If we
limit deployer architecture choices at the persistence layer then it
seems very likely that we will be tempted to build more and more
power and control into the persistence layer rather than in the
so-called "business" layer. In my experience this is a recipe for
ossification. The persistence layer needs to be dumb and
replaceable.

Err, in my experience, having a completely dumb persistence layer --
i.e. one that tries to assuage the differences between, say, relational
and non-relational stores -- is a recipe for disaster. The developer
just ends up writing join constructs in that business layer instead of
using a relational data store the way it is intended to be used. Same
for aggregate operations. [1]

Now, if what you're referring to is "don't use vendor-specific
extensions in your persistence layer", then yes, I agree with you.

Best,
-jay

[1] Witness the join constructs in Golang in Kubernetes as they work
around etcd not being a relational data store:

https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/deployment/deployment_controller.go#L528-L556

Instead of a single SQL statement:

SELECT p.* FROM pods AS p
JOIN deployments AS d
ON p.deployment_id = d.id
WHERE d.name = $name;

the deployments controller code has to read every Pod message from etcd
and loop through each Pod message, returning a list of Pods that match
the deployment searched for.

Similarly, Kubenetes API does not support any aggregate (SUM, GROUP BY,
etc) functionality. Instead, clients are required to perform these kinds
of calculations/operations in memory. This is because etcd, being an
(awesome) key/value store is not designed for aggregate operations (just
as Cassandra or CockroachDB do not allow most aggregate operations).

My point here is not to denigrate Kubernetes. Far from it. They (to
date) have a relatively shallow relational schema and doing join and
index maintenance [2] operations in client-side code has so far been a
cost that the project has been OK carrying. The point I'm trying to make
is that the choice of data store semantics (relational or not, columnar
or not, eventually-consistent or not, etc) does make a difference to
the architecture of a project, its deployment and the amount of code
that the project needs to keep to properly handle its data schema.
There's no way -- in my experience -- to make a "persistence layer" that
papers over these differences and ends up being useful.

[2] In Kubernetes, all services are required to keep all relevant data
in memory:

https://github.com/kubernetes/community/blob/master/contributors/design-proposals/principles.md

This means that code that maintains a bunch of in-memory indexes of
various data objects ends up being placed into every component, Here's
an example of this in the kubelet (the equivalent-ish of the
nova-compute daemon) pod manager, keeping an index of pods and mirrored
pods in memory:

https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/pod/pod_manager.go#L104-L114

https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/pod/pod_manager.go#L159-L181


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded May 23, 2017 by Jay_Pipes (59,760 points)   3 10 14
0 votes

On Tue, 23 May 2017, Jay Pipes wrote:

Err, in my experience, having a completely dumb persistence layer -- i.e.
one that tries to assuage the differences between, say, relational and
non-relational stores -- is a recipe for disaster. The developer just ends up
writing join constructs in that business layer instead of using a relational
data store the way it is intended to be used. Same for aggregate operations.
[1]

Now, if what you're referring to is "don't use vendor-specific extensions in
your persistence layer", then yes, I agree with you.

If you've commited to doing an RDBMS then, yeah, stick with
relational, but dumb relational. Since that's where we are [3] in
OpenStack, then we should go with that.

[3] Of course sometimes I'm sad that we made that commitment and
instead we had an abstract storage interface, an implementation
of which was stupid text files on disk, another which was generic
sqlalchemy, and another which was raw SQL extracted wholesale from
the mind of jaypipes, optimized for Drizzle 8.x. But then I'm often
sad about completely unrealistic things.

--
Chris Dent ┬──┬◡ノ(° -°ノ) https://anticdent.org/
freenode: cdent tw: @anticdent__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

responded May 23, 2017 by cdent_plus_os_at_ant (12,800 points)   2 2 4
0 votes

On May 23, 2017, at 1:43 PM, Jay Pipes jaypipes@gmail.com wrote:

[1] Witness the join constructs in Golang in Kubernetes as they work around etcd not being a relational data store:

Maybe it’s just me, but I found that Go code more understandable than some of the SQL we are using in the placement engine. :)

I assume that the SQL in a relational engine is faster than the same thing in code, but is that difference significant? For extremely large data sets I think that the database processing may be rate limiting, but is that the case here? Sometimes it seems that we are overly obsessed with optimizing data handling when the amount of data is relatively small. A few million records should be fast enough using just about anything.

-- Ed Leafe


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded May 23, 2017 by Ed_Leafe (11,720 points)   1 3 5
...