settingsLogin | Registersettings

[openstack-dev] [keystone] Adding foreign keys between subsystems

0 votes

[tl;dr I want to remove the artificial restriction of not allowing FKs between
subsystems and I want to stop FK enforcement in code.]

The keystone code architecture is pretty simple. The data and functionality are
divided up into subsystems. Each subsystem can be configured to use a different
backend datastore. Of course, there are always exceptions to the rule like how
the federation and identity subsystems are highly coupled in the data model.

On the surface this flexible model sounds good, but there are some interesting
consequences. First, you can't tell from looking at the data model that there
actually is a lot of coupling between the subsystems. So instead of looking at
our sqlalchemy models to see relationships, we must look throughout the code
for where a particular primary key is used and look for enforcement. (Hopefully
we enforce it in all of the right places.) Additionally, a DBA/data architect/
whenever can't see the relationship at all by looking at the database.

Second, this has polluted our code and causes erroneous API errors. We have added
lots of get_*() calls in our code that checks for the existence of IDs in
another subsystem. In some cases we probably don't do the check and thus would
allow bad data to be stored. The check often causes 404s instead of 400s when
bad data is provided.

So I'd like us to be more deliberate in defining which parts of the data model
are truly independent and a separate backend datastore would make sense. For
instance, we know we want to support LDAP for identity (although this still puts
identity info into a SQL database) and catalog is very isolated from the rest of
the data model.

As a side effect this means that if deployers wished to use a separate backend
for something they would need to also implement it for the other highly coupled
subsystems. They would also have to provide any FK enforcement that their own
datastore does not provide.

Thoughts?

--
david stanek
web: https://dstanek.com
twitter: https://twitter.com/dstanek


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
asked Apr 12, 2017 in openstack-dev by dstanek_at_dstanek.c (2,440 points)   1 2

5 Responses

0 votes

On Wed, Apr 12, 2017 at 9:28 AM, David Stanek dstanek@dstanek.com wrote:

[tl;dr I want to remove the artificial restriction of not allowing FKs
between
subsystems and I want to stop FK enforcement in code.]

The keystone code architecture is pretty simple. The data and
functionality are
divided up into subsystems. Each subsystem can be configured to use a
different
backend datastore. Of course, there are always exceptions to the rule like
how
the federation and identity subsystems are highly coupled in the data
model.

On the surface this flexible model sounds good, but there are some
interesting
consequences. First, you can't tell from looking at the data model that
there
actually is a lot of coupling between the subsystems. So instead of
looking at
our sqlalchemy models to see relationships, we must look throughout the
code
for where a particular primary key is used and look for enforcement.
(Hopefully
we enforce it in all of the right places.) Additionally, a DBA/data
architect/
whenever can't see the relationship at all by looking at the database.

Second, this has polluted our code and causes erroneous API errors. We
have added
lots of get_*() calls in our code that checks for the existence of IDs in
another subsystem. In some cases we probably don't do the check and thus
would
allow bad data to be stored. The check often causes 404s instead of 400s
when
bad data is provided.

Having these cleaned up would be awesome. I imagine we'd also see some sort
of performance benefit as a result, too.

So I'd like us to be more deliberate in defining which parts of the data
model
are truly independent and a separate backend datastore would make sense.
For
instance, we know we want to support LDAP for identity (although this
still puts
identity info into a SQL database) and catalog is very isolated from the
rest of
the data model.

As a side effect this means that if deployers wished to use a separate
backend
for something they would need to also implement it for the other highly
coupled
subsystems. They would also have to provide any FK enforcement that their
own
datastore does not provide.

So for deployers, this would mean that if today they only deploy keystone
backed with LDAP for only identity, tomorrow they will have to ensure
that LDAP has all the proper things for other subsystems that use to have
an in-code constraint with identity (i.e. assignment). I wonder how many
folks that would be? What would an upgrade look like for deployments
wishing to stick to LDAP? I imagine we'd be raising the bar for that
particular upgrade.

Thoughts?

--
david stanek
web: https://dstanek.com
twitter: https://twitter.com/dstanek


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Apr 12, 2017 by Lance_Bragstad (11,080 points)   2 3 6
0 votes

Just to illustrate the discussion, we have a bug fix that currently tries
to drop a FK between the federation and identity subsystems [1].

The previous fix for this bug (which has been merged and had the backport
abandoned) took advantage of the FK in order to cascade delete federated
users when a protocol or an identity provider is deleted [2].

[1] https://review.openstack.org/#/c/445505/
[2] https://review.openstack.org/#/c/420893/

On Wed, Apr 12, 2017 at 11:58 AM, Lance Bragstad lbragstad@gmail.com
wrote:

On Wed, Apr 12, 2017 at 9:28 AM, David Stanek dstanek@dstanek.com wrote:

[tl;dr I want to remove the artificial restriction of not allowing FKs
between
subsystems and I want to stop FK enforcement in code.]

The keystone code architecture is pretty simple. The data and
functionality are
divided up into subsystems. Each subsystem can be configured to use a
different
backend datastore. Of course, there are always exceptions to the rule
like how
the federation and identity subsystems are highly coupled in the data
model.

On the surface this flexible model sounds good, but there are some
interesting
consequences. First, you can't tell from looking at the data model that
there
actually is a lot of coupling between the subsystems. So instead of
looking at
our sqlalchemy models to see relationships, we must look throughout the
code
for where a particular primary key is used and look for enforcement.
(Hopefully
we enforce it in all of the right places.) Additionally, a DBA/data
architect/
whenever can't see the relationship at all by looking at the database.

Second, this has polluted our code and causes erroneous API errors. We
have added
lots of get_*() calls in our code that checks for the existence of IDs in
another subsystem. In some cases we probably don't do the check and thus
would
allow bad data to be stored. The check often causes 404s instead of 400s
when
bad data is provided.

Having these cleaned up would be awesome. I imagine we'd also see some
sort of performance benefit as a result, too.

So I'd like us to be more deliberate in defining which parts of the data
model
are truly independent and a separate backend datastore would make sense.
For
instance, we know we want to support LDAP for identity (although this
still puts
identity info into a SQL database) and catalog is very isolated from the
rest of
the data model.

As a side effect this means that if deployers wished to use a separate
backend
for something they would need to also implement it for the other highly
coupled
subsystems. They would also have to provide any FK enforcement that their
own
datastore does not provide.

So for deployers, this would mean that if today they only deploy keystone
backed with LDAP for only identity, tomorrow they will have to ensure
that LDAP has all the proper things for other subsystems that use to have
an in-code constraint with identity (i.e. assignment). I wonder how many
folks that would be? What would an upgrade look like for deployments
wishing to stick to LDAP? I imagine we'd be raising the bar for that
particular upgrade.

Thoughts?

--
david stanek
web: https://dstanek.com
twitter: https://twitter.com/dstanek



OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscrib
e
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

--
Rodrigo Duarte Sousa
Senior Quality Engineer @ Red Hat
MSc in Computer Science
http://rodrigods.com


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Apr 12, 2017 by Rodrigo_Duarte (1,760 points)   2
0 votes

On 12-Apr 14:30, Rodrigo Duarte wrote:
Just to illustrate the discussion, we have a bug fix that currently tries
to drop a FK between the federation and identity subsystems [1].

[1] https://review.openstack.org/#/c/445505/

I think this highlights one of my problems with the current architecture. I see that
you've removed the FK and added delete logic to do what the data layer would be doing
for you. I didn't see any added getuser() checks to make sure the userid being used
in creates/updates is valid. Are we already checking that somewhere else or is this
introducing a new bug?

--
david stanek
web: https://dstanek.com
twitter: https://twitter.com/dstanek


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Apr 12, 2017 by dstanek_at_dstanek.c (2,440 points)   1 2
0 votes

On Wed, Apr 12, 2017 at 2:47 PM, David Stanek dstanek@dstanek.com wrote:

On 12-Apr 14:30, Rodrigo Duarte wrote:

Just to illustrate the discussion, we have a bug fix that currently tries
to drop a FK between the federation and identity subsystems [1].

[1] https://review.openstack.org/#/c/445505/

I think this highlights one of my problems with the current architecture.
I see that
you've removed the FK and added delete logic to do what the data layer
would be doing
for you. I didn't see any added getuser() checks to make sure the userid
being used
in creates/updates is valid. Are we already checking that somewhere else
or is this
introducing a new bug?

The review [1] is dropping the idpid and idpid + protocolid FKs, not the
user
id one.

--
david stanek
web: https://dstanek.com
twitter: https://twitter.com/dstanek


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

--
Rodrigo Duarte Sousa
Senior Quality Engineer @ Red Hat
MSc in Computer Science
http://rodrigods.com


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Apr 12, 2017 by Rodrigo_Duarte (1,760 points)   2
0 votes

On 12-Apr 15:25, Rodrigo Duarte wrote:
On Wed, Apr 12, 2017 at 2:47 PM, David Stanek dstanek@dstanek.com wrote:

On 12-Apr 14:30, Rodrigo Duarte wrote:

Just to illustrate the discussion, we have a bug fix that currently tries
to drop a FK between the federation and identity subsystems [1].

[1] https://review.openstack.org/#/c/445505/

I think this highlights one of my problems with the current architecture.
I see that
you've removed the FK and added delete logic to do what the data layer
would be doing
for you. I didn't see any added getuser() checks to make sure the userid
being used
in creates/updates is valid. Are we already checking that somewhere else
or is this
introducing a new bug?

The review [1] is dropping the idpid and idpid + protocolid FKs, not the
user
id one.

Right, misremembered. Just run s/userid/idpid/ and s/getuser/getidp/ and you'll
have the same issues.

--
david stanek
web: https://dstanek.com
twitter: https://twitter.com/dstanek


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Apr 12, 2017 by dstanek_at_dstanek.c (2,440 points)   1 2
...