settingsLogin | Registersettings

[openstack-dev] [tempest][nova][defcore] Add option to disable some strict response checking for interop testing

0 votes

Last year, in response to Nova micro-versioning and extension updates[1],
the QA team added strict API schema checking to Tempest to ensure that
no additional properties were added to Nova API responses[2][3]. In the
last year, at least three vendors participating the the OpenStack Powered
Trademark program have been impacted by this change, two of which
reported this to the DefCore Working Group mailing list earlier this year[4].

The DefCore Working Group determines guidelines for the OpenStack Powered
program, which includes capabilities with associated functional tests
from Tempest that must be passed, and designated sections with associated
upstream code [5][6]. In determining these guidelines, the working group
attempts to balance the future direction of development with lagging
indicators of deployments and user adoption.

After a tremendous amount of consideration, I believe that the DefCore
Working Group needs to implement a temporary waiver for the strict API
checking requirements that were introduced last year, to give downstream
deployers more time to catch up with the strict micro-versioning
requirements determined by the Nova/Compute team and enforced by the
Tempest/QA team.

My reasoning behind this is that while the change that enabled strict
checking was discussed publicly in the developer community and took
some time to be implemented, it still landed quickly and broke several
existing deployments overnight. As Tempest has moved forward with
bug and UX fixes (some in part to support the interoperability testing
efforts of the DefCore Working Group), using an older versions of Tempest
where this strict checking is not enforced is no longer a viable solution
for downstream deployers. The TC has passed a resolution to advise
DefCore to use Tempest as the single source of capability testing[7],
but this naturally introduces tension between the competing goals of
maintaining upstream functional testing and also tracking lagging
indicators.

My proposal for addressing this problem approaches it at two levels:

  • For the short term, I will submit a blueprint and patch to tempest that
    allows configuration of a grey-list of Nova APIs where strict response
    checking on additional properties will be disabled. So, for example,
    if the 'create servers' API call returned extra properties on that call,
    the strict checking on this line[8] would be disabled at runtime.
    Use of this code path will emit a deprecation warning, and the
    code will be scheduled for removal in 2017 directly after the release
    of the 2017.01 guideline. Vendors would be required so submit the
    grey-list of APIs with additional response data that would be
    published to their marketplace entry.

  • Longer term, vendors will be expected to work with upstream to update
    the API for returning additional data that is compatible with
    API micro-versioning as defined by the Nova team, and the waiver would
    no longer be allowed after the release of the 2017.01 guideline.

For the next half-year, I feel that this approach strengthens interoperability
by accurately capturing the current state of OpenStack deployments and
client tools. Before this change, additional properties on responses
weren't explicitly disallowed, and vendors and deployers took advantage
of this in production. While this is behavior that the Nova and QA teams
want to stop, it will take a bit more time to reach downstream. Also, as
of right now, as far as I know the only client that does strict response
checking for Nova responses is the Tempest client. Currently, additional
properties in responses are ignored and do not break existing client
functionality. There is currently little to no harm done to downstream
users by temporarily allowing additional data to be returned in responses.

Thanks,

Chris Hoge
Interop Engineer
OpenStack Foundation

[1] https://specs.openstack.org/openstack/nova-specs/specs/kilo/implemented/api-microversions.html
[2] http://lists.openstack.org/pipermail/openstack-dev/2015-February/057613.html
[3] https://review.openstack.org/#/c/156130
[4] http://lists.openstack.org/pipermail/defcore-committee/2016-January/000986.html
[5] http://git.openstack.org/cgit/openstack/defcore/tree/2015.07.json
[6] http://git.openstack.org/cgit/openstack/defcore/tree/2016.01.json
[7] http://git.openstack.org/cgit/openstack/governance/tree/resolutions/20160504-defcore-test-location.rst
[8] http://git.openstack.org/cgit/openstack/tempest-lib/tree/tempest_lib/api_schema/response/compute/v2_1/servers.py#n39


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
asked Jun 14, 2016 in openstack-dev by chris_at_openstack.o (3,260 points)   2 3
retagged Jan 25, 2017 by admin

54 Responses

0 votes

Excerpts from Chris Hoge's message of 2016-06-14 10:57:05 -0700:

Last year, in response to Nova micro-versioning and extension updates[1],
the QA team added strict API schema checking to Tempest to ensure that
no additional properties were added to Nova API responses[2][3]. In the
last year, at least three vendors participating the the OpenStack Powered
Trademark program have been impacted by this change, two of which
reported this to the DefCore Working Group mailing list earlier this year[4].

The DefCore Working Group determines guidelines for the OpenStack Powered
program, which includes capabilities with associated functional tests
from Tempest that must be passed, and designated sections with associated
upstream code [5][6]. In determining these guidelines, the working group
attempts to balance the future direction of development with lagging
indicators of deployments and user adoption.

After a tremendous amount of consideration, I believe that the DefCore
Working Group needs to implement a temporary waiver for the strict API
checking requirements that were introduced last year, to give downstream
deployers more time to catch up with the strict micro-versioning
requirements determined by the Nova/Compute team and enforced by the
Tempest/QA team.

My reasoning behind this is that while the change that enabled strict
checking was discussed publicly in the developer community and took
some time to be implemented, it still landed quickly and broke several
existing deployments overnight. As Tempest has moved forward with
bug and UX fixes (some in part to support the interoperability testing
efforts of the DefCore Working Group), using an older versions of Tempest
where this strict checking is not enforced is no longer a viable solution
for downstream deployers. The TC has passed a resolution to advise
DefCore to use Tempest as the single source of capability testing[7],
but this naturally introduces tension between the competing goals of
maintaining upstream functional testing and also tracking lagging
indicators.

My proposal for addressing this problem approaches it at two levels:

  • For the short term, I will submit a blueprint and patch to tempest that
    allows configuration of a grey-list of Nova APIs where strict response
    checking on additional properties will be disabled. So, for example,
    if the 'create servers' API call returned extra properties on that call,
    the strict checking on this line[8] would be disabled at runtime.
    Use of this code path will emit a deprecation warning, and the
    code will be scheduled for removal in 2017 directly after the release
    of the 2017.01 guideline. Vendors would be required so submit the
    grey-list of APIs with additional response data that would be
    published to their marketplace entry.

  • Longer term, vendors will be expected to work with upstream to update
    the API for returning additional data that is compatible with
    API micro-versioning as defined by the Nova team, and the waiver would
    no longer be allowed after the release of the 2017.01 guideline.

For the next half-year, I feel that this approach strengthens interoperability
by accurately capturing the current state of OpenStack deployments and
client tools. Before this change, additional properties on responses
weren't explicitly disallowed, and vendors and deployers took advantage
of this in production. While this is behavior that the Nova and QA teams
want to stop, it will take a bit more time to reach downstream. Also, as
of right now, as far as I know the only client that does strict response
checking for Nova responses is the Tempest client. Currently, additional
properties in responses are ignored and do not break existing client
functionality. There is currently little to no harm done to downstream
users by temporarily allowing additional data to be returned in responses.

Thanks for putting this proposal together, Chris. The configuration
option you describe makes sense as a temporary solution to the
issue, and the timeline you propose (combined with the past year
since the change went in) should be plenty of time to handle upgrades.

Doug

Thanks,

Chris Hoge
Interop Engineer
OpenStack Foundation

[1] https://specs.openstack.org/openstack/nova-specs/specs/kilo/implemented/api-microversions.html
[2] http://lists.openstack.org/pipermail/openstack-dev/2015-February/057613.html
[3] https://review.openstack.org/#/c/156130
[4] http://lists.openstack.org/pipermail/defcore-committee/2016-January/000986.html
[5] http://git.openstack.org/cgit/openstack/defcore/tree/2015.07.json
[6] http://git.openstack.org/cgit/openstack/defcore/tree/2016.01.json
[7] http://git.openstack.org/cgit/openstack/governance/tree/resolutions/20160504-defcore-test-location.rst
[8] http://git.openstack.org/cgit/openstack/tempest-lib/tree/tempest_lib/api_schema/response/compute/v2_1/servers.py#n39


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 14, 2016 by Doug_Hellmann (87,520 points)   3 4 9
0 votes

On Tue, Jun 14, 2016 at 10:57:05AM -0700, Chris Hoge wrote:
Last year, in response to Nova micro-versioning and extension updates[1],
the QA team added strict API schema checking to Tempest to ensure that
no additional properties were added to Nova API responses[2][3]. In the
last year, at least three vendors participating the the OpenStack Powered
Trademark program have been impacted by this change, two of which
reported this to the DefCore Working Group mailing list earlier this year[4].

The DefCore Working Group determines guidelines for the OpenStack Powered
program, which includes capabilities with associated functional tests
from Tempest that must be passed, and designated sections with associated
upstream code [5][6]. In determining these guidelines, the working group
attempts to balance the future direction of development with lagging
indicators of deployments and user adoption.

After a tremendous amount of consideration, I believe that the DefCore
Working Group needs to implement a temporary waiver for the strict API
checking requirements that were introduced last year, to give downstream
deployers more time to catch up with the strict micro-versioning
requirements determined by the Nova/Compute team and enforced by the
Tempest/QA team.

I'm very much opposed to this being done. If we're actually concerned with
interoperability and verify that things behave in the same manner between multiple
clouds then doing this would be a big step backwards. The fundamental disconnect
here is that the vendors who have implemented out of band extensions or were
taking advantage of previously available places to inject extra attributes
believe that doing so means they're interoperable, which is quite far from
reality. The API is not a place for vendor differentiation.

As a user of several clouds myself I can say that having random gorp in a
response makes it much more difficult to use my code against multiple clouds. I
have to determine which properties being returned are specific to that vendor's
cloud and if I actually need to depend on them for anything it makes whatever
code I'm writing incompatible for using against any other cloud. (unless I
special case that block for each cloud) Sean Dague wrote a good post where a lot
of this was covered a year ago when microversions was starting to pick up steam:

https://dague.net/2015/06/05/the-nova-api-in-kilo-and-beyond-2

I'd recommend giving it a read, he explains the user first perspective more
clearly there.

I believe Tempest in this case is doing the right thing from an interoperability
perspective and ensuring that the API is actually the API. Not an API with extra
bits a vendor decided to add. I don't think a cloud or product that does this
to the api should be considered an interoperable OpenStack cloud and failing the
tests is the correct behavior.

-Matt Treinish

My reasoning behind this is that while the change that enabled strict
checking was discussed publicly in the developer community and took
some time to be implemented, it still landed quickly and broke several
existing deployments overnight. As Tempest has moved forward with
bug and UX fixes (some in part to support the interoperability testing
efforts of the DefCore Working Group), using an older versions of Tempest
where this strict checking is not enforced is no longer a viable solution
for downstream deployers. The TC has passed a resolution to advise
DefCore to use Tempest as the single source of capability testing[7],
but this naturally introduces tension between the competing goals of
maintaining upstream functional testing and also tracking lagging
indicators.

My proposal for addressing this problem approaches it at two levels:

  • For the short term, I will submit a blueprint and patch to tempest that
    allows configuration of a grey-list of Nova APIs where strict response
    checking on additional properties will be disabled. So, for example,
    if the 'create servers' API call returned extra properties on that call,
    the strict checking on this line[8] would be disabled at runtime.
    Use of this code path will emit a deprecation warning, and the
    code will be scheduled for removal in 2017 directly after the release
    of the 2017.01 guideline. Vendors would be required so submit the
    grey-list of APIs with additional response data that would be
    published to their marketplace entry.

  • Longer term, vendors will be expected to work with upstream to update
    the API for returning additional data that is compatible with
    API micro-versioning as defined by the Nova team, and the waiver would
    no longer be allowed after the release of the 2017.01 guideline.

For the next half-year, I feel that this approach strengthens interoperability
by accurately capturing the current state of OpenStack deployments and
client tools. Before this change, additional properties on responses
weren't explicitly disallowed, and vendors and deployers took advantage
of this in production. While this is behavior that the Nova and QA teams
want to stop, it will take a bit more time to reach downstream. Also, as
of right now, as far as I know the only client that does strict response
checking for Nova responses is the Tempest client. Currently, additional
properties in responses are ignored and do not break existing client
functionality. There is currently little to no harm done to downstream
users by temporarily allowing additional data to be returned in responses.

Thanks,

Chris Hoge
Interop Engineer
OpenStack Foundation

[1] https://specs.openstack.org/openstack/nova-specs/specs/kilo/implemented/api-microversions.html
[2] http://lists.openstack.org/pipermail/openstack-dev/2015-February/057613.html
[3] https://review.openstack.org/#/c/156130
[4] http://lists.openstack.org/pipermail/defcore-committee/2016-January/000986.html
[5] http://git.openstack.org/cgit/openstack/defcore/tree/2015.07.json
[6] http://git.openstack.org/cgit/openstack/defcore/tree/2016.01.json
[7] http://git.openstack.org/cgit/openstack/governance/tree/resolutions/20160504-defcore-test-location.rst
[8] http://git.openstack.org/cgit/openstack/tempest-lib/tree/tempest_lib/api_schema/response/compute/v2_1/servers.py#n39


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

responded Jun 14, 2016 by Matthew_Treinish (11,200 points)   2 5 5
0 votes

Excerpts from Matthew Treinish's message of 2016-06-14 14:21:27 -0400:

On Tue, Jun 14, 2016 at 10:57:05AM -0700, Chris Hoge wrote:

Last year, in response to Nova micro-versioning and extension updates[1],
the QA team added strict API schema checking to Tempest to ensure that
no additional properties were added to Nova API responses[2][3]. In the
last year, at least three vendors participating the the OpenStack Powered
Trademark program have been impacted by this change, two of which
reported this to the DefCore Working Group mailing list earlier this year[4].

The DefCore Working Group determines guidelines for the OpenStack Powered
program, which includes capabilities with associated functional tests
from Tempest that must be passed, and designated sections with associated
upstream code [5][6]. In determining these guidelines, the working group
attempts to balance the future direction of development with lagging
indicators of deployments and user adoption.

After a tremendous amount of consideration, I believe that the DefCore
Working Group needs to implement a temporary waiver for the strict API
checking requirements that were introduced last year, to give downstream
deployers more time to catch up with the strict micro-versioning
requirements determined by the Nova/Compute team and enforced by the
Tempest/QA team.

I'm very much opposed to this being done. If we're actually concerned with
interoperability and verify that things behave in the same manner between multiple
clouds then doing this would be a big step backwards. The fundamental disconnect
here is that the vendors who have implemented out of band extensions or were
taking advantage of previously available places to inject extra attributes
believe that doing so means they're interoperable, which is quite far from
reality. The API is not a place for vendor differentiation.

This is a temporary measure to address the fact that a large number
of existing tests changed their behavior, rather than having new
tests added to enforce this new requirement. The result is deployments
that previously passed these tests may no longer pass, and in fact
we have several cases where that's true with deployers who are
trying to maintain their own standard of backwards-compatibility
for their end users.

We have basically three options.

  1. Tell deployers who are trying to do the right for their immediate
    users that they can't use the trademark.

  2. Flag the related tests or remove them from the DefCore enforcement
    suite entirely.

  3. Be flexible about giving consumers of Tempest time to meet the
    new requirement by providing a way to disable the checks.

Option 1 goes against our own backwards compatibility policies.

Option 2 gives us no winners and actually reduces the interoperability
guarantees we already have in place.

Option 3 applies our usual community standard of slowly rolling
forward while maintaining compatibility as broadly as possible.

No one is suggesting that a permanent, or even open-ended, exception
be granted.

Doug

As a user of several clouds myself I can say that having random gorp in a
response makes it much more difficult to use my code against multiple clouds. I
have to determine which properties being returned are specific to that vendor's
cloud and if I actually need to depend on them for anything it makes whatever
code I'm writing incompatible for using against any other cloud. (unless I
special case that block for each cloud) Sean Dague wrote a good post where a lot
of this was covered a year ago when microversions was starting to pick up steam:

https://dague.net/2015/06/05/the-nova-api-in-kilo-and-beyond-2

I'd recommend giving it a read, he explains the user first perspective more
clearly there.

I believe Tempest in this case is doing the right thing from an interoperability
perspective and ensuring that the API is actually the API. Not an API with extra
bits a vendor decided to add. I don't think a cloud or product that does this
to the api should be considered an interoperable OpenStack cloud and failing the
tests is the correct behavior.

-Matt Treinish

My reasoning behind this is that while the change that enabled strict
checking was discussed publicly in the developer community and took
some time to be implemented, it still landed quickly and broke several
existing deployments overnight. As Tempest has moved forward with
bug and UX fixes (some in part to support the interoperability testing
efforts of the DefCore Working Group), using an older versions of Tempest
where this strict checking is not enforced is no longer a viable solution
for downstream deployers. The TC has passed a resolution to advise
DefCore to use Tempest as the single source of capability testing[7],
but this naturally introduces tension between the competing goals of
maintaining upstream functional testing and also tracking lagging
indicators.

My proposal for addressing this problem approaches it at two levels:

  • For the short term, I will submit a blueprint and patch to tempest that
    allows configuration of a grey-list of Nova APIs where strict response
    checking on additional properties will be disabled. So, for example,
    if the 'create servers' API call returned extra properties on that call,
    the strict checking on this line[8] would be disabled at runtime.
    Use of this code path will emit a deprecation warning, and the
    code will be scheduled for removal in 2017 directly after the release
    of the 2017.01 guideline. Vendors would be required so submit the
    grey-list of APIs with additional response data that would be
    published to their marketplace entry.

  • Longer term, vendors will be expected to work with upstream to update
    the API for returning additional data that is compatible with
    API micro-versioning as defined by the Nova team, and the waiver would
    no longer be allowed after the release of the 2017.01 guideline.

For the next half-year, I feel that this approach strengthens interoperability
by accurately capturing the current state of OpenStack deployments and
client tools. Before this change, additional properties on responses
weren't explicitly disallowed, and vendors and deployers took advantage
of this in production. While this is behavior that the Nova and QA teams
want to stop, it will take a bit more time to reach downstream. Also, as
of right now, as far as I know the only client that does strict response
checking for Nova responses is the Tempest client. Currently, additional
properties in responses are ignored and do not break existing client
functionality. There is currently little to no harm done to downstream
users by temporarily allowing additional data to be returned in responses.

Thanks,

Chris Hoge
Interop Engineer
OpenStack Foundation

[1] https://specs.openstack.org/openstack/nova-specs/specs/kilo/implemented/api-microversions.html
[2] http://lists.openstack.org/pipermail/openstack-dev/2015-February/057613.html
[3] https://review.openstack.org/#/c/156130
[4] http://lists.openstack.org/pipermail/defcore-committee/2016-January/000986.html
[5] http://git.openstack.org/cgit/openstack/defcore/tree/2015.07.json
[6] http://git.openstack.org/cgit/openstack/defcore/tree/2016.01.json
[7] http://git.openstack.org/cgit/openstack/governance/tree/resolutions/20160504-defcore-test-location.rst
[8] http://git.openstack.org/cgit/openstack/tempest-lib/tree/tempest_lib/api_schema/response/compute/v2_1/servers.py#n39


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 14, 2016 by Doug_Hellmann (87,520 points)   3 4 9
0 votes

On Tue, Jun 14, 2016 at 02:41:10PM -0400, Doug Hellmann wrote:
Excerpts from Matthew Treinish's message of 2016-06-14 14:21:27 -0400:

On Tue, Jun 14, 2016 at 10:57:05AM -0700, Chris Hoge wrote:

Last year, in response to Nova micro-versioning and extension updates[1],
the QA team added strict API schema checking to Tempest to ensure that
no additional properties were added to Nova API responses[2][3]. In the
last year, at least three vendors participating the the OpenStack Powered
Trademark program have been impacted by this change, two of which
reported this to the DefCore Working Group mailing list earlier this year[4].

The DefCore Working Group determines guidelines for the OpenStack Powered
program, which includes capabilities with associated functional tests
from Tempest that must be passed, and designated sections with associated
upstream code [5][6]. In determining these guidelines, the working group
attempts to balance the future direction of development with lagging
indicators of deployments and user adoption.

After a tremendous amount of consideration, I believe that the DefCore
Working Group needs to implement a temporary waiver for the strict API
checking requirements that were introduced last year, to give downstream
deployers more time to catch up with the strict micro-versioning
requirements determined by the Nova/Compute team and enforced by the
Tempest/QA team.

I'm very much opposed to this being done. If we're actually concerned with
interoperability and verify that things behave in the same manner between multiple
clouds then doing this would be a big step backwards. The fundamental disconnect
here is that the vendors who have implemented out of band extensions or were
taking advantage of previously available places to inject extra attributes
believe that doing so means they're interoperable, which is quite far from
reality. The API is not a place for vendor differentiation.

This is a temporary measure to address the fact that a large number
of existing tests changed their behavior, rather than having new
tests added to enforce this new requirement. The result is deployments
that previously passed these tests may no longer pass, and in fact
we have several cases where that's true with deployers who are
trying to maintain their own standard of backwards-compatibility
for their end users.

That's not what happened though. The API hasn't changed and the tests haven't
really changed either. We made our enforcement on Nova's APIs a bit stricter to
ensure nothing unexpected appeared. For the most these tests work on any version
of OpenStack. (we only test it in the gate on supported stable releases, but I
don't expect things to have drastically shifted on older releases) It also
doesn't matter which version of the API you run, v2.0 or v2.1. Literally, the
only case it ever fails is when you run something extra, not from the community,
either as an extension (which themselves are going away [1]) or another service
that wraps nova or imitates nova. I'm personally not comfortable saying those
extras are ever part of the OpenStack APIs.

We have basically three options.

  1. Tell deployers who are trying to do the right for their immediate
    users that they can't use the trademark.

  2. Flag the related tests or remove them from the DefCore enforcement
    suite entirely.

  3. Be flexible about giving consumers of Tempest time to meet the
    new requirement by providing a way to disable the checks.

Option 1 goes against our own backwards compatibility policies.

I don't think backwards compatibility policies really apply to what what define
as the set of tests that as a community we are saying a vendor has to pass to
say they're OpenStack. From my perspective as a community we either take a hard
stance on this and say to be considered an interoperable cloud (and to get the
trademark) you have to actually have an interoperable product. We slowly ratchet
up the requirements every 6 months, there isn't any implied backwards
compatibility in doing that. You passed in the past but not in the newer stricter
guidelines.

Also, even if I did think it applied, we're not talking about a change which
would fall into breaking that. The change was introduced a year and half ago
during kilo and landed a year ago during liberty:

https://review.openstack.org/#/c/156130/

That's way longer than our normal deprecation period of 3 months and a release
boundary.

Option 2 gives us no winners and actually reduces the interoperability
guarantees we already have in place.

Option 3 applies our usual community standard of slowly rolling
forward while maintaining compatibility as broadly as possible.

Except in this case there isn't actually any compatibility being maintained.
We're saying that we can't make the requirements for interoperability testing
stricter until all the vendors who were passing in the past are able to pass
the stricter version.

No one is suggesting that a permanent, or even open-ended, exception
be granted.

Sure, I agree an permanent or open-ended exception would be even worse. But, I
still think as a community we need to draw a hard line in the sand here. Just
because this measure is temporary doesn't make it any more palatable.

By doing this, even as a temporary measure, we're saying it's ok to call things
an OpenStack API when you add random gorp to the responses. Which is something we've
very clearly said as a community is the exact opposite of the case, which the
testing reflects. I still contend just because some vendors were running old
versions of tempest and old versions of openstack where their incompatible API
changes weren't caught doesn't mean they should be given pass now.

-Matt Treinish

[1] http://lists.openstack.org/pipermail/openstack-dev/2016-June/097285.html

Doug

As a user of several clouds myself I can say that having random gorp in a
response makes it much more difficult to use my code against multiple clouds. I
have to determine which properties being returned are specific to that vendor's
cloud and if I actually need to depend on them for anything it makes whatever
code I'm writing incompatible for using against any other cloud. (unless I
special case that block for each cloud) Sean Dague wrote a good post where a lot
of this was covered a year ago when microversions was starting to pick up steam:

https://dague.net/2015/06/05/the-nova-api-in-kilo-and-beyond-2

I'd recommend giving it a read, he explains the user first perspective more
clearly there.

I believe Tempest in this case is doing the right thing from an interoperability
perspective and ensuring that the API is actually the API. Not an API with extra
bits a vendor decided to add. I don't think a cloud or product that does this
to the api should be considered an interoperable OpenStack cloud and failing the
tests is the correct behavior.

-Matt Treinish

My reasoning behind this is that while the change that enabled strict
checking was discussed publicly in the developer community and took
some time to be implemented, it still landed quickly and broke several
existing deployments overnight. As Tempest has moved forward with
bug and UX fixes (some in part to support the interoperability testing
efforts of the DefCore Working Group), using an older versions of Tempest
where this strict checking is not enforced is no longer a viable solution
for downstream deployers. The TC has passed a resolution to advise
DefCore to use Tempest as the single source of capability testing[7],
but this naturally introduces tension between the competing goals of
maintaining upstream functional testing and also tracking lagging
indicators.

My proposal for addressing this problem approaches it at two levels:

  • For the short term, I will submit a blueprint and patch to tempest that
    allows configuration of a grey-list of Nova APIs where strict response
    checking on additional properties will be disabled. So, for example,
    if the 'create servers' API call returned extra properties on that call,
    the strict checking on this line[8] would be disabled at runtime.
    Use of this code path will emit a deprecation warning, and the
    code will be scheduled for removal in 2017 directly after the release
    of the 2017.01 guideline. Vendors would be required so submit the
    grey-list of APIs with additional response data that would be
    published to their marketplace entry.

  • Longer term, vendors will be expected to work with upstream to update
    the API for returning additional data that is compatible with
    API micro-versioning as defined by the Nova team, and the waiver would
    no longer be allowed after the release of the 2017.01 guideline.

For the next half-year, I feel that this approach strengthens interoperability
by accurately capturing the current state of OpenStack deployments and
client tools. Before this change, additional properties on responses
weren't explicitly disallowed, and vendors and deployers took advantage
of this in production. While this is behavior that the Nova and QA teams
want to stop, it will take a bit more time to reach downstream. Also, as
of right now, as far as I know the only client that does strict response
checking for Nova responses is the Tempest client. Currently, additional
properties in responses are ignored and do not break existing client
functionality. There is currently little to no harm done to downstream
users by temporarily allowing additional data to be returned in responses.

Thanks,

Chris Hoge
Interop Engineer
OpenStack Foundation

[1] https://specs.openstack.org/openstack/nova-specs/specs/kilo/implemented/api-microversions.html
[2] http://lists.openstack.org/pipermail/openstack-dev/2015-February/057613.html
[3] https://review.openstack.org/#/c/156130
[4] http://lists.openstack.org/pipermail/defcore-committee/2016-January/000986.html
[5] http://git.openstack.org/cgit/openstack/defcore/tree/2015.07.json
[6] http://git.openstack.org/cgit/openstack/defcore/tree/2016.01.json
[7] http://git.openstack.org/cgit/openstack/governance/tree/resolutions/20160504-defcore-test-location.rst
[8] http://git.openstack.org/cgit/openstack/tempest-lib/tree/tempest_lib/api_schema/response/compute/v2_1/servers.py#n39


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

responded Jun 14, 2016 by Matthew_Treinish (11,200 points)   2 5 5
0 votes

On Jun 14, 2016, at 11:21 AM, Matthew Treinish mtreinish@kortar.org wrote:

On Tue, Jun 14, 2016 at 10:57:05AM -0700, Chris Hoge wrote:

Last year, in response to Nova micro-versioning and extension updates[1],
the QA team added strict API schema checking to Tempest to ensure that
no additional properties were added to Nova API responses[2][3]. In the
last year, at least three vendors participating the the OpenStack Powered
Trademark program have been impacted by this change, two of which
reported this to the DefCore Working Group mailing list earlier this year[4].

The DefCore Working Group determines guidelines for the OpenStack Powered
program, which includes capabilities with associated functional tests
from Tempest that must be passed, and designated sections with associated
upstream code [5][6]. In determining these guidelines, the working group
attempts to balance the future direction of development with lagging
indicators of deployments and user adoption.

After a tremendous amount of consideration, I believe that the DefCore
Working Group needs to implement a temporary waiver for the strict API
checking requirements that were introduced last year, to give downstream
deployers more time to catch up with the strict micro-versioning
requirements determined by the Nova/Compute team and enforced by the
Tempest/QA team.

I'm very much opposed to this being done. If we're actually concerned with
interoperability and verify that things behave in the same manner between multiple
clouds then doing this would be a big step backwards. The fundamental disconnect
here is that the vendors who have implemented out of band extensions or were
taking advantage of previously available places to inject extra attributes
believe that doing so means they're interoperable, which is quite far from
reality. The API is not a place for vendor differentiation.

Yes, it’s bad practice, but it’s also a reality, and I honestly believe that
vendors have received the message and are working on changing.

As a user of several clouds myself I can say that having random gorp in a
response makes it much more difficult to use my code against multiple clouds. I
have to determine which properties being returned are specific to that vendor's
cloud and if I actually need to depend on them for anything it makes whatever
code I'm writing incompatible for using against any other cloud. (unless I
special case that block for each cloud) Sean Dague wrote a good post where a lot
of this was covered a year ago when microversions was starting to pick up steam:

https://dague.net/2015/06/05/the-nova-api-in-kilo-and-beyond-2 https://dague.net/2015/06/05/the-nova-api-in-kilo-and-beyond-2

I'd recommend giving it a read, he explains the user first perspective more
clearly there.

I believe Tempest in this case is doing the right thing from an interoperability
perspective and ensuring that the API is actually the API. Not an API with extra
bits a vendor decided to add.

A few points on this, though. Right now, Nova is the only API that is
enforcing this, and the clients. While this may change in the
future, I don’t think it accurately represents the reality of what’s
happening in the ecosystem.

As mentioned before, we also need to balance the lagging nature of
DefCore as an interoperability guideline with the needs of testing
upstream changes. I’m not asking for a permanent change that
undermines the goals of Tempest for QA, rather a temporary
upstream modification that recognizes the challenges faced by
vendors in the market right now, and gives them room to continue
to align themselves with upstream. Without this, the two other
alternatives are to:

  • Have some vendors leave the Powered program unnecessarily,
    weakening it.
  • Force DefCore to adopt non-upstream testing, either as a fork
    or an independent test suite.

Neither seem ideal to me.

One of my goals is to transparently strengthen the ties between
upstream and downstream development. There is a deadline
built into this proposal, and my intention is to enforce it.

I don't think a cloud or product that does this
to the api should be considered an interoperable OpenStack cloud and failing the
tests is the correct behavior.

I think it’s more nuanced than this, especially right now.
Only additions to responses will be considered, not changes.
These additions will be clearly labelled as variations,
signaling the differences to users. Existing clients in use
will not break. Correct behavior will eventually be enforced,
and this would be clearly signaled by both the test tool and
through the administrative program.


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 14, 2016 by chris_at_openstack.o (3,260 points)   2 3
0 votes

On Tue, 14 Jun 2016, Matthew Treinish wrote:

By doing this, even as a temporary measure, we're saying it's ok to call things
an OpenStack API when you add random gorp to the responses. Which is something we've
very clearly said as a community is the exact opposite of the case, which the
testing reflects. I still contend just because some vendors were running old
versions of tempest and old versions of openstack where their incompatible API
changes weren't caught doesn't mean they should be given pass now.

Yes. Thanks.

--
Chris Dent (╯°□°)╯︵┻━┻ http://anticdent.org/
freenode: cdent tw: @anticdent__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

responded Jun 14, 2016 by cdent_plus_os_at_ant (12,800 points)   2 2 5
0 votes

On Tue, Jun 14, 2016 at 12:19:54PM -0700, Chris Hoge wrote:

On Jun 14, 2016, at 11:21 AM, Matthew Treinish mtreinish@kortar.org wrote:

On Tue, Jun 14, 2016 at 10:57:05AM -0700, Chris Hoge wrote:

Last year, in response to Nova micro-versioning and extension updates[1],
the QA team added strict API schema checking to Tempest to ensure that
no additional properties were added to Nova API responses[2][3]. In the
last year, at least three vendors participating the the OpenStack Powered
Trademark program have been impacted by this change, two of which
reported this to the DefCore Working Group mailing list earlier this year[4].

The DefCore Working Group determines guidelines for the OpenStack Powered
program, which includes capabilities with associated functional tests
from Tempest that must be passed, and designated sections with associated
upstream code [5][6]. In determining these guidelines, the working group
attempts to balance the future direction of development with lagging
indicators of deployments and user adoption.

After a tremendous amount of consideration, I believe that the DefCore
Working Group needs to implement a temporary waiver for the strict API
checking requirements that were introduced last year, to give downstream
deployers more time to catch up with the strict micro-versioning
requirements determined by the Nova/Compute team and enforced by the
Tempest/QA team.

I'm very much opposed to this being done. If we're actually concerned with
interoperability and verify that things behave in the same manner between multiple
clouds then doing this would be a big step backwards. The fundamental disconnect
here is that the vendors who have implemented out of band extensions or were
taking advantage of previously available places to inject extra attributes
believe that doing so means they're interoperable, which is quite far from
reality. The API is not a place for vendor differentiation.

Yes, it’s bad practice, but it’s also a reality, and I honestly believe that
vendors have received the message and are working on changing.

They might be working on this, but this change was coming for quite some
time it shouldn't be a surprise to anyone at this point. I mean seriously, it's
been in tempest for 1 year, and it took 6months to land. Also, lets say we set
a hard deadline on this new option to disable the enforcement and enforce it.
Then we implement a similar change on keystone are we gonna have to do the same
thing again when vendors who have custom things running there fail.

As a user of several clouds myself I can say that having random gorp in a
response makes it much more difficult to use my code against multiple clouds. I
have to determine which properties being returned are specific to that vendor's
cloud and if I actually need to depend on them for anything it makes whatever
code I'm writing incompatible for using against any other cloud. (unless I
special case that block for each cloud) Sean Dague wrote a good post where a lot
of this was covered a year ago when microversions was starting to pick up steam:

https://dague.net/2015/06/05/the-nova-api-in-kilo-and-beyond-2 https://dague.net/2015/06/05/the-nova-api-in-kilo-and-beyond-2

I'd recommend giving it a read, he explains the user first perspective more
clearly there.

I believe Tempest in this case is doing the right thing from an interoperability
perspective and ensuring that the API is actually the API. Not an API with extra
bits a vendor decided to add.

A few points on this, though. Right now, Nova is the only API that is
enforcing this, and the clients. While this may change in the
future, I don’t think it accurately represents the reality of what’s
happening in the ecosystem.

This in itself doesn't make a difference. There is a disparity in the level of
testing across all the projects. Nova happens to be further along in regards
to api stability and testing things compared to a lot of projects, it's not
really a surprise that they're the first for this to come up on. It's only a
matter of time for other projects to follow nova's example and implement similar
enforcement.

As mentioned before, we also need to balance the lagging nature of
DefCore as an interoperability guideline with the needs of testing
upstream changes. I’m not asking for a permanent change that
undermines the goals of Tempest for QA, rather a temporary
upstream modification that recognizes the challenges faced by
vendors in the market right now, and gives them room to continue
to align themselves with upstream. Without this, the two other
alternatives are to:

  • Have some vendors leave the Powered program unnecessarily,
    weakening it.
  • Force DefCore to adopt non-upstream testing, either as a fork
    or an independent test suite.

Neither seem ideal to me.

It might not be ideal for a vendor to leave the program, but I think it's a
necessary consequence of evolving the guidlines to become stricter over time.
What we define as the minimum requirements for interoperability and by extension
use of the trademark will continue to evolve. Every time we add additional tests,
more stringent checking, or change something inevitably someone is going to fail
no matter how slowly we ramp it out.

There's a limit to how accommodating we should be here. This change has been in
the wild for a year, and also took 6 months to land. The issue in question
literally only ever will cause and issue if you add something extra, not
OpenStack, to the API. All versions of the nova API (maybe not really old
releases like <= folsom) should get passed this check without any issue. I still
fail to see how a vendor failing the guidelines here is a bad thing. Isn't this
what we're supposed to be doing.

Also, defcore already has a mechanism for slowly rolling out changes like this. The
guidelines contain a tempest sha1 (for better or worse):

https://git.openstack.org/cgit/openstack/defcore/tree/2016.01.json#n113

If the defcore committee still feels there needs to be a more gradual roll out of

1yr (which I strongly disagree with) then the minimum sha1 should be set more
conservatively to a point before the change in question. Yes that means old bugs
will still be present in tempest, but I don't think we can have it both ways here.
Either we say you have to pass stricter requirements or we don't. We added
idempotent ids to tempest exactly for this reason so you can keep track of tests
as things change.

One of my goals is to transparently strengthen the ties between
upstream and downstream development. There is a deadline
built into this proposal, and my intention is to enforce it.

My argument is that the deadline has already passed. We've been enforcing this
in tempest for 1 year already. It's only coming up now because some vendors didn't
pay attention to anything happening in the community or with changes in the
testing guidelines were incoming and now are stuck. From my perspective this
will always happen no matter how gradually we make changes and how much we
advertise it.

I don't think a cloud or product that does this
to the api should be considered an interoperable OpenStack cloud and failing the
tests is the correct behavior.

I think it’s more nuanced than this, especially right now.
Only additions to responses will be considered, not changes.
These additions will be clearly labelled as variations,
signaling the differences to users. Existing clients in use
will not break. Correct behavior will eventually be enforced,
and this would be clearly signaled by both the test tool and
through the administrative program.

You're making large assumptions about how the APIs are actually consumed here.
You can't assume that only one of the clients you know about is being used to
talk to APIs. For example, I have a bunch of code I wrote a while ago that uses
the tempest clients with [1] to interact with clouds. That code would fail the
second I talked to a cloud with these extra bits enabled. Granted that's a bit
of a contrived example, but if I'm dealing with the api at a lower level (using
my hypothetical hand built fortran client) it's perfectly reasonable to assume
that I start on vendor A's "openstack" cloud see the extra params in the
response and assume they're everywhere and make my code depend on that. Then
when I use a cloud deployed on the 3 spare machines in my basement from the
latest release tarballs everything starts failing without any indication where
that extra parameter went. That's the kind of experience we're trying to avoid.

Also, there is also no guarantee that the extra fields are clearly marked. If we
disable this checking literally anything can be added to the responses from nova
and still pass for example if we're not explicitly checking for it. For example,
I could add a top level field to the server response "useful: True" for things
that use my proprietary hypervisor and "useful: False" for libvirt guests. There
is nothing stopping me from writing an extension that does that and adding it to
the API and then passing all the tests. Nothing would catch this if we disable
the strict validation.

My fundamental concern here is that we're optimizing for the wrong set of
priorities. As a community do we want to prioritize enforcing interoperability
with guidelines we define and develop in the open and that things that we
say are openstack behave in a manner for a user as we've developed in the
community. Or do we want to optimize for ensuring that vendors who are
continually slow to adapt don't ever fail guidelines when they've passed things
in the past. I'm all for doing a slow roll out of changes, to give people a
chance to adopt as new constraints are added, doing otherwise would be reckless.
But, I feel in this case the time for that has past. I also don't think we should
add workarounds to avoid adding constraints as things move forward, we should set
reasonable min version of tempest to use.

-Matt Treinish

[1] https://github.com/mtreinish/mesocyclone


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

responded Jun 14, 2016 by Matthew_Treinish (11,200 points)   2 5 5
0 votes

On 06/14/2016 01:57 PM, Chris Hoge wrote:

My proposal for addressing this problem approaches it at two levels:

  • For the short term, I will submit a blueprint and patch to tempest that
    allows configuration of a grey-list of Nova APIs where strict response
    checking on additional properties will be disabled. So, for example,
    if the 'create servers' API call returned extra properties on that call,
    the strict checking on this line[8] would be disabled at runtime.
    Use of this code path will emit a deprecation warning, and the
    code will be scheduled for removal in 2017 directly after the release
    of the 2017.01 guideline. Vendors would be required so submit the
    grey-list of APIs with additional response data that would be
    published to their marketplace entry.

To understand more. Will there be a visible asterisk with their
registration that says they require a grey-list?

  • Longer term, vendors will be expected to work with upstream to update
    the API for returning additional data that is compatible with
    API micro-versioning as defined by the Nova team, and the waiver would
    no longer be allowed after the release of the 2017.01 guideline.

For the next half-year, I feel that this approach strengthens interoperability
by accurately capturing the current state of OpenStack deployments and
client tools. Before this change, additional properties on responses
weren't explicitly disallowed, and vendors and deployers took advantage
of this in production. While this is behavior that the Nova and QA teams
want to stop, it will take a bit more time to reach downstream. Also, as
of right now, as far as I know the only client that does strict response
checking for Nova responses is the Tempest client. Currently, additional
properties in responses are ignored and do not break existing client
functionality. There is currently little to no harm done to downstream
users by temporarily allowing additional data to be returned in responses.

In general I'm ok with this, as long as three things are true:

1) registrations that need the grey list are visually indicated quite
clearly and publicly that they needed it to pass.

2) 2017.01 is a firm cutoff.

3) We have evidence that folks that are having challenges with the
strict enforcement have made getting compliant a top priority.

3 is the one where I don't have any data either way. But I didn't see
any specs submissions (which are required for API changes in Nova) for
Newton that would indicate anyone is working on this. For 2017 to be a
hard stop, that means folks are either deleting this from their
interface, or proposing in Ocata. Which is a really short runway if this
stuff isn't super straight forward and already upstream agreed.

So I'm provisionally ok with this, if folks in the know feel like 3 is
covered.

-Sean

P.S. The Tempest changes pretty much just anticipate the Nova changes
which are deleting all these facilities in Newton -
https://specs.openstack.org/openstack/nova-specs/specs/newton/approved/api-no-more-extensions.html
- so in some ways we aren't doing folks a ton of favors letting them
delay too far because they are about to hit a brick wall on the code side.

-Sean

--
Sean Dague
http://dague.net


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 14, 2016 by Sean_Dague (66,200 points)   4 8 14
0 votes

Excerpts from Matthew Treinish's message of 2016-06-14 15:12:45 -0400:

On Tue, Jun 14, 2016 at 02:41:10PM -0400, Doug Hellmann wrote:

Excerpts from Matthew Treinish's message of 2016-06-14 14:21:27 -0400:

On Tue, Jun 14, 2016 at 10:57:05AM -0700, Chris Hoge wrote:

Last year, in response to Nova micro-versioning and extension updates[1],
the QA team added strict API schema checking to Tempest to ensure that
no additional properties were added to Nova API responses[2][3]. In the
last year, at least three vendors participating the the OpenStack Powered
Trademark program have been impacted by this change, two of which
reported this to the DefCore Working Group mailing list earlier this year[4].

The DefCore Working Group determines guidelines for the OpenStack Powered
program, which includes capabilities with associated functional tests
from Tempest that must be passed, and designated sections with associated
upstream code [5][6]. In determining these guidelines, the working group
attempts to balance the future direction of development with lagging
indicators of deployments and user adoption.

After a tremendous amount of consideration, I believe that the DefCore
Working Group needs to implement a temporary waiver for the strict API
checking requirements that were introduced last year, to give downstream
deployers more time to catch up with the strict micro-versioning
requirements determined by the Nova/Compute team and enforced by the
Tempest/QA team.

I'm very much opposed to this being done. If we're actually concerned with
interoperability and verify that things behave in the same manner between multiple
clouds then doing this would be a big step backwards. The fundamental disconnect
here is that the vendors who have implemented out of band extensions or were
taking advantage of previously available places to inject extra attributes
believe that doing so means they're interoperable, which is quite far from
reality. The API is not a place for vendor differentiation.

This is a temporary measure to address the fact that a large number
of existing tests changed their behavior, rather than having new
tests added to enforce this new requirement. The result is deployments
that previously passed these tests may no longer pass, and in fact
we have several cases where that's true with deployers who are
trying to maintain their own standard of backwards-compatibility
for their end users.

That's not what happened though. The API hasn't changed and the tests haven't
really changed either. We made our enforcement on Nova's APIs a bit stricter to
ensure nothing unexpected appeared. For the most these tests work on any version
of OpenStack. (we only test it in the gate on supported stable releases, but I
don't expect things to have drastically shifted on older releases) It also
doesn't matter which version of the API you run, v2.0 or v2.1. Literally, the
only case it ever fails is when you run something extra, not from the community,
either as an extension (which themselves are going away [1]) or another service
that wraps nova or imitates nova. I'm personally not comfortable saying those
extras are ever part of the OpenStack APIs.

We have basically three options.

  1. Tell deployers who are trying to do the right for their immediate
    users that they can't use the trademark.

  2. Flag the related tests or remove them from the DefCore enforcement
    suite entirely.

  3. Be flexible about giving consumers of Tempest time to meet the
    new requirement by providing a way to disable the checks.

Option 1 goes against our own backwards compatibility policies.

I don't think backwards compatibility policies really apply to what what define
as the set of tests that as a community we are saying a vendor has to pass to
say they're OpenStack. From my perspective as a community we either take a hard
stance on this and say to be considered an interoperable cloud (and to get the
trademark) you have to actually have an interoperable product. We slowly ratchet
up the requirements every 6 months, there isn't any implied backwards
compatibility in doing that. You passed in the past but not in the newer stricter
guidelines.

Also, even if I did think it applied, we're not talking about a change which
would fall into breaking that. The change was introduced a year and half ago
during kilo and landed a year ago during liberty:

https://review.openstack.org/#/c/156130/

That's way longer than our normal deprecation period of 3 months and a release
boundary.

Option 2 gives us no winners and actually reduces the interoperability
guarantees we already have in place.

Option 3 applies our usual community standard of slowly rolling
forward while maintaining compatibility as broadly as possible.

Except in this case there isn't actually any compatibility being maintained.
We're saying that we can't make the requirements for interoperability testing
stricter until all the vendors who were passing in the past are able to pass
the stricter version.

No one is suggesting that a permanent, or even open-ended, exception
be granted.

Sure, I agree an permanent or open-ended exception would be even worse. But, I
still think as a community we need to draw a hard line in the sand here. Just
because this measure is temporary doesn't make it any more palatable.

By doing this, even as a temporary measure, we're saying it's ok to call things
an OpenStack API when you add random gorp to the responses. Which is something we've
very clearly said as a community is the exact opposite of the case, which the
testing reflects. I still contend just because some vendors were running old
versions of tempest and old versions of openstack where their incompatible API
changes weren't caught doesn't mean they should be given pass now.

Nobody is saying random gorp is OK, and I'm not sure "line in the
sand" rhetoric is really constructive. The issue is not with the
nature of the API policies, it's with the implementation of those
policies and how they were rolled out.

DefCore defines its rules using named tests in Tempest. If these
new enforcement policies had been applied by adding new tests to
Tempest, then DefCore could have added them using its processes
over a period of time and we wouldn't have had any issues. That's
not what happened. Instead, the behavior of a bunch of existing
tests changed. As a result, deployments that have not changed fail
tests that they used to pass, without any action being taken on the
deployer's part. We've moved the goal posts on our users in a way
that was not easily discoverable, because it couldn't be tracked
through the (admittedly limited) process we have in place for doing
that tracking.

So, we want a way to get the test results back to their existing
status, which will then let us roll adoption forward smoothly instead
of lurching from "pass" to "fail" to "pass".

We should, separately, address the process issues and the limitations
this situation has exposed. That may mean changing the way DefCore
defines its policies, or tracks things, or uses Tempest. For
example, in the future, we may want tie versions of Tempest to
versions of the trademark more closely, so that it's possible for
someone running the Mitaka version of OpenStack to continue to use
the Mitaka version of Tempest and not have to upgrade Tempest in
order to retain their trademark (maybe that's how it already works?).
We may also need to consider that test implementation details may
change, and have a review process within DefCore to help expose
those changes to make them clearer to deployers.

Fixing the process issue may also mean changing the way we implement
things in Tempest. In this case, adding a flag helps move ahead
more smoothly. Perhaps we adopt that as a general policy in the
future when we make underlying behavioral changes like this to
existing tests. Perhaps instead we have a policy that we do not
change the behavior of existing tests in such significant ways, at
least if they're tagged as being used by DefCore. I don't know --
those are things we need to discuss.

Doug

-Matt Treinish

[1] http://lists.openstack.org/pipermail/openstack-dev/2016-June/097285.html

Doug

As a user of several clouds myself I can say that having random gorp in a
response makes it much more difficult to use my code against multiple clouds. I
have to determine which properties being returned are specific to that vendor's
cloud and if I actually need to depend on them for anything it makes whatever
code I'm writing incompatible for using against any other cloud. (unless I
special case that block for each cloud) Sean Dague wrote a good post where a lot
of this was covered a year ago when microversions was starting to pick up steam:

https://dague.net/2015/06/05/the-nova-api-in-kilo-and-beyond-2

I'd recommend giving it a read, he explains the user first perspective more
clearly there.

I believe Tempest in this case is doing the right thing from an interoperability
perspective and ensuring that the API is actually the API. Not an API with extra
bits a vendor decided to add. I don't think a cloud or product that does this
to the api should be considered an interoperable OpenStack cloud and failing the
tests is the correct behavior.

-Matt Treinish

My reasoning behind this is that while the change that enabled strict
checking was discussed publicly in the developer community and took
some time to be implemented, it still landed quickly and broke several
existing deployments overnight. As Tempest has moved forward with
bug and UX fixes (some in part to support the interoperability testing
efforts of the DefCore Working Group), using an older versions of Tempest
where this strict checking is not enforced is no longer a viable solution
for downstream deployers. The TC has passed a resolution to advise
DefCore to use Tempest as the single source of capability testing[7],
but this naturally introduces tension between the competing goals of
maintaining upstream functional testing and also tracking lagging
indicators.

My proposal for addressing this problem approaches it at two levels:

  • For the short term, I will submit a blueprint and patch to tempest that
    allows configuration of a grey-list of Nova APIs where strict response
    checking on additional properties will be disabled. So, for example,
    if the 'create servers' API call returned extra properties on that call,
    the strict checking on this line[8] would be disabled at runtime.
    Use of this code path will emit a deprecation warning, and the
    code will be scheduled for removal in 2017 directly after the release
    of the 2017.01 guideline. Vendors would be required so submit the
    grey-list of APIs with additional response data that would be
    published to their marketplace entry.

  • Longer term, vendors will be expected to work with upstream to update
    the API for returning additional data that is compatible with
    API micro-versioning as defined by the Nova team, and the waiver would
    no longer be allowed after the release of the 2017.01 guideline.

For the next half-year, I feel that this approach strengthens interoperability
by accurately capturing the current state of OpenStack deployments and
client tools. Before this change, additional properties on responses
weren't explicitly disallowed, and vendors and deployers took advantage
of this in production. While this is behavior that the Nova and QA teams
want to stop, it will take a bit more time to reach downstream. Also, as
of right now, as far as I know the only client that does strict response
checking for Nova responses is the Tempest client. Currently, additional
properties in responses are ignored and do not break existing client
functionality. There is currently little to no harm done to downstream
users by temporarily allowing additional data to be returned in responses.

Thanks,

Chris Hoge
Interop Engineer
OpenStack Foundation

[1] https://specs.openstack.org/openstack/nova-specs/specs/kilo/implemented/api-microversions.html
[2] http://lists.openstack.org/pipermail/openstack-dev/2015-February/057613.html
[3] https://review.openstack.org/#/c/156130
[4] http://lists.openstack.org/pipermail/defcore-committee/2016-January/000986.html
[5] http://git.openstack.org/cgit/openstack/defcore/tree/2015.07.json
[6] http://git.openstack.org/cgit/openstack/defcore/tree/2016.01.json
[7] http://git.openstack.org/cgit/openstack/governance/tree/resolutions/20160504-defcore-test-location.rst
[8] http://git.openstack.org/cgit/openstack/tempest-lib/tree/tempest_lib/api_schema/response/compute/v2_1/servers.py#n39


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 14, 2016 by Doug_Hellmann (87,520 points)   3 4 9
0 votes

Excerpts from Sean Dague's message of 2016-06-14 17:29:06 -0400:

On 06/14/2016 01:57 PM, Chris Hoge wrote:

My proposal for addressing this problem approaches it at two levels:

  • For the short term, I will submit a blueprint and patch to tempest that
    allows configuration of a grey-list of Nova APIs where strict response
    checking on additional properties will be disabled. So, for example,
    if the 'create servers' API call returned extra properties on that call,
    the strict checking on this line[8] would be disabled at runtime.
    Use of this code path will emit a deprecation warning, and the
    code will be scheduled for removal in 2017 directly after the release
    of the 2017.01 guideline. Vendors would be required so submit the
    grey-list of APIs with additional response data that would be
    published to their marketplace entry.

To understand more. Will there be a visible asterisk with their
registration that says they require a grey-list?

  • Longer term, vendors will be expected to work with upstream to update
    the API for returning additional data that is compatible with
    API micro-versioning as defined by the Nova team, and the waiver would
    no longer be allowed after the release of the 2017.01 guideline.

For the next half-year, I feel that this approach strengthens interoperability
by accurately capturing the current state of OpenStack deployments and
client tools. Before this change, additional properties on responses
weren't explicitly disallowed, and vendors and deployers took advantage
of this in production. While this is behavior that the Nova and QA teams
want to stop, it will take a bit more time to reach downstream. Also, as
of right now, as far as I know the only client that does strict response
checking for Nova responses is the Tempest client. Currently, additional
properties in responses are ignored and do not break existing client
functionality. There is currently little to no harm done to downstream
users by temporarily allowing additional data to be returned in responses.

In general I'm ok with this, as long as three things are true:

1) registrations that need the grey list are visually indicated quite
clearly and publicly that they needed it to pass.

I like that. Chris' proposal was that the information would need to be
submitted with the application, and I think publishing it makes sense.
I'd like to see the whole list, either which APIs had to be flagged or
at least which tests, whichever we can do.

2) 2017.01 is a firm cutoff.

3) We have evidence that folks that are having challenges with the
strict enforcement have made getting compliant a top priority.

3 is the one where I don't have any data either way. But I didn't see
any specs submissions (which are required for API changes in Nova) for
Newton that would indicate anyone is working on this. For 2017 to be a
hard stop, that means folks are either deleting this from their
interface, or proposing in Ocata. Which is a really short runway if this
stuff isn't super straight forward and already upstream agreed.

So I'm provisionally ok with this, if folks in the know feel like 3 is
covered.

-Sean

P.S. The Tempest changes pretty much just anticipate the Nova changes
which are deleting all these facilities in Newton -
https://specs.openstack.org/openstack/nova-specs/specs/newton/approved/api-no-more-extensions.html
- so in some ways we aren't doing folks a ton of favors letting them
delay too far because they are about to hit a brick wall on the code side.

-Sean


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 14, 2016 by Doug_Hellmann (87,520 points)   3 4 9
...