settingsLogin | Registersettings

[openstack-dev] [nova] Future of the Nova API

0 votes

Hi,

There has recently been some speculation around the V3 API and whether
we should go forward with it or instead backport many of the changes
to the V2 API. I believe that the core of the concern is the extra
maintenance and test burden that supporting two APIs means and the
length of time before we are able to deprecate the V2 API and return
to maintaining only one (well two including EC2) API again.

This email is rather long so here's the TL;DR version:

  • We want to make backwards incompatible changes to the API
    and whether we do it in-place with V2 or by releasing V3
    we'll have some form of dual API support burden.

    • Not making backwards incompatible changes means:
    • retaining an inconsistent API
    • not being able to fix numerous input validation issues
    • have to forever proxy for glance/cinder/neutron with all
      the problems that entails.
    • Backporting V3 infrastructure changes to V2 would be a
      considerable amount of programmer/review time
  • The V3 API as-is has:

    • lower maintenance
    • is easier to understand and use (consistent).
    • Much better input validation which is baked-in (json-schema)
      rather than ad-hoc and incomplete.
  • Whilst we have existing users of the API we also have a lot more
    users in the future. It would be much better to allow them to use
    the API we want to get to as soon as possible, rather than trying
    to evolve the V2 API and forcing them along the transition that they
    could otherwise avoid.

  • We already have feature parity for the V3 API (nova-network being
    the exception due to the very recent unfreezing of it), novaclient
    support, and a reasonable transition path for V2 users.

  • Proposed way forward:

    • Release the V3 API in Juno with nova-network and tasks support
    • Feature freeze the V2 API when the V3 API is released
    • Set the timeline for deprecation of V2 so users have a lot
      of warning
    • Fallback for those who really don't want to move after
      deprecation is an API service which translates between V2 and V3
      requests, but removes the dual API support burden from Nova.

End TL;DR.

Although its late in the development cycle I think its important to
discuss this now rather than wait until the summit because if we go
ahead with the V3 API there is exploratory work around nova-network
that we would like to do before summit and it also affects how we look
at significant effort applying changes to the V2 API now. I'd also
prefer to hit the ground running with work we know we need to do in Juno
as soon as it opens rather than wait until the summit has finished.

Firstly I'd like to step back a bit and ask the question whether we
ever want to fix up the various problems with the V2 API that involve
backwards incompatible changes. These range from inconsistent naming
through the urls and data expected and returned, to poor and
inconsistent input validation and removal of all the proxying Nova
does to cinder, glance and neutron. I believe the answer to this is
yes - inconsistencies in the API make it harder to use (eg do I have a
instance or a server, and a project or a tenant just to name a
couple) and more error prone and proxying has caused several painful to
fix issues for us.

So at some point existing users of the API will need to make changes
or we'll effectively have to carry two APIs, whether it be inside the
V2 API code or split between a V2 implementation and V3
implementation. I think the number of changes required often makes it
easier to maintain such a split in different files rather than a
single one to avoid large slabs of if/else which makes the underlying
code harder to understand and more error prone in the eventual
removal. Its also more difficult to use decorators like we have for
input validation in V3 if the function also has to support the old
behaviour.

One approach that has been suggested for retaining the V2 API is to
gradually over time mark individual interfaces deprecated, support the
new behaviour in parallel and then after the deprecation period remove
the original behaviour.

With this I think we have to consider that if Open Stack continues to
be successful then although we already have an existing user base of
the V2 API, that with every release we have more and more new users
coming in. In fact in a few cycles we may have more post-Icehouse
users than pre-Icehouse ones. But by taking this gradual approach we
are basically saying to new users of the API that although we know
where we are headed with the API that they can't actually write
against it yet and instead have to use the V2 API. And then every
cycle they will need to update their apps as we slowly deprecate parts
and replace them with the new interface. This seems to be a rather
hostile approach to people considering using our API. Also every
release we delay releasing the V3 API or defer making backwards
incompatible changes to V2 (if that's the route we take), the more
users we put into this situation of having to rework the software they
use to continue to access our API in the future.

Another side effect of this is new features such as tasks (which was
one of the significant reasons for not releasing the V3 API in
Icehouse) would not be able to be designed the way we want it in the
V2 API because it requires changes to the core API. So at least in the
short term we'd end up with a suboptimal API for tasks. And then go
through the pain of moving to the API we really want.

One thing to note is that the transition from V2 to V3 for users of
the API it does not have to be a big-bang thing. For example, an
application can quite legitimately create a server using the V2 API,
attach a volume using the V3 API, detach it using the V2 API and
delete the server using the V3 API. It is just a fairly thin layer on
top of the rest of Nova. So existing users of the API can decide
whether they want to tackle the job of moving from the V2 to V3 API in
one big step or in smaller ones over time.

We have also done a considerable amount of work in the V3 API in terms
of infrastructure which is not always visible to the API user. Such as
an improved plugin structure, better isolation between plugins,
versioning, better error handling, better input validation etc which
would all have to be backported to the V2 API. All in all I think if
you compare V2 and V3 API code, the latter is a lot cleaner and easier
to maintain. This is a non trivial amount of work that would take both
a lot of programmer and reviewer time in Juno, and perhaps overflow
into Koala. Time that has already been spent on the V3 API.

What I think our plan for the API transition should be is:

  • Release the V3 API in Juno (it is probably too late to release it in
    Icehouse at this stage now without a bunch of FFEs, but
    theoretically we could make the skeleton of the tasks API changes in
    IceHouse without functional tasks which would allow tasks to be
    added in a backwards compatible manner in Juno). The Juno release
    would have both full task and nova-network support.

  • Feature freeze the V2 API development in Juno or at the very least
    when the V3 API is marked as current/supported (bug fixes are of
    course ok). I have several reasons for wanting to do this. The first
    is to avoid the burden of actively developing two APIs. We already
    have feature parity in V3 with V2 - with the exception of
    nova-network which was deliberately left out of the V3 API because
    it was considered deprecated, but has recently been re-opened for
    development. So an exception for V2 API nova network development
    would seem reasonable until the V3 API supports it fully.

    I think it's a pretty unusual situation where a project decides to
    continue significant feature development on both the latest version
    and the next most recent version. We don't for example allow feature
    development in Havana. And I don't think the situation needs to be
    any different for V2 once V3 is available. We already have feature
    parity and we have a reasonable transition plan for existing users
    so I think its quite reasonable to say to users that if they want
    new features that they need to access them via the new API.

    Although as mentioned above, its possible to use the V3 API for new
    features even if a user is using the V2 API for everything else. So
    any new features could still be accessed by people who just to only
    modify their existing V2 API based programs to take advantage of the
    new functionality without modifying the rest. Also deployers, if they
    wished, could through policy only expose the parts of the V3 API that
    they want (even though the core would have to be loaded it doesn't
    necessarily have to be accessible to everyone).

  • At summit or the Juno midcycle meetup decide on a release when we will
    remove the V2 API so existing users of the V2 API have plenty of
    warning. They can decide either to do a gradual transition or a
    big-bang move to the V3 API. Ultimately where we set the date for
    the removal of the V2 API is going to be a balance between needs of
    users and what we can afford to spend on maintenance of it. But
    whether we try to evolve the V2 API or move to the V3 API we still
    need to make that decision. If we choose the V3 API we can also
    still choose to gradually deprecate the V2 over time (eg remove
    support for rarely used extensions earlier and point users to the V3
    API for that functionality).

  • For those who really really don't want to move off the V2 API
    when the V2 API is removed, there is the possibility of writing a
    separate translation service which takes V2 API REST requests and
    translates them to V3 API requests and also does the proxying that
    the V2 API currently does for glance/neutron/cinder. Its not a
    trivial job, but should be possible and does remove the burden of
    supporting two APIs in the Nova tree.

In summary, we need to fix the problems with the V2 API which requires
changes which are backwards incompatible. I believe releasing the V3
API in Juno and setting a deprecation timeframe for V2 well in advance
is the best overall solution we have for both our users (both current
and future) and us as developers - it balances the needs of users of
our API and what the easiest path for us as developers is. Attempting
to "evolve" the V2 API is on the surface attractive because we don't
in the short term need to make any decisions around deprecation but
will in the longer term involve more pain for everyone.

Chris

asked Feb 24, 2014 in openstack-dev by Christopher_Yeoh (8,400 points)   1 3 4
retagged Feb 25, 2015 by admin

99 Responses

0 votes

On Mon, 2014-02-24 at 17:20 +1030, Christopher Yeoh wrote:
- Proposed way forward:
- Release the V3 API in Juno with nova-network and tasks support
- Feature freeze the V2 API when the V3 API is released
- Set the timeline for deprecation of V2 so users have a lot
of warning
- Fallback for those who really don't want to move after
deprecation is an API service which translates between V2 and V3
requests, but removes the dual API support burden from Nova.

And when do you think we can begin the process of deprecating the V3 API
and removing API extensions and XML "translation" support?

Best,
-jay

responded Feb 24, 2014 by Jay_Pipes (59,760 points)   3 10 14
0 votes

On Mon, 24 Feb 2014 02:06:50 -0500
Jay Pipes wrote:

On Mon, 2014-02-24 at 17:20 +1030, Christopher Yeoh wrote:

  • Proposed way forward:

    • Release the V3 API in Juno with nova-network and tasks support
    • Feature freeze the V2 API when the V3 API is released
    • Set the timeline for deprecation of V2 so users have a lot
      of warning
    • Fallback for those who really don't want to move after
      deprecation is an API service which translates between V2 and V3
      requests, but removes the dual API support burden from Nova.

And when do you think we can begin the process of deprecating the V3
API and removing API extensions and XML "translation" support?

So did you mean V2 API here? I don't understand why you think the V3
API would need deprecating any time soon.

XML support has already been removed from the V3 API and I think the
patch to mark XML as deprecated for the V2 API and eventual removal in
Juno has already landed. So at least for part of the V2 API a one cycle
deprecation period has been seen as reasonable.

When it comes to API extensions I think that is actually more a
question of policy than anything else. The actual implementation behind
the scenes of a plugin architecture makes a lot of sense whether we
have extensions or not. It forces a good level of isolation between API
features and clarity of interaction where its needed - all of which
much is better from a maintenance point of view.

Now whether we have parts of the API which are optional or not is
really a policy decision as to whether we will force deployers to use
all of the plugins or a subset (eg currently the "core"). There is
the technical support for doing so in the V3 API (essentially what is
used to enforce the core of the API). And a major API version bump is
not required to change this. Perhaps this part falls in to the
DefCore discussions :-)

Chris

responded Feb 24, 2014 by Christopher_Yeoh (8,400 points)   1 3 4
0 votes
  • We want to make backwards incompatible changes to the API
    and whether we do it in-place with V2 or by releasing V3
    we'll have some form of dual API support burden.

IMHO, the cost of maintaining both APIs (which are largely duplicated)
for almost any amount of time outweighs the cost of localized changes.

  • Not making backwards incompatible changes means:

    • retaining an inconsistent API
    • not being able to fix numerous input validation issues
    • have to forever proxy for glance/cinder/neutron with all
      the problems that entails.

The neutron stickiness aside, I don't see a problem leaving the proxying
in place for the foreseeable future. I think that it's reasonable to
mark them as deprecated, encourage people not to use them, and maybe
even (with a core api version to mark the change) say that they're not
supported anymore.

I also think that breaking our users because we decided to split A into
B and C on the backend kind of sucks. I imagine that continuing to do
that at the API layer (when we're clearly going to keep doing it on the
backend) is going to earn us a bit of a reputation.

  • Backporting V3 infrastructure changes to V2 would be a
    considerable amount of programmer/review time

While acknowledging that you (and others) have done that for v3 already,
I have to think that such an effort is much less costly than maintaining
two complete overlapping pieces of API code.

  • The V3 API as-is has:

    • lower maintenance
    • is easier to understand and use (consistent).
    • Much better input validation which is baked-in (json-schema)
      rather than ad-hoc and incomplete.

In case it's not clear, there is no question that the implementation of
v3 is technically superior in my mind. So, thanks for that :)

IMHO, it is also:

  • twice the code
  • different enough to be annoying to convert existing clients to use
  • not currently different enough to justify the pain
  • Proposed way forward:

    • Release the V3 API in Juno with nova-network and tasks support
    • Feature freeze the V2 API when the V3 API is released
    • Set the timeline for deprecation of V2 so users have a lot
      of warning

This feels a lot like holding our users hostage in order to get them to
move. We're basically saying "We tweaked a few things, fixed some
spelling errors, and changed some date stamp formats. You will have to
port your client, or no new features for you!" That's obviously a little
hyperbolic, but I think that deployers of APIv2 would probably feel like
that's the story they have to give to their users.

Firstly I'd like to step back a bit and ask the question whether we
ever want to fix up the various problems with the V2 API that involve
backwards incompatible changes. These range from inconsistent naming
through the urls and data expected and returned, to poor and
inconsistent input validation and removal of all the proxying Nova
does to cinder, glance and neutron. I believe the answer to this is
yes - inconsistencies in the API make it harder to use (eg do I have a
instance or a server, and a project or a tenant just to name a
couple) and more error prone and proxying has caused several painful to
fix issues for us.

I naively think that we could figure out a way to move things forward
without having to completely break older clients. It's clear that other
services (with much larger and more widely-used APIs) are able to do it.

That said, I think the corollary to the above question is: do we ever
want to knowingly break an existing client for either of:

  1. arbitrary user-invisible backend changes in implementation or
    service organization
  2. purely cosmetic aspects like spelling, naming, etc

IMHO, we should do whatever we can to avoid breaking them except for the
most extreme cases.

--Dan

responded Feb 24, 2014 by Dan_Smith (9,860 points)   1 2 4
0 votes

On 02/24/2014 01:50 AM, Christopher Yeoh wrote:
Hi,

There has recently been some speculation around the V3 API and whether
we should go forward with it or instead backport many of the changes
to the V2 API. I believe that the core of the concern is the extra
maintenance and test burden that supporting two APIs means and the
length of time before we are able to deprecate the V2 API and return
to maintaining only one (well two including EC2) API again.

Yes, this is a major concern. It has taken an enormous amount of work
to get to where we are, and v3 isn't done. It's a good time to
re-evaluate whether we are on the right path.

The more I think about it, the more I think that our absolute top goal
should be to maintain a stable API for as long as we can reasonably do
so. I believe that's what is best for our users. I think if you gave
people a choice, they would prefer an inconsistent API that works for
years over dealing with non-backwards compatible jumps to get a nicer
looking one.

The v3 API and its unit tests are roughly 25k lines of code. This also
doesn't include the changes necessary in novaclient or tempest. That's
just our code. It explodes out from there into every SDK, and then
end user apps. This should not be taken lightly.

This email is rather long so here's the TL;DR version:

  • We want to make backwards incompatible changes to the API
    and whether we do it in-place with V2 or by releasing V3
    we'll have some form of dual API support burden.

    • Not making backwards incompatible changes means:
    • retaining an inconsistent API

I actually think this isn't so bad, as discussed above.

- not being able to fix numerous input validation issues

I'm not convinced, actually. Surely we can do a lot of cleanup here.
Perhaps you have some examples of what we couldn't do in the existing API?

If it's a case of wanting to be more strict, some would argue that the
current behavior isn't so bad (see robustness principle [1]):

"Be conservative in what you do, be liberal in what you accept from
others (often reworded as "Be conservative in what you send, be
liberal in what you accept")."

There's a decent counter argument to this, too. However, I still fall
back on it being best to just not break existing clients above all else.

- have to forever proxy for glance/cinder/neutron with all
  the problems that entails.

I don't think I'm as bothered by the proxying as others are. Perhaps
it's not architecturally pretty, but it's worth it to maintain
compatibility for our users.

  • Backporting V3 infrastructure changes to V2 would be a
    considerable amount of programmer/review time

Agreed, but so is the ongoing maintenance and development of v3.

  • The V3 API as-is has:

    • lower maintenance
    • is easier to understand and use (consistent).
    • Much better input validation which is baked-in (json-schema)
      rather than ad-hoc and incomplete.

So here's the rub ... with the exception of the consistency bits, none
of this is visible to users, which makes me think we should be able to
do all of this on v2.

  • Whilst we have existing users of the API we also have a lot more
    users in the future. It would be much better to allow them to use
    the API we want to get to as soon as possible, rather than trying
    to evolve the V2 API and forcing them along the transition that they
    could otherwise avoid.

I'm not sure I understand this. A key point is that I think any
evolving of the V2 API has to be backwards compatible, so there's no
forcing them along involved.

  • We already have feature parity for the V3 API (nova-network being
    the exception due to the very recent unfreezing of it), novaclient
    support, and a reasonable transition path for V2 users.

  • Proposed way forward:

    • Release the V3 API in Juno with nova-network and tasks support
    • Feature freeze the V2 API when the V3 API is released
    • Set the timeline for deprecation of V2 so users have a lot
      of warning
    • Fallback for those who really don't want to move after
      deprecation is an API service which translates between V2 and V3
      requests, but removes the dual API support burden from Nova.

One of my biggest principles with a new API is that we should not have
to force a migration with a strict timeline like this. If we haven't
built something compelling enough to get people to want to migrate as
soon as they are able, then we haven't done our job. Deprecation of the
old thing should only be done when we feel it's no longer wanted or used
by the vast majority. I just don't see that happening any time soon.

We have a couple of ways forward right now.

1) Continue as we have been, and plan to release v3 once we have a
compelling enough feature set.

2) Take what we have learned from v3 and apply it to v2. For example:

  • The plugin infrastructure is an internal implementation detail that
    can be done with the existing API.

  • extension versioning is a concept we can add to v2

  • we've also been discussing the concept of a core minor version, to
    reflect updates to the core that are backwards compatible. This
    seems doable in v2.

  • revisit a new major API when we get to the point of wanting to
    effectively do a re-write, where we are majorly re-thinking the
    way our API is designed (from an external perspective, not internal
    implementation).

[1] http://en.wikipedia.org/wiki/Robustness_principle

--
Russell Bryant

responded Feb 24, 2014 by Russell_Bryant (19,240 points)   2 3 8
0 votes

On Mon, 2014-02-24 at 20:22 +1030, Christopher Yeoh wrote:
On Mon, 24 Feb 2014 02:06:50 -0500
Jay Pipes wrote:

On Mon, 2014-02-24 at 17:20 +1030, Christopher Yeoh wrote:

  • Proposed way forward:

    • Release the V3 API in Juno with nova-network and tasks support
    • Feature freeze the V2 API when the V3 API is released
    • Set the timeline for deprecation of V2 so users have a lot
      of warning
    • Fallback for those who really don't want to move after
      deprecation is an API service which translates between V2 and V3
      requests, but removes the dual API support burden from Nova.

And when do you think we can begin the process of deprecating the V3
API and removing API extensions and XML "translation" support?

So did you mean V2 API here? I don't understand why you think the V3
API would need deprecating any time soon.

No, I meant v3.

XML support has already been removed from the V3 API and I think the
patch to mark XML as deprecated for the V2 API and eventual removal in
Juno has already landed. So at least for part of the V2 API a one cycle
deprecation period has been seen as reasonable.

OK, very sorry, I must have missed that announcement. I did not realize
that XML support had already been removed from v3.

When it comes to API extensions I think that is actually more a
question of policy than anything else. The actual implementation behind
the scenes of a plugin architecture makes a lot of sense whether we
have extensions or not.

An API extension is not a plugin. And I'm not arguing against a plugin
architecture -- the difference is that a driver/plugin architecture
enables a single public API to have difference backend implementations.

Please see my diatribe on that here:

https://www.mail-archive.com/openstack-dev at lists.openstack.org/msg13660.html

It forces a good level of isolation between API
features and clarity of interaction where its needed - all of which
much is better from a maintenance point of view.

Sorry, I have to violently disagree with you on that one. The API
extensions (in Nova, Neutron, Keystone, et al) have muddied the code
immeasurably and bled implementation into the public API -- something
that is antithetical to good public API design.

Drivers and plugins belong in the implementation layer. Not in the
public API layer.

Now whether we have parts of the API which are optional or not is
really a policy decision as to whether we will force deployers to use
all of the plugins or a subset (eg currently the "core").

It's not about "forcing" providers to support all of the public API.
It's about providing a single, well-documented, consistent HTTP REST API
for consumers of that API. Whether a provider chooses to, for example,
deploy with nova-network or Neutron, or Xen vs. KVM, or support block
migration for that matter should have no effect on the public API. The
fact that those choices currently do effect the public API that is
consumed by the client is a major indication of the weakness of the API.

There is
the technical support for doing so in the V3 API (essentially what is
used to enforce the core of the API). And a major API version bump is
not required to change this. Perhaps this part falls in to the
DefCore discussions :-)

I don't see how this discussion falls into the DefCore discussion.

Best,
-jay

responded Feb 24, 2014 by Jay_Pipes (59,760 points)   3 10 14
0 votes

On 2/24/2014 10:13 AM, Russell Bryant wrote:
On 02/24/2014 01:50 AM, Christopher Yeoh wrote:

Hi,

There has recently been some speculation around the V3 API and whether
we should go forward with it or instead backport many of the changes
to the V2 API. I believe that the core of the concern is the extra
maintenance and test burden that supporting two APIs means and the
length of time before we are able to deprecate the V2 API and return
to maintaining only one (well two including EC2) API again.

Yes, this is a major concern. It has taken an enormous amount of work
to get to where we are, and v3 isn't done. It's a good time to
re-evaluate whether we are on the right path.

The more I think about it, the more I think that our absolute top goal
should be to maintain a stable API for as long as we can reasonably do
so. I believe that's what is best for our users. I think if you gave
people a choice, they would prefer an inconsistent API that works for
years over dealing with non-backwards compatible jumps to get a nicer
looking one.

The v3 API and its unit tests are roughly 25k lines of code. This also
doesn't include the changes necessary in novaclient or tempest. That's
just our code. It explodes out from there into every SDK, and then
end user apps. This should not be taken lightly.

This email is rather long so here's the TL;DR version:

  • We want to make backwards incompatible changes to the API
    and whether we do it in-place with V2 or by releasing V3
    we'll have some form of dual API support burden.

    • Not making backwards incompatible changes means:

      • retaining an inconsistent API

I actually think this isn't so bad, as discussed above.

 - not being able to fix numerous input validation issues

I'm not convinced, actually. Surely we can do a lot of cleanup here.
Perhaps you have some examples of what we couldn't do in the existing API?

If it's a case of wanting to be more strict, some would argue that the
current behavior isn't so bad (see robustness principle [1]):

 "Be conservative in what you do, be liberal in what you accept from
 others (often reworded as "Be conservative in what you send, be
 liberal in what you accept")."

There's a decent counter argument to this, too. However, I still fall
back on it being best to just not break existing clients above all else.

 - have to forever proxy for glance/cinder/neutron with all
   the problems that entails.

I don't think I'm as bothered by the proxying as others are. Perhaps
it's not architecturally pretty, but it's worth it to maintain
compatibility for our users.

+1 to this, I think this is also related to what Jay Pipes is saying in
his reply:

"Whether a provider chooses to, for example,
deploy with nova-network or Neutron, or Xen vs. KVM, or support block
migration for that matter should have no effect on the public API. The
fact that those choices currently do effect the public API that is
consumed by the client is a major indication of the weakness of the API."

As a consumer, I don't want to have to know which V2 APIs work and which
don't depending on if I'm using nova-network or Neutron.

  • Backporting V3 infrastructure changes to V2 would be a
    considerable amount of programmer/review time

Agreed, but so is the ongoing maintenance and development of v3.

  • The V3 API as-is has:

    • lower maintenance
    • is easier to understand and use (consistent).
    • Much better input validation which is baked-in (json-schema)
      rather than ad-hoc and incomplete.

So here's the rub ... with the exception of the consistency bits, none
of this is visible to users, which makes me think we should be able to
do all of this on v2.

  • Whilst we have existing users of the API we also have a lot more
    users in the future. It would be much better to allow them to use
    the API we want to get to as soon as possible, rather than trying
    to evolve the V2 API and forcing them along the transition that they
    could otherwise avoid.

I'm not sure I understand this. A key point is that I think any
evolving of the V2 API has to be backwards compatible, so there's no
forcing them along involved.

  • We already have feature parity for the V3 API (nova-network being
    the exception due to the very recent unfreezing of it), novaclient
    support, and a reasonable transition path for V2 users.

  • Proposed way forward:

    • Release the V3 API in Juno with nova-network and tasks support
    • Feature freeze the V2 API when the V3 API is released

      • Set the timeline for deprecation of V2 so users have a lot
        of warning
      • Fallback for those who really don't want to move after
        deprecation is an API service which translates between V2 and V3
        requests, but removes the dual API support burden from Nova.

One of my biggest principles with a new API is that we should not have
to force a migration with a strict timeline like this. If we haven't
built something compelling enough to get people to want to migrate as
soon as they are able, then we haven't done our job. Deprecation of the
old thing should only be done when we feel it's no longer wanted or used
by the vast majority. I just don't see that happening any time soon.

We have a couple of ways forward right now.

1) Continue as we have been, and plan to release v3 once we have a
compelling enough feature set.

2) Take what we have learned from v3 and apply it to v2. For example:

  • The plugin infrastructure is an internal implementation detail that
    can be done with the existing API.

  • extension versioning is a concept we can add to v2

  • we've also been discussing the concept of a core minor version, to
    reflect updates to the core that are backwards compatible. This
    seems doable in v2.

  • revisit a new major API when we get to the point of wanting to
    effectively do a re-write, where we are majorly re-thinking the
    way our API is designed (from an external perspective, not internal
    implementation).

[1] http://en.wikipedia.org/wiki/Robustness_principle

--

Thanks,

Matt Riedemann

responded Feb 24, 2014 by Matt_Riedemann (48,320 points)   3 7 21
0 votes

On Mon, 24 Feb 2014 07:56:19 -0800
Dan Smith wrote:

  • We want to make backwards incompatible changes to the API
    and whether we do it in-place with V2 or by releasing V3
    we'll have some form of dual API support burden.

IMHO, the cost of maintaining both APIs (which are largely duplicated)
for almost any amount of time outweighs the cost of localized changes.

The API layer is a actually quite a very thin layer on top of the rest
of Nova. Most of the logic in the API code is really just checking
incoming data, calling the underlying nova logic and then massaging
what is returned in the correct format. So as soon as you change the
format the cost of localised changes is pretty much the same as
duplicating the APIs. In fact I'd argue in many cases its more because
in terms of code readability its a lot worse and techniques like using
decorators for jsonschema for input validation are a lot harder to
implement. And unit and tempest tests still need to be duplicated.

The neutron stickiness aside, I don't see a problem leaving the
proxying in place for the foreseeable future. I think that it's
reasonable to mark them as deprecated, encourage people not to use
them, and maybe even (with a core api version to mark the change) say
that they're not supported anymore.

I don't understand why this is also not seen as forcing people off V2
to V3 which is being given as a reason for not being able to set a
reasonable deprecation time for V2. This will require major changes for
people using the V2 API to change how they use it.

I also think that breaking our users because we decided to split A
into B and C on the backend kind of sucks. I imagine that continuing
to do that at the API layer (when we're clearly going to keep doing
it on the backend) is going to earn us a bit of a reputation.

In all the discussions we've (as in the Nova group) had over the API
there has been a pretty clear consensus that proxying is quite
suboptimal (there are caching issues etc) and the long term goal is to
remove it from Nova. Why the change now?

  • Backporting V3 infrastructure changes to V2 would be a
    considerable amount of programmer/review time

While acknowledging that you (and others) have done that for v3
already, I have to think that such an effort is much less costly than
maintaining two complete overlapping pieces of API code.

I strongly disagree here. I think you're overestimating the
amount of maintenance effort this involves and significantly
underestimating how much effort and review time a backport is going to
take.

  • twice the code
  • different enough to be annoying to convert existing clients to use
  • not currently different enough to justify the pain

For starters, It's not twice the code because we don't do things like
proxying and because we are able to logically separate out input
validation jsonschema.

v2 API: ~14600 LOC
v3 API: ~7300 LOC (~8600 LOC if nova-network as-is added back in,
though the actually increase would almost certainly be a lot smaller)

And that's with a lot of the jsonschema patches not landed. So its
actually getting smaller. Long term which looks the better from a
maintenance point of view

And I think you're continuing to look at it solely from the point of
view of pain for existing users of the API and not considering the pain
for new users who have to work out how to use the API. Eg just one
simple example, but how many people new to the API get confused about
what they are meant to send when it asks for instanceuuid when
they've never received one - is at server uuid - and if so what's the
difference? Do I have to do some sort of conversion? Similar issues
around project and tenant. And when writing code they have to remember
for this part of the API they pass it as server
uuid, in another
instance_uuid, or maybe its just id? All of these looked at
individually may look like small costs or barriers to using the API but
they all add up and they end up being imposed over a lot of people.

This feels a lot like holding our users hostage in order to get them
to move. We're basically saying "We tweaked a few things, fixed some
spelling errors, and changed some date stamp formats. You will have to
port your client, or no new features for you!" That's obviously a
little hyperbolic, but I think that deployers of APIv2 would probably
feel like that's the story they have to give to their users.

And how is say removing proxying or making any backwards incompatible
change any different? And this sort of situation is very common with
major library version upgrades. If you want new features you have to
port to the library version which requires changes to your app (that's
why its a major library version not a minor one).

I naively think that we could figure out a way to move things forward
without having to completely break older clients. It's clear that
other services (with much larger and more widely-used APIs) are able
to do it.

Well if you never deprecate the only way to do it is to maintain the
old API forever (including test). And just take the hit on all that
involves.

That said, I think the corollary to the above question is: do we ever
want to knowingly break an existing client for either of:

  1. arbitrary user-invisible backend changes in implementation or
    service organization
  2. purely cosmetic aspects like spelling, naming, etc

IMHO, we should do whatever we can to avoid breaking them except for
the most extreme cases.

What about the tasks API? We that discussed at the mid cycle summit and
decided that the alternative backwards compatible way of doing it was
too ugly and we didn't want to do that. But that's exactly what we'd be
doing if we implemented them in the v2 API and it would be a
feature which ends up looking bolted because of the otherwise
significant non backwards compatible API changes we can't do.

The v2 API "evolved" over a long period of time and we weren't
historically very good at reviewing the APIs before they were added. And
both we and our users are paying for it now. I think we're much better
at that now. Bumping the major version gives us the opportunity to
eventually get rid of that technical debt even if it costs us a bit
more in the short term.

Chris

responded Feb 24, 2014 by Christopher_Yeoh (8,400 points)   1 3 4
0 votes

On the topic of backwards incompatible changes:

I strongly believe that breaking current clients that use the APIs directly is the worst option possible. All the arguments about needing to know which APIs work based upon which backend drivers are used are all valid, but making an API incompatible change when we?ve made the contract that the current API will be stable is a very bad approach. Breaking current clients isn?t just breaking ?novaclient", it would also break any customers that are developing directly against the API. In the case of cloud deployments with real-world production loads on them (and custom development around the APIs) upgrading between major versions is already difficult to orchestrate (timing, approvals, etc), if we add in the need to re-work large swaths of code due to API changes, it will become even more onerous and perhaps drive deployers to forego the upgrades in lieu of stability.

If the perception is that we don?t have stable APIs (especially when we are ostensibly versioning them), driving adoption of OpenStack becomes significantly more difficult. Difficulty in driving further adoption would be a big negative to both the project and the community.

TL;DR, ?don?t break the contract?. If we are seriously making incompatible changes (and we will be regardless of the direction) the only reasonable option is a new major version.
?
Morgan Fainberg
Principal Software Engineer
Core Developer, Keystone
m at metacloud.com

On February 24, 2014 at 10:16:31, Matt Riedemann (mriedem at linux.vnet.ibm.com) wrote:

On 2/24/2014 10:13 AM, Russell Bryant wrote:
On 02/24/2014 01:50 AM, Christopher Yeoh wrote:

Hi,

There has recently been some speculation around the V3 API and whether
we should go forward with it or instead backport many of the changes
to the V2 API. I believe that the core of the concern is the extra
maintenance and test burden that supporting two APIs means and the
length of time before we are able to deprecate the V2 API and return
to maintaining only one (well two including EC2) API again.

Yes, this is a major concern. It has taken an enormous amount of work
to get to where we are, and v3 isn't done. It's a good time to
re-evaluate whether we are on the right path.

The more I think about it, the more I think that our absolute top goal
should be to maintain a stable API for as long as we can reasonably do
so. I believe that's what is best for our users. I think if you gave
people a choice, they would prefer an inconsistent API that works for
years over dealing with non-backwards compatible jumps to get a nicer
looking one.

The v3 API and its unit tests are roughly 25k lines of code. This also
doesn't include the changes necessary in novaclient or tempest. That's
just our code. It explodes out from there into every SDK, and then
end user apps. This should not be taken lightly.

This email is rather long so here's the TL;DR version:

- We want to make backwards incompatible changes to the API
and whether we do it in-place with V2 or by releasing V3
we'll have some form of dual API support burden.
- Not making backwards incompatible changes means:
- retaining an inconsistent API

I actually think this isn't so bad, as discussed above.

  • not being able to fix numerous input validation issues

I'm not convinced, actually. Surely we can do a lot of cleanup here.
Perhaps you have some examples of what we couldn't do in the existing API?

If it's a case of wanting to be more strict, some would argue that the
current behavior isn't so bad (see robustness principle [1]):

"Be conservative in what you do, be liberal in what you accept from
others (often reworded as "Be conservative in what you send, be
liberal in what you accept")."

There's a decent counter argument to this, too. However, I still fall
back on it being best to just not break existing clients above all else.

  • have to forever proxy for glance/cinder/neutron with all
    the problems that entails.

I don't think I'm as bothered by the proxying as others are. Perhaps
it's not architecturally pretty, but it's worth it to maintain
compatibility for our users.

+1 to this, I think this is also related to what Jay Pipes is saying in
his reply:

"Whether a provider chooses to, for example,
deploy with nova-network or Neutron, or Xen vs. KVM, or support block
migration for that matter should have no effect on the public API. The
fact that those choices currently do effect the public API that is
consumed by the client is a major indication of the weakness of the API."

As a consumer, I don't want to have to know which V2 APIs work and which
don't depending on if I'm using nova-network or Neutron.

  • Backporting V3 infrastructure changes to V2 would be a
    considerable amount of programmer/review time

Agreed, but so is the ongoing maintenance and development of v3.


- The V3 API as-is has:
- lower maintenance
- is easier to understand and use (consistent).
- Much better input validation which is baked-in (json-schema)
rather than ad-hoc and incomplete.

So here's the rub ... with the exception of the consistency bits, none
of this is visible to users, which makes me think we should be able to
do all of this on v2.

  • Whilst we have existing users of the API we also have a lot more
    users in the future. It would be much better to allow them to use
    the API we want to get to as soon as possible, rather than trying
    to evolve the V2 API and forcing them along the transition that they
    could otherwise avoid.

I'm not sure I understand this. A key point is that I think any
evolving of the V2 API has to be backwards compatible, so there's no
forcing them along involved.

  • We already have feature parity for the V3 API (nova-network being
    the exception due to the very recent unfreezing of it), novaclient
    support, and a reasonable transition path for V2 users.
  • Proposed way forward:
  • Release the V3 API in Juno with nova-network and tasks support
  • Feature freeze the V2 API when the V3 API is released
  • Set the timeline for deprecation of V2 so users have a lot
    of warning
  • Fallback for those who really don't want to move after
    deprecation is an API service which translates between V2 and V3
    requests, but removes the dual API support burden from Nova.

One of my biggest principles with a new API is that we should not have
to force a migration with a strict timeline like this. If we haven't
built something compelling enough to get people to want to migrate as
soon as they are able, then we haven't done our job. Deprecation of the
old thing should only be done when we feel it's no longer wanted or used
by the vast majority. I just don't see that happening any time soon.

We have a couple of ways forward right now.

1) Continue as we have been, and plan to release v3 once we have a
compelling enough feature set.

2) Take what we have learned from v3 and apply it to v2. For example:

- The plugin infrastructure is an internal implementation detail that
can be done with the existing API.

- extension versioning is a concept we can add to v2

- we've also been discussing the concept of a core minor version, to
reflect updates to the core that are backwards compatible. This
seems doable in v2.

- revisit a new major API when we get to the point of wanting to
effectively do a re-write, where we are majorly re-thinking the
way our API is designed (from an external perspective, not internal
implementation).

[1] http://en.wikipedia.org/wiki/Robustness_principle

--

Thanks,

Matt Riedemann


OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
responded Feb 24, 2014 by m_at_metacloud.com (580 points)   1
0 votes

The API layer is a actually quite a very thin layer on top of the
rest of Nova. Most of the logic in the API code is really just
checking incoming data, calling the underlying nova logic and then
massaging what is returned in the correct format. So as soon as you
change the format the cost of localised changes is pretty much the
same as duplicating the APIs. In fact I'd argue in many cases its
more because in terms of code readability its a lot worse and
techniques like using decorators for jsonschema for input validation
are a lot harder to implement. And unit and tempest tests still need
to be duplicated.

Making any change to the backend is double the effort with the two trees
as it would be with one API. I agree that changing/augmenting the format
of a call means some localized "if this then that" code, but that's
minor compared to what it takes to do things on the backend, IMHO.

I don't understand why this is also not seen as forcing people off
V2 to V3 which is being given as a reason for not being able to set
a reasonable deprecation time for V2. This will require major changes
for people using the V2 API to change how they use it.

Well, deprecating them doesn't require the change. Removing them does. I
think we can probably keep the proxying in a deprecated form for a very
long time, hopefully encouraging new users to "do it right" without
breaking existing users who don't care. Hopefully losing out on the
functionality they miss by not talking directly to Neutron (for example)
will be a good carrot to avoid using the proxy APIs.

In all the discussions we've (as in the Nova group) had over the API
there has been a pretty clear consensus that proxying is quite
suboptimal (there are caching issues etc) and the long term goal is
to remove it from Nova. Why the change now?

This is just MHO, of course. I don't think I've been party to those
conversations. I understand why the proxying is bad, but that's a
different issue from whether we drop it and break our users.

I strongly disagree here. I think you're overestimating the amount of
maintenance effort this involves and significantly underestimating
how much effort and review time a backport is going to take.

Fair enough. I'm going from my experience over the last few cycles of
changing how the API communicates with the backend. This is something
we'll have to continue to evolve over time, and right now it
Sucks Big Time(tm) :)

  • twice the code
    For starters, It's not twice the code because we don't do things
    like proxying and because we are able to logically separate out
    input validation jsonschema.

You're right, I should have said "twice the code for changes between the
API and the backend".

Eg just one simple example, but how many people new to the API get
confused about what they are meant to send when it asks for
instanceuuid when they've never received one - is at server uuid -
and if so what's the difference? Do I have to do some sort of
conversion? Similar issues around project and tenant. And when
writing code they have to remember for this part of the API they pass
it as server
uuid, in another instance_uuid, or maybe its just id?
All of these looked at individually may look like small costs or
barriers to using the API but they all add up and they end up being
imposed over a lot of people.

Yup, it's ugly, no doubt. I think that particular situation is probably
(hopefully?) covered up by the various client libraries (and/or docs)
that we have. If not, I think it's probably something we can improve
from an experience perspective on that end. But yeah, I know the public
API docs would still have that ambiguity.

And how is say removing proxying or making any backwards
incompatible change any different?

It's not. That's why I said "maybe remove it some day" :)

Well if you never deprecate the only way to do it is to maintain the
old API forever (including test). And just take the hit on all that
involves.

Sure. Hopefully people that actually deploy and support our API will
chime in here about whether they think that effort is worth not telling
their users to totally rewrite their clients.

If we keep v2 and v3, I think we start in icehouse with a very large
surface, which will increase over time. If we don't, then we start with
v2 and end up with only the delta over time.

What about the tasks API? We that discussed at the mid cycle summit
and decided that the alternative backwards compatible way of doing it
was too ugly and we didn't want to do that. But that's exactly what
we'd be doing if we implemented them in the v2 API and it would be a
feature which ends up looking bolted because of the otherwise
significant non backwards compatible API changes we can't do.

If we version the core API and let the client declare the version it
speaks in a header, we could iterate on that interface right? If they're
version <X, return the server object and a task header, if >=X return
the task. We could also let the client declare support in their
accept-type header:

Accept: application/json;type=task

Which would mean "I can take a JSON task instead of a server". Better
than "if you want a task, rewrite your client against v3" IMHO.

I recognize that tasks were going to be the first big win for v3, but
honestly it feels minor in the context of what we do about v2/v3 long
term, so I don't want to get bogged down in the details of how to do it.

--Dan

responded Feb 24, 2014 by Dan_Smith (9,860 points)   1 2 4
0 votes

On 02/24/2014 05:01 PM, Morgan Fainberg wrote:
On the topic of backwards incompatible changes:

I strongly believe that breaking current clients that use the APIs
directly is the worst option possible. All the arguments about needing
to know which APIs work based upon which backend drivers are used are
all valid, but making an API incompatible change when we?ve made the
contract that the current API will be stable is a very bad approach.
Breaking current clients isn?t just breaking ?novaclient", it would also
break any customers that are developing directly against the API. In the
case of cloud deployments with real-world production loads on them (and
custom development around the APIs) upgrading between major versions is
already difficult to orchestrate (timing, approvals, etc), if we add in
the need to re-work large swaths of code due to API changes, it will
become even more onerous and perhaps drive deployers to forego the
upgrades in lieu of stability.

If the perception is that we don?t have stable APIs (especially when we
are ostensibly versioning them), driving adoption of OpenStack becomes
significantly more difficult. Difficulty in driving further adoption
would be a big negative to both the project and the community.

TL;DR, ?don?t break the contract?. If we are seriously making
incompatible changes (and we will be regardless of the direction) the
only reasonable option is a new major version.

FWIW, I do not consider non backwards compatible changes to be on the
table for the existing API. Evolving it would have to be done in a
backwards compatible way. I'm completely in agreement with that.

--
Russell Bryant

responded Feb 24, 2014 by Russell_Bryant (19,240 points)   2 3 8
...