settingsLogin | Registersettings

[openstack-dev] [chef] Making the Kitchen Great Again: A Retrospective on OpenStack & Chef

0 votes

The HTML version is here:
https://s.cassiba.com/2017/02/14/making-the-kitchen-great-again-a-retrospective-on-openstack-chef https://s.cassiba.com/2017/02/14/making-the-kitchen-great-again-a-retrospective-on-openstack-chef

This was influenced by Graham Hayes' State of the Project for Designate:
http://graham.hayes.ie/posts/openstack-designate-where-we-are/ http://graham.hayes.ie/posts/openstack-designate-where-we-are/

I have been asked recently "what is going on with the OpenStack-Chef project?",
"how is the state of the cookbooks?", and "hey sc, how are those integration
tests coming?". Having been the PTL for the Newton and Ocata cycles, yet
having not shipped a release, is the unthinkable, and deserves at least a
sentence or two.

It goes without saying, this is disheartening and depressing to me and
everybody that has devoted their time to making the cookbooks a solid
and viable method for deploying OpenStack. OpenStack-Chef is among the
oldest[1] and most mature solutions for deploying OpenStack, though it is
not the most feature-rich.

TL;DR if you don't want to keep going -
OpenStack-Chef is not in a good place and is not sustainable.

OpenStack-Chef has always been a small project with a big responsibility.
The Chef approach to OpenStack historically has required a level of
investment within the Chef ecosystem, which is a hard enough sell when you
started out with Puppet or Ansible. Despite the unicorns and rainbows of
being Chef cookbooks, OpenStack-Chef always asserted itself as an OpenStack
project first, up to and including joining the Big Tent, whatever it takes.
To beat that drum, we are OpenStack.

There is no cool factor from deploying and managing OpenStack using Chef,
unless you've been running Chef, because insert Xzibit meme here and jokes
about turtles. Unless you break something with automation, then it's
applause or facepalm. Usually both. At the same time.

As with any kitchen, it must be stocked and well maintained, and
OpenStack-Chef is no exception. Starting out, there was a vibrant community
producing organic, free-range code. Automation is invisible, assumed to be
there in the background. Once it's in place, it isn't touched again unless
it breaks. Upgrades in complex deployments can be fraught with error, even
in an automated fashion.

As has been seen in previous surveys[2], once an OpenStack release has chosen
by an operator, some tend to not upgrade for the next cycle or three, to get
the immediate bugs worked out. Though there are now multinode and upgrade
scenarios supported with the Puppet OpenStack and TripleO projects, they do
not use Chef, so Chef deployers do not directly benefit from any of this
testing.

Being a deployment project, we are responsible for not one aspect of
the OpenStack project but as many as can be reasonably supported.

We were very fortunate in the beginning, having support from public cloud
providers, as well as large private cloud providers. Stackalytics shows a
vibrant history, a veritable who's-who of OpenStack contributors, too many to
name. They've all moved on, working on other things.

As a previous PTL for the project once joked, the Chef approach to OpenStack
was the "other deployment tool that nobody uses". As time has gone by, that has
become more of a true statement.

There are a few of us still cooking away, creating new recipes and cookbooks. The
pilot lights are still lit and there's usually something simmering away on the
back burner, but there is no shouting of orders, and not every dish gets tasted.
We think there might be rats, too, but we’re too shorthanded to maintain the traps.

We have yet to see many (meaningful) contributions from the community, however.
We have some amazing deployers that file bugs, and if they can, push up a patch.
It delights me when someone other than a core weighs in on a review. They are
highly appreciated and incredibly valuable, but they are very tactical
contributions. A project cannot live on such contributions.

October 2015

https://s.cassiba.com/images/oct-2015-deployment-decisions.png https://s.cassiba.com/images/oct-2015-deployment-decisions.png

Where does that leave OpenStack-Chef? Let's take a look at the numbers:

+------------+------------+
| Cycle      | Commits |
+------------+------------+
| Havana   | 557        |
+------------+------------+
| Icehouse | 692        |
+------------+------------+
| Juno       | 424         |
+------------+------------+
| Kilo         | 474         |
+------------+------------+
| Liberty    | 259         |
+------------+------------+
| Mitaka    | 85           |
+------------+------------+
| Newton   | 112         |
+------------+------------+
| Ocata      | 78          |
+------------+------------+

As of the time of this writing, Newton has not yet branched. Yes, you read
correctly. This means the Ocata cycle has gone to ensuring that Newton just
functions
. In a virtual quasi-vacuum, without input from larger scale
deployments, who are running releases older than Newton, reporting bugs we've
fixed in master. Supporting Newton required implementing support for Ubuntu
16.04, as well as client and underlying cookbook changes, due to deprecations
that started prior to Newton. Here is the output from berks viz for a top-down
view into the complexity on just the Chef side.

https://s.cassiba.com/images/openstack-chef-dependency-graph.png https://s.cassiba.com/images/openstack-chef-dependency-graph.png

For the Pike cycle, Jan Klare will be reprising the role of PTL. I do not
intend to speak for him, but there are few paths forward in the Big Tent:

  • Branching stable/newton and stable/ocata with the quickness.
  • Improve OpenStack CI to the point of being able to trust it again for
    testing patches, as well as extend testing scenarios (including multinode).

For branching stable/newton, the external CI has been proving useful in overall
confidence in cutting a release. We're way behind schedule, but nearly there. I
have begun working on implementing some basic multinode gates, as our allinone
no longer fits within the confines of the 8GB instances. But, it’s Chef, so
triangle wheels, yo. Some of the cross-project efforts translate to Chef, but
not all. With square spinners.

So... how did this happen?


As was in the case of Designate, as is in the case of OpenStack-Chef. There is
no one single reason or cause that arrived us at this point.

The main catalyst was internal support shifting, which impacted the sponsored
developers and contributors. OpenStack-Chef became less and less a priority,
and one by one they shifted to other focuses. At the Austin 2016 Summit, we said
farewell to all but the PTL and one core. This put OpenStack-Chef in a bad place
given its mission and scope, but onward we go.

Due to the volume of work done by this small group and the lack of feedback
during development, it became more and more difficult to tell when a release
could be considered "done". We could no longer trust our CI framework, as the
developers with intrinsic knowledge had been refocused, with little more than
commit history to go on.

Users were okay with leaving us work, which we added to the heap. This, with
the departure in contributors, resulted in the majority of the development
being funded by just two companies, which left the project at risk to changes
in direction by those companies. Without regular feedback or guidance beyond
release notes and the occasional chat in another project's channel, the focus
shifted away from features to just ship it, as long as it passes allinone and/or
multinode locally, if there’s time. Does it pass lint/unit/style? Fuck it. Ship
it, deal with the fallout. Yeah. This is bad on so many levels.

The Big Tent really did not do as much as advertised for OpenStack-Chef, as harsh
of an opinion as that sounds. Larger, more well-funded projects have since created
processes, frameworks and test suites that were developed for their own use cases,
not necessarily taking into account Chef's own blend of automation. That left
us having to go and discover how to make fire on our own to make the cookbooks
work on each supported platform and release of OpenStack. In the Big Tent, we
were effectively left to our own devices. Just another OpenStack project. We
numbered nine cores when we moved from StackForge to the Big Tent. Developer
peak, though we did not know it yet.

Initially, the cookbooks had a very heavy dependency: Chef Server. If not Chef
Server, Chef Solo, which still had its own quirks, and nobody liked Chef Solo
anyway. Not even Chef Solo liked itself. During the Juno cycle, we switched to
the Chef Development Kit, which gave us chef-provisioning. This decreased
turnaround time for testing patches being submitted, and boosted confidence all
around. Until Juno, it was difficult to run functional tests against the
cookbooks. That's when we discovered how to create fire. We could run
OpenStack! In virtual machines! On our laptops! OMyG you guys! Suddenly,
OpenStack on the laptop became easy, push button, single command, automated. We
could test a patch without a long spin-up. With that, came integration gates,
and periodic jobs. From days to minutes. We were cooking with gas! But... let’s
not make those integration gates voting... yet.

Mitaka brought a significant overhaul and simplification, with the introduction
of a multinode chef-provisioning recipe and more modular cookbooks. The pieces
finally existed, but the damage had been done, and unfortunately, this momentum
did not last. Internal priorities changed within companies sponsoring
developers, many of which could not be fully committed in the first place, and
we started shedding contributors, which happens. At this point is usually where
someone comes in to either play Grim Reaper or lifesaver. By Austin, our
numbers waned until just two cores remained, Jan and myself. We could not be in
the worst of locations to communicate, he in Germany and I in California.

In the Newton and Ocata cycles, development progressed in a lurching capacity
without a team. Due to the overhaul in Mitaka, patches slowed in frequency from
the outside community, many of who continued deploying and running on older,
EOL branches, or got frustrated enough to switch to other automation flavors.
The remaining team had little overlapping time to communicate, being on
different continents in conflicting time zones. What was difficult to do with a
larger team spread across three continents and five time zones became
impossible with just two. Day jobs increasingly took priority over
OpenStack-Chef care and feeding, and some cookbooks started to go rancid
(sorry, Ironic, Sahara, Swift and Trove. nobody was able to support a
deployment with you). Interaction within the development team was limited to an
hour or two a day, eventually down to once or twice a week if we had time. Day
jobs proceeded to consume the development team, with sporadic development as
the months ticked on.

Over the Newton cycle, one cookbook was offered, EC2, with inadequate coverage
for our support matrix. The most desired integration API. In the end, it was
not integrated due to time and commitment to support such a feature, having
inadequate resources at our disposal. During Ocata, the project had one
cookbook contributed from the community, Murano, that could be integrated, and
grew an appendage in the form of the client cookbook. It is the closest anyone
has gotten to new features since the Mitaka cycle. We added one core reviewer
during this time.

Communication is a big part of any project, particularly a geographically
diverse effort like OpenStack. Prior to the Big Tent, we held weekly meetings
using Hangouts, which were open and publicized for the mailing list subscribers
and channel denizens. Upon joining the Big Tent, we gave up the regular
face-time in favor of text-based IRC meetings, per governance. Without the high
bandwidth requirement of a video call once a week, one by one, cores had day
job meetings do what they do, and take priority. In the Newton cycle, we
relinquished our weekly time slot after it became apparent that neither of us
could make the meetings. We have not held a scheduled meeting since then, as it
is next to impossible to carve out adequate overlapping time.

We still have many of these problems to this day. Documentation is a mess or
nonexistent. Despite the flexibility of the tooling, users have but two
representative deployment examples: allinone or a rudimentary multinode.
OpenStack-Chef gained modularity at the expense of features, and there was an
overwhelming non-reaction to the deprecation of those features.

All of this results in a project that is not very friendly to new users, and
Chef does not look as attractive as a deployment option as other, more
feature-rich flavors. This has real business decisions behind it. One only need
look at the steady usage decline in the surveys to see how negatively things
appear to existing and new users. This, for a project that has roots in the
very cubicles OpenStack was born.[1]

But it's not all bad


For all the negativity, this story has upsides. I call to the people who
actually use this software in their deployments, in whatever shape it's in, to
not abandon OpenStack-Chef, or retire it to bitrot. We need help, not funeral
arrangements. Share your pain, so that we may find a way forward together, not
alone.

October 2016

https://s.cassiba.com/images/oct-2016-deployment-decisions.png https://s.cassiba.com/images/oct-2016-deployment-decisions.png

In my time with the project, I’ve gotten to know that there are some pretty big
names that leverage Chef in their deployments, and some of them even use it for
OpenStack. Some cookbook forks also exist, all serving to solve the same
problems that face OpenStack-Chef. Without feedback from real-world
deployments, OpenStack-Chef will continue to wither on the vine. This
fragmentation harms more than it solves.

What do we need? It’s easier to list what we don’t need. Developers, tooling,
documentation, testing, any and all are welcome and greatly appreciated. We
don’t need much, but what we are able to do is limited by our size and time
available.

We need developers with time and funding budgeted for contributing within the
OpenStack ecosystem. We need better representation at events such as the PTG.
For the first PTG, no OpenStack-Chef members will be attending, though we
intend to meet up in Boston, maybe. Given our physical locations in proximity
to Atlanta, it made sense to stay home and communicate over IRC/code review. It
doesn’t mean our development has ceased, we’re just too few and far away for it
to make sense.

We also need help from cross-project teams to understand and assimilate the
work done to solve the problems that we are working to solve. We are all
working toward the same goal, though with a different dialect and coat of arms.

OpenStack needs choice in how to OpenStack. Limiting to a few dialects of
automation makes things look less like an ecosystem and more like a distro,
which is fine, if that’s how people want to go about it. We can talk about that,
too.

I am happy to talk to anyone about how they can help. OpenStack-Chef has roots
in the pioneers of OpenStack, and, in my (not so humble) opinion, is way too
nifty to just let fall to the wayside.

I do not have team visuals to represent how the team has grown and shrank over
the years, so let me leave you with some mental visuals. At the end of Mitaka,
we numbered nine. At the start of Ocata, we numbered just two. In Boston, we
anticipate there will be, hopefully, all three of us, representing the sixteen
active subprojects that consist OpenStack-Chef.

[1]: [openstack-compute::default commit history](https://github.com/openstack/cookbook-openstack-compute/commits/eol-grizzly/recipes/default.rb https://github.com/openstack/cookbook-openstack-compute/commits/eol-grizzly/recipes/default.rb)
[2]: [April 2016 OpenStack User Survey: greater than 50% of production deployments were still on releases circa Kilo or older](https://www.openstack.org/assets/survey/April-2016-User-Survey-Report.pdf https://www.openstack.org/assets/survey/April-2016-User-Survey-Report.pdf)
[3]: [OpenStack Survey Report](https://www.openstack.org/analytics https://www.openstack.org/analytics)


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

asked Feb 16, 2017 in openstack-dev by s_at_cassiba.com (1,200 points)  

27 Responses

0 votes

Samuel Cassiba wrote:
[...]
TL;DR if you don't want to keep going -
OpenStack-Chef is not in a good place and is not sustainable.
[...]

Thanks for sharing, Sam.

I think that part of the reasons for the situation is that we grew the
number of options for deploying OpenStack. We originally only had Puppet
and Chef, but now there is Ansible, Juju, and the various
Kolla-consuming container-oriented approaches. There is a gravitational
attraction effect at play (more users -> more contributors) which
currently benefits Puppet, Ansible and Kolla, to the expense of
less-popular community-driven efforts like OpenStackChef and
OpenStackSalt. I expect this effect to continue. I have mixed feelings
about it: on one hand it reduces available technical options, but on the
other it allows to focus and raise quality...

There is one question I wanted to ask you in terms of community. We
maintain in OpenStack a number of efforts that bridge two communities,
and where the project could set up its infrastructure / governance in
one or the other. In the case of OpenStackChef, you could have set up
shop on the Chef community side, rather than on the OpenStack community
side. Would you say that living on the OpenStack community side helped
you or hurt you ? Did you get enough help / visibility to balance the
constraints ? Do you think you would have been more, less or equally
successful if you had set up shop more on the Chef community side ?

--
Thierry Carrez (ttx)


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

responded Feb 15, 2017 by Thierry_Carrez (57,480 points)   3 8 12
0 votes

On Feb 15, 2017, at 02:07, Thierry Carrez thierry@openstack.org wrote:

Samuel Cassiba wrote:

[...]
TL;DR if you don't want to keep going -
OpenStack-Chef is not in a good place and is not sustainable.
[...]

Thanks for sharing, Sam.

Thanks for taking the time to read and respond. This was as hard to write as it was to read. As time went on, it became apparent that this retrospective needed to exist. It was not written lightly, and does not aim to point fingers.

I think that part of the reasons for the situation is that we grew the
number of options for deploying OpenStack. We originally only had Puppet
and Chef, but now there is Ansible, Juju, and the various
Kolla-consuming container-oriented approaches. There is a gravitational
attraction effect at play (more users -> more contributors) which
currently benefits Puppet, Ansible and Kolla, to the expense of
less-popular community-driven efforts like OpenStackChef and
OpenStackSalt. I expect this effect to continue. I have mixed feelings
about it: on one hand it reduces available technical options, but on the
other it allows to focus and raise quality…

You have a very valid point. One need only look at the trends over the cycles in the User Survey to see this shift in most places. Ansible wins due to sheer simplicity for new deployments, but there are also real business decisions that go behind automation flavors at certain business sizes. This leaves them effectively married to whichever flavor chosen. That shift impacts Puppet’s overall user base, as well, though they had and still have the luxury of maintaining sponsored support at higher numbers.

Chef’s sponsored support has numbered far fewer. It casts an extremely negative image on OpenStack when someone looks for help at odd hours, or asks something somewhere that none of us have time to track. The answer to that is the point of making noise, to generate conversation about avenues and solutions. I could have kept my fingers aiming at LP, Gerrit and IRC in an attempt to bury my head in the sand. We’re way past the point of denial, perhaps too far, but as long as the results of the User Survey shows Chef, there are still users to support, for now. Operators and deployers will be looking to the source of truth, wherever that is, and right now that source of truth is OpenStack.

There is one question I wanted to ask you in terms of community. We
maintain in OpenStack a number of efforts that bridge two communities,
and where the project could set up its infrastructure / governance in
one or the other. In the case of OpenStackChef, you could have set up
shop on the Chef community side, rather than on the OpenStack community
side. Would you say that living on the OpenStack community side helped
you or hurt you ? Did you get enough help / visibility to balance the
constraints ? Do you think you would have been more, less or equally
successful if you had set up shop more on the Chef community side ?

We set up under Stackforge, later OpenStack, because the cookbooks evolved alongside OpenStack, as far back as 2011, before my time in the cookbooks. The earliest commits on the now EOL Grizzly branch were quite enlightening, if only Stackalytics had the visuals. Maybe I’m biased, but that’s worth something.

You’re absolutely correct that we could have pushed more to set up the Chef side of things, and in fact we made several concerted efforts to integrate into the Chef community, up to and including having sponsored contributors, even a PTL. When exploring the Chef side, we found that we faced as much or more friction with the ecosystem, requiring more fundamental changes than we could influence. Chef (the ecosystem) has many great things, but Chef doesn’t OpenStack. Maybe that was the writing on the wall.

I keep one foot in both Chef and OpenStack, to keep myself as informed as time allows me. It’s clear that even Chef’s long-term cookbook support community is ill equipped to handle OpenStack. The problem? We’re too complex and too far integrated, and none of them know OpenStack. Where does that leave us?

--
Best,

Samuel Cassiba

--
Thierry Carrez (ttx)


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

responded Feb 15, 2017 by s_at_cassiba.com (1,200 points)  
0 votes

On Wed, Feb 15, 2017 at 9:02 AM, Samuel Cassiba s@cassiba.com wrote:

On Feb 15, 2017, at 02:07, Thierry Carrez thierry@openstack.org wrote:

Samuel Cassiba wrote:

[...]
TL;DR if you don't want to keep going -
OpenStack-Chef is not in a good place and is not sustainable.
[...]

Thanks for sharing, Sam.

Thanks for taking the time to read and respond. This was as hard to write as it was to read. As time went on, it became apparent that this retrospective needed to exist. It was not written lightly, and does not aim to point fingers.

I think that part of the reasons for the situation is that we grew the
number of options for deploying OpenStack. We originally only had Puppet
and Chef, but now there is Ansible, Juju, and the various
Kolla-consuming container-oriented approaches. There is a gravitational
attraction effect at play (more users -> more contributors) which
currently benefits Puppet, Ansible and Kolla, to the expense of
less-popular community-driven efforts like OpenStackChef and
OpenStackSalt. I expect this effect to continue. I have mixed feelings
about it: on one hand it reduces available technical options, but on the
other it allows to focus and raise quality…

You have a very valid point. One need only look at the trends over the cycles in the User Survey to see this shift in most places. Ansible wins due to sheer simplicity for new deployments, but there are also real business decisions that go behind automation flavors at certain business sizes. This leaves them effectively married to whichever flavor chosen. That shift impacts Puppet’s overall user base, as well, though they had and still have the luxury of maintaining sponsored support at higher numbers.

To chime in on the Puppet side, we've seen a decrease in contributors
over the last several cycles and I have a feeling we'll be in the same
boat in the near future. The amount of modules that we have to try
and manage versus the amount of folks that we have contributing is
getting to an unmanageable state. I believe the only way we've gotten
to where we have been is due to the use within Fuel and TripleO. As
those projects evolve, it directly impacts the ability for the Puppet
modules to remain relevant. Some could argue that's just the way it
goes and technologies evolve which is true. But it's also a loss for
many of the newer methods as they are losing all of the historical
knowledge and understanding that went with it and why some patterns
work better than others. The software wheel, it's getting reinvented
every day.

Chef’s sponsored support has numbered far fewer. It casts an extremely negative image on OpenStack when someone looks for help at odd hours, or asks something somewhere that none of us have time to track. The answer to that is the point of making noise, to generate conversation about avenues and solutions. I could have kept my fingers aiming at LP, Gerrit and IRC in an attempt to bury my head in the sand. We’re way past the point of denial, perhaps too far, but as long as the results of the User Survey shows Chef, there are still users to support, for now. Operators and deployers will be looking to the source of truth, wherever that is, and right now that source of truth is OpenStack.

There is one question I wanted to ask you in terms of community. We
maintain in OpenStack a number of efforts that bridge two communities,
and where the project could set up its infrastructure / governance in
one or the other. In the case of OpenStackChef, you could have set up
shop on the Chef community side, rather than on the OpenStack community
side. Would you say that living on the OpenStack community side helped
you or hurt you ? Did you get enough help / visibility to balance the
constraints ? Do you think you would have been more, less or equally
successful if you had set up shop more on the Chef community side ?

We set up under Stackforge, later OpenStack, because the cookbooks evolved alongside OpenStack, as far back as 2011, before my time in the cookbooks. The earliest commits on the now EOL Grizzly branch were quite enlightening, if only Stackalytics had the visuals. Maybe I’m biased, but that’s worth something.

You’re absolutely correct that we could have pushed more to set up the Chef side of things, and in fact we made several concerted efforts to integrate into the Chef community, up to and including having sponsored contributors, even a PTL. When exploring the Chef side, we found that we faced as much or more friction with the ecosystem, requiring more fundamental changes than we could influence. Chef (the ecosystem) has many great things, but Chef doesn’t OpenStack. Maybe that was the writing on the wall.

I keep one foot in both Chef and OpenStack, to keep myself as informed as time allows me. It’s clear that even Chef’s long-term cookbook support community is ill equipped to handle OpenStack. The problem? We’re too complex and too far integrated, and none of them know OpenStack. Where does that leave us?

This is a shared concern that I have along with others who have voiced
the concern around features/changes in the OpenStack projects and
their impacts on the deployment tools (and ultimately the end user).
OpenStack continues to get more complex and the support tooling that
has grown around it (ansible, puppet, chef, charms) is struggling to
keep up and maintain because there is not an easy way to stay informed
on all the changes for the entire ecosystem (and there are a ton of
changes). "Read the release notes" is not a great answer for the end
user. Neither is "Read the release notes for every project deployed
and figure it out". I see this problem is a community and culture
problem within OpenStack on and not necessarily a problem with the
folks trying to write the tooling to manage it.

Thanks,
-Alex

--
Best,

Samuel Cassiba

--
Thierry Carrez (ttx)


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Feb 15, 2017 by aschultz_at_redhat.c (5,800 points)   2 2 4
0 votes

On Feb 15, 2017, at 08:49, Alex Schultz aschultz@redhat.com wrote:

On Wed, Feb 15, 2017 at 9:02 AM, Samuel Cassiba s@cassiba.com wrote:

On Feb 15, 2017, at 02:07, Thierry Carrez thierry@openstack.org wrote:

Samuel Cassiba wrote:

[...]
TL;DR if you don't want to keep going -
OpenStack-Chef is not in a good place and is not sustainable.
[...]

Thanks for sharing, Sam.

Thanks for taking the time to read and respond. This was as hard to write as it was to read. As time went on, it became apparent that this retrospective needed to exist. It was not written lightly, and does not aim to point fingers.

I think that part of the reasons for the situation is that we grew the
number of options for deploying OpenStack. We originally only had Puppet
and Chef, but now there is Ansible, Juju, and the various
Kolla-consuming container-oriented approaches. There is a gravitational
attraction effect at play (more users -> more contributors) which
currently benefits Puppet, Ansible and Kolla, to the expense of
less-popular community-driven efforts like OpenStackChef and
OpenStackSalt. I expect this effect to continue. I have mixed feelings
about it: on one hand it reduces available technical options, but on the
other it allows to focus and raise quality…

You have a very valid point. One need only look at the trends over the cycles in the User Survey to see this shift in most places. Ansible wins due to sheer simplicity for new deployments, but there are also real business decisions that go behind automation flavors at certain business sizes. This leaves them effectively married to whichever flavor chosen. That shift impacts Puppet’s overall user base, as well, though they had and still have the luxury of maintaining sponsored support at higher numbers.

To chime in on the Puppet side, we've seen a decrease in contributors
over the last several cycles and I have a feeling we'll be in the same
boat in the near future. The amount of modules that we have to try
and manage versus the amount of folks that we have contributing is
getting to an unmanageable state. I believe the only way we've gotten
to where we have been is due to the use within Fuel and TripleO. As
those projects evolve, it directly impacts the ability for the Puppet
modules to remain relevant. Some could argue that's just the way it
goes and technologies evolve which is true. But it's also a loss for
many of the newer methods as they are losing all of the historical
knowledge and understanding that went with it and why some patterns
work better than others. The software wheel, it's getting reinvented
every day.

Thank you for your perspective from the Puppet side. The Survey data alone
paints a certain narrative, and not one I think people want. If OpenStack
deployment choice is down to a popularity contest, the direct result is
fewer avenues back into OpenStack.

Fewer people will think to pick OpenStack as a viable option if it simply
doesn’t support their design, which means less exposure for non-core
projects, less feedback for core projects, rinse, repeat. Developers can and
would coalesce around but a couple of the most popular options, which works
if that’s the way things are intending to go. With that, the OpenStack story
starts to tell less like an ecosystem and more like a distro, bordering on
echo chamber. I don’t think anyone signed up for that. On the other hand,
fewer deployment options allow for more singular focus. Without all that
choice clouding decision-making, one has no way to OpenStack but those few
methods that everyone uses.

Chef’s sponsored support has numbered far fewer. It casts an extremely negative image on OpenStack when someone looks for help at odd hours, or asks something somewhere that none of us have time to track. The answer to that is the point of making noise, to generate conversation about avenues and solutions. I could have kept my fingers aiming at LP, Gerrit and IRC in an attempt to bury my head in the sand. We’re way past the point of denial, perhaps too far, but as long as the results of the User Survey shows Chef, there are still users to support, for now. Operators and deployers will be looking to the source of truth, wherever that is, and right now that source of truth is OpenStack.

There is one question I wanted to ask you in terms of community. We
maintain in OpenStack a number of efforts that bridge two communities,
and where the project could set up its infrastructure / governance in
one or the other. In the case of OpenStackChef, you could have set up
shop on the Chef community side, rather than on the OpenStack community
side. Would you say that living on the OpenStack community side helped
you or hurt you ? Did you get enough help / visibility to balance the
constraints ? Do you think you would have been more, less or equally
successful if you had set up shop more on the Chef community side ?

We set up under Stackforge, later OpenStack, because the cookbooks evolved alongside OpenStack, as far back as 2011, before my time in the cookbooks. The earliest commits on the now EOL Grizzly branch were quite enlightening, if only Stackalytics had the visuals. Maybe I’m biased, but that’s worth something.

You’re absolutely correct that we could have pushed more to set up the Chef side of things, and in fact we made several concerted efforts to integrate into the Chef community, up to and including having sponsored contributors, even a PTL. When exploring the Chef side, we found that we faced as much or more friction with the ecosystem, requiring more fundamental changes than we could influence. Chef (the ecosystem) has many great things, but Chef doesn’t OpenStack. Maybe that was the writing on the wall.

I keep one foot in both Chef and OpenStack, to keep myself as informed as time allows me. It’s clear that even Chef’s long-term cookbook support community is ill equipped to handle OpenStack. The problem? We’re too complex and too far integrated, and none of them know OpenStack. Where does that leave us?

This is a shared concern that I have along with others who have voiced
the concern around features/changes in the OpenStack projects and
their impacts on the deployment tools (and ultimately the end user).
OpenStack continues to get more complex and the support tooling that
has grown around it (ansible, puppet, chef, charms) is struggling to
keep up and maintain because there is not an easy way to stay informed
on all the changes for the entire ecosystem (and there are a ton of
changes). "Read the release notes" is not a great answer for the end
user. Neither is "Read the release notes for every project deployed
and figure it out". I see this problem is a community and culture
problem within OpenStack on and not necessarily a problem with the
folks trying to write the tooling to manage it.

For the cookbooks, every core and non-core project that is supported has to
be tracked. In addition to that, each platform that is supported must be
tracked, for quirks and idiosyncrasies, because they always have them.

Then, there are the cross-project teams that do the packaging, as well as
the teams that do not necessarily ship releases that must be tracked, for
variances in testing methods, mirrors outside the scope of infra, external
dependencies, etc. It can be slightly overwhelming and overloading at times,
even to someone reasonably seasoned. Scale that process, for every ecosystem
in which one desires to exist, by an order of magnitude.

There’s definitely a general undercurrent to all of this, and it’s bigger
than any one person or team to solve. We definitely can’t “read the release
notes” for this.

--
Best,

Samuel Cassiba

Thanks,
-Alex

--
Best,

Samuel Cassiba

--
Thierry Carrez (ttx)


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

responded Feb 16, 2017 by s_at_cassiba.com (1,200 points)  
0 votes

For the cookbooks, every core and non-core project that is supported has to
be tracked. In addition to that, each platform that is supported must be
tracked, for quirks and idiosyncrasies, because they always have them.

Then, there are the cross-project teams that do the packaging, as well as
the teams that do not necessarily ship releases that must be tracked, for
variances in testing methods, mirrors outside the scope of infra, external
dependencies, etc. It can be slightly overwhelming and overloading at times,
even to someone reasonably seasoned. Scale that process, for every ecosystem
in which one desires to exist, by an order of magnitude.

There’s definitely a general undercurrent to all of this, and it’s bigger
than any one person or team to solve. We definitely can’t “read the release
notes” for this.

Radical idea, have each project (not libraries) contain a dockerfile
that builds the project into a deployable unit (or multiple dockerfiles
for projects with multiple components) and then it becomes the projects
responsibility for ensuring that the right code is in that dockerfile to
move from release to release (whether that be a piece of code that does
a configuration migration).

This is basically what kolla is doing (except kolla itself contains all
the dockerfiles and deployment tooling as well) and though I won't
comment on the kolla perspective if each project managed its own
dockerfiles that wouldn't seem like a bad thing... (it may have been
proposed before).

Such a thing could move the responsibility (of at least the packaging
components and dependencies) onto the projects themselves. I've been in
the boat of try to do all the packaging and tracking variances and I
know it's a some kind of hell and shifting the responsibility on the
projects themselves may be a better solution (or at least can be one
people discuss).


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Feb 16, 2017 by harlowja_at_fastmail (16,200 points)   2 5 8
0 votes

On Thu, Feb 16, 2017 at 5:26 AM Joshua Harlow harlowja@fastmail.com wrote:

Radical idea, have each project (not libraries) contain a dockerfile
that builds the project into a deployable unit (or multiple dockerfiles
for projects with multiple components) and then it becomes the projects
responsibility for ensuring that the right code is in that dockerfile to
move from release to release (whether that be a piece of code that does
a configuration migration).

I've wondered about that approach, but worried about having the Docker
engine as a new dependency for each OpenStack node. Would that matter?
(Or are there other reasons why OpenStack nodes commonly already have
Docker on them?)


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Feb 16, 2017 by neil_at_tigera.io (3,740 points)   2 3
0 votes

Le 16/02/2017 10:17, Neil Jerram a écrit :

On Thu, Feb 16, 2017 at 5:26 AM Joshua Harlow <harlowja@fastmail.com
harlowja@fastmail.com> wrote:

Radical idea, have each project (not libraries) contain a dockerfile
that builds the project into a deployable unit (or multiple dockerfiles
for projects with multiple components) and then it becomes the projects
responsibility for ensuring that the right code is in that dockerfile to
move from release to release (whether that be a piece of code that does
a configuration migration).

I've wondered about that approach, but worried about having the Docker
engine as a new dependency for each OpenStack node. Would that matter?
(Or are there other reasons why OpenStack nodes commonly already have
Docker on them?)

And one could claim that each project should also maintain its Ansible
playbooks. And one could claim that each project should also maintain
its Chef cookbooks. And one could claim that each project should also
maintain its Puppet manifests.

I surely understand the problem that it is stated here and how it is
difficult for a deployment tool team to cope with the requirements that
every project makes every time it writes an upgrade impact.

For the good or worst, as a service project developer, the only way to
signal the change is to write a release note. I'm not at all seasoned by
all the quirks and specifics of a specific deployment tool, and it's
always hard time for figuring out if what I write can break other things.

What could be the solution to that distributed services problem ? Well,
understanding each other problem is certainly one of the solutions.
Getting more communication between teams can also certainly help. Having
consistent behaviours between heteregenous deployment tools could also
be a thing.

That's an iterative approach, and that takes time. Sure, and that's
frustrating. But, please, keep in mind we all go into the same direction.

-S


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Feb 16, 2017 by Sylvain_Bauza (14,100 points)   1 3 5
0 votes

Excerpts from Sylvain Bauza's message of 2017-02-16 10:55:14 +0100:

Le 16/02/2017 10:17, Neil Jerram a écrit :

On Thu, Feb 16, 2017 at 5:26 AM Joshua Harlow <harlowja@fastmail.com
harlowja@fastmail.com> wrote:

Radical idea, have each project (not libraries) contain a dockerfile
that builds the project into a deployable unit (or multiple dockerfiles
for projects with multiple components) and then it becomes the projects
responsibility for ensuring that the right code is in that dockerfile to
move from release to release (whether that be a piece of code that does
a configuration migration).

I've wondered about that approach, but worried about having the Docker
engine as a new dependency for each OpenStack node. Would that matter?
(Or are there other reasons why OpenStack nodes commonly already have
Docker on them?)

And one could claim that each project should also maintain its Ansible
playbooks. And one could claim that each project should also maintain
its Chef cookbooks. And one could claim that each project should also
maintain its Puppet manifests.

I surely understand the problem that it is stated here and how it is
difficult for a deployment tool team to cope with the requirements that
every project makes every time it writes an upgrade impact.

For the good or worst, as a service project developer, the only way to
signal the change is to write a release note. I'm not at all seasoned by
all the quirks and specifics of a specific deployment tool, and it's
always hard time for figuring out if what I write can break other things.

What could be the solution to that distributed services problem ? Well,
understanding each other problem is certainly one of the solutions.
Getting more communication between teams can also certainly help. Having
consistent behaviours between heteregenous deployment tools could also
be a thing.

Right. The liaison program used by other cross-project teams is
designed to deal with this communication gap by identifying someone
to focus on ensuring the communication happens. Perhaps we need to
apply that idea to to some of the deployment projects as well.

Doug

That's an iterative approach, and that takes time. Sure, and that's
frustrating. But, please, keep in mind we all go into the same direction.

-S


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Feb 16, 2017 by Doug_Hellmann (87,520 points)   3 4 8
0 votes

-----Original Message-----
From: Doug Hellmann doug@doughellmann.com
Reply: OpenStack Development Mailing List (not for usage questions)

Date: February 16, 2017 at 07:06:25
To: openstack-dev openstack-dev@lists.openstack.org
Subject:  Re: [openstack-dev] [chef] Making the Kitchen Great Again: A
Retrospective on OpenStack & Chef

Excerpts from Sylvain Bauza's message of 2017-02-16 10:55:14 +0100:

Le 16/02/2017 10:17, Neil Jerram a écrit :

On Thu, Feb 16, 2017 at 5:26 AM Joshua Harlow > > > > wrote:

Radical idea, have each project (not libraries) contain a dockerfile
that builds the project into a deployable unit (or multiple dockerfiles
for projects with multiple components) and then it becomes the projects
responsibility for ensuring that the right code is in that dockerfile to
move from release to release (whether that be a piece of code that does
a configuration migration).

I've wondered about that approach, but worried about having the Docker
engine as a new dependency for each OpenStack node. Would that matter?
(Or are there other reasons why OpenStack nodes commonly already have
Docker on them?)

And one could claim that each project should also maintain its Ansible
playbooks. And one could claim that each project should also maintain
its Chef cookbooks. And one could claim that each project should also
maintain its Puppet manifests.

I surely understand the problem that it is stated here and how it is
difficult for a deployment tool team to cope with the requirements that
every project makes every time it writes an upgrade impact.

For the good or worst, as a service project developer, the only way to
signal the change is to write a release note. I'm not at all seasoned by
all the quirks and specifics of a specific deployment tool, and it's
always hard time for figuring out if what I write can break other things.

What could be the solution to that distributed services problem ? Well,
understanding each other problem is certainly one of the solutions.
Getting more communication between teams can also certainly help. Having
consistent behaviours between heteregenous deployment tools could also
be a thing.

Right. The liaison program used by other cross-project teams is
designed to deal with this communication gap by identifying someone
to focus on ensuring the communication happens. Perhaps we need to
apply that idea to to some of the deployment projects as well.

I know the OpenStack-Ansible project went out of its way to try to
create a liaison program with the OpenStack services it works on. The
only engagement (that I'm aware) of it seeing has been from other
Rackspace employees whose management has told them to work on the
Ansible roles to ship the project they work on.

It seems like a lot of people view "developing OpenStack" as more
attractive than "making it easy to deploy OpenStack" (via the
deployment tools in the tent). Doing both is really something project
teams should do, but I don't think shoving all of the deployment
tooling in repo makes that any better frankly. If we put our tooling
in repo, we'll likely then start collecting RPM/Debian/etc. packaging
in repo as well.

I think what would help improve willing bidirectional communication
between service and deployment/packaging teams would be an
understanding that without the deployment/packaging teams, the service
teams will likely be out of a job. People will/can not deploy
OpenStack without the tooling these other teams provide.

Cheers,
--
Ian Cordasco


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Feb 16, 2017 by sigmavirus24_at_gmai (8,720 points)   1 2 3
0 votes

On Thu, Feb 16, 2017 at 7:54 AM, Ian Cordasco sigmavirus24@gmail.com wrote:
It seems like a lot of people view "developing OpenStack" as more
attractive than "making it easy to deploy OpenStack" (via the
deployment tools in the tent). Doing both is really something project
teams should do, but I don't think shoving all of the deployment
tooling in repo makes that any better frankly. If we put our tooling
in repo, we'll likely then start collecting RPM/Debian/etc. packaging
in repo as well.

Part of it is the "deployment isn't shiny/new/features", but there
is also the historical view lingering that we ship tarballs (and
realistically, git repo tags) so as not to pick favorites and to keep
deployment and packaging out of the project repos directly.

We have >5 different tools that are "mainstream" to consider, not even
the large project teams have enough expertise to maintain those. I
had a hard enough time myself keeping both rpm and apt up to date in
previous projects, much less recipies, playbooks and the odd shell
script.

It is also hard to come in to a community and demand that other
projects suddenly have to support your project just because you showed
up. (This works both ways, when we add a "Big Tent" project, now all
of the downstreams are expected to suddenly add it.)

The solution I see is to have a mechanism by which important things
can be communicated from the project developers that can be consumed
by the packager/deployer community. Today, however sub-optimal it
seems to be, that is release notes. Maybe adding a specific section
for packaging would be helpful, but in practice that adds one more
thing to remember to write to a list that is already hard to get devs
to do.

Ad Doug mentions elsewhere in this thread, the liaison approach has
worked in other areas, it may be a useful approach here. But I know
as the PTL of a very small project I do not want 5 more hats to wear.
Maybe one.

We have encouraged the packaging groups to work together in the past,
with mixed results, but it seems to me that a lot of the things that
need to be discovered and learned in new releases will be similar for
all of the downstream consumers. Maybe if that downstream community
could reach a critical mass and suggest a single common way to
communicate the deployment considerations in a release it would get
more traction. And a single project liaison for the collective rather
than one for each deployment project.

I think what would help improve willing bidirectional communication
between service and deployment/packaging teams would be an
understanding that without the deployment/packaging teams, the service
teams will likely be out of a job. People will/can not deploy
OpenStack without the tooling these other teams provide.

True in a literal sense. This is also true for our users, without
them nobody will deploy a cloud in the first place. That has not
changed anything unfortunately, we still regularly give our users
miserable experiences because it is too much effort to participate
with the (now comatose) UX team or prioritize making a common log
format or reduce the number of knobs available to twiddle to make each
cloud a unique snowflake.

We can not sustain being all things to all people. Given the
contraction of investment being considered and implemented by many of
our foundation member companies, we will have to make some hard
decisions about where to spend the resources we have left. Some of
those decision are made for us by those member companies promoting
their own priorities (your example of Ansible contributions is one).
But as a community we have an opportunity to express our desires and
potentially influence those decisions.

dt

--

Dean Troyer
dtroyer@gmail.com


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Feb 16, 2017 by Dean_Troyer (13,100 points)   1 3 3
...