settingsLogin | Registersettings

[openstack-dev] [nova] bug triage experimentation

0 votes

The Nova bug backlog is just over 800 open bugs, which while
historically not terrible, remains too large to be collectively usable
to figure out where things stand. We've had a few recent issues where we
just happened to discover upgrade bugs filed 4 months ago that needed
fixes and backports.

Historically we've tried to just solve the bug backlog with volunteers.
We've had many a brave person dive into here, and burn out after 4 - 6
months. And we're currently without a bug lead. Having done a big giant
purge in the past
(http://lists.openstack.org/pipermail/openstack-dev/2014-September/046517.html)
I know how daunting this all can be.

I don't think that people can currently solve the bug triage problem at
the current workload that it creates. We've got to reduce the smart
human part of that workload.

But, I think that we can also learn some lessons from what active github
projects do.

1 Bot away bad states

There are known bad states of bugs - In Progress with no open patch,
Assigned but not In Progress. We can just bot these away with scripts.
Even better would be to react immediately on bugs like those, that helps
to train folks how to use our workflow. I've got some starter scripts
for this up at - https://github.com/sdague/nova-bug-tools

2 Use tag based workflow

One lesson from github projects, is the github tracker has no workflow.
Issues are openned or closed. Workflow has to be invented by every team
based on a set of tags. Sometimes that's annoying, but often times it's
super handy, because it allows the tracker to change workflows and not
try to change the meaning of things like "Confirmed vs. Triaged" in your
mind.

We can probably tag for information we know we need at lot easier. I'm
considering something like

  • needs.system-version
  • needs.openstack-version
  • needs.logs
  • needs.subteam-feedback
  • has.system-version
  • has.openstack-version
  • has.reproduce

Some of these a bot can process the text on and tell if that info was
provided, and comment how to provide the updated info. Some of this
would be human, but with official tags, it would probably help.

3 machine assisted functional tagging

I'm playing around with some things that might be useful in mapping new
bugs into existing functional buckets like: libvirt, volumes, etc. We'll
see how useful it ends up being.

4 reporting on smaller slices

Build some tooling to report on the status and change over time of bugs
under various tags. This will help visualize how we are doing
(hopefully) and where the biggest piles of issues are.

The intent is the normal unit of interaction would be one of these
smaller piles. Be they the 76 libvirt bugs, 61 volumes bugs, or 36
vmware bugs. It would also highlight the rates of change in these piles,
and what's getting attention and what is not.

This is going to be kind of an ongoing experiment, but as we currently
have no one spear heading bug triage, it seemed like a good time to try
this out.

Comments and other suggestions are welcomed. The tooling will have the
nova flow in mind, but I'm trying to make it so it takes a project name
as params on all the scripts, so anyone can use it. It's a little hack
and slash right now to discover what the right patterns are.

-Sean

--
Sean Dague
http://dague.net


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
asked Jul 26, 2017 in openstack-dev by Sean_Dague (66,200 points)   4 8 14

25 Responses

0 votes

Sean,

This sounds amazing and Swift could definitely use some [automated]
assistance here. It would help if you could throw out a WIP somewhere.

First thought that comes to mind tho .... storyboard.o.o :\

-Clay

On Fri, Jun 23, 2017 at 9:52 AM, Sean Dague sean@dague.net wrote:

The Nova bug backlog is just over 800 open bugs, which while
historically not terrible, remains too large to be collectively usable
to figure out where things stand. We've had a few recent issues where we
just happened to discover upgrade bugs filed 4 months ago that needed
fixes and backports.

Historically we've tried to just solve the bug backlog with volunteers.
We've had many a brave person dive into here, and burn out after 4 - 6
months. And we're currently without a bug lead. Having done a big giant
purge in the past
(http://lists.openstack.org/pipermail/openstack-dev/2014-
September/046517.html)
I know how daunting this all can be.

I don't think that people can currently solve the bug triage problem at
the current workload that it creates. We've got to reduce the smart
human part of that workload.

But, I think that we can also learn some lessons from what active github
projects do.

1 Bot away bad states

There are known bad states of bugs - In Progress with no open patch,
Assigned but not In Progress. We can just bot these away with scripts.
Even better would be to react immediately on bugs like those, that helps
to train folks how to use our workflow. I've got some starter scripts
for this up at - https://github.com/sdague/nova-bug-tools

2 Use tag based workflow

One lesson from github projects, is the github tracker has no workflow.
Issues are openned or closed. Workflow has to be invented by every team
based on a set of tags. Sometimes that's annoying, but often times it's
super handy, because it allows the tracker to change workflows and not
try to change the meaning of things like "Confirmed vs. Triaged" in your
mind.

We can probably tag for information we know we need at lot easier. I'm
considering something like

  • needs.system-version
  • needs.openstack-version
  • needs.logs
  • needs.subteam-feedback
  • has.system-version
  • has.openstack-version
  • has.reproduce

Some of these a bot can process the text on and tell if that info was
provided, and comment how to provide the updated info. Some of this
would be human, but with official tags, it would probably help.

3 machine assisted functional tagging

I'm playing around with some things that might be useful in mapping new
bugs into existing functional buckets like: libvirt, volumes, etc. We'll
see how useful it ends up being.

4 reporting on smaller slices

Build some tooling to report on the status and change over time of bugs
under various tags. This will help visualize how we are doing
(hopefully) and where the biggest piles of issues are.

The intent is the normal unit of interaction would be one of these
smaller piles. Be they the 76 libvirt bugs, 61 volumes bugs, or 36
vmware bugs. It would also highlight the rates of change in these piles,
and what's getting attention and what is not.

This is going to be kind of an ongoing experiment, but as we currently
have no one spear heading bug triage, it seemed like a good time to try
this out.

Comments and other suggestions are welcomed. The tooling will have the
nova flow in mind, but I'm trying to make it so it takes a project name
as params on all the scripts, so anyone can use it. It's a little hack
and slash right now to discover what the right patterns are.

    -Sean

--
Sean Dague
http://dague.net


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 23, 2017 by Clay_Gerrard (5,800 points)   1 2 2
0 votes

Le 23/06/2017 18:52, Sean Dague a écrit :

The Nova bug backlog is just over 800 open bugs, which while
historically not terrible, remains too large to be collectively usable
to figure out where things stand. We've had a few recent issues where we
just happened to discover upgrade bugs filed 4 months ago that needed
fixes and backports.

Historically we've tried to just solve the bug backlog with volunteers.
We've had many a brave person dive into here, and burn out after 4 - 6
months. And we're currently without a bug lead. Having done a big giant
purge in the past
(http://lists.openstack.org/pipermail/openstack-dev/2014-September/046517.html)
I know how daunting this all can be.

I don't think that people can currently solve the bug triage problem at
the current workload that it creates. We've got to reduce the smart
human part of that workload.

Thanks for sharing ideas, Sean.

But, I think that we can also learn some lessons from what active github
projects do.

1 Bot away bad states

There are known bad states of bugs - In Progress with no open patch,
Assigned but not In Progress. We can just bot these away with scripts.
Even better would be to react immediately on bugs like those, that helps
to train folks how to use our workflow. I've got some starter scripts
for this up at - https://github.com/sdague/nova-bug-tools

Sometimes, I had no idea why but I noticed the Gerrit hook not working
(ie. amending the Launchpad bug with the Gerrit URL) so some of the bugs
I was looking for were actively being worked on (and I had the same
experience myself although my commit msg was pretty correctly marked AFAIR).

Either way, what you propose sounds reasonable to me. If you care about
fixing a bug by putting yourself owner of that bug, that also means you
engage yourself on a resolution sooner than later (even if I do fail
applying that to myself...).

2 Use tag based workflow

One lesson from github projects, is the github tracker has no workflow.
Issues are openned or closed. Workflow has to be invented by every team
based on a set of tags. Sometimes that's annoying, but often times it's
super handy, because it allows the tracker to change workflows and not
try to change the meaning of things like "Confirmed vs. Triaged" in your
mind.

We can probably tag for information we know we need at lot easier. I'm
considering something like

  • needs.system-version
  • needs.openstack-version
  • needs.logs
  • needs.subteam-feedback
  • has.system-version
  • has.openstack-version
  • has.reproduce

Some of these a bot can process the text on and tell if that info was
provided, and comment how to provide the updated info. Some of this
would be human, but with official tags, it would probably help.

The tags you propose seem to me related to an "Incomplete" vs.
"Confirmed" state of the bug.

If I'm not able to triage the bug because I'm missing information like
the release version or more logs, I put the bug as Incomplete.
I could add those tags, but I don't see where a programmatical approach
could help us.

If I understand correctly, you're rather trying to identify more what's
missing in the bug report to provide a clear path of resolution, so we
could mark the bug as Triaged, right? If so, I'd not propose those tags
for the reason I just said, but rather other tags like (disclaimer, I
suck at naming things):

  • rootcause.found
  • needs.rootcause.analysis
  • is.regression
  • reproduced.locally

3 machine assisted functional tagging

I'm playing around with some things that might be useful in mapping new
bugs into existing functional buckets like: libvirt, volumes, etc. We'll
see how useful it ends up being.

Logs parsing could certainly help. If someone is able to provide a clear
stacktrace of some root exception, we can get for free the impact
functional bucket for 80% of cases.

I'm not fan of identifying a domain by text recognition (like that's not
because someone tells about libvirt that this is a libvirt bug tho), so
that's why I'd see more some logs analysis like I mentioned.

4 reporting on smaller slices

Build some tooling to report on the status and change over time of bugs
under various tags. This will help visualize how we are doing
(hopefully) and where the biggest piles of issues are.

The intent is the normal unit of interaction would be one of these
smaller piles. Be they the 76 libvirt bugs, 61 volumes bugs, or 36
vmware bugs. It would also highlight the rates of change in these piles,
and what's getting attention and what is not.

I do wonder if Markus already wrote such reporting tools. AFAIR, he had
a couple of very interesting reportings (and he also worked hard on the
bugs taxonomy) so we could potentially leverage those.

-Sylvain

This is going to be kind of an ongoing experiment, but as we currently
have no one spear heading bug triage, it seemed like a good time to try
this out.

Comments and other suggestions are welcomed. The tooling will have the
nova flow in mind, but I'm trying to make it so it takes a project name
as params on all the scripts, so anyone can use it. It's a little hack
and slash right now to discover what the right patterns are.

-Sean


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 26, 2017 by Sylvain_Bauza (14,100 points)   1 3 5
0 votes

On 06/26/2017 04:49 AM, Sylvain Bauza wrote:

Le 23/06/2017 18:52, Sean Dague a écrit :

The Nova bug backlog is just over 800 open bugs, which while
historically not terrible, remains too large to be collectively usable
to figure out where things stand. We've had a few recent issues where we
just happened to discover upgrade bugs filed 4 months ago that needed
fixes and backports.

Historically we've tried to just solve the bug backlog with volunteers.
We've had many a brave person dive into here, and burn out after 4 - 6
months. And we're currently without a bug lead. Having done a big giant
purge in the past
(http://lists.openstack.org/pipermail/openstack-dev/2014-September/046517.html)
I know how daunting this all can be.

I don't think that people can currently solve the bug triage problem at
the current workload that it creates. We've got to reduce the smart
human part of that workload.

Thanks for sharing ideas, Sean.

But, I think that we can also learn some lessons from what active github
projects do.

1 Bot away bad states

There are known bad states of bugs - In Progress with no open patch,
Assigned but not In Progress. We can just bot these away with scripts.
Even better would be to react immediately on bugs like those, that helps
to train folks how to use our workflow. I've got some starter scripts
for this up at - https://github.com/sdague/nova-bug-tools

Sometimes, I had no idea why but I noticed the Gerrit hook not working
(ie. amending the Launchpad bug with the Gerrit URL) so some of the bugs
I was looking for were actively being worked on (and I had the same
experience myself although my commit msg was pretty correctly marked AFAIR).

Either way, what you propose sounds reasonable to me. If you care about
fixing a bug by putting yourself owner of that bug, that also means you
engage yourself on a resolution sooner than later (even if I do fail
applying that to myself...).

2 Use tag based workflow

One lesson from github projects, is the github tracker has no workflow.
Issues are openned or closed. Workflow has to be invented by every team
based on a set of tags. Sometimes that's annoying, but often times it's
super handy, because it allows the tracker to change workflows and not
try to change the meaning of things like "Confirmed vs. Triaged" in your
mind.

We can probably tag for information we know we need at lot easier. I'm
considering something like

  • needs.system-version
  • needs.openstack-version
  • needs.logs
  • needs.subteam-feedback
  • has.system-version
  • has.openstack-version
  • has.reproduce

Some of these a bot can process the text on and tell if that info was
provided, and comment how to provide the updated info. Some of this
would be human, but with official tags, it would probably help.

The tags you propose seem to me related to an "Incomplete" vs.
"Confirmed" state of the bug.

If I'm not able to triage the bug because I'm missing information like
the release version or more logs, I put the bug as Incomplete.
I could add those tags, but I don't see where a programmatical approach
could help us.

We always want that information, and the odds of us getting it from a
user decline over time. When we end up looking at bugs that are year
old, it becomes a big guessing game on their relevancy.

The theory here is that tags like that would be applied by a bot
immediately after the bug is filed. Catching the owner within 5 minutes
of their bug filing with a response which is "we need the following"
means we should get a pretty decent attach rate on that information. And
then you don't have to spend 10 minutes of real human time realizing
that you really need that before moving forward.

-Sean

--
Sean Dague
http://dague.net


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 26, 2017 by Sean_Dague (66,200 points)   4 8 14
0 votes

On 26.06.2017 10:49, Sylvain Bauza wrote:

Le 23/06/2017 18:52, Sean Dague a écrit :

The Nova bug backlog is just over 800 open bugs, which while
historically not terrible, remains too large to be collectively usable
to figure out where things stand. We've had a few recent issues where we
just happened to discover upgrade bugs filed 4 months ago that needed
fixes and backports.

Historically we've tried to just solve the bug backlog with volunteers.
We've had many a brave person dive into here, and burn out after 4 - 6
months. And we're currently without a bug lead. Having done a big giant
purge in the past
(http://lists.openstack.org/pipermail/openstack-dev/2014-September/046517.html)
I know how daunting this all can be.

I don't think that people can currently solve the bug triage problem at
the current workload that it creates. We've got to reduce the smart
human part of that workload.

Thanks for sharing ideas, Sean.

But, I think that we can also learn some lessons from what active github
projects do.

1 Bot away bad states

There are known bad states of bugs - In Progress with no open patch,
Assigned but not In Progress. We can just bot these away with scripts.
Even better would be to react immediately on bugs like those, that helps
to train folks how to use our workflow. I've got some starter scripts
for this up at - https://github.com/sdague/nova-bug-tools

Sometimes, I had no idea why but I noticed the Gerrit hook not working
(ie. amending the Launchpad bug with the Gerrit URL) so some of the bugs
I was looking for were actively being worked on (and I had the same
experience myself although my commit msg was pretty correctly marked AFAIR).

Either way, what you propose sounds reasonable to me. If you care about
fixing a bug by putting yourself owner of that bug, that also means you
engage yourself on a resolution sooner than later (even if I do fail
applying that to myself...).

2 Use tag based workflow

One lesson from github projects, is the github tracker has no workflow.
Issues are openned or closed. Workflow has to be invented by every team
based on a set of tags. Sometimes that's annoying, but often times it's
super handy, because it allows the tracker to change workflows and not
try to change the meaning of things like "Confirmed vs. Triaged" in your
mind.

We can probably tag for information we know we need at lot easier. I'm
considering something like

  • needs.system-version
  • needs.openstack-version
  • needs.logs
  • needs.subteam-feedback
  • has.system-version
  • has.openstack-version
  • has.reproduce

Some of these a bot can process the text on and tell if that info was
provided, and comment how to provide the updated info. Some of this
would be human, but with official tags, it would probably help.

The tags you propose seem to me related to an "Incomplete" vs.
"Confirmed" state of the bug.

If I'm not able to triage the bug because I'm missing information like
the release version or more logs, I put the bug as Incomplete.
I could add those tags, but I don't see where a programmatical approach
could help us.

If I understand correctly, you're rather trying to identify more what's
missing in the bug report to provide a clear path of resolution, so we
could mark the bug as Triaged, right? If so, I'd not propose those tags
for the reason I just said, but rather other tags like (disclaimer, I
suck at naming things):

  • rootcause.found
  • needs.rootcause.analysis
  • is.regression
  • reproduced.locally

3 machine assisted functional tagging

I'm playing around with some things that might be useful in mapping new
bugs into existing functional buckets like: libvirt, volumes, etc. We'll
see how useful it ends up being.

Logs parsing could certainly help. If someone is able to provide a clear
stacktrace of some root exception, we can get for free the impact
functional bucket for 80% of cases.

I'm not fan of identifying a domain by text recognition (like that's not
because someone tells about libvirt that this is a libvirt bug tho), so
that's why I'd see more some logs analysis like I mentioned.

4 reporting on smaller slices

Build some tooling to report on the status and change over time of bugs
under various tags. This will help visualize how we are doing
(hopefully) and where the biggest piles of issues are.

The intent is the normal unit of interaction would be one of these
smaller piles. Be they the 76 libvirt bugs, 61 volumes bugs, or 36
vmware bugs. It would also highlight the rates of change in these piles,
and what's getting attention and what is not.

I do wonder if Markus already wrote such reporting tools. AFAIR, he had
a couple of very interesting reportings (and he also worked hard on the
bugs taxonomy) so we could potentially leverage those.

-Sylvain

The things I had/have are:

http://markuszoeller.github.io/posts/2016/03/06/grafana-graphite-statsd/#collect-and-push-custom-metrics

Most of the time in the past, I asked for logs, versions and
configuration. Tooling for that could save some back-and-forth.

This is going to be kind of an ongoing experiment, but as we currently
have no one spear heading bug triage, it seemed like a good time to try
this out.

Comments and other suggestions are welcomed. The tooling will have the
nova flow in mind, but I'm trying to make it so it takes a project name
as params on all the scripts, so anyone can use it. It's a little hack
and slash right now to discover what the right patterns are.

-Sean


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

--
Regards, Markus Zoeller (markus_z)


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 26, 2017 by mzoeller_at_linux.vn (3,220 points)   3 5
0 votes

On 06/23/2017 11:52 AM, Sean Dague wrote:
The Nova bug backlog is just over 800 open bugs, which while
historically not terrible, remains too large to be collectively usable
to figure out where things stand. We've had a few recent issues where we
just happened to discover upgrade bugs filed 4 months ago that needed
fixes and backports.

Historically we've tried to just solve the bug backlog with volunteers.
We've had many a brave person dive into here, and burn out after 4 - 6
months. And we're currently without a bug lead. Having done a big giant
purge in the past
(http://lists.openstack.org/pipermail/openstack-dev/2014-September/046517.html)
I know how daunting this all can be.

I don't think that people can currently solve the bug triage problem at
the current workload that it creates. We've got to reduce the smart
human part of that workload.

But, I think that we can also learn some lessons from what active github
projects do.

1 Bot away bad states

There are known bad states of bugs - In Progress with no open patch,
Assigned but not In Progress. We can just bot these away with scripts.
Even better would be to react immediately on bugs like those, that helps
to train folks how to use our workflow. I've got some starter scripts
for this up at - https://github.com/sdague/nova-bug-tools

Just saw the update on https://bugs.launchpad.net/nova/+bug/1698010 and
I don't agree that assigned but not in progress is an invalid state. If
it persists for a period of time then sure, but to me assigning yourself
a bug is a signal that you're working on it and nobody else needs to.
Otherwise you end up with multiple people working a bug without
realizing someone else already was. I've seen that happen more than once.

Would it be possible to only un-assign such bugs if they've been in that
state for a week? At that point it seems safe to say the assignee has
either moved on or that the bug is tricky and additional input would be
useful anyway.

Otherwise, big +1 to a bug bot. I need to try running it against the
~700 open TripleO bugs...

2 Use tag based workflow

One lesson from github projects, is the github tracker has no workflow.
Issues are openned or closed. Workflow has to be invented by every team
based on a set of tags. Sometimes that's annoying, but often times it's
super handy, because it allows the tracker to change workflows and not
try to change the meaning of things like "Confirmed vs. Triaged" in your
mind.

We can probably tag for information we know we need at lot easier. I'm
considering something like

  • needs.system-version
  • needs.openstack-version
  • needs.logs
  • needs.subteam-feedback
  • has.system-version
  • has.openstack-version
  • has.reproduce

Some of these a bot can process the text on and tell if that info was
provided, and comment how to provide the updated info. Some of this
would be human, but with official tags, it would probably help.

3 machine assisted functional tagging

I'm playing around with some things that might be useful in mapping new
bugs into existing functional buckets like: libvirt, volumes, etc. We'll
see how useful it ends up being.

4 reporting on smaller slices

Build some tooling to report on the status and change over time of bugs
under various tags. This will help visualize how we are doing
(hopefully) and where the biggest piles of issues are.

The intent is the normal unit of interaction would be one of these
smaller piles. Be they the 76 libvirt bugs, 61 volumes bugs, or 36
vmware bugs. It would also highlight the rates of change in these piles,
and what's getting attention and what is not.

This is going to be kind of an ongoing experiment, but as we currently
have no one spear heading bug triage, it seemed like a good time to try
this out.

Comments and other suggestions are welcomed. The tooling will have the
nova flow in mind, but I'm trying to make it so it takes a project name
as params on all the scripts, so anyone can use it. It's a little hack
and slash right now to discover what the right patterns are.

-Sean


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 28, 2017 by Ben_Nemec (19,660 points)   2 3 3
0 votes

On 06/28/2017 10:33 AM, Ben Nemec wrote:

On 06/23/2017 11:52 AM, Sean Dague wrote:

The Nova bug backlog is just over 800 open bugs, which while
historically not terrible, remains too large to be collectively usable
to figure out where things stand. We've had a few recent issues where we
just happened to discover upgrade bugs filed 4 months ago that needed
fixes and backports.

Historically we've tried to just solve the bug backlog with volunteers.
We've had many a brave person dive into here, and burn out after 4 - 6
months. And we're currently without a bug lead. Having done a big giant
purge in the past
(http://lists.openstack.org/pipermail/openstack-dev/2014-September/046517.html)

I know how daunting this all can be.

I don't think that people can currently solve the bug triage problem at
the current workload that it creates. We've got to reduce the smart
human part of that workload.

But, I think that we can also learn some lessons from what active github
projects do.

1 Bot away bad states

There are known bad states of bugs - In Progress with no open patch,
Assigned but not In Progress. We can just bot these away with scripts.
Even better would be to react immediately on bugs like those, that helps
to train folks how to use our workflow. I've got some starter scripts
for this up at - https://github.com/sdague/nova-bug-tools

Just saw the update on https://bugs.launchpad.net/nova/+bug/1698010 and
I don't agree that assigned but not in progress is an invalid state. If
it persists for a period of time then sure, but to me assigning yourself
a bug is a signal that you're working on it and nobody else needs to.
Otherwise you end up with multiple people working a bug without
realizing someone else already was. I've seen that happen more than once.

The other case, where folks assign themselves and never do anything,
happens about 100 times a month.

We don't live in an exclusive lock environment, anyone can push a fix
for a bug, and gerrit assigns it to them. I don't see why we'd treat LP
any differently. Yes, this sometimes leads to duplicate fixes, however
in the current model it's far more frequent for bugs to be blocked away
as "assigned" when no one is working on them.

A future version might be smarter and give folks a 7 day window or
something, but parsing back the history to understand the right logic
there is tricky enough that it's a future enhancement at best.

-Sean

--
Sean Dague
http://dague.net


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 28, 2017 by Sean_Dague (66,200 points)   4 8 14
0 votes

On 28.06.2017 16:50, Sean Dague wrote:
On 06/28/2017 10:33 AM, Ben Nemec wrote:

On 06/23/2017 11:52 AM, Sean Dague wrote:

The Nova bug backlog is just over 800 open bugs, which while
historically not terrible, remains too large to be collectively usable
to figure out where things stand. We've had a few recent issues where we
just happened to discover upgrade bugs filed 4 months ago that needed
fixes and backports.

Historically we've tried to just solve the bug backlog with volunteers.
We've had many a brave person dive into here, and burn out after 4 - 6
months. And we're currently without a bug lead. Having done a big giant
purge in the past
(http://lists.openstack.org/pipermail/openstack-dev/2014-September/046517.html)

I know how daunting this all can be.

I don't think that people can currently solve the bug triage problem at
the current workload that it creates. We've got to reduce the smart
human part of that workload.

But, I think that we can also learn some lessons from what active github
projects do.

1 Bot away bad states

There are known bad states of bugs - In Progress with no open patch,
Assigned but not In Progress. We can just bot these away with scripts.
Even better would be to react immediately on bugs like those, that helps
to train folks how to use our workflow. I've got some starter scripts
for this up at - https://github.com/sdague/nova-bug-tools

Just saw the update on https://bugs.launchpad.net/nova/+bug/1698010 and
I don't agree that assigned but not in progress is an invalid state. If
it persists for a period of time then sure, but to me assigning yourself
a bug is a signal that you're working on it and nobody else needs to.
Otherwise you end up with multiple people working a bug without
realizing someone else already was. I've seen that happen more than once.

The other case, where folks assign themselves and never do anything,
happens about 100 times a month.

We don't live in an exclusive lock environment, anyone can push a fix
for a bug, and gerrit assigns it to them. I don't see why we'd treat LP
any differently. Yes, this sometimes leads to duplicate fixes, however
in the current model it's far more frequent for bugs to be blocked away
as "assigned" when no one is working on them.

A future version might be smarter and give folks a 7 day window or
something, but parsing back the history to understand the right logic
there is tricky enough that it's a future enhancement at best.

-Sean

+1 That happened so frequently I made a query for that:
http://45.55.105.55:8082/bugs-dashboard.html#tabInProgressStale

After poking people to get the actual state, 99% of the time the answer
was "I couldn't work on it, please remove my assignment.".

--
Regards, Markus Zoeller (markus_z)


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 29, 2017 by mzoeller_at_linux.vn (3,220 points)   3 5
0 votes

On Wed, Jun 28, 2017 at 7:33 AM, Ben Nemec openstack@nemebean.com wrote:

On 06/23/2017 11:52 AM, Sean Dague wrote:

The Nova bug backlog is just over 800 open bugs, which while
historically not terrible, remains too large to be collectively usable
to figure out where things stand. We've had a few recent issues where we
just happened to discover upgrade bugs filed 4 months ago that needed
fixes and backports.

Historically we've tried to just solve the bug backlog with volunteers.
We've had many a brave person dive into here, and burn out after 4 - 6
months. And we're currently without a bug lead. Having done a big giant
purge in the past

(http://lists.openstack.org/pipermail/openstack-dev/2014-September/046517.html)
I know how daunting this all can be.

I don't think that people can currently solve the bug triage problem at
the current workload that it creates. We've got to reduce the smart
human part of that workload.

But, I think that we can also learn some lessons from what active github
projects do.

1 Bot away bad states

There are known bad states of bugs - In Progress with no open patch,
Assigned but not In Progress. We can just bot these away with scripts.
Even better would be to react immediately on bugs like those, that helps
to train folks how to use our workflow. I've got some starter scripts
for this up at - https://github.com/sdague/nova-bug-tools

Just saw the update on https://bugs.launchpad.net/nova/+bug/1698010 and I
don't agree that assigned but not in progress is an invalid state. If it
persists for a period of time then sure, but to me assigning yourself a bug
is a signal that you're working on it and nobody else needs to. Otherwise
you end up with multiple people working a bug without realizing someone else
already was. I've seen that happen more than once.

Would it be possible to only un-assign such bugs if they've been in that
state for a week? At that point it seems safe to say the assignee has
either moved on or that the bug is tricky and additional input would be
useful anyway.

Otherwise, big +1 to a bug bot. I need to try running it against the ~700
open TripleO bugs...

I just tried, please send complains to me if I broke something.

Sean, the tool is really awesome and I was wondering if we could move
it to https://github.com/openstack-infra/release-tools so we
centralize the tools.

Thanks,

2 Use tag based workflow

One lesson from github projects, is the github tracker has no workflow.
Issues are openned or closed. Workflow has to be invented by every team
based on a set of tags. Sometimes that's annoying, but often times it's
super handy, because it allows the tracker to change workflows and not
try to change the meaning of things like "Confirmed vs. Triaged" in your
mind.

We can probably tag for information we know we need at lot easier. I'm
considering something like

  • needs.system-version
  • needs.openstack-version
  • needs.logs
  • needs.subteam-feedback
  • has.system-version
  • has.openstack-version
  • has.reproduce

Some of these a bot can process the text on and tell if that info was
provided, and comment how to provide the updated info. Some of this
would be human, but with official tags, it would probably help.

3 machine assisted functional tagging

I'm playing around with some things that might be useful in mapping new
bugs into existing functional buckets like: libvirt, volumes, etc. We'll
see how useful it ends up being.

4 reporting on smaller slices

Build some tooling to report on the status and change over time of bugs
under various tags. This will help visualize how we are doing
(hopefully) and where the biggest piles of issues are.

The intent is the normal unit of interaction would be one of these
smaller piles. Be they the 76 libvirt bugs, 61 volumes bugs, or 36
vmware bugs. It would also highlight the rates of change in these piles,
and what's getting attention and what is not.

This is going to be kind of an ongoing experiment, but as we currently
have no one spear heading bug triage, it seemed like a good time to try
this out.

Comments and other suggestions are welcomed. The tooling will have the
nova flow in mind, but I'm trying to make it so it takes a project name
as params on all the scripts, so anyone can use it. It's a little hack
and slash right now to discover what the right patterns are.

    -Sean


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

--
Emilien Macchi


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 5, 2017 by emilien_at_redhat.co (36,940 points)   2 6 9
0 votes

On Wed, Jun 28, 2017 at 7:33 AM, Ben Nemec openstack@nemebean.com wrote:

On 06/23/2017 11:52 AM, Sean Dague wrote:

The Nova bug backlog is just over 800 open bugs, which while
historically not terrible, remains too large to be collectively usable
to figure out where things stand. We've had a few recent issues where we
just happened to discover upgrade bugs filed 4 months ago that needed
fixes and backports.

Historically we've tried to just solve the bug backlog with volunteers.
We've had many a brave person dive into here, and burn out after 4 - 6
months. And we're currently without a bug lead. Having done a big giant
purge in the past

(http://lists.openstack.org/pipermail/openstack-dev/2014-September/046517.html)
I know how daunting this all can be.

I don't think that people can currently solve the bug triage problem at
the current workload that it creates. We've got to reduce the smart
human part of that workload.

But, I think that we can also learn some lessons from what active github
projects do.

1 Bot away bad states

There are known bad states of bugs - In Progress with no open patch,
Assigned but not In Progress. We can just bot these away with scripts.
Even better would be to react immediately on bugs like those, that helps
to train folks how to use our workflow. I've got some starter scripts
for this up at - https://github.com/sdague/nova-bug-tools

Just saw the update on https://bugs.launchpad.net/nova/+bug/1698010 and I
don't agree that assigned but not in progress is an invalid state. If it
persists for a period of time then sure, but to me assigning yourself a bug
is a signal that you're working on it and nobody else needs to. Otherwise
you end up with multiple people working a bug without realizing someone else
already was. I've seen that happen more than once.

I agree with you Ben. While I was running this query for old bug, I've
stopped so bugs after March of this year won't be modified (let me
know if that's the case, then I'll fix it).

A grace period of 7 days is a good idea maybe.

Would it be possible to only un-assign such bugs if they've been in that
state for a week? At that point it seems safe to say the assignee has
either moved on or that the bug is tricky and additional input would be
useful anyway.

Otherwise, big +1 to a bug bot. I need to try running it against the ~700
open TripleO bugs...

2 Use tag based workflow

One lesson from github projects, is the github tracker has no workflow.
Issues are openned or closed. Workflow has to be invented by every team
based on a set of tags. Sometimes that's annoying, but often times it's
super handy, because it allows the tracker to change workflows and not
try to change the meaning of things like "Confirmed vs. Triaged" in your
mind.

We can probably tag for information we know we need at lot easier. I'm
considering something like

  • needs.system-version
  • needs.openstack-version
  • needs.logs
  • needs.subteam-feedback
  • has.system-version
  • has.openstack-version
  • has.reproduce

Some of these a bot can process the text on and tell if that info was
provided, and comment how to provide the updated info. Some of this
would be human, but with official tags, it would probably help.

3 machine assisted functional tagging

I'm playing around with some things that might be useful in mapping new
bugs into existing functional buckets like: libvirt, volumes, etc. We'll
see how useful it ends up being.

4 reporting on smaller slices

Build some tooling to report on the status and change over time of bugs
under various tags. This will help visualize how we are doing
(hopefully) and where the biggest piles of issues are.

The intent is the normal unit of interaction would be one of these
smaller piles. Be they the 76 libvirt bugs, 61 volumes bugs, or 36
vmware bugs. It would also highlight the rates of change in these piles,
and what's getting attention and what is not.

This is going to be kind of an ongoing experiment, but as we currently
have no one spear heading bug triage, it seemed like a good time to try
this out.

Comments and other suggestions are welcomed. The tooling will have the
nova flow in mind, but I'm trying to make it so it takes a project name
as params on all the scripts, so anyone can use it. It's a little hack
and slash right now to discover what the right patterns are.

    -Sean


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

--
Emilien Macchi


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 5, 2017 by emilien_at_redhat.co (36,940 points)   2 6 9
0 votes

On Fri, Jun 23, 2017 at 9:52 AM, Sean Dague sean@dague.net wrote:
The Nova bug backlog is just over 800 open bugs, which while
historically not terrible, remains too large to be collectively usable
to figure out where things stand. We've had a few recent issues where we
just happened to discover upgrade bugs filed 4 months ago that needed
fixes and backports.

Historically we've tried to just solve the bug backlog with volunteers.
We've had many a brave person dive into here, and burn out after 4 - 6
months. And we're currently without a bug lead. Having done a big giant
purge in the past
(http://lists.openstack.org/pipermail/openstack-dev/2014-September/046517.html)
I know how daunting this all can be.

I don't think that people can currently solve the bug triage problem at
the current workload that it creates. We've got to reduce the smart
human part of that workload.

But, I think that we can also learn some lessons from what active github
projects do.

1 Bot away bad states

There are known bad states of bugs - In Progress with no open patch,
Assigned but not In Progress. We can just bot these away with scripts.
Even better would be to react immediately on bugs like those, that helps
to train folks how to use our workflow. I've got some starter scripts
for this up at - https://github.com/sdague/nova-bug-tools

2 Use tag based workflow

One lesson from github projects, is the github tracker has no workflow.
Issues are openned or closed. Workflow has to be invented by every team
based on a set of tags. Sometimes that's annoying, but often times it's
super handy, because it allows the tracker to change workflows and not
try to change the meaning of things like "Confirmed vs. Triaged" in your
mind.

We can probably tag for information we know we need at lot easier. I'm
considering something like

  • needs.system-version
  • needs.openstack-version
  • needs.logs
  • needs.subteam-feedback
  • has.system-version
  • has.openstack-version
  • has.reproduce

Some of these a bot can process the text on and tell if that info was
provided, and comment how to provide the updated info. Some of this
would be human, but with official tags, it would probably help.

3 machine assisted functional tagging

I'm playing around with some things that might be useful in mapping new
bugs into existing functional buckets like: libvirt, volumes, etc. We'll
see how useful it ends up being.

4 reporting on smaller slices

Build some tooling to report on the status and change over time of bugs
under various tags. This will help visualize how we are doing
(hopefully) and where the biggest piles of issues are.

The intent is the normal unit of interaction would be one of these
smaller piles. Be they the 76 libvirt bugs, 61 volumes bugs, or 36
vmware bugs. It would also highlight the rates of change in these piles,
and what's getting attention and what is not.

This is going to be kind of an ongoing experiment, but as we currently
have no one spear heading bug triage, it seemed like a good time to try
this out.

Comments and other suggestions are welcomed. The tooling will have the
nova flow in mind, but I'm trying to make it so it takes a project name
as params on all the scripts, so anyone can use it. It's a little hack
and slash right now to discover what the right patterns are.

I also believe that some of the scripts could be transformed into
native features of Storyboard where bugs could be auto-triaged
periodically without human intervention.
Maybe it would convince more OpenStack projects to leave Launchpad and
adopt Storyboard?
I would certainly one of those and propose such a change for TripleO &
related projects.

Thanks,

    -Sean

--
Sean Dague
http://dague.net


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

--
Emilien Macchi


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 5, 2017 by emilien_at_redhat.co (36,940 points)   2 6 9
...