settingsLogin | Registersettings

[openstack-dev] [all][infra][tc][ptl] Scaling up code review process (subdir cores)

0 votes

Hi stackers,

Issue
-------

Projects are becoming bigger and bigger overtime.
More and more people would like to contribute code and usually core
reviewers
team can't scale enough. It's very hard to find people that understand full
project and have enough time to do code reviews. As a result team is very
small under heavy load and many maintainers just get burned out.

We have to solve this issue to move forward.

Idea
------

Let's introduce subsystems cores.

Really it's hard to find cores that understand whole project, but it's
quite simple to find people that can maintain subsystems of project.

How To
-----------

Gerrit is not so simple as it looks and it has really neat features ;)

For example we can write own rules about who can put +2 and merge patch
based on changes files.

We can make special "subdirectory core" ACL group.
People from such ACL group will be able to merge changes that touch only
files from some specific subdirs.

As a result with proper organization of directories in project we can scale
up review process without losing quality.

Thoughts?

Best regards,
Boris Pavlovic


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
asked Jun 2, 2015 in openstack-dev by boris_at_pavlovic.me (6,900 points)   1 4 7

42 Responses

0 votes

I like this idea. I agree, as things grow it is probably easier to find
folks that know certain areas a project rather than the full scope.

This could be a good way to handle the load and delegate some pieces
(such as driver reviews) to a different set of people.

On 06/02/2015 04:24 PM, Boris Pavlovic wrote:
Hi stackers,

Issue
-------

Projects are becoming bigger and bigger overtime.
More and more people would like to contribute code and usually core
reviewers
team can't scale enough. It's very hard to find people that understand
full project and have enough time to do code reviews. As a result team
is very small under heavy load and many maintainers just get burned out.

We have to solve this issue to move forward.

Idea
------

Let's introduce subsystems cores.

Really it's hard to find cores that understand whole project, but it's
quite simple to find people that can maintain subsystems of project.

How To
-----------
*
*
Gerrit is not so simple as it looks and it has really neat features ;)

For example we can write own rules about who can put +2 and merge
patch based on changes files.

We can make special "subdirectory core" ACL group.
People from such ACL group will be able to merge changes that touch
only files from some specific subdirs.

As a result with proper organization of directories in project we can
scale up review process without losing quality.


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 2, 2015 by Sean_McGinnis (11,820 points)   2 3 6
0 votes

On 6/2/15, 16:24, "Boris Pavlovic" boris@pavlovic.me wrote:

Hi stackers,

Issue


Projects are becoming bigger and bigger overtime.
More and more people would like to contribute code and usually core
reviewers
team can't scale enough. It's very hard to find people that understand
full project and have enough time to do code reviews. As a result team is
very small under heavy load and many maintainers just get burned out.

We have to solve this issue to move forward.

Idea


Let's introduce subsystems cores.

Really it's hard to find cores that understand whole project, but it's
quite simple to find people that can maintain subsystems of project.

How To


Gerrit is not so simple as it looks and it has really neat features ;)

For example we can write own rules about who can put +2 and merge patch
based on changes files.

We can make special "subdirectory core" ACL group.
People from such ACL group will be able to merge changes that touch only
files from some specific subdirs.

As a result with proper organization of directories in project we can
scale up review process without losing quality.

Thoughts?

Best regards,
Boris Pavlovic

I like this very much. I recall there was a session at the summit about
this that Thierry and Kyle led. If I recall correctly, the discussion
mentioned that it wasn't (at this point in time) possible to use gerrit
the way you describe it, but perhaps people were mistaken?

If we can do this exactly as you describe it, that would be awesome. If
there's a problem in limiting people to what files they can approve
changes for, then an alteration might be that those people get +2 but not
+W. This provides a signal to whomever has +W that the review is very much
ready to be merged. Does that sound fair?

Cheers,
Ian


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 2, 2015 by Ian_Cordasco (5,340 points)   1 3 4
0 votes

On 2 June 2015 at 23:59, Ian Cordasco ian.cordasco@rackspace.com wrote:

On 6/2/15, 16:24, "Boris Pavlovic" boris@pavlovic.me wrote:

Hi stackers,

Issue


Projects are becoming bigger and bigger overtime.
More and more people would like to contribute code and usually core
reviewers
team can't scale enough. It's very hard to find people that understand
full project and have enough time to do code reviews. As a result team is
very small under heavy load and many maintainers just get burned out.

We have to solve this issue to move forward.

Idea


Let's introduce subsystems cores.

Really it's hard to find cores that understand whole project, but it's
quite simple to find people that can maintain subsystems of project.

How To


Gerrit is not so simple as it looks and it has really neat features ;)

For example we can write own rules about who can put +2 and merge patch
based on changes files.

We can make special "subdirectory core" ACL group.
People from such ACL group will be able to merge changes that touch only
files from some specific subdirs.

As a result with proper organization of directories in project we can
scale up review process without losing quality.

Thoughts?

Best regards,
Boris Pavlovic

I like this very much. I recall there was a session at the summit about
this that Thierry and Kyle led.

Indeed, and Kyle has already transformed that into facts [1]

If I recall correctly, the discussion
mentioned that it wasn't (at this point in time) possible to use gerrit
the way you describe it, but perhaps people were mistaken?

I recall that too, and I also recall fungi stating the same thing back in
Paris.
Gerrit doesn't really have a concept of subsystems, as far as I can
understand; in theory gerrit could be changed to support this, but that's
another discussion.
The networking community is currently adopting multiple repositories to
this aim. This has worked very well for 3rd party plugins, and quite well
for advanced services.
For the 'neutron' proper project, which is however large enough to identify
multiple subsystems in it, the lieutenant mode described in [1] will be
enforced with a bit of common sense - from what I gather. If you're a core
for subsystem X, nominated by its lieutenant, you're not supposed to +/-2
patches that only marginally affect your subsystem or do not affect it at
all.

If we can do this exactly as you describe it, that would be awesome.

If
there's a problem in limiting people to what files they can approve
changes for, then an alteration might be that those people get +2 but not
+W. This provides a signal to whomever has +W that the review is very much
ready to be merged. Does that sound fair?

neutron-specs adopts this approach (all cores can +2 but only a handful can
+A).
I think it works, in the assumption of a lieutenant systems, but for
projects with a large patch turnaround might constitute a bottleneck,
especially when there are gate-breaking issues that need to be approved
ASAP.
Generally speaking, I believe having 2 ties of cores (those with +A rights
and those without) is an experiment worth doing. I don't think it creates
an "elite" among developers; on the other hand, it gives SMEs a chance to
have a greater impact.

Cheers,
Ian

Salvatore

[1]
http://git.openstack.org/cgit/openstack/neutron/tree/doc/source/policies/core-reviewers.rst


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 2, 2015 by Salvatore_Orlando (12,280 points)   2 5 8
0 votes

On 2015-06-02 21:59:34 +0000 (+0000), Ian Cordasco wrote:
I like this very much. I recall there was a session at the summit
about this that Thierry and Kyle led. If I recall correctly, the
discussion mentioned that it wasn't (at this point in time)
possible to use gerrit the way you describe it, but perhaps people
were mistaken?
[...]

It wasn't an option at the time. What's being conjectured now is
that with custom Prolog rules it might be possible to base Gerrit
label permissions on strict file subsets within repos. It's
nontrivial, as of yet I've seen no working demonstration, and we'd
still need the Infrastructure Team to feel comfortable supporting it
even if it does turn out to be technically possible. But even before
going down the path of automating/enforcing it anywhere in our
toolchain, projects interested in this workflow need to try to
mentally follow the proposed model and see if it makes social sense
for them.

It's also still not immediately apparent to me that this additional
complication brings any substantial convenience over having distinct
Git repositories under the control of separate but allied teams. For
example, the Infrastructure Project is now past 120 repos with more
than 70 core reviewers among those. In a hypothetical reality where
those were separate directory trees within a single repository, I'm
not coming up with any significant ways it would improve our current
workflow. That said, I understand other projects may have different
needs and challenges with their codebase we just don't face.
--
Jeremy Stanley


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 2, 2015 by Jeremy_Stanley (56,700 points)   3 5 7
0 votes

On 3 June 2015 at 10:34, Jeremy Stanley fungi@yuggoth.org wrote:
On 2015-06-02 21:59:34 +0000 (+0000), Ian Cordasco wrote:

I like this very much. I recall there was a session at the summit
about this that Thierry and Kyle led. If I recall correctly, the
discussion mentioned that it wasn't (at this point in time)
possible to use gerrit the way you describe it, but perhaps people
were mistaken?
[...]

It wasn't an option at the time. What's being conjectured now is
that with custom Prolog rules it might be possible to base Gerrit
label permissions on strict file subsets within repos. It's
nontrivial, as of yet I've seen no working demonstration, and we'd
still need the Infrastructure Team to feel comfortable supporting it
even if it does turn out to be technically possible. But even before
going down the path of automating/enforcing it anywhere in our
toolchain, projects interested in this workflow need to try to
mentally follow the proposed model and see if it makes social sense
for them.

It's also still not immediately apparent to me that this additional
complication brings any substantial convenience over having distinct
Git repositories under the control of separate but allied teams. For
example, the Infrastructure Project is now past 120 repos with more
than 70 core reviewers among those. In a hypothetical reality where
those were separate directory trees within a single repository, I'm
not coming up with any significant ways it would improve our current
workflow. That said, I understand other projects may have different
needs and challenges with their codebase we just don't face.

We really don't need a technical solution to a social problem.

If someone isn't trusted enough to know the difference between
project/subsystemA and project/subsystemB, nor trusted enough to not
commit changes to subsystemB, pushing stuff out to a new repo, or
in-repo ACLs are not the solution. The solution is to work with them
to learn to trust them.

Further, there are plenty of cases where the 'subsystem' is
cross-cutting, not vertical - and in those cases its much much much
harder to just describe file boundaries where the thing is.

So I'd like us to really get our heads around the idea that folk are
able to make promises ('I will only commit changes relevant to the DB
abstraction/transaction management') and honour them. And if they
don't - well, remove their access. even with CD in the picture,
thats a wholly acceptable risk IMO.

-Rob

--
Robert Collins rbtcollins@hp.com
Distinguished Technologist
HP Converged Cloud


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 2, 2015 by Robert_Collins (27,200 points)   4 6 12
0 votes

Jeremy,

the Infrastructure Project is now past 120 repos with more
than 70 core reviewers among those.

I dislike the idea of having 120 repos for single tool. It makes things
complicated for everybody:
documentation stuff, installation, maintaing, work that touches multiple
repos and so on..

So I would prefer to have single repo with many subcores.

Robert,

We really don't need a technical solution to a social problem.

We really need... It's like non voting jobs in our CI (everybody just
ignores).
Btw it will be hard for large core team to know each other.
Especially if we are speaking about various groups of cores that are
mantaining
only parts of systems. Keeping all this in heads will be hard task (it
should be automated)

Best regards,
Boris Pavlovic

On Wed, Jun 3, 2015 at 2:12 AM, Robert Collins robertc@robertcollins.net
wrote:

On 3 June 2015 at 10:34, Jeremy Stanley fungi@yuggoth.org wrote:

On 2015-06-02 21:59:34 +0000 (+0000), Ian Cordasco wrote:

I like this very much. I recall there was a session at the summit
about this that Thierry and Kyle led. If I recall correctly, the
discussion mentioned that it wasn't (at this point in time)
possible to use gerrit the way you describe it, but perhaps people
were mistaken?
[...]

It wasn't an option at the time. What's being conjectured now is
that with custom Prolog rules it might be possible to base Gerrit
label permissions on strict file subsets within repos. It's
nontrivial, as of yet I've seen no working demonstration, and we'd
still need the Infrastructure Team to feel comfortable supporting it
even if it does turn out to be technically possible. But even before
going down the path of automating/enforcing it anywhere in our
toolchain, projects interested in this workflow need to try to
mentally follow the proposed model and see if it makes social sense
for them.

It's also still not immediately apparent to me that this additional
complication brings any substantial convenience over having distinct
Git repositories under the control of separate but allied teams. For
example, the Infrastructure Project is now past 120 repos with more
than 70 core reviewers among those. In a hypothetical reality where
those were separate directory trees within a single repository, I'm
not coming up with any significant ways it would improve our current
workflow. That said, I understand other projects may have different
needs and challenges with their codebase we just don't face.

We really don't need a technical solution to a social problem.

If someone isn't trusted enough to know the difference between
project/subsystemA and project/subsystemB, nor trusted enough to not
commit changes to subsystemB, pushing stuff out to a new repo, or
in-repo ACLs are not the solution. The solution is to work with them
to learn to trust them.

Further, there are plenty of cases where the 'subsystem' is
cross-cutting, not vertical - and in those cases its much much much
harder to just describe file boundaries where the thing is.

So I'd like us to really get our heads around the idea that folk are
able to make promises ('I will only commit changes relevant to the DB
abstraction/transaction management') and honour them. And if they
don't - well, remove their access. even with CD in the picture,
thats a wholly acceptable risk IMO.

-Rob

--
Robert Collins rbtcollins@hp.com
Distinguished Technologist
HP Converged Cloud


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 2, 2015 by boris_at_pavlovic.me (6,900 points)   1 4 7
0 votes

On Tue, Jun 2, 2015 at 4:12 PM, Robert Collins robertc@robertcollins.net
wrote:

On 3 June 2015 at 10:34, Jeremy Stanley fungi@yuggoth.org wrote:

On 2015-06-02 21:59:34 +0000 (+0000), Ian Cordasco wrote:

I like this very much. I recall there was a session at the summit
about this that Thierry and Kyle led. If I recall correctly, the
discussion mentioned that it wasn't (at this point in time)
possible to use gerrit the way you describe it, but perhaps people
were mistaken?
[...]

It wasn't an option at the time. What's being conjectured now is
that with custom Prolog rules it might be possible to base Gerrit
label permissions on strict file subsets within repos. It's
nontrivial, as of yet I've seen no working demonstration, and we'd
still need the Infrastructure Team to feel comfortable supporting it
even if it does turn out to be technically possible. But even before
going down the path of automating/enforcing it anywhere in our
toolchain, projects interested in this workflow need to try to
mentally follow the proposed model and see if it makes social sense
for them.

It's also still not immediately apparent to me that this additional
complication brings any substantial convenience over having distinct
Git repositories under the control of separate but allied teams. For
example, the Infrastructure Project is now past 120 repos with more
than 70 core reviewers among those. In a hypothetical reality where
those were separate directory trees within a single repository, I'm
not coming up with any significant ways it would improve our current
workflow. That said, I understand other projects may have different
needs and challenges with their codebase we just don't face.

We really don't need a technical solution to a social problem.

If someone isn't trusted enough to know the difference between
project/subsystemA and project/subsystemB, nor trusted enough to not
commit changes to subsystemB, pushing stuff out to a new repo, or
in-repo ACLs are not the solution. The solution is to work with them
to learn to trust them.

Further, there are plenty of cases where the 'subsystem' is
cross-cutting, not vertical - and in those cases its much much much
harder to just describe file boundaries where the thing is.

So I'd like us to really get our heads around the idea that folk are
able to make promises ('I will only commit changes relevant to the DB
abstraction/transaction management') and honour them. And if they
don't - well, remove their access. even with CD in the picture,
thats a wholly acceptable risk IMO.

With gerrit's great REST APIs it would be very easy to generate a report to
detect if someone breaks their promise and commits something outside of a
given sub-directory.

-Rob

--
Robert Collins rbtcollins@hp.com
Distinguished Technologist
HP Converged Cloud


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 2, 2015 by Joe_Gordon (24,620 points)   2 5 8
0 votes

On 2015-06-03 02:36:20 +0300 (+0300), Boris Pavlovic wrote:
I dislike the idea of having 120 repos for single tool. It makes things
complicated for everybody:
documentation stuff, installation, maintaing, work that touches multiple
repos and so on..

So I would prefer to have single repo with many subcores.
[...]

Can you explain why having things in separate Git repositories is
more complicated than having them in separate directory hierarchies
in one Git repository plus using a turing-complete language to
identify who has permission to approve what across those?
--
Jeremy Stanley


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 2, 2015 by Jeremy_Stanley (56,700 points)   3 5 7
0 votes

On 06/03/2015 07:24 AM, Boris Pavlovic wrote:
Really it's hard to find cores that understand whole project, but
it's quite simple to find people that can maintain subsystems of
project.

We are made wise not by the recollection of our past, but by the
responsibility for our future.
- George Bernard Shaw

Less authorities, mini-kingdoms and
turing-complete-rule-based-gerrit-subtree-git-commit-enforcement; more
empowerment of responsible developers and building trust.

-i


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 3, 2015 by Ian_Wienand (3,620 points)   4 5
0 votes

On Tue, Jun 2, 2015 at 7:19 PM, Ian Wienand iwienand@redhat.com wrote:

On 06/03/2015 07:24 AM, Boris Pavlovic wrote:

Really it's hard to find cores that understand whole project, but
it's quite simple to find people that can maintain subsystems of
project.

We are made wise not by the recollection of our past, but by the
responsibility for our future.
- George Bernard Shaw

Less authorities, mini-kingdoms and
turing-complete-rule-based-gerrit-subtree-git-commit-enforcement; more
empowerment of responsible developers and building trust.

-i


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

​All of the debate about the technical feasibility, additional repos aside,
the one question I always raise when topics like this come up is "how does
that really solve the problem". In other words, there's still a finite
number of folks that dedicate the time to be "subject matter experts" and
do the reviews.

Maybe this will help, I don't know. But I have the same argument as I made
in my spec to remove drivers from Cinder altogether, creating "another
repo" and moving things around just creates more overhead and does little
to address the lack of review resources.

I understand you're not proposing new repos Boris, although it was
mentioned in this thread.

I do think that we could probably try and do something like growing the
Lieutenant model that the Neutron team is hammering out. Not sure... but
seems like a good start; again assuming there are enough
qualified/interested Lieutenants. I'm not sure, but that's kind of how I
interpreted your proposal but one additional step of ACL's; is that
accurate?

Thanks,
John​


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jun 3, 2015 by John_Griffith (8,640 points)   1 2 5
...