On Mon, Aug 8, 2016 at 1:47 PM, James Slagle email@example.com wrote:
On Mon, Aug 8, 2016 at 1:06 PM, Jeremy Stanley firstname.lastname@example.org wrote:
On 2016-08-08 11:47:56 -0400 (-0400), James Slagle wrote:
I suppose it's also possible that we might be pushing too strongly
down the multinode path? Is the general concensus in infra that they'd
like to help enable project teams to eventually add 3 and 4 (and maybe
more) node multinode jobs?
We've not outright rejected the idea, but do want to make sure that
there's been suitable due diligence done explaining how the things
you'll be able to test with >2 job nodes effectively can't be done
Our current 2 node job uses the first node as the undercloud which
deploys an AIO Overcloud on the 2nd node. TripleO traditionally has
also been able to deploy standalone Compute, Cinder, Swift, and Ceph
nodes. Additionally in this cycle, a lot of work has gone into making
it fully customizable what services are deployed on which roles. You
can deploy nodes that are just API services, or just a DB server, or
rabbitmq, etc. In order to test the composability feature we need to
deploy to more than one node.
Also, we'd need at least 3 Overcloud nodes to successfully test that
we can deploy a Pacemaker managed cluster successfully.
Also we want to be sure that projects who are interested
in multi-node jobs start with just 2 job nodes and get some initial
tests performing well and returning stable results before trying to
push past 2.
I think that the 2 node job that we've added has been stable. We've
worked a few issues out that we've hit depending on which cloud
provider we land on, but generally speaking it has been very stable.
We make use of the ovsvxlanbridge function from devstack-gate to
configure the private networking among the nodes. I think this was a
good first step since that has been a proven way in the devstack
multinode jobs. I'd like to move to using TripleO's os-net-config in
the future though, since that is the tool used in TripleO. The end
result of the network configuration would be the same (using ovs vxlan
bridges), we'd just use a different tool to get there.
-- James Slagle
Reviving this thread to continue the discussion. I'd like to keep the
discussion going with hopes that we can set the stage to finalize a
plan for what we want to tackle in Ocata for tripleo-ci at the
State of rh1 and rh2
Both rh1 and rh2 are OVB (OpenStack Virtual Baremetal) enabled
clouds. OVB allows us to treat OpenStack instances as baremetal
instances for traditional tripleo-ci testing (PXE booting, etc).
Currently only rh1 is enabled in nodepool. We could re-enable rh2 if
we wanted (the previous ntp issue is resolved now).
As Paul indicated, he's done signficant work to bring these 2 clouds
in alignment with standard Infra tooling. If we wanted to move forward
with opening up these clouds to run other jobs besides tripleo-ci, we
could do that.
We've continued to add additional CI jobs using the multinode support
in nodepool and tripleo-ci, running on all the enabled clouds (except
rh1) in nodepool. We are still only at using 2 nodes. I'd like to add
additional jobs and increase this to 3 nodes initially (probably
deploying ceph on the additional node), and then 4 nodes for doing an
Becoming 3rd party CI
tripleo-ci becoming 3rd party CI continues to come up in discussion. I
agree that the OVB based tripleo-ci jobs align better with the 3rd
party CI model since they do require a specially configured OpenStack
cloud. However, the previous points about opening up rh1/rh2 for non
tripleo jobs and scaling out multinode jobs muddies the water for me a
bit when this topic comes up.
Given we'd like to scale out and add more multinode jobs, I'd like to
counter that by offering some capacity back to nodepool by opening up
the rh1/rh2 clouds to all job types.
However, if tripleo-ci becomes 3rd party CI, we need some
infrastructure to run that CI on and resources to set it up and
maintain the CI tooling. At that point, the TripleO team would be
trying to maintain a 3rd party CI system, and keep 2 public clouds
running for normal infra jobs. That may be possible to do, but it is
Just to be clear, I'm not trying to say that if tripleo-ci becomes 3rd
party CI, we will "just take our cloud and go home" :-). We want to be
better aligned and integrated with infra tooling and jobs. Maintaining
a 3rd party CI system and 2 public clouds integrated with Infra's CI
system is additional work though, and like a lot of project teams, we
have to prioritize and make trade offs.
Further, even if tripleo-ci becomes 3rd party CI for OVB jobs, and
there are capacity concerns about us scaling out our multinode jobs
onto the other enabled clouds in nodepool, we may still prioritize the
work to maintain these 2 clouds for Infra's general use.
We want to work more closely with Infra overall. But if there is
little perceived benefit in that there are no capacity concerns and no
concerns about us going to 3+ node multinode jobs, then I think we'd
probably just disable our clouds in nodepool, make tripleo-ci OVB jobs
3rd party, and press on that way.
Better alignment with infra CI tools
When the topic of 3rd party CI comes up, it is often accompanied with
the fact that tripleo-ci is not aligned with other infra tools
(devstack-gate, zuul-cloner, others?). We do plan to continue to
address these things and strive for better alignment. I'm not sure of
all the historical context around what the "original" plans were for
tripleo-ci, and it doesn't really matter all that much anyway.
If the repo needs some modernization to be in better alignment with
tooling or to take advantage of new features in nodepool/zuul (I know
Paul has some ideas around pipelining), then I think we will work on
Some of my goals for tripleo-ci are to continue to try and make it
easier to consume externally, and to use the TripleO production
tooling where possible.
However, we may not be able to align perfectly with how other jobs are
run given the nature of the project (we don't use source installs, pip
installs, devstack, etc). Using the same production tooling in
tripleo-ci that we expect TripleO users to also use when they deploy
is a goal of tripleo-ci.
As an example, we will likely not continue to use devstack-gate to
setup multinode networking, and instead use the tool TripleO uses for
production: os-net-config. Does that have any impact on the decision
of tripleo-ci becoming 3rd party CI or not? I'm not honestly sure what
the expectations are around things like that (e.g., must use something
Personally, I would like to see more testing in the check and gate
queues with production deployment tools across the board (fuel, kolla,
tripleo, etc), because it makes all of OpenStack better when issues
are found earlier rather than later. I think the progress we've made
so far with the TripleO multinode jobs have proven that this is
 We have TripleO sessions proposed to talk about the state of CI: