settingsLogin | Registersettings

[openstack-dev] [tripleo] Help needed on debugging upgrade jobs on Pike

0 votes

Since we've got promotion, we can now properly test upgrades from ocata to pike.
It's now failing for various reasons, as you can see on:
https://review.openstack.org/#/c/500625/

I haven't filled bug yet but this is the kind of thing I see now:
http://logs.openstack.org/25/500625/20/check/legacy-tripleo-ci-centos-7-scenario002-multinode-oooq-container-upgrades/62e7f14/logs/undercloud/home/zuul/overcloud_upgrade_console.log.txt.gz#_2017-11-04_00_14_17

I'm requesting some help from the upgrades squad, if they already saw
the failures, etc. It would be great to have the jobs passing at some
point, now the framework is in place and we had promotion.

Thanks,
--
Emilien Macchi


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
asked Nov 6, 2017 in openstack-dev by emilien_at_redhat.co (36,940 points)   2 6 9

5 Responses

0 votes

On Sat, Nov 4, 2017 at 2:27 AM, Emilien Macchi emilien@redhat.com wrote:
Since we've got promotion, we can now properly test upgrades from ocata to pike.
It's now failing for various reasons, as you can see on:
https://review.openstack.org/#/c/500625/

I haven't filled bug yet but this is the kind of thing I see now:
http://logs.openstack.org/25/500625/20/check/legacy-tripleo-ci-centos-7-scenario002-multinode-oooq-container-upgrades/62e7f14/logs/undercloud/home/zuul/overcloud_upgrade_console.log.txt.gz#_2017-11-04_00_14_17

I think this is related to https://review.openstack.org/#/c/510577/
which introduced running os-net-config during the major upgrade
composable step. In case of environments without network isolation
/etc/os-net-config/config.json doesn't exist so the os-net-config
command fails. I filed https://bugs.launchpad.net/tripleo/+bug/1730328
to keep track of it.

I'm requesting some help from the upgrades squad, if they already saw
the failures, etc. It would be great to have the jobs passing at some
point, now the framework is in place and we had promotion.

Thanks,
--
Emilien Macchi


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Nov 6, 2017 by mariusc_at_redhat.co (160 points)  
0 votes

On Mon, Nov 6, 2017 at 11:09 AM, Marius Cornea mariusc@redhat.com wrote:

On Sat, Nov 4, 2017 at 2:27 AM, Emilien Macchi emilien@redhat.com wrote:

Since we've got promotion, we can now properly test upgrades from ocata
to pike.
It's now failing for various reasons, as you can see on:
https://review.openstack.org/#/c/500625/

I haven't filled bug yet but this is the kind of thing I see now:
http://logs.openstack.org/25/500625/20/check/legacy-tripleo-ci-centos-7-
scenario002-multinode-oooq-container-upgrades/62e7f14/
logs/undercloud/home/zuul/overcloudupgradeconsole.log.
txt.gz#_2017-11-04_00_14_17

I think this is related to https://review.openstack.org/#/c/510577/
which introduced running os-net-config during the major upgrade
composable step. In case of environments without network isolation
/etc/os-net-config/config.json doesn't exist so the os-net-config
command fails. I filed https://bugs.launchpad.net/tripleo/+bug/1730328
to keep track of it.

heh, beat me to it :) I was about to file that. Indeed from logs @ [0] you
can see the step3 ansible-playbook failing for
https://github.com/openstack/tripleo-heat-templates/blob/e463ca15fb2189fde7e7e2de136cfb2303d3171f/puppet/services/tripleo-packages.yaml#L56-L64

I had a poke at one of the other jobs too since there are apparently
multiple issues - I found a different one
for legacy-tripleo-ci-centos-7-containers-multinode-upgrades and filed
https://bugs.launchpad.net/tripleo/+bug/1730349 for that. It looks like all
the upgrade_tasks pass there but then fails on docker-puppet

[0]
http://logs.openstack.org/25/500625/20/check/legacy-tripleo-ci-centos-7-scenario002-multinode-oooq-container-upgrades/62e7f14/logs/subnode-2/var/log/messages.txt.gz#_Nov__4_00_13_55

thanks,

marios

I'm requesting some help from the upgrades squad, if they already saw

the failures, etc. It would be great to have the jobs passing at some
point, now the framework is in place and we had promotion.

Thanks,

--

Emilien Macchi


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Nov 6, 2017 by Marios_Andreou (3,200 points)   3 4
0 votes

On 6.11.2017 10:52, Marios Andreou wrote:
On Mon, Nov 6, 2017 at 11:09 AM, Marius Cornea mariusc@redhat.com wrote:

On Sat, Nov 4, 2017 at 2:27 AM, Emilien Macchi emilien@redhat.com wrote:

Since we've got promotion, we can now properly test upgrades from ocata
to pike.
It's now failing for various reasons, as you can see on:
https://review.openstack.org/#/c/500625/

I haven't filled bug yet but this is the kind of thing I see now:
http://logs.openstack.org/25/500625/20/check/legacy-tripleo-ci-centos-7-
scenario002-multinode-oooq-container-upgrades/62e7f14/
logs/undercloud/home/zuul/overcloudupgradeconsole.log.
txt.gz#_2017-11-04_00_14_17

I think this is related to https://review.openstack.org/#/c/510577/
which introduced running os-net-config during the major upgrade
composable step. In case of environments without network isolation
/etc/os-net-config/config.json doesn't exist so the os-net-config
command fails. I filed https://bugs.launchpad.net/tripleo/+bug/1730328
to keep track of it.

heh, beat me to it :) I was about to file that. Indeed from logs @ [0] you
can see the step3 ansible-playbook failing for
https://github.com/openstack/tripleo-heat-templates/blob/e463ca15fb2189fde7e7e2de136cfb2303d3171f/puppet/services/tripleo-packages.yaml#L56-L64

I had a poke at one of the other jobs too since there are apparently
multiple issues - I found a different one
for legacy-tripleo-ci-centos-7-containers-multinode-upgrades and filed
https://bugs.launchpad.net/tripleo/+bug/1730349 for that. It looks like all
the upgrade_tasks pass there but then fails on docker-puppet

I'm not sure if it's related to that ^ error in particular, but since we
landed deploy/upgrade scenario separation [1], the upgrade job on Pike
effectively started testing non-pacemaker to pacemaker upgrade, which
won't work. Due to a chicken-and-egg issue with landing related patches
we could not set the dependencies properly. There's a patch fixing this
issue and making the Pike upgrade pacemaker-to-pacemaker [2]. This may
not solve all the issues, but i think we need it merged to at least have
a chance at a green result.

[0]
http://logs.openstack.org/25/500625/20/check/legacy-tripleo-ci-centos-7-scenario002-multinode-oooq-container-upgrades/62e7f14/logs/subnode-2/var/log/messages.txt.gz#_Nov__4_00_13_55
[1] https://review.openstack.org/#/c/500552
[2] https://review.openstack.org/#/c/512305

thanks,

marios

I'm requesting some help from the upgrades squad, if they already saw

the failures, etc. It would be great to have the jobs passing at some
point, now the framework is in place and we had promotion.

Thanks,

--

Emilien Macchi


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Nov 6, 2017 by =?UTF-8?B?SmnFmcOtIF (3,860 points)   2 3
0 votes

On 6.11.2017 11:17, Jiří Stránský wrote:
On 6.11.2017 10:52, Marios Andreou wrote:

On Mon, Nov 6, 2017 at 11:09 AM, Marius Cornea mariusc@redhat.com wrote:

On Sat, Nov 4, 2017 at 2:27 AM, Emilien Macchi emilien@redhat.com wrote:

Since we've got promotion, we can now properly test upgrades from ocata
to pike.
It's now failing for various reasons, as you can see on:
https://review.openstack.org/#/c/500625/

I haven't filled bug yet but this is the kind of thing I see now:
http://logs.openstack.org/25/500625/20/check/legacy-tripleo-ci-centos-7-
scenario002-multinode-oooq-container-upgrades/62e7f14/
logs/undercloud/home/zuul/overcloudupgradeconsole.log.
txt.gz#_2017-11-04_00_14_17

I think this is related to https://review.openstack.org/#/c/510577/
which introduced running os-net-config during the major upgrade
composable step. In case of environments without network isolation
/etc/os-net-config/config.json doesn't exist so the os-net-config
command fails. I filed https://bugs.launchpad.net/tripleo/+bug/1730328
to keep track of it.

heh, beat me to it :) I was about to file that. Indeed from logs @ [0] you
can see the step3 ansible-playbook failing for
https://github.com/openstack/tripleo-heat-templates/blob/e463ca15fb2189fde7e7e2de136cfb2303d3171f/puppet/services/tripleo-packages.yaml#L56-L64

I had a poke at one of the other jobs too since there are apparently
multiple issues - I found a different one
for legacy-tripleo-ci-centos-7-containers-multinode-upgrades and filed
https://bugs.launchpad.net/tripleo/+bug/1730349 for that. It looks like all
the upgrade_tasks pass there but then fails on docker-puppet

I'm not sure if it's related to that ^ error in particular

Yea the backport [2] seems to have fixed that issue. The upgrade now
completed successfully, but the job failed on validation. I've +A'd the
backport as it gets us closer to green.

, but since we
landed deploy/upgrade scenario separation [1], the upgrade job on Pike
effectively started testing non-pacemaker to pacemaker upgrade, which
won't work. Due to a chicken-and-egg issue with landing related patches
we could not set the dependencies properly. There's a patch fixing this
issue and making the Pike upgrade pacemaker-to-pacemaker [2]. This may
not solve all the issues, but i think we need it merged to at least have
a chance at a green result.

[0]
http://logs.openstack.org/25/500625/20/check/legacy-tripleo-ci-centos-7-scenario002-multinode-oooq-container-upgrades/62e7f14/logs/subnode-2/var/log/messages.txt.gz#_Nov__4_00_13_55
[1] https://review.openstack.org/#/c/500552
[2] https://review.openstack.org/#/c/512305

thanks,

marios

I'm requesting some help from the upgrades squad, if they already saw

the failures, etc. It would be great to have the jobs passing at some
point, now the framework is in place and we had promotion.

Thanks,

--

Emilien Macchi


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Nov 6, 2017 by =?UTF-8?B?SmnFmcOtIF (3,860 points)   2 3
0 votes

Thanks folks :-) you rock!

On Mon, Nov 6, 2017 at 5:05 AM, Jiří Stránský jistr@redhat.com wrote:
On 6.11.2017 11:17, Jiří Stránský wrote:

On 6.11.2017 10:52, Marios Andreou wrote:

On Mon, Nov 6, 2017 at 11:09 AM, Marius Cornea mariusc@redhat.com
wrote:

On Sat, Nov 4, 2017 at 2:27 AM, Emilien Macchi emilien@redhat.com
wrote:

Since we've got promotion, we can now properly test upgrades from ocata

to pike.

It's now failing for various reasons, as you can see on:
https://review.openstack.org/#/c/500625/

I haven't filled bug yet but this is the kind of thing I see now:

http://logs.openstack.org/25/500625/20/check/legacy-tripleo-ci-centos-7-

scenario002-multinode-oooq-container-upgrades/62e7f14/
logs/undercloud/home/zuul/overcloudupgradeconsole.log.
txt.gz#_2017-11-04_00_14_17

I think this is related to https://review.openstack.org/#/c/510577/
which introduced running os-net-config during the major upgrade
composable step. In case of environments without network isolation
/etc/os-net-config/config.json doesn't exist so the os-net-config
command fails. I filed https://bugs.launchpad.net/tripleo/+bug/1730328
to keep track of it.

heh, beat me to it :) I was about to file that. Indeed from logs @ [0]
you
can see the step3 ansible-playbook failing for

https://github.com/openstack/tripleo-heat-templates/blob/e463ca15fb2189fde7e7e2de136cfb2303d3171f/puppet/services/tripleo-packages.yaml#L56-L64

I had a poke at one of the other jobs too since there are apparently
multiple issues - I found a different one
for legacy-tripleo-ci-centos-7-containers-multinode-upgrades and filed
https://bugs.launchpad.net/tripleo/+bug/1730349 for that. It looks like
all
the upgrade_tasks pass there but then fails on docker-puppet

I'm not sure if it's related to that ^ error in particular

Yea the backport [2] seems to have fixed that issue. The upgrade now
completed successfully, but the job failed on validation. I've +A'd the
backport as it gets us closer to green.

, but since we
landed deploy/upgrade scenario separation [1], the upgrade job on Pike
effectively started testing non-pacemaker to pacemaker upgrade, which
won't work. Due to a chicken-and-egg issue with landing related patches
we could not set the dependencies properly. There's a patch fixing this
issue and making the Pike upgrade pacemaker-to-pacemaker [2]. This may
not solve all the issues, but i think we need it merged to at least have
a chance at a green result.

[1] https://review.openstack.org/#/c/500552
[2] https://review.openstack.org/#/c/512305

thanks,

marios

I'm requesting some help from the upgrades squad, if they already saw

the failures, etc. It would be great to have the jobs passing at some
point, now the framework is in place and we had promotion.

Thanks,

--

Emilien Macchi


OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

--
Emilien Macchi


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Nov 7, 2017 by emilien_at_redhat.co (36,940 points)   2 6 9
...