We use Fuel for deployment, with a fairly simple network configuration
(Controller/Network node are the same) and OpenDaylight as the neutron
driver. However, we also have SR-IOV configured for some nics, and there
might be something interesting here.
The instance was created with an SR-IOV port, and in the logs I see
"Assigning a pci device without numa affinity toinstance
389109a4-540e-48d9-82b1-873b02cb4d31 which has numa topology". Then shortly
after creation fails and the hypervisor seems to crash.
So today I tried to create an instance without SR-IOV and
hw:policy=dedicated, and it worked fine. Then I did the same but added an
SR-IOV port, and I get the same crash (though not across all nodes this
I assume we have some kind of misconfiguration somewhere, though the entire
hypervisor crashing doesn't seem correct either :-)
On 17 September 2017 at 00:32, Steve Gordon firstname.lastname@example.org wrote:
----- Original Message -----
From: "Tomas Brännström" email@example.com
Sent: Friday, September 15, 2017 5:56:34 AM
Subject: [Openstack] QEMU/KVM crash when mixing cpu_policy:dedicated and
I just noticed a strange (?) issue when I tried to create an instance
a flavor with hw:cpu_policy=dedicated. The instance failed with error:
Unable to read from monitor: Connection reset by peer', u'code': 500,
u'details': u' File
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1926,
File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line
in buildandruninstance\n instanceuuid=instance.uuid,
And all other instances were shut down, even those living on another
compute host than the new one was scheduled to. A quick googling reveals
that this could be due to the hypervisor crashing (though why would it
crash on unrelated compute hosts??).
Are there any more specific messages in the system logs or elsewhere?
Check /var/log/libvirt/* in particular, though I suspect it will be the
original source of the above message it may have some additional useful
The only odd thing here that I can think of was that the existing
did -not- use dedicated cpu policy -- can there be problems like this
attempting to mix dedicated and non-dedicated policies?
The main problem if you mix them on the same node is that Nova wont
account properly for this when placing guests, the current design assumes
that a node will be used either for "normal" instances (with CPU
overcommit) or "dedicated" instances (no CPU overcommit, pinning) and the
two will be separated via the use of host aggregates and flavors. This in
and of itself should not result in a QEMU crash though it may eventually
result in issues w.r.t. balancing of scheduling/placement decisions. If
instances on other nodes went down at the same time I'd be looking for a
broader issue, what is your storage and networking setup like?
Principal Product Manager,
Red Hat OpenStack Platform
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : firstname.lastname@example.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack