settingsLogin | Registersettings

[Openstack-operators] VM monitoring suggestions

0 votes

Hi,

We are currently exploring monitoring solutions for the VMs we deploy
for our customers in production. What I have been asked to deploy would
be something akin to how you can see openvz container usage: you get
memory usage, bandwidth, load and so forth for each container.

I know that ceilometer may be an option, but I believe operators use all
kind of tools for their own ressource usage monitoring. So what do you
people use?

(For this use case, we're looking for something that can be used without
installing an agent in the VM, which makes it impossible to get a VM's
load metric. I would be satisfied with cpu/memory/network/io metrics
though.)


OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
asked Nov 17, 2016 in openstack-operators by Jean-Philippe_Methot (600 points)   1 2 5

4 Responses

0 votes

We have some custom scripts that run on the hypervisors which poll:

virsh dominfo
virsh domiflist
etc

The memory stats with "virsh dommemstat" are, AFAIK, not accurate since
there's nothing triggering kvm / the vm to release unused memory. But all
other virsh stuff works well for us.

We don't record "load", but we do record CPU time.

The "nova diagnostics" command can also be helpful. We have a custom policy
in place to allow users to query their own instances. I think a few others
are doing this as well -- there was a past discussion about it.

Hope that helps,
Joe

On Thu, Nov 17, 2016 at 9:57 AM, Jean-Philippe Methot <
jp.methot@planethoster.info> wrote:

Hi,

We are currently exploring monitoring solutions for the VMs we deploy for
our customers in production. What I have been asked to deploy would be
something akin to how you can see openvz container usage: you get memory
usage, bandwidth, load and so forth for each container.

I know that ceilometer may be an option, but I believe operators use all
kind of tools for their own ressource usage monitoring. So what do you
people use?

(For this use case, we're looking for something that can be used without
installing an agent in the VM, which makes it impossible to get a VM's load
metric. I would be satisfied with cpu/memory/network/io metrics though.)


OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
responded Nov 17, 2016 by Joe_Topjian (5,780 points)   1 6 10
0 votes

I know that ceilometer may be an option, but I believe operators use all kind of tools for their own ressource usage monitoring. So what do you people use?

(For this use case, we're looking for something that can be used without installing an agent in the VM, which makes it impossible to get a VM's load metric. I would be satisfied with cpu/memory/network/io metrics though.)

Although I’d like to re-evaluate ceilometer at some point we currently use something very simple with the infra we already had in place.

We use the collectd libvirt plugin and push the metrics to graphite.
https://collectd.org/wiki/index.php/Plugin:virt

If you use the following format you get the instance uuid in the metric name:
HostnameFormat "hostname uuid"

The output of those graphite keys is not exactly what we want (IIRC you get hostname_uuid instead of hostname.uuid)
We rewrite it a bit with carbon-(c-)relay daemon to something more usable so you get:
computenode.libvirt.UUID.metrics

We made a grafana dashboard where you can select the uuid and get the stats of the instance.

Pros:
* Simple, just needs collectd on compute nodes
* Graphite scales (with the proper setup)

Cons:
* No tenant-id in the metric name (I guess with some scripting you can make a mapping-tree in graphite)
* No metrics in Horizon. (We still have to make some time to integrate these metrics into horizon but that should be doable.)
* Just instance metrics nothing else

Cheers,
Robert van Leeuwen


OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
responded Nov 21, 2016 by Van_Leeuwen,_Robert (1,740 points)   1 3
0 votes

Using Collectd virt plugin is a good point to start, for more information,
you can read this article
https://community.rackspace.com/products/f/25/t/6800
Using Ceilometer with MongoDB for monitoring services will give heavy load
because of too many api requests (tesed with Kilo and Liberty). We used
Ceilometer only for alert perpose.
In newer cycle, you shoud have a try MaaS or Ceilomer + Gnocchi + Aodh.

2016-11-21 15:17 GMT+07:00 Van Leeuwen, Robert rovanleeuwen@ebay.com:

I know that ceilometer may be an option, but I believe operators use all
kind of tools for their own ressource usage monitoring. So what do you
people use?

(For this use case, we're looking for something that can be used without
installing an agent in the VM, which makes it impossible to get a VM's load
metric. I would be satisfied with cpu/memory/network/io metrics though.)

Although I’d like to re-evaluate ceilometer at some point we currently use
something very simple with the infra we already had in place.

We use the collectd libvirt plugin and push the metrics to graphite.

https://collectd.org/wiki/index.php/Plugin:virt

If you use the following format you get the instance uuid in the metric
name:

HostnameFormat "hostname uuid"

The output of those graphite keys is not exactly what we want (IIRC you
get hostname_uuid instead of hostname.uuid)

We rewrite it a bit with carbon-(c-)relay daemon to something more usable
so you get:

computenode.libvirt.UUID.metrics

We made a grafana dashboard where you can select the uuid and get the
stats of the instance.

Pros:

  • Simple, just needs collectd on compute nodes

  • Graphite scales (with the proper setup)

Cons:

  • No tenant-id in the metric name (I guess with some scripting you can
    make a mapping-tree in graphite)

  • No metrics in Horizon. (We still have to make some time to integrate
    these metrics into horizon but that should be doable.)

  • Just instance metrics nothing else

Cheers,

Robert van Leeuwen


OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

--
Best Regards!

Duc, Nguyen Cong
Skype: ducncvn@hotmail.com
Phone: (+84)948309446
Site: http://ducnc.github.io/


OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
responded Dec 12, 2016 by Đức_Nguyễn_Công (180 points)  
0 votes

We have been using influxdata's influxdb, telegraf, and kapacitor with
grafana for visualization [1]. You do not have to install telegraf in the
VM to gather data on them however will need telegraf on the compute node
and more than likely have to use some coding of telegraf to get what you
want out.

There is also a tool called netdata [2] which may suffice for what you are
wanting to collect. Robert and Joe seem to have a sense of what you want
and may offer the best option.

[1] http://influxdata.com
[2] https://github.com/firehol/netdata

On Sun, Dec 11, 2016 at 7:33 PM, Đức Nguyễn Công <
nguyencongduc3112@gmail.com> wrote:

Using Collectd virt plugin is a good point to start, for more information,
you can read this article https://community.rackspace.com/products/f/25/t/
6800
Using Ceilometer with MongoDB for monitoring services will give heavy load
because of too many api requests (tesed with Kilo and Liberty). We used
Ceilometer only for alert perpose.
In newer cycle, you shoud have a try MaaS or Ceilomer + Gnocchi + Aodh.

2016-11-21 15:17 GMT+07:00 Van Leeuwen, Robert rovanleeuwen@ebay.com:

I know that ceilometer may be an option, but I believe operators use
all kind of tools for their own ressource usage monitoring. So what do you
people use?

(For this use case, we're looking for something that can be used
without installing an agent in the VM, which makes it impossible to get a
VM's load metric. I would be satisfied with cpu/memory/network/io metrics
though.)

Although I’d like to re-evaluate ceilometer at some point we currently
use something very simple with the infra we already had in place.

We use the collectd libvirt plugin and push the metrics to graphite.

https://collectd.org/wiki/index.php/Plugin:virt

If you use the following format you get the instance uuid in the metric
name:

HostnameFormat "hostname uuid"

The output of those graphite keys is not exactly what we want (IIRC you
get hostname_uuid instead of hostname.uuid)

We rewrite it a bit with carbon-(c-)relay daemon to something more usable
so you get:

computenode.libvirt.UUID.metrics

We made a grafana dashboard where you can select the uuid and get the
stats of the instance.

Pros:

  • Simple, just needs collectd on compute nodes

  • Graphite scales (with the proper setup)

Cons:

  • No tenant-id in the metric name (I guess with some scripting you can
    make a mapping-tree in graphite)

  • No metrics in Horizon. (We still have to make some time to integrate
    these metrics into horizon but that should be doable.)

  • Just instance metrics nothing else

Cheers,

Robert van Leeuwen


OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

--
Best Regards!

Duc, Nguyen Cong
Skype: ducncvn@hotmail.com
Phone: (+84)948309446 <+84%2094%20830%2094%2046>
Site: http://ducnc.github.io/


OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
responded Dec 12, 2016 by Melvin_Hillsman (4,480 points)   1 2 2
...