settingsLogin | Registersettings

[openstack-dev] [vitrage] Collectd - to - Vitrage setup issues

0 votes

I am trying to get collectd to report some alarms to vitrage in a devstack setup,

I am using a devstack created on a late version of ocata.
And my devstack with vitrage appears to be working ok otherwise;
e.g. I can create VMs, and raise fake alarms using “vitrage event post -type=compute.host.down ...” or with “aodh alarm create ... resource_id=instance-uuid” ... and they get reported fine in vitrage.

UNFORTUNATELY not seeing anything in vitrage from collectd, and
don’t believe I’m seeing anything even from collectd, for example from the syslog output plugin.

I’ve attached the following files: ( not sure if these get distributed on mailing list )
· /etc/collectd/collectd.conf <-- do these look ok ?
· /etc/vitrage/vitrage.conf <-- do these look ok ?
· /var/log/syslog ... around the time when I updated collectd.conf and vitrage.conf and restarted collectd and vitrage-graph
o QUESTIONS
• NOTE THE FOLLOWING ERRORS IN THE SYSLOG FILE ... where do I get the collectdconf.yaml file from ? Can’t see it in the devstack files for vitrage.
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.039 25962 ERROR vitrage.utils.file [-] File doesn't exist: /etc/vitrage/collectd
conf.yaml.
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver [-] failed in init 'NoneType' object has no attribute 'getitem' : TypeError: 'NoneType' object has no attribute 'getitem'
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver Traceback (most recent call last):
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver File "/opt/stack/vitrage/vitrage/datasources/collectd/driver.py", line 65, in configurationmapping
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver collectdconfigelements = collectdconfig[COLLECTDDATASOURCE]
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver TypeError: 'NoneType' object has no attribute 'getitem'
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver

• IT DOESN”T SEEM LIKE collectd is actually getting any events anyways ... shouldn’t I see some collectd events being reported in /var/log/syslog from some of the monitoring plugins that are loaded ?
· gregs-air:collectd-info gregwaines$ fgrep "localhost collectd" syslog
· Aug 23 13:56:07 localhost collectd[23267]: supervised by systemd, will signal readyness
· Aug 23 13:56:07 localhost collectd[23267]: Initialization complete, entering read-loop.
· Aug 23 13:56:07 localhost collectd[23267]: rrdtool plugin: Adjusting "RandomTimeout" to 0.000 seconds.
· Aug 23 14:09:05 localhost collectd[23267]: Exiting normally.
· Aug 23 14:09:05 localhost collectd[23267]: collectd: Stopping 5 read threads.
· Aug 23 14:09:05 localhost collectd[23267]: rrdtool plugin: Shutting down the queue thread.
· Aug 23 14:09:05 localhost collectd[23267]: collectd: Stopping 5 write threads.
· Aug 23 14:09:07 localhost collectd[25824]: supervised by systemd, will signal readyness
· Aug 23 14:09:07 localhost collectd[25824]: Initialization complete, entering read-loop.
· Aug 23 14:09:07 localhost collectd[25824]: rrdtool plugin: Adjusting "RandomTimeout" to 0.000 seconds.
·
· /etc/vitrage/templates/hostdownscenarios.yaml
· /etc/vitrage/templates/hosthighcpuloadscenarios.yaml
o Am I suppose to have some templates that are specific to the collectd events/alarms that are being reported to vitrage ?

Any other suggestions on things to look at in order to understand what’s wrong ?

Greg.


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

asked Sep 6, 2017 in openstack-dev by Waines,_Greg (2,700 points)   1 4 9

19 Responses

0 votes

Hi Greg,

I’m less familiar with the collectd configuration and the events that it sends.

Regarding the collectdconf.yaml, it is definitely missing. You should add a /etc/vitrage/collectdconf.yaml file that looks like that:

collectd:
- collectdhost:
type:
name:
- collectd
host: …

This file maps a Collectd resource to the corresponding resource in OpenStack. Only resources that are listed in this file will have their alarms imported to Vitrage.

Next, you should add a reference to this file in /etc/vitrage/vitrage.conf:

[collectd]
configfile = /etc/vitrage/collectdconf.yaml

Then you should restart vitrage-graph.

Let me know if it helped,
Ifat.

From: "Waines, Greg" Greg.Waines@windriver.com
Date: Wednesday, 23 August 2017 at 21:19

I am trying to get collectd to report some alarms to vitrage in a devstack setup,

I am using a devstack created on a late version of ocata.
And my devstack with vitrage appears to be working ok otherwise;
e.g. I can create VMs, and raise fake alarms using “vitrage event post -type=compute.host.down ...” or with “aodh alarm create ... resource_id=instance-uuid” ... and they get reported fine in vitrage.

UNFORTUNATELY not seeing anything in vitrage from collectd, and
don’t believe I’m seeing anything even from collectd, for example from the syslog output plugin.

I’ve attached the following files: ( not sure if these get distributed on mailing list )
· /etc/collectd/collectd.conf <-- do these look ok ?
· /etc/vitrage/vitrage.conf <-- do these look ok ?
· /var/log/syslog ... around the time when I updated collectd.conf and vitrage.conf and restarted collectd and vitrage-graph
o QUESTIONS
• NOTE THE FOLLOWING ERRORS IN THE SYSLOG FILE ... where do I get the collectdconf.yaml file from ? Can’t see it in the devstack files for vitrage.
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.039 25962 ERROR vitrage.utils.file [-] File doesn't exist: /etc/vitrage/collectd
conf.yaml.
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver [-] failed in init 'NoneType' object has no attribute 'getitem' : TypeError: 'NoneType' object has no attribute 'getitem'
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver Traceback (most recent call last):
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver File "/opt/stack/vitrage/vitrage/datasources/collectd/driver.py", line 65, in configurationmapping
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver collectdconfigelements = collectdconfig[COLLECTDDATASOURCE]
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver TypeError: 'NoneType' object has no attribute 'getitem'
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver

• IT DOESN”T SEEM LIKE collectd is actually getting any events anyways ... shouldn’t I see some collectd events being reported in /var/log/syslog from some of the monitoring plugins that are loaded ?
· gregs-air:collectd-info gregwaines$ fgrep "localhost collectd" syslog
· Aug 23 13:56:07 localhost collectd[23267]: supervised by systemd, will signal readyness
· Aug 23 13:56:07 localhost collectd[23267]: Initialization complete, entering read-loop.
· Aug 23 13:56:07 localhost collectd[23267]: rrdtool plugin: Adjusting "RandomTimeout" to 0.000 seconds.
· Aug 23 14:09:05 localhost collectd[23267]: Exiting normally.
· Aug 23 14:09:05 localhost collectd[23267]: collectd: Stopping 5 read threads.
· Aug 23 14:09:05 localhost collectd[23267]: rrdtool plugin: Shutting down the queue thread.
· Aug 23 14:09:05 localhost collectd[23267]: collectd: Stopping 5 write threads.
· Aug 23 14:09:07 localhost collectd[25824]: supervised by systemd, will signal readyness
· Aug 23 14:09:07 localhost collectd[25824]: Initialization complete, entering read-loop.
· Aug 23 14:09:07 localhost collectd[25824]: rrdtool plugin: Adjusting "RandomTimeout" to 0.000 seconds.
·
· /etc/vitrage/templates/hostdownscenarios.yaml
· /etc/vitrage/templates/hosthighcpuloadscenarios.yaml
o Am I suppose to have some templates that are specific to the collectd events/alarms that are being reported to vitrage ?

Any other suggestions on things to look at in order to understand what’s wrong ?

Greg.


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Aug 28, 2017 by ifat.afek_at_nokia.c (5,680 points)   2 2 3
0 votes

Hi Ifat,
thanks for the reply ... just got around to trying your suggestions.

This definitely helps ... I no longer get any errors on re-starting collectd or vitrage-graph.
i.e. it appears to load the collectd and updated vitrage conf files correctly now.

Now still don’t get any alarms in vitrage.
HOWEVER I suspect it may be my collectd setup now.
( WARNING I am NOT a collectd expert. ;) )

I suspect that the vitrage-collectd plugin only sends collectd NOTIFICATIONS or THRESHOLD Events to vitrage.
i.e. it likely does NOT send just statistic/status samples to vitrage.

I can see that collectd sampling is happening ... I have logfile and csv and rrd plugins running and samples are being captured in the specified directories / files.

I tried to set threshold for CPU based on an example I had found on web.
See attached collectd.conf file .

BUT really not sure if the threshold configuration in my collectd.conf is correct or working ... is there a way to confirm this ? ( any collectd experts out there ? )
OR
Is there an example collectd.conf that has notifications or thresholds (whatever vitrage needs) setup for something basic like CPU ?

Greg.

From: "Afek, Ifat (Nokia - IL/Kfar Sava)" ifat.afek@nokia.com
Reply-To: "openstack-dev@lists.openstack.org" openstack-dev@lists.openstack.org
Date: Monday, August 28, 2017 at 9:42 AM
To: "openstack-dev@lists.openstack.org" openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [vitrage] Collectd - to - Vitrage setup issues

Hi Greg,

I’m less familiar with the collectd configuration and the events that it sends.

Regarding the collectdconf.yaml, it is definitely missing. You should add a /etc/vitrage/collectdconf.yaml file that looks like that:

collectd:
- collectdhost:
type:
name:
- collectd
host: …

This file maps a Collectd resource to the corresponding resource in OpenStack. Only resources that are listed in this file will have their alarms imported to Vitrage.

Next, you should add a reference to this file in /etc/vitrage/vitrage.conf:

[collectd]
configfile = /etc/vitrage/collectdconf.yaml

Then you should restart vitrage-graph.

Let me know if it helped,
Ifat.

From: "Waines, Greg" Greg.Waines@windriver.com
Date: Wednesday, 23 August 2017 at 21:19

I am trying to get collectd to report some alarms to vitrage in a devstack setup,

I am using a devstack created on a late version of ocata.
And my devstack with vitrage appears to be working ok otherwise;
e.g. I can create VMs, and raise fake alarms using “vitrage event post -type=compute.host.down ...” or with “aodh alarm create ... resource_id=instance-uuid” ... and they get reported fine in vitrage.

UNFORTUNATELY not seeing anything in vitrage from collectd, and
don’t believe I’m seeing anything even from collectd, for example from the syslog output plugin.

I’ve attached the following files: ( not sure if these get distributed on mailing list )
· /etc/collectd/collectd.conf <-- do these look ok ?
· /etc/vitrage/vitrage.conf <-- do these look ok ?
· /var/log/syslog ... around the time when I updated collectd.conf and vitrage.conf and restarted collectd and vitrage-graph
o QUESTIONS
• NOTE THE FOLLOWING ERRORS IN THE SYSLOG FILE ... where do I get the collectdconf.yaml file from ? Can’t see it in the devstack files for vitrage.
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.039 25962 ERROR vitrage.utils.file [-] File doesn't exist: /etc/vitrage/collectd
conf.yaml.
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver [-] failed in init 'NoneType' object has no attribute 'getitem' : TypeError: 'NoneType' object has no attribute 'getitem'
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver Traceback (most recent call last):
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver File "/opt/stack/vitrage/vitrage/datasources/collectd/driver.py", line 65, in configurationmapping
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver collectdconfigelements = collectdconfig[COLLECTDDATASOURCE]
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver TypeError: 'NoneType' object has no attribute 'getitem'
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver

• IT DOESN”T SEEM LIKE collectd is actually getting any events anyways ... shouldn’t I see some collectd events being reported in /var/log/syslog from some of the monitoring plugins that are loaded ?
· gregs-air:collectd-info gregwaines$ fgrep "localhost collectd" syslog
· Aug 23 13:56:07 localhost collectd[23267]: supervised by systemd, will signal readyness
· Aug 23 13:56:07 localhost collectd[23267]: Initialization complete, entering read-loop.
· Aug 23 13:56:07 localhost collectd[23267]: rrdtool plugin: Adjusting "RandomTimeout" to 0.000 seconds.
· Aug 23 14:09:05 localhost collectd[23267]: Exiting normally.
· Aug 23 14:09:05 localhost collectd[23267]: collectd: Stopping 5 read threads.
· Aug 23 14:09:05 localhost collectd[23267]: rrdtool plugin: Shutting down the queue thread.
· Aug 23 14:09:05 localhost collectd[23267]: collectd: Stopping 5 write threads.
· Aug 23 14:09:07 localhost collectd[25824]: supervised by systemd, will signal readyness
· Aug 23 14:09:07 localhost collectd[25824]: Initialization complete, entering read-loop.
· Aug 23 14:09:07 localhost collectd[25824]: rrdtool plugin: Adjusting "RandomTimeout" to 0.000 seconds.
·
· /etc/vitrage/templates/hostdownscenarios.yaml
· /etc/vitrage/templates/hosthighcpuloadscenarios.yaml
o Am I suppose to have some templates that are specific to the collectd events/alarms that are being reported to vitrage ?

Any other suggestions on things to look at in order to understand what’s wrong ?

Greg.


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

responded Aug 30, 2017 by Waines,_Greg (2,700 points)   1 4 9
0 votes

Hi Greg,

Vitrage listens to Collectd notifications, not statistics.
Can you please turn on the debug option in /etc/vitrage/vitrage.conf (set “debug = true”), and send me the vitrage-graph.log?

Thanks,
Ifat.

From: "Waines, Greg" Greg.Waines@windriver.com
Date: Wednesday, 30 August 2017 at 22:17
To: "OpenStack Development Mailing List (not for usage questions)" openstack-dev@lists.openstack.org
Cc: "Afek, Ifat (Nokia - IL/Kfar Sava)" ifat.afek@nokia.com, "TAHHAN, MARYAM" maryam.tahhan@intel.com
Subject: Re: [openstack-dev] [vitrage] Collectd - to - Vitrage setup issues

Hi Ifat,
thanks for the reply ... just got around to trying your suggestions.

This definitely helps ... I no longer get any errors on re-starting collectd or vitrage-graph.
i.e. it appears to load the collectd and updated vitrage conf files correctly now.

Now still don’t get any alarms in vitrage.
HOWEVER I suspect it may be my collectd setup now.
( WARNING I am NOT a collectd expert. ;) )

I suspect that the vitrage-collectd plugin only sends collectd NOTIFICATIONS or THRESHOLD Events to vitrage.
i.e. it likely does NOT send just statistic/status samples to vitrage.

I can see that collectd sampling is happening ... I have logfile and csv and rrd plugins running and samples are being captured in the specified directories / files.

I tried to set threshold for CPU based on an example I had found on web.
See attached collectd.conf file .

BUT really not sure if the threshold configuration in my collectd.conf is correct or working ... is there a way to confirm this ? ( any collectd experts out there ? )
OR
Is there an example collectd.conf that has notifications or thresholds (whatever vitrage needs) setup for something basic like CPU ?

Greg.

From: "Afek, Ifat (Nokia - IL/Kfar Sava)" ifat.afek@nokia.com
Reply-To: "openstack-dev@lists.openstack.org" openstack-dev@lists.openstack.org
Date: Monday, August 28, 2017 at 9:42 AM
To: "openstack-dev@lists.openstack.org" openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [vitrage] Collectd - to - Vitrage setup issues

Hi Greg,

I’m less familiar with the collectd configuration and the events that it sends.

Regarding the collectdconf.yaml, it is definitely missing. You should add a /etc/vitrage/collectdconf.yaml file that looks like that:

collectd:
- collectdhost:
type:
name:
- collectd
host: …

This file maps a Collectd resource to the corresponding resource in OpenStack. Only resources that are listed in this file will have their alarms imported to Vitrage.

Next, you should add a reference to this file in /etc/vitrage/vitrage.conf:

[collectd]
configfile = /etc/vitrage/collectdconf.yaml

Then you should restart vitrage-graph.

Let me know if it helped,
Ifat.

From: "Waines, Greg" Greg.Waines@windriver.com
Date: Wednesday, 23 August 2017 at 21:19

I am trying to get collectd to report some alarms to vitrage in a devstack setup,

I am using a devstack created on a late version of ocata.
And my devstack with vitrage appears to be working ok otherwise;
e.g. I can create VMs, and raise fake alarms using “vitrage event post -type=compute.host.down ...” or with “aodh alarm create ... resource_id=instance-uuid” ... and they get reported fine in vitrage.

UNFORTUNATELY not seeing anything in vitrage from collectd, and
don’t believe I’m seeing anything even from collectd, for example from the syslog output plugin.

I’ve attached the following files: ( not sure if these get distributed on mailing list )
· /etc/collectd/collectd.conf <-- do these look ok ?
· /etc/vitrage/vitrage.conf <-- do these look ok ?
· /var/log/syslog ... around the time when I updated collectd.conf and vitrage.conf and restarted collectd and vitrage-graph
o QUESTIONS
• NOTE THE FOLLOWING ERRORS IN THE SYSLOG FILE ... where do I get the collectdconf.yaml file from ? Can’t see it in the devstack files for vitrage.
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.039 25962 ERROR vitrage.utils.file [-] File doesn't exist: /etc/vitrage/collectd
conf.yaml.
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver [-] failed in init 'NoneType' object has no attribute 'getitem' : TypeError: 'NoneType' object has no attribute 'getitem'
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver Traceback (most recent call last):
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver File "/opt/stack/vitrage/vitrage/datasources/collectd/driver.py", line 65, in configurationmapping
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver collectdconfigelements = collectdconfig[COLLECTDDATASOURCE]
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver TypeError: 'NoneType' object has no attribute 'getitem'
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver

• IT DOESN”T SEEM LIKE collectd is actually getting any events anyways ... shouldn’t I see some collectd events being reported in /var/log/syslog from some of the monitoring plugins that are loaded ?
· gregs-air:collectd-info gregwaines$ fgrep "localhost collectd" syslog
· Aug 23 13:56:07 localhost collectd[23267]: supervised by systemd, will signal readyness
· Aug 23 13:56:07 localhost collectd[23267]: Initialization complete, entering read-loop.
· Aug 23 13:56:07 localhost collectd[23267]: rrdtool plugin: Adjusting "RandomTimeout" to 0.000 seconds.
· Aug 23 14:09:05 localhost collectd[23267]: Exiting normally.
· Aug 23 14:09:05 localhost collectd[23267]: collectd: Stopping 5 read threads.
· Aug 23 14:09:05 localhost collectd[23267]: rrdtool plugin: Shutting down the queue thread.
· Aug 23 14:09:05 localhost collectd[23267]: collectd: Stopping 5 write threads.
· Aug 23 14:09:07 localhost collectd[25824]: supervised by systemd, will signal readyness
· Aug 23 14:09:07 localhost collectd[25824]: Initialization complete, entering read-loop.
· Aug 23 14:09:07 localhost collectd[25824]: rrdtool plugin: Adjusting "RandomTimeout" to 0.000 seconds.
·
· /etc/vitrage/templates/hostdownscenarios.yaml
· /etc/vitrage/templates/hosthighcpuloadscenarios.yaml
o Am I suppose to have some templates that are specific to the collectd events/alarms that are being reported to vitrage ?

Any other suggestions on things to look at in order to understand what’s wrong ?

Greg.


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Aug 31, 2017 by ifat.afek_at_nokia.c (5,680 points)   2 2 3
0 votes

Hi Greg,

            First of all, let’s make sure that the notification is generated by collectd. To do so, create a simple collectd python plugin to dump notifications into /tmp/python-notifications.dump' file:

cat > /etc/collectdnotificationdump.py <<EOF
import collectd

def notify(n):
f = open('/tmp/python-notifications.dump', 'a')
f.write('host: {}\n'.format(n.host))
f.write('plugin: {}\n'.format(n.plugin))
f.write('plugininstance: {}\n'.format(n.plugininstance))
f.write('type: {}\n'.format(n.type))
f.write('typeinstance: {}\n'.format(n.typeinstance))
f.write('time: {}\n'.format(n.time))
f.write('severity: {}\n'.format(n.severity))
f.write('message: {}\n'.format(n.message))
f.write('\n')
f.close()

collectd.register_notification(notify)

ModulePath "/etc"
LogTraces true
Interactive false
Import "collectdnotificationdump"

Restart collectd. All collectd notifications will be dump in /tmp/python-notifications.dump' file. E.g. if the collectd threshold plugin generate the notification, it will appear in the dump file. If not, there may be a problem with configuring the threshold plugin.

Thanks and Regards,
Volodymyr

From: Waines, Greg [mailto:Greg.Waines@windriver.com]
Sent: Wednesday, August 30, 2017 10:18 PM
To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org
Cc: Tahhan, Maryam maryam.tahhan@intel.com
Subject: Re: [openstack-dev] [vitrage] Collectd - to - Vitrage setup issues

Hi Ifat,
thanks for the reply ... just got around to trying your suggestions.

This definitely helps ... I no longer get any errors on re-starting collectd or vitrage-graph.
i.e. it appears to load the collectd and updated vitrage conf files correctly now.

Now still don’t get any alarms in vitrage.
HOWEVER I suspect it may be my collectd setup now.
( WARNING I am NOT a collectd expert. ;) )

I suspect that the vitrage-collectd plugin only sends collectd NOTIFICATIONS or THRESHOLD Events to vitrage.
i.e. it likely does NOT send just statistic/status samples to vitrage.

I can see that collectd sampling is happening ... I have logfile and csv and rrd plugins running and samples are being captured in the specified directories / files.

I tried to set threshold for CPU based on an example I had found on web.
See attached collectd.conf file .

BUT really not sure if the threshold configuration in my collectd.conf is correct or working ... is there a way to confirm this ? ( any collectd experts out there ? )
OR
Is there an example collectd.conf that has notifications or thresholds (whatever vitrage needs) setup for something basic like CPU ?

Greg.

From: "Afek, Ifat (Nokia - IL/Kfar Sava)" ifat.afek@nokia.com
Reply-To: "openstack-dev@lists.openstack.org" openstack-dev@lists.openstack.org
Date: Monday, August 28, 2017 at 9:42 AM
To: "openstack-dev@lists.openstack.org" openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [vitrage] Collectd - to - Vitrage setup issues

Hi Greg,

I’m less familiar with the collectd configuration and the events that it sends.

Regarding the collectdconf.yaml, it is definitely missing. You should add a /etc/vitrage/collectdconf.yaml file that looks like that:

collectd:
- collectdhost:
type:
name:
- collectd
host: …

This file maps a Collectd resource to the corresponding resource in OpenStack. Only resources that are listed in this file will have their alarms imported to Vitrage.

Next, you should add a reference to this file in /etc/vitrage/vitrage.conf:

[collectd]
configfile = /etc/vitrage/collectdconf.yaml

Then you should restart vitrage-graph.

Let me know if it helped,
Ifat.

From: "Waines, Greg" Greg.Waines@windriver.com
Date: Wednesday, 23 August 2017 at 21:19

I am trying to get collectd to report some alarms to vitrage in a devstack setup,

I am using a devstack created on a late version of ocata.
And my devstack with vitrage appears to be working ok otherwise;
e.g. I can create VMs, and raise fake alarms using “vitrage event post -type=compute.host.down ...” or with “aodh alarm create ... resource_id=instance-uuid” ... and they get reported fine in vitrage.

UNFORTUNATELY not seeing anything in vitrage from collectd, and
don’t believe I’m seeing anything even from collectd, for example from the syslog output plugin.

I’ve attached the following files: ( not sure if these get distributed on mailing list )
· /etc/collectd/collectd.conf <-- do these look ok ?
· /etc/vitrage/vitrage.conf <-- do these look ok ?
· /var/log/syslog ... around the time when I updated collectd.conf and vitrage.conf and restarted collectd and vitrage-graph
o QUESTIONS
• NOTE THE FOLLOWING ERRORS IN THE SYSLOG FILE ... where do I get the collectdconf.yaml file from ? Can’t see it in the devstack files for vitrage.
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.039 25962 ERROR vitrage.utils.file [-] File doesn't exist: /etc/vitrage/collectd
conf.yaml.
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver [-] failed in init 'NoneType' object has no attribute 'getitem' : TypeError: 'NoneType' object has no attribute 'getitem'
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver Traceback (most recent call last):
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver File "/opt/stack/vitrage/vitrage/datasources/collectd/driver.py", line 65, in configurationmapping
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver collectdconfigelements = collectdconfig[COLLECTDDATASOURCE]
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver TypeError: 'NoneType' object has no attribute 'getitem'
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver

• IT DOESN”T SEEM LIKE collectd is actually getting any events anyways ... shouldn’t I see some collectd events being reported in /var/log/syslog from some of the monitoring plugins that are loaded ?
· gregs-air:collectd-info gregwaines$ fgrep "localhost collectd" syslog
· Aug 23 13:56:07 localhost collectd[23267]: supervised by systemd, will signal readyness
· Aug 23 13:56:07 localhost collectd[23267]: Initialization complete, entering read-loop.
· Aug 23 13:56:07 localhost collectd[23267]: rrdtool plugin: Adjusting "RandomTimeout" to 0.000 seconds.
· Aug 23 14:09:05 localhost collectd[23267]: Exiting normally.
· Aug 23 14:09:05 localhost collectd[23267]: collectd: Stopping 5 read threads.
· Aug 23 14:09:05 localhost collectd[23267]: rrdtool plugin: Shutting down the queue thread.
· Aug 23 14:09:05 localhost collectd[23267]: collectd: Stopping 5 write threads.
· Aug 23 14:09:07 localhost collectd[25824]: supervised by systemd, will signal readyness
· Aug 23 14:09:07 localhost collectd[25824]: Initialization complete, entering read-loop.
· Aug 23 14:09:07 localhost collectd[25824]: rrdtool plugin: Adjusting "RandomTimeout" to 0.000 seconds.
·
· /etc/vitrage/templates/hostdownscenarios.yaml
· /etc/vitrage/templates/hosthighcpuloadscenarios.yaml
o Am I suppose to have some templates that are specific to the collectd events/alarms that are being reported to vitrage ?

Any other suggestions on things to look at in order to understand what’s wrong ?

Greg.


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Aug 31, 2017 by Mytnyk,_VolodymyrX (480 points)  
0 votes

Hi Ifat,
I actually have ‘debug = true’ in /etc/vitrage/vitrage.conf .
However I don’t see vitrage-graph.log anywhere ?
Where is it suppose to be ? in /var/log/ ?
Greg.

root@devstack-vitrage:/# more /etc/vitrage/vitrage.conf

[DEFAULT]

debug = True

transport_url = rabbit://stackrabbit:admin@10.10.10.13:5672/

[oslo_policy]

policy_file = /etc/vitrage/policy.json

[service_credentials]

auth_url = http://10.10.10.13/identity

region_name = RegionOne

project_name = admin

password = admin

projectdomainid = default

userdomainid = default

username = vitrage

auth_type = password

[datasources]

types = nova.host,nova.instance,nova.zone,static,static_physical,aodh,cinder.volume,neutron.network,neutron.port,doctor,collectd

[keystone_authtoken]

memcached_servers = 10.10.10.13:11211

signing_dir = /var/cache/vitrage

cafile = /opt/stack/data/ca-bundle.pem

projectdomainname = Default

project_name = admin

userdomainname = Default

password = admin

username = vitrage

auth_url = http://10.10.10.13/identity

auth_type = password

[api]

pecan_debug = False

[collectd]

configfile = /etc/vitrage/collectdconf.yaml

root@devstack-vitrage:/#

From: "Afek, Ifat (Nokia - IL/Kfar Sava)" ifat.afek@nokia.com
Date: Thursday, August 31, 2017 at 3:52 AM
To: Greg Waines Greg.Waines@windriver.com, "openstack-dev@lists.openstack.org" openstack-dev@lists.openstack.org
Cc: "TAHHAN, MARYAM" maryam.tahhan@intel.com
Subject: Re: [openstack-dev] [vitrage] Collectd - to - Vitrage setup issues

Hi Greg,

Vitrage listens to Collectd notifications, not statistics.
Can you please turn on the debug option in /etc/vitrage/vitrage.conf (set “debug = true”), and send me the vitrage-graph.log?

Thanks,
Ifat.

From: "Waines, Greg" Greg.Waines@windriver.com
Date: Wednesday, 30 August 2017 at 22:17
To: "OpenStack Development Mailing List (not for usage questions)" openstack-dev@lists.openstack.org
Cc: "Afek, Ifat (Nokia - IL/Kfar Sava)" ifat.afek@nokia.com, "TAHHAN, MARYAM" maryam.tahhan@intel.com
Subject: Re: [openstack-dev] [vitrage] Collectd - to - Vitrage setup issues

Hi Ifat,
thanks for the reply ... just got around to trying your suggestions.

This definitely helps ... I no longer get any errors on re-starting collectd or vitrage-graph.
i.e. it appears to load the collectd and updated vitrage conf files correctly now.

Now still don’t get any alarms in vitrage.
HOWEVER I suspect it may be my collectd setup now.
( WARNING I am NOT a collectd expert. ;) )

I suspect that the vitrage-collectd plugin only sends collectd NOTIFICATIONS or THRESHOLD Events to vitrage.
i.e. it likely does NOT send just statistic/status samples to vitrage.

I can see that collectd sampling is happening ... I have logfile and csv and rrd plugins running and samples are being captured in the specified directories / files.

I tried to set threshold for CPU based on an example I had found on web.
See attached collectd.conf file .

BUT really not sure if the threshold configuration in my collectd.conf is correct or working ... is there a way to confirm this ? ( any collectd experts out there ? )
OR
Is there an example collectd.conf that has notifications or thresholds (whatever vitrage needs) setup for something basic like CPU ?

Greg.

From: "Afek, Ifat (Nokia - IL/Kfar Sava)" ifat.afek@nokia.com
Reply-To: "openstack-dev@lists.openstack.org" openstack-dev@lists.openstack.org
Date: Monday, August 28, 2017 at 9:42 AM
To: "openstack-dev@lists.openstack.org" openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [vitrage] Collectd - to - Vitrage setup issues

Hi Greg,

I’m less familiar with the collectd configuration and the events that it sends.

Regarding the collectdconf.yaml, it is definitely missing. You should add a /etc/vitrage/collectdconf.yaml file that looks like that:

collectd:
- collectdhost:
type:
name:
- collectd
host: …

This file maps a Collectd resource to the corresponding resource in OpenStack. Only resources that are listed in this file will have their alarms imported to Vitrage.

Next, you should add a reference to this file in /etc/vitrage/vitrage.conf:

[collectd]
configfile = /etc/vitrage/collectdconf.yaml

Then you should restart vitrage-graph.

Let me know if it helped,
Ifat.

From: "Waines, Greg" Greg.Waines@windriver.com
Date: Wednesday, 23 August 2017 at 21:19

I am trying to get collectd to report some alarms to vitrage in a devstack setup,

I am using a devstack created on a late version of ocata.
And my devstack with vitrage appears to be working ok otherwise;
e.g. I can create VMs, and raise fake alarms using “vitrage event post -type=compute.host.down ...” or with “aodh alarm create ... resource_id=instance-uuid” ... and they get reported fine in vitrage.

UNFORTUNATELY not seeing anything in vitrage from collectd, and
don’t believe I’m seeing anything even from collectd, for example from the syslog output plugin.

I’ve attached the following files: ( not sure if these get distributed on mailing list )
· /etc/collectd/collectd.conf <-- do these look ok ?
· /etc/vitrage/vitrage.conf <-- do these look ok ?
· /var/log/syslog ... around the time when I updated collectd.conf and vitrage.conf and restarted collectd and vitrage-graph
o QUESTIONS
• NOTE THE FOLLOWING ERRORS IN THE SYSLOG FILE ... where do I get the collectdconf.yaml file from ? Can’t see it in the devstack files for vitrage.
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.039 25962 ERROR vitrage.utils.file [-] File doesn't exist: /etc/vitrage/collectd
conf.yaml.
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver [-] failed in init 'NoneType' object has no attribute 'getitem' : TypeError: 'NoneType' object has no attribute 'getitem'
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver Traceback (most recent call last):
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver File "/opt/stack/vitrage/vitrage/datasources/collectd/driver.py", line 65, in configurationmapping
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver collectdconfigelements = collectdconfig[COLLECTDDATASOURCE]
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver TypeError: 'NoneType' object has no attribute 'getitem'
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver

• IT DOESN”T SEEM LIKE collectd is actually getting any events anyways ... shouldn’t I see some collectd events being reported in /var/log/syslog from some of the monitoring plugins that are loaded ?
· gregs-air:collectd-info gregwaines$ fgrep "localhost collectd" syslog
· Aug 23 13:56:07 localhost collectd[23267]: supervised by systemd, will signal readyness
· Aug 23 13:56:07 localhost collectd[23267]: Initialization complete, entering read-loop.
· Aug 23 13:56:07 localhost collectd[23267]: rrdtool plugin: Adjusting "RandomTimeout" to 0.000 seconds.
· Aug 23 14:09:05 localhost collectd[23267]: Exiting normally.
· Aug 23 14:09:05 localhost collectd[23267]: collectd: Stopping 5 read threads.
· Aug 23 14:09:05 localhost collectd[23267]: rrdtool plugin: Shutting down the queue thread.
· Aug 23 14:09:05 localhost collectd[23267]: collectd: Stopping 5 write threads.
· Aug 23 14:09:07 localhost collectd[25824]: supervised by systemd, will signal readyness
· Aug 23 14:09:07 localhost collectd[25824]: Initialization complete, entering read-loop.
· Aug 23 14:09:07 localhost collectd[25824]: rrdtool plugin: Adjusting "RandomTimeout" to 0.000 seconds.
·
· /etc/vitrage/templates/hostdownscenarios.yaml
· /etc/vitrage/templates/hosthighcpuloadscenarios.yaml
o Am I suppose to have some templates that are specific to the collectd events/alarms that are being reported to vitrage ?

Any other suggestions on things to look at in order to understand what’s wrong ?

Greg.


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Aug 31, 2017 by Waines,_Greg (2,700 points)   1 4 9
0 votes

Hey Volodymyr
... thanks for taking a look at this.

I tried your suggestion.
I believe you can only have one python plugin defined, so I commented out the vitrage related one.
I restarted collectd ... sudo systemctl restart collectd ... which issued no warnings or errors.

But /tmp/python-notifications.dump has not even been created.
SOOO I’m assuming that means there are no collectd NOTIFICATIONS being raised.
I’m sure that’s because I don’t have the correct plugin / threshold configuration for like cpu or memory or disk setup to generate notifications.

Can you take a quick look at my collectd.conf file below.
I have a brief version with just the python and cpu and threshold plugins.
And I have the full file pasted below as well.

thanks in advance for helping me out,
Greg.

stack@devstack-vitrage:~$ cat /etc/collectdnotificationdump.py
import collectd

def notify(n):
f = open('/tmp/python-notifications.dump', 'a')
f.write('host: {}\n'.format(n.host))
f.write('plugin: {}\n'.format(n.plugin))
f.write('plugininstance: {}\n'.format(n.plugininstance))
f.write('type: {}\n'.format(n.type))
f.write('typeinstance: {}\n'.format(n.typeinstance))
f.write('time: {}\n'.format(n.time))
f.write('severity: {}\n'.format(n.severity))
f.write('message: {}\n'.format(n.message))
f.write('\n')
f.close()

collectd.register_notification(notify)

stack@devstack-vitrage:~$

stack@devstack-vitrage:~$ cat /etc/collectd/collectd.conf

LoadPlugin cpu
LoadPlugin python

   ReportByCpu true
   ReportByState true
   ValuesPercentage true

ModulePath "/etc"
LogTraces true
Interactive false
Import "collectdnotificationdump"

LoadPlugin "threshold"

    Instance "wait"
    FailureMax 12
    FailureMin 10

Greg.

p.s. ... here is my complete collectd.conf file

stack@devstack-vitrage:~$ cat /etc/collectd/collectd.conf

Config file for collectd(1).

#

Some plugins need additional configuration and are disabled by default.

Please read collectd.conf(5) for details.

#

You should also read /usr/share/doc/collectd-core/README.Debian.plugins

before enabling any more plugins.

#

Global

----------------------------------------------------------------------------

Global settings for the daemon.

#

Hostname "devstack-vitrage"
FQDNLookup true

BaseDir "/var/lib/collectd"

PluginDir "/usr/lib/collectd"

TypesDB "/usr/share/collectd/types.db" "/etc/collectd/my_types.db"

----------------------------------------------------------------------------

When enabled, plugins are loaded automatically with the default options

when an appropriate block is encountered.

Disabled by default.

----------------------------------------------------------------------------

AutoLoadPlugin false

----------------------------------------------------------------------------

When enabled, internal statistics are collected, using "collectd" as the

plugin name.

Disabled by default.

----------------------------------------------------------------------------

CollectInternalStats false

----------------------------------------------------------------------------

Interval at which to query values. This may be overwritten on a per-plugin

base by using the 'Interval' option of the LoadPlugin block:

Interval 60

----------------------------------------------------------------------------

Interval 15

MaxReadInterval 86400

Timeout 2

ReadThreads 5

WriteThreads 5

Limit the size of the write queue. Default is no limit. Setting up a limit

is recommended for servers handling a high volume of traffic.

WriteQueueLimitHigh 1000000

WriteQueueLimitLow 800000

#

Logging

----------------------------------------------------------------------------

Plugins which provide logging functions should be loaded first, so log

messages generated when loading or configuring other plugins can be

accessed.

#

LoadPlugin logfile
LoadPlugin syslog

LoadPlugin log_logstash

LogLevel "info"

File STDOUT

Timestamp true

PrintSeverity false

LogLevel "info"
File "/var/log/collectd.log"
Timestamp true

   LogLevel info
   NotifyLevel OKAY

LogLevel info

File "/var/log/collectd.json.log"

#

LoadPlugin section

----------------------------------------------------------------------------

Specify what features to activate.

#

LoadPlugin aggregation

LoadPlugin amqp

LoadPlugin apache

LoadPlugin apcups

LoadPlugin ascent

LoadPlugin barometer

LoadPlugin battery

LoadPlugin bind

LoadPlugin ceph

LoadPlugin cgroups

LoadPlugin conntrack

LoadPlugin contextswitch

LoadPlugin cpu

LoadPlugin cpufreq

LoadPlugin csv

LoadPlugin curl

LoadPlugin curl_json

LoadPlugin curl_xml

LoadPlugin dbi

LoadPlugin df
LoadPlugin disk

LoadPlugin dns

LoadPlugin drbd

LoadPlugin email

LoadPlugin entropy

LoadPlugin ethstat

LoadPlugin exec

LoadPlugin fhcount

LoadPlugin filecount

LoadPlugin fscache

LoadPlugin gmond

LoadPlugin hddtemp

LoadPlugin interface

LoadPlugin ipc

LoadPlugin ipmi

LoadPlugin iptables

LoadPlugin ipvs

LoadPlugin irq

LoadPlugin java

LoadPlugin load

LoadPlugin lvm

LoadPlugin madwifi

LoadPlugin mbmon

LoadPlugin md

LoadPlugin memcachec

LoadPlugin memcached

LoadPlugin memory

LoadPlugin modbus

LoadPlugin multimeter

LoadPlugin mysql

LoadPlugin netlink

LoadPlugin network

LoadPlugin nfs

LoadPlugin nginx

LoadPlugin notify_desktop

LoadPlugin notify_email

LoadPlugin ntpd

LoadPlugin numa

LoadPlugin nut

LoadPlugin olsrd

LoadPlugin openldap

LoadPlugin openvpn

LoadPlugin perl

LoadPlugin pinba

LoadPlugin ping

LoadPlugin postgresql

LoadPlugin powerdns

LoadPlugin processes

LoadPlugin protocols

LoadPlugin python

LoadPlugin redis

LoadPlugin rrdcached

LoadPlugin rrdtool

LoadPlugin sensors

LoadPlugin serial

LoadPlugin sigrok

LoadPlugin smart

LoadPlugin snmp

LoadPlugin statsd

LoadPlugin swap

LoadPlugin table

LoadPlugin tail

LoadPlugin tail_csv

LoadPlugin tcpconns

LoadPlugin teamspeak2

LoadPlugin ted

LoadPlugin thermal

LoadPlugin tokyotyrant

LoadPlugin turbostat

LoadPlugin unixsock

LoadPlugin uptime

LoadPlugin users

LoadPlugin uuid

LoadPlugin varnish

LoadPlugin virt

LoadPlugin vmem

LoadPlugin vserver

LoadPlugin wireless

LoadPlugin write_graphite

LoadPlugin write_http

LoadPlugin write_kafka

LoadPlugin write_log

LoadPlugin write_redis

LoadPlugin write_riemann

LoadPlugin write_sensu

LoadPlugin write_tsdb

LoadPlugin zfs_arc

LoadPlugin zookeeper

#

Plugin configuration

----------------------------------------------------------------------------

In this section configuration stubs for each plugin are provided. A desc-

ription of those options is available in the collectd.conf(5) manual page.

#
          #Host "unspecified"
          Plugin "cpu"
          PluginInstance "/[0,2,4,6,8]$/"
          Type "cpu"
          #TypeInstance "unspecified"

          SetPlugin "cpu"
          SetPluginInstance "even-%{aggregation}"

          GroupBy "Host"
          GroupBy "TypeInstance"

          CalculateNum false
          CalculateSum false
          CalculateAverage true
          CalculateMinimum false
          CalculateMaximum false
          CalculateStddev false

Host "localhost"

Port "5672"

VHost "/"

User "guest"

Password "guest"

Exchange "amq.fanout"

RoutingKey "collectd"

Persistent false

StoreRates false

ConnectionRetryDelay 0

URL "http://localhost/server-status?auto"

User "www-user"

Password "secret"

VerifyPeer false

VerifyHost false

CACert "/etc/ssl/ca.crt"

Server "apache"

#

URL "http://some.domain.tld/status?auto"

Host "some.domain.tld"

Server "lighttpd"

Host "localhost"

Port "3551"

ReportSeconds true

URL "http://localhost/ascent/status/"

User "www-user"

Password "secret"

VerifyPeer false

VerifyHost false

CACert "/etc/ssl/ca.crt"

Device "/dev/i2c-0";

Oversampling 512

PressureOffset 0.0

TemperatureOffset 0.0

Normalization 2

Altitude 238.0

TemperatureSensor "myserver/onewire-F10FCA000800/temperature"

ValuesPercentage false

ReportDegraded false

URL "http://localhost:8053/"

#

ParseTime false

#

OpCodes true

QTypes true

ServerStats true

ZoneMaintStats true

ResolverStats false

MemoryStats true

#

QTypes true

ResolverStats true

CacheRRSets true

#

Zone "127.in-addr.arpa/IN"

LongRunAvgLatency false

ConvertSpecialMetricTypes true

SocketPath "/var/run/ceph/ceph-osd.0.asok"

SocketPath "/var/run/ceph/ceph-osd.1.asok"

SocketPath "/var/run/ceph/ceph-mon.ceph1.asok"

SocketPath "/var/run/ceph/ceph-mds.ceph1.asok"

CGroup "libvirt"

IgnoreSelected false

   ReportByCpu true
   ReportByState true
   ValuesPercentage true

   DataDir "/var/lib/collectd/csv"
   StoreRates false

URL "http://finance.google.com/finance?q=NYSE%3AAMD"

User "foo"

Password "bar"

Digest false

VerifyPeer true

VerifyHost true

CACert "/path/to/ca.crt"

Header "X-Custom-Header: foobar"

Post "foo=bar"

#

MeasureResponseTime false

MeasureResponseCode false

Regex "]> *([0-9]\.[0-9]+) *"

DSType "GaugeAverage"

Type "stock_value"

Instance "AMD"

See: http://wiki.apache.org/couchdb/Runtime_Statistics

Instance "httpd"

Type "http_requests"

#

Type "httprequestmethods"

#

Type "httpresponsecodes"

Database status metrics:

Instance "dbs"

Type "gauge"

Type "counter"

Type "bytes"

Host "my_host"

Instance "some_instance"

User "collectd"

Password "thaiNg0I"

Digest false

VerifyPeer true

VerifyHost true

CACert "/path/to/ca.crt"

Header "X-Custom-Header: foobar"

Post "foo=bar"

#

Type "magic_level"

InstancePrefix "prefix-"

InstanceFrom "td[1]"

ValuesFrom "td[2]/span[@class=\"level\"]"

Statement "SELECT 'customers' AS ckey, COUNT(*) AS cvalue \

FROM customers_tbl"

MinVersion 40102

MaxVersion 50042

Type "gauge"

InstancePrefix "customer"

InstancesFrom "c_key"

ValuesFrom "c_value"

#

Driver "mysql"

DriverOption "host" "localhost"

DriverOption "username" "collectd"

DriverOption "password" "secret"

DriverOption "dbname" "custdb0"

SelectDB "custdb0"

Query "numofcustomers"

Query "..."

Host "..."

Device "/dev/sda1"

Device "192.168.0.2:/mnt/nfs"

MountPoint "/home"

FSType "ext3"

   # ignore rootfs; else, the root file-system would appear twice, causing
   # one of the updates to fail and spam the log
   FSType rootfs
   # ignore the usual virtual / temporary file-systems
   FSType sysfs
   FSType proc
   FSType devtmpfs
   FSType devpts
   FSType tmpfs
   FSType fusectl
   FSType cgroup
   IgnoreSelected true

ReportByDevice false

ReportInodes false

ValuesAbsolute true

ValuesPercentage false

Disk "hda"

Disk "/sda[23]/"

IgnoreSelected false

UseBSDName false

UdevNameAttr "DEVNAME"

Interface "eth0"

IgnoreSource "192.168.0.1"

SelectNumericQueryTypes false

SocketFile "/var/run/collectd-email"

SocketGroup "collectd"

SocketPerms "0770"

MaxConns 5

Interface "eth0"

Map "rxcsumoffloaderrors" "ifrxerrors" "checksumoffload"

Map "multicast" "if_multicast"

MappedOnly false

Exec user "/path/to/exec"

Exec "user:group" "/path/to/exec"

NotificationExec user "/path/to/exec"

NotificationExec "user" "/usr/lib/collectd/notify.sh"

ValuesAbsolute true

ValuesPercentage false

Instance "foodir"

Name "*.conf"

MTime "-5m"

Size "+10k"

Recursive true

IncludeHidden false

MCReceiveFrom "239.2.11.71" "8649"

#

Type "swap"

TypeInstance "total"

DataSource "value"

#

Type "swap"

TypeInstance "free"

DataSource "value"

Host "127.0.0.1"

Port 7634

   Interface "ens3"
   IgnoreSelected false

Sensor "some_sensor"

Sensor "another_one"

IgnoreSelected false

NotifySensorAdd false

NotifySensorRemove true

NotifySensorNotPresent false

Chain "table" "chain"

Chain6 "table" "chain"

Irq 7

Irq 8

Irq 9

IgnoreSelected true

JVMArg "-verbose:jni"

JVMArg "-Djava.class.path=/usr/share/collectd/java/collectd-api.jar"

#

LoadPlugin "org.collectd.java.GenericJMX"

# See /usr/share/doc/collectd/examples/GenericJMX.conf

# for an example config.

ReportRelative true

Interface "wlan0"

IgnoreSelected false

Source "SysFS"

WatchSet "None"

WatchAdd "node_octets"

WatchAdd "node_rssi"

WatchAdd "isrxacl"

WatchAdd "isscanactive"

Host "127.0.0.1"

Port 411

Device "/dev/md0"

IgnoreSelected false

Server "localhost"

Key "page_key"

Regex "(\d+) bytes sent"

ExcludeRegex ""

DSType CounterAdd

Type "ipt_octets"

Instance "type_instance"

Socket "/var/run/memcached.sock"

or:

Host "127.0.0.1"

Port "11211"

   ValuesAbsolute true
   ValuesPercentage false

RegisterBase 1234

RegisterCmd ReadHolding

RegisterType float

Type gauge

Instance "..."

#

Address "addr"

Port "1234"

Interval 60

#

Instance "foobar" # optional

Collect "data_name"

Host "database.serv.er"

Port "3306"

User "db_user"

Password "secret"

Database "db_name"

MasterStats true

ConnectTimeout 10

InnodbStats true

#

Alias "squeeze"

Host "localhost"

Socket "/var/run/mysql/mysqld.sock"

SlaveStats true

SlaveNotifications true

Interface "All"

VerboseInterface "All"

QDisc "eth0" "pfifo_fast-1:0"

Class "ppp0" "htb-1:10"

Filter "ppp0" "u32-1:0"

IgnoreSelected false

# client setup:

Server "ff18::efc0:4a42" "25826"

SecurityLevel Encrypt

Username "user"

Password "secret"

Interface "eth0"

ResolveInterval 14400

TimeToLive 128

#

# server setup:

Listen "ff18::efc0:4a42" "25826"

SecurityLevel Sign

AuthFile "/etc/collectd/passwd"

Interface "eth0"

MaxPacketSize 1452

#

# proxy setup (client and server as above):

Forward true

#

# statistics about the network plugin itself

ReportStats false

#

# "garbage collection"

CacheFlush 1800

URL "http://localhost/status?auto"

User "www-user"

Password "secret"

VerifyPeer false

VerifyHost false

CACert "/etc/ssl/ca.crt"

OkayTimeout 1000

WarningTimeout 5000

FailureTimeout 0

SMTPServer "localhost"

SMTPPort 25

SMTPUser "my-username"

SMTPPassword "my-password"

From "collectd@main0server.com"

# <WARNING/FAILURE/OK> on .

# Beware! Do not use not more than two placeholders (%)!

Subject "[collectd] %s on %s!"

Recipient "email1@domain1.net"

Recipient "email2@domain2.com"

Host "localhost"

Port 123

ReverseLookups false

IncludeUnitID true

UPS "upsname@hostname:port"

Host "127.0.0.1"

Port "2006"

CollectLinks "Summary"

CollectRoutes "Summary"

CollectTopology "Summary"

URL "ldap://localhost:389"

StartTLS false

VerifyHost true

CACert "/path/to/ca.crt"

Timeout -1

Version 3

StatusFile "/etc/openvpn/openvpn-status.log"

ImprovedNamingSchema false

CollectCompression true

CollectIndividualUsers true

CollectUserCount false

IncludeDir "/my/include/path"

BaseName "Collectd::Plugins"

EnableDebugger ""

LoadPlugin Monitorus

LoadPlugin OpenVZ

#

Foo "Bar"

Qux "Baz"

Address "::0"

Port "30002"

Host "host name"

Server "server name"

Script "script name"

Host "host.foo.bar"

Host "host.baz.qux"

Interval 1.0

Timeout 0.9

TTL 255

SourceAddress "1.2.3.4"

Device "eth0"

MaxMissed -1

Statement "SELECT magic FROM wizard WHERE host = $1;"

Param hostname

#

Type gauge

InstancePrefix "magic"

ValuesFrom "magic"

#

Statement "SELECT COUNT(type) AS count, type \

FROM (SELECT CASE \

WHEN resolved = 'epoch' THEN 'open' \

ELSE 'resolved' END AS type \

FROM tickets) type \

GROUP BY type;"

#

Type counter

InstancePrefix "rt36_tickets"

InstancesFrom "type"

ValuesFrom "count"

#

# See /usr/share/doc/collectd-core/examples/postgresql/collectd_insert.sql for details

Statement "SELECT collectd_insert($1, $2, $3, $4, $5, $6, $7, $8, $9);"

StoreRates true

#

Host "hostname"

Port 5432

User "username"

Password "secret"

#

SSLMode "prefer"

KRBSrvName "kerberosservicename"

#

Query magic

#

Interval 60

Service "service_name"

#

Query backend # predefined

Query rt36_tickets

#

Service "collectd_store"

Writer sqlstore

# see collectd.conf(5) for details

CommitInterval 30

Collect "latency"

Collect "udp-answers" "udp-queries"

Socket "/var/run/pdns.controlsocket"

Collect "questions"

Collect "cache-hits" "cache-misses"

Socket "/var/run/pdns_recursor.controlsocket"

LocalSocket "/opt/collectd/var/run/collectd-powerdns"

Process "name"

ProcessMatch "foobar" "/usr/bin/perl foobar\.pl.*"

Value "/^Tcp:/"

IgnoreSelected false

ModulePath "/path/to/your/python/modules"

LogTraces true

Interactive true

Import "spam"

#

spam "wonderful" "lovely"

    # ModulePath "/opt/stack/vitrage/vitrage/datasources/collectd/"
    # LogTraces true
    # Interactive false
    # Import "collectd_vitrage.vitrageplugin"
    # Import "collectd_vitrage.getsigchld"

#
#
# transport_url "rabbit://stackrabbit:admin@10.10.10.13:5672/"
#

ModulePath "/etc"
LogTraces true
Interactive false
Import "collectdnotificationdump"

Host "redis.example.com"

Port "6379"

Timeout 2000

DaemonAddress "unix:/var/run/rrdcached.sock"

DataDir "/var/lib/rrdcached/db/collectd"

CreateFiles true

CreateFilesAsync false

CollectStatistics true

#

The following settings are rather advanced

and should usually not be touched:

StepSize 10

HeartBeat 20

RRARows 1200

RRATimespan 158112000

XFF 0.1

   DataDir "/var/lib/collectd/rrd"

CacheTimeout 120

CacheFlush 900

WritesPerSecond 30

CreateFilesAsync false

RandomTimeout 0

#

The following settings are rather advanced

and should usually not be touched:

StepSize 10

HeartBeat 20

RRARows 1200

RRATimespan 158112000

XFF 0.1

SensorConfigFile "/etc/sensors3.conf"

Sensor "it8712-isa-0290/temperature-temp1"

Sensor "it8712-isa-0290/fanspeed-fan3"

Sensor "it8712-isa-0290/voltage-in8"

IgnoreSelected false

LogLevel 3

Driver "fluke-dmm"

MinimumInterval 10

Conn "/dev/ttyUSB2"

Driver "cem-dt-885x"

Conn "/dev/ttyUSB1"

Disk "/^[hs]d[a-f][0-9]?$/"

IgnoreSelected false

See /usr/share/doc/collectd/examples/snmp-data.conf.gz for a

comprehensive sample configuration.

Type "voltage"

Table false

Instance "input_line1"

Scale 0.1

Values "SNMPv2-SMI::enterprises.6050.5.4.1.1.2.1"

Type "users"

Table false

Instance ""

Shift -1

Values "HOST-RESOURCES-MIB::hrSystemNumUsers.0"

Type "if_octets"

Table true

InstancePrefix "traffic"

Instance "IF-MIB::ifDescr"

Values "IF-MIB::ifInOctets" "IF-MIB::ifOutOctets"

#

Address "192.168.0.2"

Version 1

Community "community_string"

Collect "std_traffic"

Inverval 120

Address "192.168.0.42"

Version 2

Community "another_string"

Collect "stdtraffic" "hrusers"

Address "192.168.0.3"

Version 1

Community "more_communities"

Collect "powerplusvoltgeinput"

Interval 300

Host "::"

Port "8125"

DeleteCounters false

DeleteTimers false

DeleteGauges false

DeleteSets false

TimerPercentile 90.0

TimerPercentile 95.0

TimerPercentile 99.0

TimerLower false

TimerUpper false

TimerSum false

TimerCount false

ReportByDevice false

ReportBytes true

Instance "slabinfo"

Separator " "

Type gauge

InstancePrefix "active_objs"

InstancesFrom 0

ValuesFrom 1

Type gauge

InstancePrefix "objperslab"

InstancesFrom 0

ValuesFrom 4

Instance "exim"

Interval 60

Regex "S=([1-9][0-9]*)"

DSType "CounterAdd"

Type "ipt_bytes"

Instance "total"

Regex "\<R=local_user\>"

ExcludeRegex "\<R=localuser\>.*mailspool defer"

DSType "CounterInc"

Type "counter"

Instance "local_user"

Type "percent"

Instance "dropped"

ValueFrom 1

Type "bytes"

Instance "wire-realtime"

ValueFrom 2

Type "alertspersecond"

ValueFrom 3

Type "kpacketswireper_sec.realtime"

ValueFrom 4

Instance "snort-eth0"

Interval 600

Collect "dropped" "mbps" "alerts" "kpps"

TimeFrom 0

ListeningPorts false

AllPortsSummary false

LocalPort "25"

RemotePort "25"

Host "127.0.0.1"

Port "51234"

Server "8767"

Device "/dev/ttyUSB0"

Retries 0

ForceUseProcfs false

Device "THRM"

IgnoreSelected false

Host "localhost"

Port "1978"

None of the following option should be set manually

This plugin automatically detect most optimal options

Only set values here if:

- The module ask you to

- You want to disable the collection of some data

- Your (intel) CPU is not supported (yet) by the module

- The module generate a lot of errors 'MSR offset 0x... read failed'

In the last two cases, please open a bug request

#

TCCActivationTemp "100"

CoreCstates "392"

PackageCstates "396"

SystemManagementInterrupt true

DigitalTemperatureSensor true

PackageThermalManagement true

RunningAveragePowerLimit "7"

SocketFile "/var/run/collectd-unixsock"

SocketGroup "collectd"

SocketPerms "0660"

DeleteSocket false

UUIDFile "/etc/uuid"

CollectBackend true

CollectBan false # Varnish 3 and above

CollectCache true

CollectConnections true

CollectDirectorDNS false # Varnish 3 only

CollectESI false

CollectFetch false

CollectHCB false

CollectObjects false

CollectPurge false # Varnish 2 only

CollectSession false

CollectSHM true

CollectSMA false # Varnish 2 only

CollectSMS false

CollectSM false # Varnish 2 only

CollectStruct false

CollectTotals false

CollectUptime false # Varnish 3 and above

CollectdVCL false

CollectVSM false # Varnish 4 only

CollectWorkers false

#

CollectCache true

Connection "xen:///"

RefreshInterval 60

Domain "name"

BlockDevice "name:device"

InterfaceDevice "name:device"

IgnoreSelected false

HostnameFormat name

InterfaceFormat name

PluginInstanceFormat name

Verbose false

Host "localhost"

Port "2003"

Protocol "tcp"

LogSendErrors true

Prefix "collectd"

Postfix "collectd"

StoreRates true

AlwaysAppendDS false

EscapeCharacter "_"

URL "http://example.com/collectd-post"

User "collectd"

Password "secret"

VerifyPeer true

VerifyHost true

CACert "/etc/ssl/ca.crt"

CAPath "/etc/ssl/certs/"

ClientKey "/etc/ssl/client.pem"

ClientCert "/etc/ssl/client.crt"

ClientKeyPass "secret"

SSLVersion "TLSv1"

Format "Command"

StoreRates false

BufferSize 4096

LowSpeedLimit 0

Timeout 0

Property "metadata.broker.list" "localhost:9092"

Format JSON

Host "localhost"

Port 5555

Protocol TCP

Batch true

BatchMaxSize 8192

StoreRates true

AlwaysAppendDS false

TTLFactor 2.0

Notifications true

CheckThresholds false

EventServicePrefix ""

Tag "foobar"

Attribute "foo" "bar"

Host "localhost"

Port 3030

StoreRates true

AlwaysAppendDS false

Notifications true

Metrics true

EventServicePrefix ""

MetricHandler "influx"

MetricHandler "default"

NotificationHandler "flapjack"

NotificationHandler "howling_monkey"

Tag "foobar"

Attribute "foo" "bar"

Host "localhost"

Port "4242"

HostTags "status=production"

StoreRates false

AlwaysAppendDS false

Host "localhost"

Port "2181"

LoadPlugin "threshold"

    Instance "wait"
    FailureMax 12
    FailureMin 10

   Filter "*.conf"

stack@devstack-vitrage:~$

From: "Mytnyk, VolodymyrX" volodymyrx.mytnyk@intel.com
Reply-To: "openstack-dev@lists.openstack.org" openstack-dev@lists.openstack.org
Date: Thursday, August 31, 2017 at 4:05 AM
To: "openstack-dev@lists.openstack.org" openstack-dev@lists.openstack.org
Cc: "TAHHAN, MARYAM" maryam.tahhan@intel.com
Subject: Re: [openstack-dev] [vitrage] Collectd - to - Vitrage setup issues

Hi Greg,

            First of all, let’s make sure that the notification is generated by collectd. To do so, create a simple collectd python plugin to dump notifications into /tmp/python-notifications.dump' file:

cat > /etc/collectdnotificationdump.py <<EOF
import collectd

def notify(n):
f = open('/tmp/python-notifications.dump', 'a')
f.write('host: {}\n'.format(n.host))
f.write('plugin: {}\n'.format(n.plugin))
f.write('plugininstance: {}\n'.format(n.plugininstance))
f.write('type: {}\n'.format(n.type))
f.write('typeinstance: {}\n'.format(n.typeinstance))
f.write('time: {}\n'.format(n.time))
f.write('severity: {}\n'.format(n.severity))
f.write('message: {}\n'.format(n.message))
f.write('\n')
f.close()

collectd.register_notification(notify)

ModulePath "/etc"
LogTraces true
Interactive false
Import "collectdnotificationdump"

Restart collectd. All collectd notifications will be dump in /tmp/python-notifications.dump' file. E.g. if the collectd threshold plugin generate the notification, it will appear in the dump file. If not, there may be a problem with configuring the threshold plugin.

Thanks and Regards,
Volodymyr

From: Waines, Greg [mailto:Greg.Waines@windriver.com]
Sent: Wednesday, August 30, 2017 10:18 PM
To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org
Cc: Tahhan, Maryam maryam.tahhan@intel.com
Subject: Re: [openstack-dev] [vitrage] Collectd - to - Vitrage setup issues

Hi Ifat,
thanks for the reply ... just got around to trying your suggestions.

This definitely helps ... I no longer get any errors on re-starting collectd or vitrage-graph.
i.e. it appears to load the collectd and updated vitrage conf files correctly now.

Now still don’t get any alarms in vitrage.
HOWEVER I suspect it may be my collectd setup now.
( WARNING I am NOT a collectd expert. ;) )

I suspect that the vitrage-collectd plugin only sends collectd NOTIFICATIONS or THRESHOLD Events to vitrage.
i.e. it likely does NOT send just statistic/status samples to vitrage.

I can see that collectd sampling is happening ... I have logfile and csv and rrd plugins running and samples are being captured in the specified directories / files.

I tried to set threshold for CPU based on an example I had found on web.
See attached collectd.conf file .

BUT really not sure if the threshold configuration in my collectd.conf is correct or working ... is there a way to confirm this ? ( any collectd experts out there ? )
OR
Is there an example collectd.conf that has notifications or thresholds (whatever vitrage needs) setup for something basic like CPU ?

Greg.

From: "Afek, Ifat (Nokia - IL/Kfar Sava)" ifat.afek@nokia.com
Reply-To: "openstack-dev@lists.openstack.org" openstack-dev@lists.openstack.org
Date: Monday, August 28, 2017 at 9:42 AM
To: "openstack-dev@lists.openstack.org" openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [vitrage] Collectd - to - Vitrage setup issues

Hi Greg,

I’m less familiar with the collectd configuration and the events that it sends.

Regarding the collectdconf.yaml, it is definitely missing. You should add a /etc/vitrage/collectdconf.yaml file that looks like that:

collectd:
- collectdhost:
type:
name:
- collectd
host: …

This file maps a Collectd resource to the corresponding resource in OpenStack. Only resources that are listed in this file will have their alarms imported to Vitrage.

Next, you should add a reference to this file in /etc/vitrage/vitrage.conf:

[collectd]
configfile = /etc/vitrage/collectdconf.yaml

Then you should restart vitrage-graph.

Let me know if it helped,
Ifat.

From: "Waines, Greg" Greg.Waines@windriver.com
Date: Wednesday, 23 August 2017 at 21:19

I am trying to get collectd to report some alarms to vitrage in a devstack setup,

I am using a devstack created on a late version of ocata.
And my devstack with vitrage appears to be working ok otherwise;
e.g. I can create VMs, and raise fake alarms using “vitrage event post -type=compute.host.down ...” or with “aodh alarm create ... resource_id=instance-uuid” ... and they get reported fine in vitrage.

UNFORTUNATELY not seeing anything in vitrage from collectd, and
don’t believe I’m seeing anything even from collectd, for example from the syslog output plugin.

I’ve attached the following files: ( not sure if these get distributed on mailing list )
· /etc/collectd/collectd.conf <-- do these look ok ?
· /etc/vitrage/vitrage.conf <-- do these look ok ?
· /var/log/syslog ... around the time when I updated collectd.conf and vitrage.conf and restarted collectd and vitrage-graph
o QUESTIONS
• NOTE THE FOLLOWING ERRORS IN THE SYSLOG FILE ... where do I get the collectdconf.yaml file from ? Can’t see it in the devstack files for vitrage.
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.039 25962 ERROR vitrage.utils.file [-] File doesn't exist: /etc/vitrage/collectd
conf.yaml.
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver [-] failed in init 'NoneType' object has no attribute 'getitem' : TypeError: 'NoneType' object has no attribute 'getitem'
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver Traceback (most recent call last):
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver File "/opt/stack/vitrage/vitrage/datasources/collectd/driver.py", line 65, in configurationmapping
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver collectdconfigelements = collectdconfig[COLLECTDDATASOURCE]
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver TypeError: 'NoneType' object has no attribute 'getitem'
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver

• IT DOESN”T SEEM LIKE collectd is actually getting any events anyways ... shouldn’t I see some collectd events being reported in /var/log/syslog from some of the monitoring plugins that are loaded ?
· gregs-air:collectd-info gregwaines$ fgrep "localhost collectd" syslog
· Aug 23 13:56:07 localhost collectd[23267]: supervised by systemd, will signal readyness
· Aug 23 13:56:07 localhost collectd[23267]: Initialization complete, entering read-loop.
· Aug 23 13:56:07 localhost collectd[23267]: rrdtool plugin: Adjusting "RandomTimeout" to 0.000 seconds.
· Aug 23 14:09:05 localhost collectd[23267]: Exiting normally.
· Aug 23 14:09:05 localhost collectd[23267]: collectd: Stopping 5 read threads.
· Aug 23 14:09:05 localhost collectd[23267]: rrdtool plugin: Shutting down the queue thread.
· Aug 23 14:09:05 localhost collectd[23267]: collectd: Stopping 5 write threads.
· Aug 23 14:09:07 localhost collectd[25824]: supervised by systemd, will signal readyness
· Aug 23 14:09:07 localhost collectd[25824]: Initialization complete, entering read-loop.
· Aug 23 14:09:07 localhost collectd[25824]: rrdtool plugin: Adjusting "RandomTimeout" to 0.000 seconds.
·
· /etc/vitrage/templates/hostdownscenarios.yaml
· /etc/vitrage/templates/hosthighcpuloadscenarios.yaml
o Am I suppose to have some templates that are specific to the collectd events/alarms that are being reported to vitrage ?

Any other suggestions on things to look at in order to understand what’s wrong ?

Greg.


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Aug 31, 2017 by Waines,_Greg (2,700 points)   1 4 9
0 votes

Hi Greg,

“I believe you can only have one python plugin defined, so I commented out the vitrage related one.”

  • VM: no, you can use several python plugins at the same time. The LoadPlugin python should be mentioned only once in the configuration.

Also, based on the whole configuration file you attached, there are few more plugin enabled except cpu, python and threshold like df, disk etc.

Please try my collectd configuration file, which enables just logfile, csv, python (for monitoring metrics and dumping the notifications) and threshold (generate notification when ), cpu for generating the CPU notification. Just fix the highlighted path in the collectd config file below. Once it works for you, enable vitrage collectd plugin (and all other interesting plugins) in the configuration.

To see what metrics are dispatched by collectd (example for cpu 0, idle):
tail -F /var/lib/collectd/csv/devstack-vitrage/cpu-0/percent-idle-2017-09-01

Once, the notification is generated, you should see the notification in /tmp/python-notifications.dump (see the example below).

To stress the CPU 0 (and generate the notification), use this command:
taskset 0x01 yes > /dev/null

Thanks and Regards,
Volodymyr

/tmp/python-notifications.dump


host: devstack-vitrage
plugin: cpu
plugininstance: 0
type: percent
type
instance: idle
time: 1504255457.13
severity: 1
message: Host devstack-vitrage, plugin cpu (instance 0) type percent (instance idle): Data source "value" is currently 17.379679. That is below the failure threshold of 20.000000.

host: devstack-vitrage
plugin: cpu
plugininstance: 0
type: percent
type
instance: idle
time: 1504255472.13
severity: 4
message: Host devstack-vitrage, plugin cpu (instance 0) type percent (instance idle): All data sources are within range again. Current value of "value" is 63.533333.

collectd.conf


Hostname "devstack-vitrage"
FQDNLookup true
Interval 15

LoadPlugin logfile
LoadPlugin csv

LogLevel "info"
File "/work-dir/collectd/install/var/log/collectd.log"
Timestamp true

DataDir "/work-dir/collectd/install/var/lib/collectd/csv"

LoadPlugin cpu

   ReportByCpu true
   ReportByState true
   ValuesPercentage true

LoadPlugin python

ModulePath "/etc"
LogTraces true
Interactive false
Import "collectdnotificationdump"

LoadPlugin "threshold"

    Instance "idle"
    FailureMin 20

/etc/collectdnotificationdump.py


import collectd

def notify(n):
f = open('/tmp/python-notifications.dump', 'a')
f.write('host: {}\n'.format(n.host))
f.write('plugin: {}\n'.format(n.plugin))
f.write('plugininstance: {}\n'.format(n.plugininstance))
f.write('type: {}\n'.format(n.type))
f.write('typeinstance: {}\n'.format(n.typeinstance))
f.write('time: {}\n'.format(n.time))
f.write('severity: {}\n'.format(n.severity))
f.write('message: {}\n'.format(n.message))
f.write('\n')
f.close()

collectd.register_notification(notify)

From: Waines, Greg [mailto:Greg.Waines@windriver.com]
Sent: Thursday, August 31, 2017 8:38 PM
To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org
Cc: Tahhan, Maryam maryam.tahhan@intel.com
Subject: Re: [openstack-dev] [vitrage] Collectd - to - Vitrage setup issues

Hey Volodymyr
... thanks for taking a look at this.

I tried your suggestion.
I believe you can only have one python plugin defined, so I commented out the vitrage related one.
I restarted collectd ... sudo systemctl restart collectd ... which issued no warnings or errors.

But /tmp/python-notifications.dump has not even been created.
SOOO I’m assuming that means there are no collectd NOTIFICATIONS being raised.
I’m sure that’s because I don’t have the correct plugin / threshold configuration for like cpu or memory or disk setup to generate notifications.

Can you take a quick look at my collectd.conf file below.
I have a brief version with just the python and cpu and threshold plugins.
And I have the full file pasted below as well.

thanks in advance for helping me out,
Greg.

stack@devstack-vitrage:~$ cat /etc/collectdnotificationdump.py
import collectd

def notify(n):
f = open('/tmp/python-notifications.dump', 'a')
f.write('host: {}\n'.format(n.host))
f.write('plugin: {}\n'.format(n.plugin))
f.write('plugininstance: {}\n'.format(n.plugininstance))
f.write('type: {}\n'.format(n.type))
f.write('typeinstance: {}\n'.format(n.typeinstance))
f.write('time: {}\n'.format(n.time))
f.write('severity: {}\n'.format(n.severity))
f.write('message: {}\n'.format(n.message))
f.write('\n')
f.close()

collectd.register_notification(notify)

stack@devstack-vitrage:~$

stack@devstack-vitrage:~$ cat /etc/collectd/collectd.conf

LoadPlugin cpu
LoadPlugin python

   ReportByCpu true
   ReportByState true
   ValuesPercentage true

ModulePath "/etc"
LogTraces true
Interactive false
Import "collectdnotificationdump"

LoadPlugin "threshold"

    Instance "wait"
    FailureMax 12
    FailureMin 10

Greg.

p.s. ... here is my complete collectd.conf file

stack@devstack-vitrage:~$ cat /etc/collectd/collectd.conf

Config file for collectd(1).

#

Some plugins need additional configuration and are disabled by default.

Please read collectd.conf(5) for details.

#

You should also read /usr/share/doc/collectd-core/README.Debian.plugins

before enabling any more plugins.

#

Global

----------------------------------------------------------------------------

Global settings for the daemon.

#

Hostname "devstack-vitrage"
FQDNLookup true

BaseDir "/var/lib/collectd"

PluginDir "/usr/lib/collectd"

TypesDB "/usr/share/collectd/types.db" "/etc/collectd/my_types.db"

----------------------------------------------------------------------------

When enabled, plugins are loaded automatically with the default options

when an appropriate block is encountered.

Disabled by default.

----------------------------------------------------------------------------

AutoLoadPlugin false

----------------------------------------------------------------------------

When enabled, internal statistics are collected, using "collectd" as the

plugin name.

Disabled by default.

----------------------------------------------------------------------------

CollectInternalStats false

----------------------------------------------------------------------------

Interval at which to query values. This may be overwritten on a per-plugin

base by using the 'Interval' option of the LoadPlugin block:

Interval 60

----------------------------------------------------------------------------

Interval 15

MaxReadInterval 86400

Timeout 2

ReadThreads 5

WriteThreads 5

Limit the size of the write queue. Default is no limit. Setting up a limit

is recommended for servers handling a high volume of traffic.

WriteQueueLimitHigh 1000000

WriteQueueLimitLow 800000

#

Logging

----------------------------------------------------------------------------

Plugins which provide logging functions should be loaded first, so log

messages generated when loading or configuring other plugins can be

accessed.

#

LoadPlugin logfile
LoadPlugin syslog

LoadPlugin log_logstash

LogLevel "info"

File STDOUT

Timestamp true

PrintSeverity false

LogLevel "info"
File "/var/log/collectd.log"
Timestamp true

   LogLevel info
   NotifyLevel OKAY

LogLevel info

File "/var/log/collectd.json.log"

#

LoadPlugin section

----------------------------------------------------------------------------

Specify what features to activate.

#

LoadPlugin aggregation

LoadPlugin amqp

LoadPlugin apache

LoadPlugin apcups

LoadPlugin ascent

LoadPlugin barometer

LoadPlugin battery

LoadPlugin bind

LoadPlugin ceph

LoadPlugin cgroups

LoadPlugin conntrack

LoadPlugin contextswitch

LoadPlugin cpu

LoadPlugin cpufreq

LoadPlugin csv

LoadPlugin curl

LoadPlugin curl_json

LoadPlugin curl_xml

LoadPlugin dbi

LoadPlugin df
LoadPlugin disk

LoadPlugin dns

LoadPlugin drbd

LoadPlugin email

LoadPlugin entropy

LoadPlugin ethstat

LoadPlugin exec

LoadPlugin fhcount

LoadPlugin filecount

LoadPlugin fscache

LoadPlugin gmond

LoadPlugin hddtemp

LoadPlugin interface

LoadPlugin ipc

LoadPlugin ipmi

LoadPlugin iptables

LoadPlugin ipvs

LoadPlugin irq

LoadPlugin java

LoadPlugin load

LoadPlugin lvm

LoadPlugin madwifi

LoadPlugin mbmon

LoadPlugin md

LoadPlugin memcachec

LoadPlugin memcached

LoadPlugin memory

LoadPlugin modbus

LoadPlugin multimeter

LoadPlugin mysql

LoadPlugin netlink

LoadPlugin network

LoadPlugin nfs

LoadPlugin nginx

LoadPlugin notify_desktop

LoadPlugin notify_email

LoadPlugin ntpd

LoadPlugin numa

LoadPlugin nut

LoadPlugin olsrd

LoadPlugin openldap

LoadPlugin openvpn

LoadPlugin perl

LoadPlugin pinba

LoadPlugin ping

LoadPlugin postgresql

LoadPlugin powerdns

LoadPlugin processes

LoadPlugin protocols

LoadPlugin python

LoadPlugin redis

LoadPlugin rrdcached

LoadPlugin rrdtool

LoadPlugin sensors

LoadPlugin serial

LoadPlugin sigrok

LoadPlugin smart

LoadPlugin snmp

LoadPlugin statsd

LoadPlugin swap

LoadPlugin table

LoadPlugin tail

LoadPlugin tail_csv

LoadPlugin tcpconns

LoadPlugin teamspeak2

LoadPlugin ted

LoadPlugin thermal

LoadPlugin tokyotyrant

LoadPlugin turbostat

LoadPlugin unixsock

LoadPlugin uptime

LoadPlugin users

LoadPlugin uuid

LoadPlugin varnish

LoadPlugin virt

LoadPlugin vmem

LoadPlugin vserver

LoadPlugin wireless

LoadPlugin write_graphite

LoadPlugin write_http

LoadPlugin write_kafka

LoadPlugin write_log

LoadPlugin write_redis

LoadPlugin write_riemann

LoadPlugin write_sensu

LoadPlugin write_tsdb

LoadPlugin zfs_arc

LoadPlugin zookeeper

#

Plugin configuration

----------------------------------------------------------------------------

In this section configuration stubs for each plugin are provided. A desc-

ription of those options is available in the collectd.conf(5) manual page.

#
          #Host "unspecified"
          Plugin "cpu"
          PluginInstance "/[0,2,4,6,8]$/"
          Type "cpu"
          #TypeInstance "unspecified"

          SetPlugin "cpu"
          SetPluginInstance "even-%{aggregation}"

          GroupBy "Host"
          GroupBy "TypeInstance"

          CalculateNum false
          CalculateSum false
          CalculateAverage true
          CalculateMinimum false
          CalculateMaximum false
          CalculateStddev false

Host "localhost"

Port "5672"

VHost "/"

User "guest"

Password "guest"

Exchange "amq.fanout"

RoutingKey "collectd"

Persistent false

StoreRates false

ConnectionRetryDelay 0

URL "http://localhost/server-status?auto"

User "www-user"

Password "secret"

VerifyPeer false

VerifyHost false

CACert "/etc/ssl/ca.crt"

Server "apache"

#

URL "http://some.domain.tld/status?auto"

Host "some.domain.tld"

Server "lighttpd"

Host "localhost"

Port "3551"

ReportSeconds true

URL "http://localhost/ascent/status/"

User "www-user"

Password "secret"

VerifyPeer false

VerifyHost false

CACert "/etc/ssl/ca.crt"

Device "/dev/i2c-0";

Oversampling 512

PressureOffset 0.0

TemperatureOffset 0.0

Normalization 2

Altitude 238.0

TemperatureSensor "myserver/onewire-F10FCA000800/temperature"

ValuesPercentage false

ReportDegraded false

URL "http://localhost:8053/"

#

ParseTime false

#

OpCodes true

QTypes true

ServerStats true

ZoneMaintStats true

ResolverStats false

MemoryStats true

#

QTypes true

ResolverStats true

CacheRRSets true

#

Zone "127.in-addr.arpa/IN"

LongRunAvgLatency false

ConvertSpecialMetricTypes true

SocketPath "/var/run/ceph/ceph-osd.0.asok"

SocketPath "/var/run/ceph/ceph-osd.1.asok"

SocketPath "/var/run/ceph/ceph-mon.ceph1.asok"

SocketPath "/var/run/ceph/ceph-mds.ceph1.asok"

CGroup "libvirt"

IgnoreSelected false

   ReportByCpu true
   ReportByState true
   ValuesPercentage true

   DataDir "/var/lib/collectd/csv"
   StoreRates false

URL "http://finance.google.com/finance?q=NYSE%3AAMD"

User "foo"

Password "bar"

Digest false

VerifyPeer true

VerifyHost true

CACert "/path/to/ca.crt"

Header "X-Custom-Header: foobar"

Post "foo=bar"

#

MeasureResponseTime false

MeasureResponseCode false

Regex "]> *([0-9]\.[0-9]+) *"

DSType "GaugeAverage"

Type "stock_value"

Instance "AMD"

See: http://wiki.apache.org/couchdb/Runtime_Statistics

Instance "httpd"

Type "http_requests"

#

Type "httprequestmethods"

#

Type "httpresponsecodes"

Database status metrics:

Instance "dbs"

Type "gauge"

Type "counter"

Type "bytes"

Host "my_host"

Instance "some_instance"

User "collectd"

Password "thaiNg0I"

Digest false

VerifyPeer true

VerifyHost true

CACert "/path/to/ca.crt"

Header "X-Custom-Header: foobar"

Post "foo=bar"

#

Type "magic_level"

InstancePrefix "prefix-"

InstanceFrom "td[1]"

ValuesFrom "td[2]/span[@class=\"level\"]"

Statement "SELECT 'customers' AS ckey, COUNT(*) AS cvalue \

FROM customers_tbl"

MinVersion 40102

MaxVersion 50042

Type "gauge"

InstancePrefix "customer"

InstancesFrom "c_key"

ValuesFrom "c_value"

#

Driver "mysql"

DriverOption "host" "localhost"

DriverOption "username" "collectd"

DriverOption "password" "secret"

DriverOption "dbname" "custdb0"

SelectDB "custdb0"

Query "numofcustomers"

Query "..."

Host "..."

Device "/dev/sda1"

Device "192.168.0.2:/mnt/nfs"

MountPoint "/home"

FSType "ext3"

   # ignore rootfs; else, the root file-system would appear twice, causing
   # one of the updates to fail and spam the log
   FSType rootfs
   # ignore the usual virtual / temporary file-systems
   FSType sysfs
   FSType proc
   FSType devtmpfs
   FSType devpts
   FSType tmpfs
   FSType fusectl
   FSType cgroup
   IgnoreSelected true

ReportByDevice false

ReportInodes false

ValuesAbsolute true

ValuesPercentage false

Disk "hda"

Disk "/sda[23]/"

IgnoreSelected false

UseBSDName false

UdevNameAttr "DEVNAME"

Interface "eth0"

IgnoreSource "192.168.0.1"

SelectNumericQueryTypes false

SocketFile "/var/run/collectd-email"

SocketGroup "collectd"

SocketPerms "0770"

MaxConns 5

Interface "eth0"

Map "rxcsumoffloaderrors" "ifrxerrors" "checksumoffload"

Map "multicast" "if_multicast"

MappedOnly false

Exec user "/path/to/exec"

Exec "user:group" "/path/to/exec"

NotificationExec user "/path/to/exec"

NotificationExec "user" "/usr/lib/collectd/notify.sh"

ValuesAbsolute true

ValuesPercentage false

Instance "foodir"

Name "*.conf"

MTime "-5m"

Size "+10k"

Recursive true

IncludeHidden false

MCReceiveFrom "239.2.11.71" "8649"

#

Type "swap"

TypeInstance "total"

DataSource "value"

#

Type "swap"

TypeInstance "free"

DataSource "value"

Host "127.0.0.1"

Port 7634

   Interface "ens3"
   IgnoreSelected false

Sensor "some_sensor"

Sensor "another_one"

IgnoreSelected false

NotifySensorAdd false

NotifySensorRemove true

NotifySensorNotPresent false

Chain "table" "chain"

Chain6 "table" "chain"

Irq 7

Irq 8

Irq 9

IgnoreSelected true

JVMArg "-verbose:jni"

JVMArg "-Djava.class.path=/usr/share/collectd/java/collectd-api.jar"

#

LoadPlugin "org.collectd.java.GenericJMX"

# See /usr/share/doc/collectd/examples/GenericJMX.conf

# for an example config.

ReportRelative true

Interface "wlan0"

IgnoreSelected false

Source "SysFS"

WatchSet "None"

WatchAdd "node_octets"

WatchAdd "node_rssi"

WatchAdd "isrxacl"

WatchAdd "isscanactive"

Host "127.0.0.1"

Port 411

Device "/dev/md0"

IgnoreSelected false

Server "localhost"

Key "page_key"

Regex "(\d<file:///\d>+) bytes sent"

ExcludeRegex ""

DSType CounterAdd

Type "ipt_octets"

Instance "type_instance"

Socket "/var/run/memcached.sock"

or:

Host "127.0.0.1"

Port "11211"

   ValuesAbsolute true
   ValuesPercentage false

RegisterBase 1234

RegisterCmd ReadHolding

RegisterType float

Type gauge

Instance "..."

#

Address "addr"

Port "1234"

Interval 60

#

Instance "foobar" # optional

Collect "data_name"

Host "database.serv.er"

Port "3306"

User "db_user"

Password "secret"

Database "db_name"

MasterStats true

ConnectTimeout 10

InnodbStats true

#

Alias "squeeze"

Host "localhost"

Socket "/var/run/mysql/mysqld.sock"

SlaveStats true

SlaveNotifications true

Interface "All"

VerboseInterface "All"

QDisc "eth0" "pfifo_fast-1:0"

Class "ppp0" "htb-1:10"

Filter "ppp0" "u32-1:0"

IgnoreSelected false

# client setup:

Server "ff18::efc0:4a42" "25826"

SecurityLevel Encrypt

Username "user"

Password "secret"

Interface "eth0"

ResolveInterval 14400

TimeToLive 128

#

# server setup:

Listen "ff18::efc0:4a42" "25826"

SecurityLevel Sign

AuthFile "/etc/collectd/passwd"

Interface "eth0"

MaxPacketSize 1452

#

# proxy setup (client and server as above):

Forward true

#

# statistics about the network plugin itself

ReportStats false

#

# "garbage collection"

CacheFlush 1800

URL "http://localhost/status?auto"

User "www-user"

Password "secret"

VerifyPeer false

VerifyHost false

CACert "/etc/ssl/ca.crt"

OkayTimeout 1000

WarningTimeout 5000

FailureTimeout 0

SMTPServer "localhost"

SMTPPort 25

SMTPUser "my-username"

SMTPPassword "my-password"

From "collectd@main0server.com"

# <WARNING/FAILURE/OK> on .

# Beware! Do not use not more than two placeholders (%)!

Subject "[collectd] %s on %s!"

Recipient "email1@domain1.net"

Recipient "email2@domain2.com"

Host "localhost"

Port 123

ReverseLookups false

IncludeUnitID true

UPS "upsname@hostname:port"

Host "127.0.0.1"

Port "2006"

CollectLinks "Summary"

CollectRoutes "Summary"

CollectTopology "Summary"

URL "ldap://localhost:389"

StartTLS false

VerifyHost true

CACert "/path/to/ca.crt"

Timeout -1

Version 3

StatusFile "/etc/openvpn/openvpn-status.log"

ImprovedNamingSchema false

CollectCompression true

CollectIndividualUsers true

CollectUserCount false

IncludeDir "/my/include/path"

BaseName "Collectd::Plugins"

EnableDebugger ""

LoadPlugin Monitorus

LoadPlugin OpenVZ

#

Foo "Bar"

Qux "Baz"

Address "::0"

Port "30002"

Host "host name"

Server "server name"

Script "script name"

Host "host.foo.bar"

Host "host.baz.qux"

Interval 1.0

Timeout 0.9

TTL 255

SourceAddress "1.2.3.4"

Device "eth0"

MaxMissed -1

Statement "SELECT magic FROM wizard WHERE host = $1;"

Param hostname

#

Type gauge

InstancePrefix "magic"

ValuesFrom "magic"

#

Statement "SELECT COUNT(type) AS count, type \

FROM (SELECT CASE \

WHEN resolved = 'epoch' THEN 'open' \

ELSE 'resolved' END AS type \

FROM tickets) type \

GROUP BY type;"

#

Type counter

InstancePrefix "rt36_tickets"

InstancesFrom "type"

ValuesFrom "count"

#

# See /usr/share/doc/collectd-core/examples/postgresql/collectd_insert.sql for details

Statement "SELECT collectd_insert($1, $2, $3, $4, $5, $6, $7, $8, $9);"

StoreRates true

#

Host "hostname"

Port 5432

User "username"

Password "secret"

#

SSLMode "prefer"

KRBSrvName "kerberosservicename"

#

Query magic

#

Interval 60

Service "service_name"

#

Query backend # predefined

Query rt36_tickets

#

Service "collectd_store"

Writer sqlstore

# see collectd.conf(5) for details

CommitInterval 30

Collect "latency"

Collect "udp-answers" "udp-queries"

Socket "/var/run/pdns.controlsocket"

Collect "questions"

Collect "cache-hits" "cache-misses"

Socket "/var/run/pdns_recursor.controlsocket"

LocalSocket "/opt/collectd/var/run/collectd-powerdns"

Process "name"

ProcessMatch "foobar" "/usr/bin/perl foobar\.pl.*"

Value "/^Tcp:/"

IgnoreSelected false

ModulePath "/path/to/your/python/modules"

LogTraces true

Interactive true

Import "spam"

#

spam "wonderful" "lovely"

    # ModulePath "/opt/stack/vitrage/vitrage/datasources/collectd/"
    # LogTraces true
    # Interactive false
    # Import "collectd_vitrage.vitrageplugin"
    # Import "collectd_vitrage.getsigchld"

#
#
# transport_url "rabbit://stackrabbit:admin@10.10.10.13:5672/"
#

ModulePath "/etc"
LogTraces true
Interactive false
Import "collectdnotificationdump"

Host "redis.example.com"

Port "6379"

Timeout 2000

DaemonAddress "unix:/var/run/rrdcached.sock"

DataDir "/var/lib/rrdcached/db/collectd"

CreateFiles true

CreateFilesAsync false

CollectStatistics true

#

The following settings are rather advanced

and should usually not be touched:

StepSize 10

HeartBeat 20

RRARows 1200

RRATimespan 158112000

XFF 0.1

   DataDir "/var/lib/collectd/rrd"

CacheTimeout 120

CacheFlush 900

WritesPerSecond 30

CreateFilesAsync false

RandomTimeout 0

#

The following settings are rather advanced

and should usually not be touched:

StepSize 10

HeartBeat 20

RRARows 1200

RRATimespan 158112000

XFF 0.1

SensorConfigFile "/etc/sensors3.conf"

Sensor "it8712-isa-0290/temperature-temp1"

Sensor "it8712-isa-0290/fanspeed-fan3"

Sensor "it8712-isa-0290/voltage-in8"

IgnoreSelected false

LogLevel 3

Driver "fluke-dmm"

MinimumInterval 10

Conn "/dev/ttyUSB2"

Driver "cem-dt-885x"

Conn "/dev/ttyUSB1"

Disk "/^[hs]d[a-f][0-9]?$/"

IgnoreSelected false

See /usr/share/doc/collectd/examples/snmp-data.conf.gz for a

comprehensive sample configuration.

Type "voltage"

Table false

Instance "input_line1"

Scale 0.1

Values "SNMPv2-SMI::enterprises.6050.5.4.1.1.2.1"

Type "users"

Table false

Instance ""

Shift -1

Values "HOST-RESOURCES-MIB::hrSystemNumUsers.0"

Type "if_octets"

Table true

InstancePrefix "traffic"

Instance "IF-MIB::ifDescr"

Values "IF-MIB::ifInOctets" "IF-MIB::ifOutOctets"

#

Address "192.168.0.2"

Version 1

Community "community_string"

Collect "std_traffic"

Inverval 120

Address "192.168.0.42"

Version 2

Community "another_string"

Collect "stdtraffic" "hrusers"

Address "192.168.0.3"

Version 1

Community "more_communities"

Collect "powerplusvoltgeinput"

Interval 300

Host "::"

Port "8125"

DeleteCounters false

DeleteTimers false

DeleteGauges false

DeleteSets false

TimerPercentile 90.0

TimerPercentile 95.0

TimerPercentile 99.0

TimerLower false

TimerUpper false

TimerSum false

TimerCount false

ReportByDevice false

ReportBytes true

Instance "slabinfo"

Separator " "

Type gauge

InstancePrefix "active_objs"

InstancesFrom 0

ValuesFrom 1

Type gauge

InstancePrefix "objperslab"

InstancesFrom 0

ValuesFrom 4

Instance "exim"

Interval 60

Regex "S=([1-9][0-9]*)"

DSType "CounterAdd"

Type "ipt_bytes"

Instance "total"

Regex "\<R=localuser\><file:///\%3cR=localuser\%3e>"

ExcludeRegex "\<R=localuser\>.*mailspool defer<file:///\%3cR=localuser\%3e.*mailspool%20defer>"

DSType "CounterInc"

Type "counter"

Instance "local_user"

Type "percent"

Instance "dropped"

ValueFrom 1

Type "bytes"

Instance "wire-realtime"

ValueFrom 2

Type "alertspersecond"

ValueFrom 3

Type "kpacketswireper_sec.realtime"

ValueFrom 4

Instance "snort-eth0"

Interval 600

Collect "dropped" "mbps" "alerts" "kpps"

TimeFrom 0

ListeningPorts false

AllPortsSummary false

LocalPort "25"

RemotePort "25"

Host "127.0.0.1"

Port "51234"

Server "8767"

Device "/dev/ttyUSB0"

Retries 0

ForceUseProcfs false

Device "THRM"

IgnoreSelected false

Host "localhost"

Port "1978"

None of the following option should be set manually

This plugin automatically detect most optimal options

Only set values here if:

- The module ask you to

- You want to disable the collection of some data

- Your (intel) CPU is not supported (yet) by the module

- The module generate a lot of errors 'MSR offset 0x... read failed'

In the last two cases, please open a bug request

#

TCCActivationTemp "100"

CoreCstates "392"

PackageCstates "396"

SystemManagementInterrupt true

DigitalTemperatureSensor true

PackageThermalManagement true

RunningAveragePowerLimit "7"

SocketFile "/var/run/collectd-unixsock"

SocketGroup "collectd"

SocketPerms "0660"

DeleteSocket false

UUIDFile "/etc/uuid"

CollectBackend true

CollectBan false # Varnish 3 and above

CollectCache true

CollectConnections true

CollectDirectorDNS false # Varnish 3 only

CollectESI false

CollectFetch false

CollectHCB false

CollectObjects false

CollectPurge false # Varnish 2 only

CollectSession false

CollectSHM true

CollectSMA false # Varnish 2 only

CollectSMS false

CollectSM false # Varnish 2 only

CollectStruct false

CollectTotals false

CollectUptime false # Varnish 3 and above

CollectdVCL false

CollectVSM false # Varnish 4 only

CollectWorkers false

#

CollectCache true

Connection "xen:///"

RefreshInterval 60

Domain "name"

BlockDevice "name:device"

InterfaceDevice "name:device"

IgnoreSelected false

HostnameFormat name

InterfaceFormat name

PluginInstanceFormat name

Verbose false

Host "localhost"

Port "2003"

Protocol "tcp"

LogSendErrors true

Prefix "collectd"

Postfix "collectd"

StoreRates true

AlwaysAppendDS false

EscapeCharacter "_"

URL "http://example.com/collectd-post"

User "collectd"

Password "secret"

VerifyPeer true

VerifyHost true

CACert "/etc/ssl/ca.crt"

CAPath "/etc/ssl/certs/"

ClientKey "/etc/ssl/client.pem"

ClientCert "/etc/ssl/client.crt"

ClientKeyPass "secret"

SSLVersion "TLSv1"

Format "Command"

StoreRates false

BufferSize 4096

LowSpeedLimit 0

Timeout 0

Property "metadata.broker.list" "localhost:9092"

Format JSON

Host "localhost"

Port 5555

Protocol TCP

Batch true

BatchMaxSize 8192

StoreRates true

AlwaysAppendDS false

TTLFactor 2.0

Notifications true

CheckThresholds false

EventServicePrefix ""

Tag "foobar"

Attribute "foo" "bar"

Host "localhost"

Port 3030

StoreRates true

AlwaysAppendDS false

Notifications true

Metrics true

EventServicePrefix ""

MetricHandler "influx"

MetricHandler "default"

NotificationHandler "flapjack"

NotificationHandler "howling_monkey"

Tag "foobar"

Attribute "foo" "bar"

Host "localhost"

Port "4242"

HostTags "status=production"

StoreRates false

AlwaysAppendDS false

Host "localhost"

Port "2181"

LoadPlugin "threshold"

    Instance "wait"
    FailureMax 12
    FailureMin 10

   Filter "*.conf"

stack@devstack-vitrage:~$

From: "Mytnyk, VolodymyrX" volodymyrx.mytnyk@intel.com
Reply-To: "openstack-dev@lists.openstack.org" openstack-dev@lists.openstack.org
Date: Thursday, August 31, 2017 at 4:05 AM
To: "openstack-dev@lists.openstack.org" openstack-dev@lists.openstack.org
Cc: "TAHHAN, MARYAM" maryam.tahhan@intel.com
Subject: Re: [openstack-dev] [vitrage] Collectd - to - Vitrage setup issues

Hi Greg,

            First of all, let’s make sure that the notification is generated by collectd. To do so, create a simple collectd python plugin to dump notifications into /tmp/python-notifications.dump' file:

cat > /etc/collectdnotificationdump.py <<EOF
import collectd

def notify(n):
f = open('/tmp/python-notifications.dump', 'a')
f.write('host: {}\n'.format(n.host))
f.write('plugin: {}\n'.format(n.plugin))
f.write('plugininstance: {}\n'.format(n.plugininstance))
f.write('type: {}\n'.format(n.type))
f.write('typeinstance: {}\n'.format(n.typeinstance))
f.write('time: {}\n'.format(n.time))
f.write('severity: {}\n'.format(n.severity))
f.write('message: {}\n'.format(n.message))
f.write('\n')
f.close()

collectd.register_notification(notify)

ModulePath "/etc"
LogTraces true
Interactive false
Import "collectdnotificationdump"

Restart collectd. All collectd notifications will be dump in /tmp/python-notifications.dump' file. E.g. if the collectd threshold plugin generate the notification, it will appear in the dump file. If not, there may be a problem with configuring the threshold plugin.

Thanks and Regards,
Volodymyr

From: Waines, Greg [mailto:Greg.Waines@windriver.com]
Sent: Wednesday, August 30, 2017 10:18 PM
To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org
Cc: Tahhan, Maryam maryam.tahhan@intel.com
Subject: Re: [openstack-dev] [vitrage] Collectd - to - Vitrage setup issues

Hi Ifat,
thanks for the reply ... just got around to trying your suggestions.

This definitely helps ... I no longer get any errors on re-starting collectd or vitrage-graph.
i.e. it appears to load the collectd and updated vitrage conf files correctly now.

Now still don’t get any alarms in vitrage.
HOWEVER I suspect it may be my collectd setup now.
( WARNING I am NOT a collectd expert. ;) )

I suspect that the vitrage-collectd plugin only sends collectd NOTIFICATIONS or THRESHOLD Events to vitrage.
i.e. it likely does NOT send just statistic/status samples to vitrage.

I can see that collectd sampling is happening ... I have logfile and csv and rrd plugins running and samples are being captured in the specified directories / files.

I tried to set threshold for CPU based on an example I had found on web.
See attached collectd.conf file .

BUT really not sure if the threshold configuration in my collectd.conf is correct or working ... is there a way to confirm this ? ( any collectd experts out there ? )
OR
Is there an example collectd.conf that has notifications or thresholds (whatever vitrage needs) setup for something basic like CPU ?

Greg.

From: "Afek, Ifat (Nokia - IL/Kfar Sava)" ifat.afek@nokia.com
Reply-To: "openstack-dev@lists.openstack.org" openstack-dev@lists.openstack.org
Date: Monday, August 28, 2017 at 9:42 AM
To: "openstack-dev@lists.openstack.org" openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [vitrage] Collectd - to - Vitrage setup issues

Hi Greg,

I’m less familiar with the collectd configuration and the events that it sends.

Regarding the collectdconf.yaml, it is definitely missing. You should add a /etc/vitrage/collectdconf.yaml file that looks like that:

collectd:
- collectdhost:
type:
name:
- collectd
host: …

This file maps a Collectd resource to the corresponding resource in OpenStack. Only resources that are listed in this file will have their alarms imported to Vitrage.

Next, you should add a reference to this file in /etc/vitrage/vitrage.conf:

[collectd]
configfile = /etc/vitrage/collectdconf.yaml

Then you should restart vitrage-graph.

Let me know if it helped,
Ifat.

From: "Waines, Greg" Greg.Waines@windriver.com
Date: Wednesday, 23 August 2017 at 21:19

I am trying to get collectd to report some alarms to vitrage in a devstack setup,

I am using a devstack created on a late version of ocata.
And my devstack with vitrage appears to be working ok otherwise;
e.g. I can create VMs, and raise fake alarms using “vitrage event post -type=compute.host.down ...” or with “aodh alarm create ... resource_id=instance-uuid” ... and they get reported fine in vitrage.

UNFORTUNATELY not seeing anything in vitrage from collectd, and
don’t believe I’m seeing anything even from collectd, for example from the syslog output plugin.

I’ve attached the following files: ( not sure if these get distributed on mailing list )
· /etc/collectd/collectd.conf <-- do these look ok ?
· /etc/vitrage/vitrage.conf <-- do these look ok ?
· /var/log/syslog ... around the time when I updated collectd.conf and vitrage.conf and restarted collectd and vitrage-graph
o QUESTIONS
• NOTE THE FOLLOWING ERRORS IN THE SYSLOG FILE ... where do I get the collectdconf.yaml file from ? Can’t see it in the devstack files for vitrage.
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.039 25962 ERROR vitrage.utils.file [-] File doesn't exist: /etc/vitrage/collectd
conf.yaml.
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver [-] failed in init 'NoneType' object has no attribute 'getitem' : TypeError: 'NoneType' object has no attribute 'getitem'
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver Traceback (most recent call last):
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver File "/opt/stack/vitrage/vitrage/datasources/collectd/driver.py", line 65, in configurationmapping
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver collectdconfigelements = collectdconfig[COLLECTDDATASOURCE]
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver TypeError: 'NoneType' object has no attribute 'getitem'
· Aug 23 14:09:31 localhost vitrage-graph[25962]: 2017-08-23 14:09:31.040 25962 ERROR vitrage.datasources.collectd.driver

• IT DOESN”T SEEM LIKE collectd is actually getting any events anyways ... shouldn’t I see some collectd events being reported in /var/log/syslog from some of the monitoring plugins that are loaded ?
· gregs-air:collectd-info gregwaines$ fgrep "localhost collectd" syslog
· Aug 23 13:56:07 localhost collectd[23267]: supervised by systemd, will signal readyness
· Aug 23 13:56:07 localhost collectd[23267]: Initialization complete, entering read-loop.
· Aug 23 13:56:07 localhost collectd[23267]: rrdtool plugin: Adjusting "RandomTimeout" to 0.000 seconds.
· Aug 23 14:09:05 localhost collectd[23267]: Exiting normally.
· Aug 23 14:09:05 localhost collectd[23267]: collectd: Stopping 5 read threads.
· Aug 23 14:09:05 localhost collectd[23267]: rrdtool plugin: Shutting down the queue thread.
· Aug 23 14:09:05 localhost collectd[23267]: collectd: Stopping 5 write threads.
· Aug 23 14:09:07 localhost collectd[25824]: supervised by systemd, will signal readyness
· Aug 23 14:09:07 localhost collectd[25824]: Initialization complete, entering read-loop.
· Aug 23 14:09:07 localhost collectd[25824]: rrdtool plugin: Adjusting "RandomTimeout" to 0.000 seconds.
·
· /etc/vitrage/templates/hostdownscenarios.yaml
· /etc/vitrage/templates/hosthighcpuloadscenarios.yaml
o Am I suppose to have some templates that are specific to the collectd events/alarms that are being reported to vitrage ?

Any other suggestions on things to look at in order to understand what’s wrong ?

Greg.


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev