SUSE Support

Here When You Need Us

openATTIC logs errors: Failed to run "ceph.tasks.get_rbd_performance_data"

This document (7021202) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Enterprise Storage 4

Situation

Many errors similar to the following are logged  to "/var/log/openattic/openattic.log":

2017-07-28 11:22:11,151 1856 runsystemd INFO taskqueue.models#run_once - Running 2005: ceph.tasks.get_rbd_performance_data with [u'56a2cbx-56fb-3c88-83ec-49bd54gs5c28', u'pool_name', u'iamge_name'], {}. Estimated: None
2017-07-28 11:22:11,224 1856 runsystemd ERROR taskqueue.models#run_once - Failed to run "ceph.tasks.get_rbd_performance_data with [u'56a2cbx-56fb-3c88-83ec-49bd54gs5c28', u'pool_name', u'image_name'], {}" created "2017-07-28 11:22:07.633303"
Traceback (most recent call last):
  File "/usr/share/openattic/taskqueue/models.py", line 79, in run_once
    res = task.run_once()
  File "/usr/share/openattic/taskqueue/models.py", line 239, in run_once
    res = self.wrapper.call_now(*self.args, **self.kwargs)
  File "/usr/share/openattic/taskqueue/models.py", line 316, in call_now
    return self._orig_func(*args, **kwargs)
  File "/usr/share/openattic/ceph/tasks.py", line 74, in get_rbd_performance_data
    disk_usage = api.image_disk_usage(pool_name, image_name)
  File "/usr/share/openattic/ceph/librados.py", line 880, in image_disk_usage
    '--pool', pool_name, '--image', name, '--format', 'json'])
  File "/usr/lib64/python2.7/subprocess.py", line 219, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
CalledProcessError: Command '['rbd', 'disk-usage', '--cluster', 'ceph', '--pool', u'pool_name', '--image', u'image_name', '--format', 'json']' returned non-zero exit status 2
2017-07-28 11:22:11,228 1856 runsystemd INFO taskqueue.models#finish_task - Task finished: Command '['rbd', 'disk-usage', '--cluster', 'ceph', '--pool', u'pool_name', '--image', u'image_name', '--format', 'json']' returned non-zero exit status 2

Resolution

There are two possible solutions:

A. From the admin node execute again the command "oaconfig install". This will also result in all the Icinga checks being re-created.

B. Remove the stale configuration files from "/etc/icinga/conf.d/". The naming convention of the RBD (Rados Block Device) configuration files are using the following pattern:

cephrbd_<fsid>_<pool name>_<RBD name>.cfg

For example, if the cluster fsid is "70e9c50b-e375-37c7-a35b-1af02442b751" and the Pool and Image for which the errors are logged are called "testpool" and "testimage" then remove the following stale file from the "/etc/icinga/conf.d/" directory:

cephrbd_70e9c50b-e375-37c7-a35b-1af02442b751_testpool_testimage.cfg

After removing the relevant files restart the following two services:

systemctl restart icinga.service npcd.service

Cause

When removing pools and their images from a cluster it can happen that the Icinga configuration files for these pools are not properly removed from the Icinga configuration.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:7021202
  • Creation Date: 15-Aug-2017
  • Modified Date:03-Mar-2020
    • SUSE Enterprise Storage

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

tick icon

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

tick icon

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

tick icon

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.