CEPH Scrubbing impact on client io and performance.

Ceph’s default IO priority and class for behind the scene disk operations should be considered required vs best efforts. For those of us who actually utilize our storage for services that require performance will quickly find that deep scrub grinds even the most powerful systems to a halt.

Below are the settings to run the scrub as the lowest possible priority. This REQUIRES CFQ as the scheduler for the spindle disk. Without CFQ you cannot prioritize IO. Since only 1 service utilizes these disk CFQ performance will be comparable to deadline and noop.

Inject the new settings for the existing OSD:
ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7'
ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle'

Edit your ceph.conf on your storage nodes to automatically set the the priority at runtime.
#Reduce impact of scrub.
osd_disk_thread_ioprio_class = "idle"
osd_disk_thread_ioprio_priority = 7

You can go a step further and setup redhats optimizations for the system charactistics.
tuned-adm profile latency-performance
This information referenced from multiple sources.

Reference documentation.
http://dachary.org/?p=3268

Disable scrubbing in realtime to determine its impact on your running cluster.
http://dachary.org/?p=3157

A detailed analysis of the scrubbing io impact.
http://blog.simon.leinen.ch/2015/02/ceph-deep-scrubbing-impact.html

OSD Configuration Reference
http://ceph.com/docs/master/rados/configuration/osd-config-ref/

Redhat system tuning.
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Performance_Tuning_Guide/sect-Red_Hat_Enterprise_Linux-Performance_Tuning_Guide-Tool_Reference-tuned_adm.html

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.