Debug logs
The main debugging tool for Ceph is the dout and derr logging functions.Collectively, these are referred to as “dout logging.”
Dout has several log faculties, which can be set at various loglevels using the configuration management system. So it is possible to enabledebugging just for the messenger, by setting debug_ms to 10, for example.
The dout macro avoids even generating log messages which are not going to beused, by enclosing them in an “if” statement. What this means is that if youhave the debug level set at 0, and you run this code:
- dout(20) << "myfoo() = " << myfoo() << dendl;
myfoo() will not be called here.
Unfortunately, the performance of debug logging is relatively low. This isbecause there is a single, process-wide mutex which every debug outputstatement takes, and every debug output statement leads to a write() systemcall or a call to syslog(). There is also a computational overhead to using C++streams to consider. So you will need to be parsimonious in your logging to getthe best performance.
Sometimes, enabling logging can hide race conditions and other bugs by changingthe timing of events. Keep this in mind when debugging.
Performance counters
Ceph daemons use performance counters to track key statistics like number ofinodes pinned. Performance counters are essentially sets of integers and floatswhich can be set, incremented, and read using the PerfCounters API.
A PerfCounters object is usually associated with a single subsystem. Itcontains multiple counters. This object is thread-safe because it is protectedby an internal mutex. You can create multiple PerfCounters objects.
Currently, three types of performance counters are supported: u64 counters,float counters, and long-run floating-point average counters. These are createdby PerfCountersBuilder::add_u64, PerfCountersBuilder::add_fl, andPerfCountersBuilder::add_fl_avg, respectively. u64 and float counters simplyprovide a single value which can be updated, incremented, and read atomically.floating-pointer average counters provide two values: the current total, andthe number of times the total has been changed. This is intended to provide along-run average value.
Performance counter information can be read in JSON format from theadministrative socket (admin_sock). This is implemented as a UNIX domainsocket. The Ceph performance counter plugin for collectd shows an example of howto access this information. Another example can be found in the unit tests forthe administrative sockets.