Skip to end of metadata
Go to start of metadata

In this manual we will show you a few tools for debugging the cmc core, if it's crashing.



Analyze cmc core

strace

you can use strace to track the cmc process when you face any isse:

klapp-0130# strace -o cmc-strace.log -p $(cat ~<MYSITE>/tmp/run/cmc.pid)


valgrind

You can use valgrind to start the cmc in the debug mode. Here you will get a full stack trace. If valgrind is not available on your system, you can install it or run the cmc only with -g.

su - SITE
omd stop cmc
valgrind --num-callers=30 cmc -g

or 

su - SITE 
omd stop cmc
cmc -g

gdb

With gdb you can analyze the coredump, if checkmk will create one. Note: Checkmk will only create one if you enable it in the global settings.

With the "-r" command you can run the cmc again analyze inside gdb.

gdb /omd/sites/<SITENAME>/bin/cmc --core=<PATH/TO/COREUMP>
(gdb) r 


frozen cmc

When the cmc seems to frozen and nothing happens, please run this command before restarting the cmc:

klapp-0130# gdb -p $(cat ~<MYSITE>/tmp/run/cmc.pid) --batch -ex 'set pagination off' -ex 'thread apply all backtrace'

or to write that to a file:

klapp-0130# gdb -p $(cat ~<MYSITE>/tmp/run/cmc.pid) --batch -ex 'set pagination off' -ex 'thread apply all backtrace' |& tee /home/anastasios/Downloads/cmccrash/gdb.txt

Analyze coredump file

By default, there is no coredump creation enabled. You can enable that via Setup → Global settings → Monitoring core → Enable core dumps

After a crash of the cmc, a coredump in ~/var/check_mk/core/ will be written

gdb

With gdb you can analyze the coredump, if checkmk will create one. Note: Checkmk will only create one if you enable it in the global settings.

gdb /omd/sites/at/bin/cmc --core=/home/anastasios/Downloads/core.python3.989.4b7ee3adffd14e31a0188aac0c215161.804036.1640164046000000 
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2Copyright (C) 2020 Free Software Foundation, Inc.License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>This is free software: you are free to change and redistribute it.There is NO WARRANTY, to the extent permitted by law.Type "show copying" and "show warranty" for details.This GDB was configured as "x86_64-linux-gnu".Type "show configuration" for configuration details.For bug reporting instructions, please see:<http://www.gnu.org/software/gdb/bugs/>.Find the GDB manual and other documentation resources online at:    <http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".Type "apropos word" to search for commands related to "word"...Reading symbols from /omd/sites/at/bin/cmc...
warning: core file may not match specified executable file.[New LWP 804036]Core was generated by `python3 /omd/sites/mysite/bin/cmk --discover-marked-hosts'.Program terminated with signal SIGSEGV, Segmentation fault.#0  0x00007f2b661be1fd in ?? ()
(gdb) where
#0  0x00007f2b661be1fd in ?? ()
#1  0x00007ffed8a75060 in ?? ()
#2  0x0000000000000000 in ?? ()


# Run it (if it's still crashing, you'll see it crash)
r 
# View the backtrace (call stack)
bt  
# Quit when done 
q
# Memory mappings
i proc m

# Listing all threads. This is really useful! 
thread apply all bt

enable log within gdb

set logging file gdb_log.txt
set logging on
set trace-commands on
show logging     # prove logging is on
flush
set pretty print on
bt               # view the backtrace
set logging off  
show logging     # prove logging is back off


objdump

With objdump you can fetch the contect of the dump.

objdump -s /home/anastasios/Downloads/core.python3.989.4b7ee3adffd14e31a0188aac0c215161.804036.1640164046000000 >dump_sup8890.txt


file command

With file command you can also fetch the content of the dump

file /home/anastasios/Downloads/core.python3.989.4b7ee3adffd14e31a0188aac0c215161.804036.1640164046000000 
/home/anastasios/Downloads/core.python3.989.4b7ee3adffd14e31a0188aac0c215161.804036.1640164046000000: ELF 64-bit LSB core file, x86-64, version 1 (SYSV), SVR4-style, from 'python3 /omd/sites/test/bin/cmk --discover-marked-hosts', real uid: 989, effective uid: 989, real gid: 1000, effective gid: 1000, execfn: '/omd/sites/test/bin/python3', platform: 'x86_64'