Recently we had an issue with rsyslog daemon, now and then it was crashing and the only way to debug such an application (especially if it’s not something developed by your team) is using core dump. It’s not kind of task that you do every day and it took me a while to search and remember how did it last time.
Enable Core Dump
The first is step is to enable the Core Dump, It’s pretty simple. Follow this guide and you can enable it in no time.
The core dump only will be generated for those process that started after the above changes. Don’t forget to restart the daemon or application you have problem with.
Wait for the next crash!
Make coffee and enjoy the life and be prepared for the next crash :)
We are using Zenoss, it’s monitoring the important services like apache, rsyslog and etc. If the process is not running we get an SMS right away.
Install debugging packages
Make sure you have gdb package installed.
GDB is GNU Debugger is the standard debugger for the GNU operating system.
If you are using Centos during the debugging you might see some messages like
In order to be able to debug and see the full stacktrace you need to install
yum-utils. Then you need to install the debugging packages for the application with
debuginfo-install. It installs headers and debugging tools that are required to debug rsyslog daemon.
Read the Core dump
What you are going to get out of this core dump is stacktrace of application and the exact line of code that caused this failure which requires some programming skills. . I’m going to explain how did I debug and read the stacktrace for rsyslog but you can follow the same steps to do it for any other application.
gdbfor the core dump
You need to run the following command to start the
gdb for that specific core dump.
1 2 3 4 5 6 7 8 9
Based on my configuration the core dump saved under
- Now gdb prompt is ready for a command.
If you are new to gdb give yourself a faviour and check this gdb crash course
The First command that I usally run especially in this suiation is where. It spits out the stacktrace and the line of code which was running when the crash happend.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
What I can read from this output is the program crashed in queue.c in function qDelLinkedList at line 586. My gut feeling is it has something to do with memory allocation. In order to follow up more deeply I had to find the right source code that matches with my application. We found out that the application crashed on calling free system call over a variable. That looks like a dead end to me! Fortunately there was a new package of libc available on Centos, we upgraded it and so for everything works smoothly.
These steps are just a starting point for debugging a crash. I wish you a wonderful journey!