增强的异常处理方法(enhancedExceptionHandlingMethod)

需求描述

在利用OpenCV(2.x)进行图像处理算法开发过程中,总会遇到程序崩溃的情况。通常情况下,崩溃发生在程序运行时,OpenCV会给出一个“cv::Exception”,对错误进行简单描述。虽然异常产生于OpenCV库的内部,但是通常是由用户错误调用OpenCV函数引起的,该描述仅限于对异常本身的一点简单描述,没有对异常产生的过程进行清晰的梳理(这需要查看堆栈中函数调用过程),更不能给出异常发生时一些关键变量的值。当程序规模变得复杂一点,当程序部署到生产环境中,程序崩溃了,但用户无法获知有关于程序在崩溃前一时刻的状态的更多信息,无疑对于查找Bug带来困扰,尤其是一些偶发异常,想要重现Bug都很困难。

本文试图找出一个可行的解决方案。以下代码在OpenCV 2.4.4和g++ 4.6.3编译通过。

解决思路

通过学习C/C++、OpenCV(2.x)中对于错误和异常的处理方法,参见《程序中的错误处理方法》,很容易想到这样一种解决方案:

像SIGFPE、SIGSEGV这样的信号,在算法开发过程中是比较常见的,就以此为例进行说明。

这里需要强调的是,SIGFPE信号针对的是”整数除0“这一现象,”浮点数除0“不触发该信号,运算结果是inf。“浮点数除0”需要在<fenv.h>/<cfenv>中检测,属于C++ 11的范畴,在此不做讨论。

失败尝试一

 1 #include <iostream>
 2 #include <csignal>
 3 #include <cstdio>
 4 
 5 sig_atomic_t signaled = 0;
 6 
 7 void new_sigint_handler(int param) {
 8     std::cout << "In new_sigint_handler" << std::endl;
 9     signaled = 1;
10 }
11 
12 int main(int argc, char **argv) {
13     std::cout << "Set user handler for SIGINT signal" << std::endl;
14     __sighandler_t old_sigint_handler;
15     old_sigint_handler = signal(SIGINT, new_sigint_handler);
16     if (SIG_ERR == old_sigint_handler) {
17         perror("Error on set SIGINT handler");
18         return -1;
19     } else if (SIG_DFL == old_sigint_handler) {
20         std::cout << "Found old SIGINT handler: default handling" << std::endl;
21     } else if (SIG_IGN == old_sigint_handler) {
22         std::cout << "Found old SIGINT handler: ignore signal" << std::endl;
23     }
24     std::cout << "Set successfully" << std::endl;
25 
26     while (!signaled) {
27     }
28 
29     std::cout << "Set old handler for SIGINT signal" << std::endl;
30     __sighandler_t reset_sigint_handler;
31     reset_sigint_handler = signal(SIGINT, old_sigint_handler);
32     if (SIG_ERR == reset_sigint_handler) {
33         perror("Error on reset SIGINT handler");
34         return -2;
35     }
36     std::cout << "Set successfully" << std::endl;
37     return 0;
38 }

用户定义的“new_sigint_handler()”函数通过修改全局变量“signaled”的值来通知主程序是否捕获到SIGINT信号。

首先通过“signal()”函数注册用户定义的SIGINT信号处理函数,并且检查注册是否成功,同时提示出原有的SIGINT信号处理函数是什么。然后进入功能部分,即死循环等待SIGINT信号。用户按下ctrl-c组合键,触发SIGINT信号,被用户定义的处理函数捕获。用户定义的处理函数修改signaled值,被主程序检测到,于是跳出死循环。最后恢复原有的信号处理函数。

运行结果:

1 Set user handler for SIGINT signal
2 Found old SIGINT handler: default handling
3 Set successfully
4 ^CIn new_sigint_handler
5 Set old handler for SIGINT signal
6 Set successfully
这种处理方法貌似还不错,但是没有利用C++的异常处理机制,主程序流程不够灵活,显得很笨拙。尤其是,对于SIGFPE、SIGSEGV信号,程序行为出现了异常
 1 #include <iostream>
 2 #include <csignal>
 3 #include <cstdio>
 4 
 5 sig_atomic_t signaled = 0;
 6 
 7 void new_sigfpe_handler(int param) {
 8     std::cout << "In new_sigfpe_handler" << std::endl;
 9     signaled = 1;
10     sleep(1);
11 }
12 
13 int main(int argc, char **argv) {
14     std::cout << "Set user handler for SIGFPE signal" << std::endl;
15     __sighandler_t old_sigfpe_handler;
16     old_sigfpe_handler = signal(SIGFPE, new_sigfpe_handler);
17     if (SIG_ERR == old_sigfpe_handler) {
18         perror("Error on set SIGFPE handler");
19         return -1;
20     } else if (SIG_DFL == old_sigfpe_handler) {
21         std::cout << "Found old SIGFPE handler: default handling" << std::endl;
22     } else if (SIG_IGN == old_sigfpe_handler) {
23         std::cout << "Found old SIGFPE handler: ignore signal" << std::endl;
24     }
25     std::cout << "Set successfully" << std::endl;
26 
27     float a = 0.0;
28     float b = 10 / a;
29     std::cout << "b = " << b << std::endl;
30     int c = 0;
31     int d = 10 / c;
32     while (!signaled) {
33     }
34 
35     std::cout << "Set old handler for SIGFPE signal" << std::endl;
36     __sighandler_t reset_sigfpe_handler;
37     reset_sigfpe_handler = signal(SIGFPE, old_sigfpe_handler);
38     if (SIG_ERR == reset_sigfpe_handler) {
39         perror("Error on reset SIGFPE handler");
40         return -2;
41     }
42     std::cout << "Set successfully" << std::endl;
43     return 0;
44 }

运行结果:

1 Set user handler for SIGFPE signal
2 Found old SIGFPE handler: default handling
3 Set successfully
4 b = inf
5 In new_sigfpe_handler
6 In new_sigfpe_handler
7 In new_sigfpe_handler
8 ^C

在“new_sigfpe_handler()”中,增加“sleep(1);”的目的是为了让程序慢下来,因为:

该函数被无限调用了!!直到用户按下ctrl-c组合键终止程序。

也就是说,系统不停的检测到SIGFPE信号,于是不停的调用用户定义的异常处理函数,主程序无法进行后面的判断。

SIGSEGV信号也存在类似的情况。

 1 #include <iostream>
 2 #include <csignal>
 3 #include <cstdio>
 4 
 5 sig_atomic_t signaled = 0;
 6 
 7 void new_sigsegv_handler(int param) {
 8     std::cout << "In new_sigsegv_handler" << std::endl;
 9     signaled = 1;
10     sleep(1);
11 }
12 
13 int main(int argc, char **argv) {
14     std::cout << "Set user handler for SIGSEGV signal" << std::endl;
15     __sighandler_t old_sigsegv_handler;
16     old_sigsegv_handler = signal(SIGSEGV, new_sigsegv_handler);
17     if (SIG_ERR == old_sigsegv_handler) {
18         perror("Error on set SIGSEGV handler");
19         return -1;
20     } else if (SIG_DFL == old_sigsegv_handler) {
21         std::cout << "Found old SIGSEGV handler: default handling" << std::endl;
22     } else if (SIG_IGN == old_sigsegv_handler) {
23         std::cout << "Found old SIGSEGV handler: ignore signal" << std::endl;
24     }
25     std::cout << "Set successfully" << std::endl;
26 
27     int a[10];
28     for (int i = 0; i < 65535; ++i) {
29         a[i] = i;
30     }
31     std::cout << "Check" << std::endl;
32     for (int i = 50000; i < 50005; ++i) {
33         std::cout << "a[" << i << "] = " << i << std::endl;
34     }
35     while (!signaled) {
36     }
37 
38     std::cout << "Set old handler for SIGSEGV signal" << std::endl;
39     __sighandler_t reset_sigsegv_handler;
40     reset_sigsegv_handler = signal(SIGSEGV, old_sigsegv_handler);
41     if (SIG_ERR == reset_sigsegv_handler) {
42         perror("Error on reset SIGSEGV handler");
43         return -2;
44     }
45     std::cout << "Set successfully" << std::endl;
46     return 0;
47 }

运行结果:

1 Set user handler for SIGSEGV signal
2 Found old SIGSEGV handler: default handling
3 Set successfully
4 In new_sigsegv_handler
5 In new_sigsegv_handler
6 In new_sigsegv_handler
7 ^C

通过查询signal的man手册,看到如下文字:

According to POSIX, the behavior of a process is undefined after it ignores a SIGFPE, SIGILL, or SIGSEGV signal that was not generated by kill(2) or raise(3). Integer division by zero has undefined result. On some architectures it will generate a SIGFPE signal. (Also dividing the most negative integer by -1 may generate SIGFPE.) Ignoring this signal might lead to an endless loop.

失败尝试二

将全局变量修改成抛出异常,期待主程序能够捕获。

 1 #include <iostream>
 2 #include <csignal>
 3 #include <cstdio>
 4 #include <exception>
 5 
 6 void new_sigint_handler(int param) {
 7     std::cout << "In new_sigint_handler" << std::endl;
 8     throw(std::exception());
 9 }
10 
11 int main(int argc, char **argv) {
12     std::cout << "Set user handler for SIGINT signal" << std::endl;
13     __sighandler_t old_sigint_handler;
14     old_sigint_handler = signal(SIGINT, new_sigint_handler);
15     if (SIG_ERR == old_sigint_handler) {
16         perror("Error on set SIGINT handler");
17         return -1;
18     } else if (SIG_DFL == old_sigint_handler) {
19         std::cout << "Found old SIGINT handler: default handling" << std::endl;
20     } else if (SIG_IGN == old_sigint_handler) {
21         std::cout << "Found old SIGINT handler: ignore signal" << std::endl;
22     }
23     std::cout << "Set successfully" << std::endl;
24 
25     try {
26         while (true) {
27         }
28     } catch (std::exception ex) {
29         std::cout << "Catch: " << ex.what() << std::endl;
30     }
31 
32     std::cout << "Set old handler for SIGINT signal" << std::endl;
33     __sighandler_t reset_sigint_handler;
34     reset_sigint_handler = signal(SIGINT, old_sigint_handler);
35     if (SIG_ERR == reset_sigint_handler) {
36         perror("Error on reset SIGINT handler");
37         return -2;
38     }
39     std::cout << "Set successfully" << std::endl;
40     return 0;
41 }

运行结果:

1 Set user handler for SIGINT signal
2 Found old SIGINT handler: default handling
3 Set successfully
4 ^CIn new_sigint_handler
5 terminate called after throwing an instance of 'std::exception'
6   what():  std::exception
7 已放弃 (核心已转储)

看来还是不行,主程序不能catch到异常。

成功版本

经过尝试,“signal()”函数可以成功注册用户自定义的信号处理函数,但是信号处理函数结束后的程序行为需要更明显、更清晰、更强有力的控制,主程序才能正常执行。

C/C++标准库中有这样的头文件<setjmp.h>/<csetjmp>,可以赋予用户这样的控制能力。

下面代码同时注册了三个信号处理函数,主程序中先等待ctrl-c组合键触发SIGINT信号,然后依次触发SIGFPE信号和SIGSEGV信号。

  1 #include <iostream>
  2 #include <csignal>
  3 #include <cstdio>
  4 #include <csetjmp>
  5 #include <exception>
  6 
  7 jmp_buf int_env;
  8 jmp_buf fpe_env;
  9 jmp_buf segv_env;
 10 
 11 void new_sigint_handler(int param) {
 12     std::cout << "In new_sigint_handler" << std::endl;
 13     longjmp(int_env, SIGINT);
 14 }
 15 
 16 void new_sigfpe_handler(int param) {
 17     std::cout << "In new_sigfpe_handler" << std::endl;
 18     longjmp(fpe_env, SIGFPE);
 19 }
 20 
 21 void new_sigsegv_handler(int param) {
 22     std::cout << "In new_sigsegv_handler" << std::endl;
 23     longjmp(segv_env, SIGSEGV);
 24 }
 25 
 26 int main(int argc, char **argv) {
 27     std::cout << "Set user handler for SIGINT signal" << std::endl;
 28     __sighandler_t old_sigint_handler;
 29     old_sigint_handler = signal(SIGINT, new_sigint_handler);
 30     if (SIG_ERR == old_sigint_handler) {
 31         perror("Error on set SIGINT handler");
 32         return -1;
 33     } else if (SIG_DFL == old_sigint_handler) {
 34         std::cout << "Found old SIGINT handler: default handling" << std::endl;
 35     } else if (SIG_IGN == old_sigint_handler) {
 36         std::cout << "Found old SIGINT handler: ignore signal" << std::endl;
 37     }
 38     std::cout << "Set successfully" << std::endl;
 39 
 40     std::cout << "Set user handler for SIGFPE signal" << std::endl;
 41     __sighandler_t old_sigfpe_handler;
 42     old_sigfpe_handler = signal(SIGFPE, new_sigfpe_handler);
 43     if (SIG_ERR == old_sigfpe_handler) {
 44         perror("Error on set SIGFPE handler");
 45         return -2;
 46     } else if (SIG_DFL == old_sigfpe_handler) {
 47         std::cout << "Found old SIGFPE handler: default handling" << std::endl;
 48     } else if (SIG_IGN == old_sigfpe_handler) {
 49         std::cout << "Found old SIGFPE handler: ignore signal" << std::endl;
 50     }
 51     std::cout << "Set successfully" << std::endl;
 52 
 53     std::cout << "Set user handler for SIGSEGV signal" << std::endl;
 54     __sighandler_t old_sigsegv_handler;
 55     old_sigsegv_handler = signal(SIGSEGV, new_sigsegv_handler);
 56     if (SIG_ERR == old_sigsegv_handler) {
 57         perror("Error on set SIGSEGV handler");
 58         return -3;
 59     } else if (SIG_DFL == old_sigsegv_handler) {
 60         std::cout << "Found old SIGSEGV handler: default handling" << std::endl;
 61     } else if (SIG_IGN == old_sigsegv_handler) {
 62         std::cout << "Found old SIGSEGV handler: ignore signal" << std::endl;
 63     }
 64     std::cout << "Set successfully" << std::endl;
 65 
 66     try {
 67         int status = setjmp(int_env);
 68         if (status) {
 69             throw std::exception();
 70         }
 71         while (true) {
 72         }
 73     } catch (std::exception ex) {
 74         std::cout << "Catch SIGINT: " << ex.what() << std::endl;
 75     }
 76 
 77     try {
 78         int status = setjmp(fpe_env);
 79         if (status) {
 80             throw std::exception();
 81         }
 82         int a = 0;
 83         int b = 10 / a;
 84         while (true) {
 85         }
 86     } catch (std::exception ex) {
 87         std::cout << "Catch SIGFPE: " << ex.what() << std::endl;
 88     }
 89 
 90     try {
 91         int status = setjmp(segv_env);
 92         if (status) {
 93             throw std::exception();
 94         }
 95         int *a = NULL;
 96         int b = 10 * (*a);
 97         while (true) {
 98         }
 99     } catch (std::exception ex) {
100         std::cout << "Catch SIGSEGV: " << ex.what() << std::endl;
101     }
102 
103     std::cout << "Set old handler for SIGINT signal" << std::endl;
104     __sighandler_t reset_sigint_handler;
105     reset_sigint_handler = signal(SIGINT, old_sigint_handler);
106     if (SIG_ERR == reset_sigint_handler) {
107         perror("Error on reset SIGINT handler");
108         return -4;
109     }
110     std::cout << "Set successfully" << std::endl;
111 
112     std::cout << "Set old handler for SIGFPE signal" << std::endl;
113     __sighandler_t reset_sigfpe_handler;
114     reset_sigfpe_handler = signal(SIGFPE, old_sigfpe_handler);
115     if (SIG_ERR == reset_sigfpe_handler) {
116         perror("Error on reset SIGFPE handler");
117         return -5;
118     }
119     std::cout << "Set successfully" << std::endl;
120 
121     std::cout << "Set old handler for SIGSEGV signal" << std::endl;
122     __sighandler_t reset_sigsegv_handler;
123     reset_sigsegv_handler = signal(SIGSEGV, old_sigsegv_handler);
124     if (SIG_ERR == reset_sigsegv_handler) {
125         perror("Error on reset SIGSEGV handler");
126         return -6;
127     }
128     std::cout << "Set successfully" << std::endl;
129 
130     return 0;
131 }

运行结果:

 1 Set user handler for SIGINT signal
 2 Found old SIGINT handler: default handling
 3 Set successfully
 4 Set user handler for SIGFPE signal
 5 Found old SIGFPE handler: default handling
 6 Set successfully
 7 Set user handler for SIGSEGV signal
 8 Found old SIGSEGV handler: default handling
 9 Set successfully
10 ^CIn new_sigint_handler
11 Catch SIGINT: std::exception
12 In new_sigfpe_handler
13 Catch SIGFPE: std::exception
14 In new_sigsegv_handler
15 Catch SIGSEGV: std::exception
16 Set old handler for SIGINT signal
17 Set successfully
18 Set old handler for SIGFPE signal
19 Set successfully
20 Set old handler for SIGSEGV signal
21 Set successfully

查询程序退出状态,可以看到程序是正常退出的:

1 $ echo $?
2 0

我已经将该功能写成一个库“enhanced exception handling method”,缩写为“eEHM”,还增加了堆栈的解析功能,可以清晰的看到函数的调用过程。下载地址

这里以启用SIGFPE、SIGSEGV、SIGINT三个信号为例说明使用方法。示例代码:

 1 #include <iostream>
 2 #include "include/eEHM.h"
 3 
 4 int main(int argc, char **argv) {
 5     eEHM *error_handling;
 6     try {
 7         error_handling = new eEHM();
 8         error_handling->SetUserHandler(SIGFPE);
 9         error_handling->SetUserHandler(SIGSEGV);
10         error_handling->SetUserHandler(SIGINT);
11     } catch (signal_error ex) {
12         std::cout << ex.what() << std::endl;
13     }
14 
15     try {
16         if (error_handling->isValid()) {
17             int a = 0;
18             int b = 10 / a;
19         }
20     } catch (signal_error ex) {
21         std::cout << ex.what() << std::endl;
22     }
23 
24     try {
25         if (error_handling->isValid()) {
26             int *a = NULL;
27             int b = 10 * (*a);
28         }
29     } catch (signal_error ex) {
30         std::cout << ex.what() << std::endl;
31     }
32 
33     try {
34         if (error_handling->isValid()) {
35             while (true) {
36             }
37         }
38     } catch (signal_error ex) {
39         std::cout << ex.what() << std::endl;
40     }
41 
42     return 0;
43 }

Debug模式预期的运行结果:

 1 SIGFPE: backtrace 5 records:
 2   0:Debug/test() [0x402ba1]
 3     eEHM::GetTrace() at eEHM.cpp:172
 4   1:Debug/test() [0x401ee4]
 5     eEHM::SetUserHandler(int) at eEHM.cpp:106
 6   2:Debug/test() [0x4034b4]
 7     main at test.cpp:17
 8   3:/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f518fb6a76d]
 9     ??
10   4:Debug/test() [0x401619]
11     _start at ??:0
12 
13 SIGSEGV: backtrace 5 records:
14   0:Debug/test() [0x402ba1]
15     eEHM::GetTrace() at eEHM.cpp:172
16   1:Debug/test() [0x40206d]
17     eEHM::SetUserHandler(int) at eEHM.cpp:115
18   2:Debug/test() [0x4034db]
19     main at test.cpp:26
20   3:/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f518fb6a76d]
21     ??
22   4:Debug/test() [0x401619]
23     _start at ??:0
24 
25 ^CSIGINT: backtrace 5 records:
26   0:Debug/test() [0x402ba1]
27     eEHM::GetTrace() at eEHM.cpp:172
28   1:Debug/test() [0x4021f6]
29     eEHM::SetUserHandler(int) at eEHM.cpp:124
30   2:Debug/test() [0x403505]
31     main at test.cpp:35
32   3:/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f518fb6a76d]
33     ??
34   4:Debug/test() [0x401619]
35     _start at ??:0

可以看到,三个信号都正确输出了信号触发的堆栈信息,函数的调用过程清晰明了,由于是Debug模式,还能定位问题所在的源文件和大概行数。另外,前2条记录都是eEHM库内的信息,最后2条记录是操作系统调用程序的信息,一般来说可以忽略。所以,eEHM库将这4条信息人为屏蔽了

Debug模式实际的运行结果:

 1 SIGFPE: backtrace 5 records:
 2   2:Debug/test() [0x4034fc]
 3     main at test.cpp:17
 4 
 5 SIGSEGV: backtrace 5 records:
 6   2:Debug/test() [0x403523]
 7     main at test.cpp:26
 8 
 9 ^CSIGINT: backtrace 5 records:
10   2:Debug/test() [0x40354d]
11     main at test.cpp:35

这就是简化了的输出信息,只输出跟用户代码有关的堆栈部分,但堆栈的基本信息以及堆栈内容的顺序号得以保留。

Release模式实际的运行结果:

 1 SIGFPE: backtrace 5 records:
 2   2:Release/test() [0x40350c]
 3     main at ??:0
 4 
 5 SIGSEGV: backtrace 5 records:
 6   2:Release/test() [0x403533]
 7     main at ??:0
 8 
 9 ^CSIGINT: backtrace 5 records:
10   2:Release/test() [0x40355d]
11     main at ??:0

在Release模式下,文件信息和行号都不可用。