Wednesday, February 04, 2009

做出自己的back trace function

相信有在用arm linux的, 應該對kernel panic不陌生吧~~ 在你對kernel做了一些無法挽救的錯事後, kernel叫了一聲"Oops~~",然後就死在路邊~~ 不過幸運的事, 通常kernel會在死之前留下一些"線索", 好讓你跟隨這些線索找出些端倪...

舉例來說, 我故意在我的init module 裡插入存取非法位址的動作, kernel果然就掛在那邊並且印出:

Unable to handle kernel paging request at virtual address dcc01120
pgd = c0004000
[dcc01120] *pgd=00000000
Internal error: Oops: 805 [#1]
Modules linked in:
CPU: 0
PC is at audio_aic32_init+0x44/0x1cc
LR is at 0x1
pc : [] lr : [<00000001>] Not tainted
sp : c0387fc4 ip : 60000013 fp : c0387fd4
r10: 00000000 r9 : 00000000 r8 : 00000000
r7 : c001f70c r6 : 00000000 r5 : c0386000 r4 : 00000000
r3 : fbbc0000 r2 : e1041128 r1 : 00000001 r0 : e1041120
Flags: nZCv IRQs on FIQs on Mode SVC_32 Segment kernel
Control: 5317F Table: 80004000 DAC: 00000017
Process swapper (pid: 1, stack limit = 0xc0386198)
Stack: (0xc0387fc4 to 0xc0388000)
7fc0: c001f6dc c0387ff4 c0387fd8 c0076290 c001d00c 00000000 00000000
7fe0: 00000000 00000000 00000000 c0387ff8 c008d738 c007620c 00000000 00000000
Backtrace:
[] (audio_aic32_init+0x0/0x1cc) from [] (init+0x94/0x1e0)
r4 = C001F6DC
[] (init+0x0/0x1e0) from [] (do_exit+0x0/0xdc0)
r7 = 00000000 r6 = 00000000 r5 = 00000000 r4 = 00000000
Code: e59f3174 e3a01001 e5801000 e2422d25 (e7801003)
<0>Kernel panic - not syncing: Attempted to kill init!


嘿嘿, 很明顯的kernel就是死在audio_aic32_init,

我們再去仔細看看這個function即可;
這麼說起來, kernel的這個機制還真好用, 能讓我們了解掛掉的時候,

是由哪幾個function call下來的,

那麼...我們有沒有辦法拿來用呢?

有時你會想知道, kernel到底何時, 又是從哪呼叫到這個function;
又有時你會想知道, kernel到底怎麼call到我們function裡的,

我明明只是在結構中加入 .probe= probe_function,
kernel卻能找到我的probe_function, 到底是從哪進入的呢,

當然你也可以一層一行的去追code去printk,
但是如果能善用kernel的這個"線索"function, 想必能省力不少

好, 既然要用這個工具,

讓我們先找看看他被放在kernel source tree的哪裡
用lxr search看看, 發現他是位於/arch/arm/kernel/traps.c
從die開始(但是如何跑到die的呢? 我猜是fault interrupt),接著die()->dump_backtrace()->c_backtrace(),
但是lxr search不到c_backtrace不到, 按照慣例,

應該是一個assembly functon,
去arch/arm/lib/裡找找, 果然找到一個backtrace.S,

c_backtrace就在裡面,
有興趣的可以去看看他是怎麼寫的, 我則只想知道怎麼用它

最簡單的方式就是 看dump_backtrace()是怎麼call c_backtrace()的, 我們跟著做, 這樣應該就能印出資訊吧
在dump_backtrace()裡:

157 static void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk)
158 {
159 unsigned int fp;
160 int ok = 1;
161
162 printk("Backtrace: ");
163 fp = regs->ARM_fp;
164 if (!fp) {
165 printk("no frame pointer");
166 ok = 0;
167 } else if (verify_stack(fp)) {
168 printk("invalid frame pointer 0x%08x", fp);
169 ok = 0;
170 } else if (fp < (unsigned long)(tsk->thread_info + 1))
171 printk("frame pointer underflow");
172 printk("\n");
173
174 if (ok)
175 c_backtrace(fp, processor_mode(regs));
176 }


其中 c_backtrace需要2個參數, 一個是fp,

一個是processor目前的mode
processor 目前的mode倒是比較容易解決,

因為我們driver都是在system mode, 所以傳入固定0x1f即可

至於fp呢, 要取得就比較麻煩啦, 要先get register的值, 我的做法如下

static void load_regs( struct pt_regs *ptr )
{
asm volatile(
"stmia %0, {r0 - r15}\n\t"
:
: "r" (ptr)
: "memory"
);
}


最後則是寫成一個function 讓別人call啦~

void back_trace( void )
{
struct pt_regs *ptr;
unsigned int fp;
unsigned long flags;

ptr = kmalloc( sizeof( struct pt_regs ), GFP_KERNEL);

local_irq_save(flags);

printk("\n\nstart back trace...");
load_regs( ptr );
fp = ptr->ARM_fp;
c_backtrace(fp, SYS_MODE);
printk("back trace end...\n\n");

local_irq_restore(flags);

kfree ( ptr );
}
EXPORT_SYMBOL( back_trace);


如此, 我們只要在driver中加入back_trace();

就可以知道整個來龍去脈啦

舉個例子, 在audio 的probe function中加入back_trace();
則會印出

start back trace...
[] (back_trace+0x0/0x68) from [] (davinci_aic32_probe+0x3c/0x144)
r5 = C0293EE8 r4 = 00000000
[] (davinci_aic32_probe+0x0/0x144) from [] (audio_probe+0x24/0x2c)
r5 = C02471B4 r4 = C024725C
[] (audio_probe+0x0/0x2c) from [] (driver_probe_device+0x54/0x74)
[] (driver_probe_device+0x0/0x74) from [] (driver_attach+0x54/0x90)
r5 = C024725C r4 = C02471BC
[] (driver_attach+0x0/0x90) from [] (bus_add_driver+0x78/0x120)
r6 = C024725C r5 = C0293E48 r4 = C0243C34
[] (bus_add_driver+0x0/0x120) from [] (audio_register_codec+0xbc/0xe8)
[] (audio_register_codec+0x0/0xe8) from [] (audio_aic32_init+0x14/0x1c4)
r6 = 00000000 r5 = C0386000 r4 = C001F6DC
[] (audio_aic32_init+0x0/0x1c4) from [] (init+0x94/0x1e0)
r4 = C001F6DC
[] (init+0x0/0x1e0) from [] (do_exit+0x0/0xdc0)
r7 = 00000000 r6 = 00000000 r5 = 00000000 r4 = 00000000
back trace end...


如此,整個流程就清清楚楚了~~~

No comments: