【23】share中斷引發的教訓和PCIe DPC代碼的bug

最近在驗證某款芯片的DPC和熱插拔功能,發現在暴力拔出和插入SSD時,dpc_irq都會被調勇到,但是DPC Status又顯示沒有觸發DPC,一開始以爲是芯片bug。後面查詢了一下share中斷的東西,發現是老版本dpc的代碼的bug。(說到這裏就汗顏了,以前在某不止是500強搬磚時也遇到一樣的問題—中斷服務函數被調用,然後status reg爲0,有個打印,會頻繁打印。我當時沒有搞清楚share中斷的機制,就直接把打印刪除了,修改方式倒是沒有問題,只是不知道所以然)。

話題回到現在的場景,由於我使用的CPU支持AER、DPC和hotplug功能,OS會註冊AER中斷函數、DPC中斷服務函數、hotplug中斷服務函數,並且都是使用的share中斷。當我暴力拔出SSD時,data link laer從active跳轉到inactive,產出data link active change中斷,插入SSD時,會有presence change 中斷。這個時候OS會調度CPU端口註冊的三個中斷服務函數。
低版本的dpc_irq代碼有bug,只是判斷DPC status不爲全1並且不0就認爲DPC中斷觸發了,而恰巧DPC STATUS的bit12-8默認值是1 1111b。因此,拔插SSD 時,我們可以看到打印出DPC STATUS reg爲0x1F00。
關於DPC的代碼的bug,在高版本OS和下面補丁中已經修復
https://patchwork.kernel.org/patch/10112467/
在這裏插入圖片描述
在這裏插入圖片描述
在這裏插入圖片描述
關於share中斷問題,stack overflow有明確的解釋
https://stackoverflow.com/questions/14371513/for-a-shared-interrupt-line-how-do-i-find-which-interrupt-handler-to-use?answertab=active#tab-top

https://stackoverflow.com/questions/4732570/return-value-of-interrupt-handlers-in-linux-kernel
The kernel will sequentially invoke all the handlers for that particular shared line.

Exactly. Say Dev1 and Dev2 shares the IRQ10. When an interrupt is generated for IRQ10, all ISRs registered with this line will be invoked one by one.

In our scenario, say Dev2 was the one that generated the interrupt. If Dev1’s ISR is registered first, than its ISR (i.e Dev1’s ISR) only called first. In that ISR, the interrupt status register will be verified for interrupt. If no interrupt bit is set (which is the case, cause Dev2 raised the interrupt) then we can confirm that interrupt was not generated by Dev1 - so Dev1’s ISR should return to the kernel IRQ_NONE - which means:“I did not handled that interrupt”, so on the kernel continues to the next ISR (i.e Dev2’s ISR), which in turn, will indeed verify that its corresponding device generated the interrupt, thus, this handler should handle it and eventually return IRQ_HANDLED - which means:“I handled this one”.

See the return values IRQ_NONE/IRQ_HANDLED for more information.

How does the handler know that the corresponding device issued the interrupt or not ?

By reading the Interrupt status register only.

Is this information relayed through the interrupt controller that is between the devices and the processor interrupt line ??

I’m not sure about this. But the OS will take care of calling ISRs based on the return values from ISR.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章