Skip to content
This repository has been archived by the owner on Jan 29, 2019. It is now read-only.

From SourceForge: BSoD on detach. #5

Open
Oxalin opened this issue Jan 31, 2018 · 7 comments
Open

From SourceForge: BSoD on detach. #5

Oxalin opened this issue Jan 31, 2018 · 7 comments
Labels
bug Something isn't working
Milestone

Comments

@Oxalin
Copy link
Owner

Oxalin commented Jan 31, 2018

Reported by: DanT
Date:
Link: https://sourceforge.net/p/usbip/discussion/418507/thread/7ff86875/?limit=25&page=1#f556

Description: BSoD on detach in "complete_pending_irp" in the function "bus_unplug_dev".

This may have been fixed by commit b7bfa2c I included from Daniel Mitchell's patch. There was at least two possible errors with IRQL not being set correctly. However, I suspect another error could still happen under complete_pending_irp() in pnp.c.

@Oxalin
Copy link
Owner Author

Oxalin commented Jan 31, 2018

What needs to be done:

  • Review the state of complete_pending_irp() and make sure there is no possible case where IRQL could not be set or restored appropriately.

@Oxalin
Copy link
Owner Author

Oxalin commented Feb 7, 2018

From investigation: it seems one of the previous developers had hit a very similar bug elsewhere in the code [process_write_irp()] where he had to pad IO calls with IRQL raising and lowering. We may need to do the same elsewhere in the code, not just under complete_pending_irp().

@dennisdegryse
Copy link

FYI: recompiled with the server-side accepted version number (111) and got the following stack trace from the kernel memory dump after detaching a bluetooth dongle (bsod):

nt!KeBugCheckEx
nt!KiBugCheckDispatch + 0x69
nt!KiPageFault + 0x519
USBIPEnum + 0x225e
nt!IoCancelIrp + 0x71
BTHUSB!UsbWrapCancelAllPingPongIrps + 0xe5
BTHUSB!USBStopInterruptTransfers + 0x50
BTHUSB!BthUsb_SetPipeState + 0xa0
BTHUSB!BthUsb_HandleStateChange + 0x6a
BTHUSB!BthUsb_PnpRemove + 0x7c
bthport!BthProcessStateChange + 0x132
bthport!BthProcessRemove + 0x147
bthport!BthHandleSurpriseRemoval + 0xa1
bthport!BthHandlePnp + 0x1a7
bthport!BthDispatchPnp + 0x61
nt!IofCallDriver + 0x59
nt!IopSynchronousCall + 0xe5
nt!IopRemoveDevice + 0xdf
nt!PnpSurpriseRemoveLockedDeviceNode + 0xba
nt!PnpDeleteLockedDeviceNode + 0xaf
nt!PnpDeleteLockedDeviceNodes + 0xb3
nt!PnpProcessQueryRemoveAndEject + 0x44a
nt!PnpProcessTargetDeviceEvent + 0xde
nt!PnpDeviceEventWorker + 0x29b
nt!ExpWorkerThread + 0xf5
nt!PspSystemThreadStartup + 0x47
nt!KiStartSystemThread + 0x16

@Oxalin
Copy link
Owner Author

Oxalin commented Feb 26, 2018

Hi @dennisdegryse . Thank you for your trace. I'm pretty sure I've pinpointed where the problem is, but not why this is happening. I'm mostly working on the usbip-tools for now, but I'll take whatever you can feed me on the driver side: which OS are you testing on? Which commit are you compiling? How did you disconnect your device? Did it generate a "IRQL not less or equal" or was it a different message?

From what I can read, you generated a surprise removal: the PNP process is called and it deals with a Remove and Eject query, than it continues deleting devices nodes, going to a specific node and it is treated as a SupriseRemoveLockedDeviceNode... Then it calls the bluetooth driver (bthport), which in turn also deals with the suprise removal, stopping transfers, canceling all IRPs and this is where it fails under USBIPEnum, generating a page fault.

@Oxalin
Copy link
Owner Author

Oxalin commented Feb 26, 2018

@dennisdegryse : also, could you attach the dump file?

@dennisdegryse
Copy link

ATM I only have a full memory dump of 1.2GB, from a non-sandboxed environment (may contain info I don't want to leak). I'll set up a VM for a reproduction and new dump asap. Do you want the full memory dump or will a minidump suffice?

@Oxalin
Copy link
Owner Author

Oxalin commented Feb 28, 2018

@dennisdegryse : If you were testing using the master/HEAD, I pushed a fix a few minutes ago (well, I hope this will work).

IoCancelIrp() calls the driver's cancel IRP routine, which is cancel_irp(). At first, I tought we were hitting a wrongly assumed IRQL at DISPATCH_LEVEL. However, after digging in Microsoft's documentation, I think I have properly fixed the code.

If you want to give it a try and let me know (I still haven't worked on the server-side accepted version number (111)) if this fixes your problem.

@Oxalin Oxalin added this to the 1.0.0 milestone Mar 5, 2018
@Oxalin Oxalin added the bug Something isn't working label Mar 5, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants