Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no more HCI events after a few times (around 6) disconnect and connection #654

Open
rvt opened this issue Jan 11, 2025 · 7 comments
Open

Comments

@rvt
Copy link

rvt commented Jan 11, 2025

Describe the bug

I have been using sm_pairing_peripheral.c as my 'template' and use it to steam NMEA data from a ublox over BLE. Pairing and bonding works fine aswell as storing this on flash.
I am using this to steams data from an ublox (NMEA) to BLE, so it's a continues stream of NMEA data using notifications.

However, if I disconnect my device and reconnect a few times then after around 6 disconnects and re-connections the bluetooth stack stops responding and no more HCI events are received. No new connections are possible, no authentication is received and the log stays silent. Sometimes the stream might start for a second, then just stops. No disconnect events like ATT_EVENT_DISCONNECTED are received.

Other tasks the PICO is running keep running just fine for example reading serial data from a ublox etc, SPI interfaces and the WIFI web interface that the device is running.

From the wireshark I cannot see anything that stands out, but I am not well versed on bluetooth, or wireshark so I am not sure what to look at yet.

To Reproduce

Not sure yet what example I can use or what to use on the receiving device to reproduce. First I would like to get a feel of what os going on, and if others have seen this.

Expected behavior

I expect that I can always disconnect and reconnect any number of times.

HCI Packet Logs
See attached gz file.

Environment: (please complete the following information):

  • Current BTstack branch: Develop
  • Bluetooth Controller: cyw43 on the PICO RP2040
  • Remote device: Xiaomi Redmi Note 10
  • FreeRTOS latest
  • PICO SDK 2.1.0

out.pklg.gz

Configuration:

#ifndef _PICO_BTSTACK_BTSTACK_CONFIG_H
#define _PICO_BTSTACK_BTSTACK_CONFIG_H

// BTstack features that can be enabled
// #define ENABLE_LOG_INFO
// #define ENABLE_LOG_ERROR

// Enable ERROR LOG in BlueTooth
#define ENABLE_LOG_INTO_HCI_DUMP ( 1 )
#define ENABLE_LOG_DEBUG
#define ENABLE_LOG_ERROR
#define ENABLE_LOG_INFO
#define ENABLE_LOG_BTSTACK_EVENTS
#define HCI_DUMP_STDOUT_MAX_SIZE_ACL ( 12 )
#define ENABLE_PRINTF_HEXDUMP


#ifdef ENABLE_BLE
#define ENABLE_GATT_CLIENT_PAIRING
#define ENABLE_L2CAP_LE_CREDIT_BASED_FLOW_CONTROL_MODE
// #define ENABLE_LE_CENTRAL
#define ENABLE_LE_DATA_LENGTH_EXTENSION
#define ENABLE_LE_PERIPHERAL
#define ENABLE_LE_PRIVACY_ADDRESS_RESOLUTION
#define ENABLE_LE_SECURE_CONNECTIONS
#endif

#ifdef ENABLE_CLASSIC
#define ENABLE_L2CAP_ENHANCED_RETRANSMISSION_MODE
#define ENABLE_GOEP_L2CAP
#endif

#if defined (ENABLE_CLASSIC) && defined(ENABLE_BLE)
#define ENABLE_CROSS_TRANSPORT_KEY_DERIVATION
#endif

// BTstack configuration. buffers, sizes, ...
#define HCI_OUTGOING_PRE_BUFFER_SIZE 4
#define HCI_ACL_PAYLOAD_SIZE (1691 + 4)
#define HCI_ACL_CHUNK_SIZE_ALIGNMENT 4
#define MAX_NR_AVDTP_CONNECTIONS 1
#define MAX_NR_AVDTP_STREAM_ENDPOINTS 1
#define MAX_NR_AVRCP_CONNECTIONS 2
#define MAX_NR_BNEP_CHANNELS 1
#define MAX_NR_BNEP_SERVICES 1
#define MAX_NR_BTSTACK_LINK_KEY_DB_MEMORY_ENTRIES  3
#define MAX_NR_GATT_CLIENTS 3

#define MAX_NR_HCI_CONNECTIONS 3
#define MAX_NR_HID_HOST_CONNECTIONS 0
#define MAX_NR_HIDS_CLIENTS 0
#define MAX_NR_HFP_CONNECTIONS 1
#define MAX_NR_L2CAP_CHANNELS  3
#define MAX_NR_L2CAP_SERVICES  3
#define MAX_NR_RFCOMM_CHANNELS 1
#define MAX_NR_RFCOMM_MULTIPLEXERS 1
#define MAX_NR_RFCOMM_SERVICES 1
#define MAX_NR_SERVICE_RECORD_ITEMS 4
#define MAX_NR_SM_LOOKUP_ENTRIES 3
#define MAX_NR_WHITELIST_ENTRIES 16
#define MAX_NR_LE_DEVICE_DB_ENTRIES 16

// Limit number of ACL/SCO Buffer to use by stack to avoid cyw43 shared bus overrun
#define MAX_NR_CONTROLLER_ACL_BUFFERS 3
#define MAX_NR_CONTROLLER_SCO_PACKETS 3

// Enable and configure HCI Controller to Host Flow Control to avoid cyw43 shared bus overrun
#define ENABLE_HCI_CONTROLLER_TO_HOST_FLOW_CONTROL
#define HCI_HOST_ACL_PACKET_LEN 1024
#define HCI_HOST_ACL_PACKET_NUM 3
#define HCI_HOST_SCO_PACKET_LEN 120
#define HCI_HOST_SCO_PACKET_NUM 3

// Link Key DB and LE Device DB using TLV on top of Flash Sector interface
#define NVM_NUM_DEVICE_DB_ENTRIES 16
#define NVM_NUM_LINK_KEYS 16

// We don't give btstack a malloc, so use a fixed-size ATT DB.
#define MAX_ATT_DB_SIZE 512

// BTstack HAL configuration
#define HAVE_EMBEDDED_TIME_MS

// map btstack_assert onto Pico SDK assert()
#define HAVE_ASSERT

// Some USB dongles take longer to respond to HCI reset (e.g. BCM20702A).
#define HCI_RESET_RESEND_TIMEOUT_MS 1000

#define ENABLE_SOFTWARE_AES128
#define ENABLE_MICRO_ECC_FOR_LE_SECURE_CONNECTIONS

// #define HAVE_BTSTACK_STDIN

// To get the audio demos working even with HCI dump at 115200, this truncates long ACL packets
//#define HCI_DUMP_STDOUT_MAX_SIZE_ACL 100

#endif // _PICO_BTSTACK_BTSTACK_CONFIG_H

@mringwal
Copy link
Member

Hi there. I understand that you are using an u-blox GPS module (they also produce Bluetooth and Cellular modules).
In the log, there's nothing suspicious. There is "ACL out, size xxx" throughout the log.

Please disable ENABLE_LOG_DEBUG for logs to reduce clutter.

So, from the log, it looks like data is sent all the time, which would indicate that there might be a problem on the receiving side. Could you try a different remote device and/or try to get the HCI Snoop logs from this device.

However, you explain that you don't get any events on the Pico which would indicate a local problem. In that case, please check if you get an event when you turn off the remote device's Bluetooth, as this needs to generate an HCI Disconnect event. You can also try to use Polling mode instead of FreeRTOS.

Finally, if it really is an issue with the CYW43xx, it would help to have a way to reproduce this. The GATT Counter or GATT Streamer might be a good option for that.

@rvt
Copy link
Author

rvt commented Jan 12, 2025

Currently I am guessing a local problem as not even a disconnect is received anymore, it just 'stops'.
When I try to connect with a other device for example an iPAD not a single package is received, not even a pairing.

As for the data, it is correct that all data is received untill it just stops...

I did mention uBlox because that is what I am reading over serial port that is send over NMEA. The reason I am using a PICO is because I am already receiving and sending NMEA packages that contain traffic information send to a Electronic Flight Bag.

It's good info for me that there is nothing suspicious in the log, I was not 100% sure of that yet. I will come back with further information or a way to reproduce..

@rvt
Copy link
Author

rvt commented Jan 17, 2025

hell @mringwal,

I have created a repo demonstrating the issue, it contains a uf2 that can be put on a RP2040 PICO_W

https://github.com/rvt/bluetoothtest/releases/tag/0.0.1

I did notice it takes a bit more work to get it into the situatoi, so like 12..16 connections/disconnections.
I used Serial Bluetooth Terminal for my testing (https://play.google.com/store/apps/details?id=de.kai_morich.serial_bluetooth_terminal&hl=nl#) on Android. But I have seen the same using Skydemon on both iPhone and Android so my current best guess it's not the client.

The repo contains a readme on how to compile and a patch file as it needs v1.6.2 of the bluetooth stack that is not compatible with the PICO SDK 2.1.0 and it contains the patch for this issue: raspberrypi/pico-sdk#2101

in short to replicate:

  1. Download and copy the OpenAce.uf2 file to the pico
  2. after the firmware starts the green led will blink fast.
  3. with Serial Bluetooth Terminal connect and disconnect a few times.

If you have a terminal connected via OpenOCD where uart is connect to pin 16/17 you will also see hci output.
If you need a firwware to play with that outputs over USB uart, let me know and I will create that. The readme also contains a small instruction how to do this.

The code of the bluetooth can be found here: https://github.com/rvt/bluetoothtest/blob/main/src/lib/bluetooth/ace/bluetooth.cpp

The code looks a little bit chaotic, but that's because it's part of a larger codebase, this is just a stripped down version to bare minimum. The full code can be found here if you are curious : https://github.com/rvt/openace

UPDATE:
I think the issue was the mtu I have been using. I did change my code a bit and and did use the mtu from att_event_mtu_exchange_complete_get_MTU and remove 16 byte of it. With this change I got it running reliable.
I also change the configuration as found in pico-examples/pico_w/bt/standalone instead of the one in pico-examples/pico_w/bt/config.
When using MTU as given to me, I would get into the same issue. I think I mis understand how mtu works in this case and will read up on it..

@mringwal
Copy link
Member

Thanks for preparing the test.

Just to confirm: your Pico app mainly provides a GATT Service and you use Notifications to send data to the mobile phone. If you repeat { connect, send some data for some time, disconnect } often enough, it stops working, correct?

Oh.. just reading you got it stable.

To send notifications, you can send (att_event_mtu_exchange_complete_get_mtu() - 3). Could you verify that this works with either configuration?

@rvt
Copy link
Author

rvt commented Jan 21, 2025

@mringwal,

Just to confirm: your Pico app mainly provides a GATT Service and you use Notifications to send data to the mobile phone. If you repeat { connect, send some data for some time, disconnect } often enough, it stops working, correct?

Correct

Not even when I do att_event_mtu_exchange_complete_get_mtu() - 128. The MTU I am receiving was 512, which might be to large for the PICO with the CYW43? based on the above configuration.

When i change the configuration to this:

from:

#define HCI_ACL_PAYLOAD_SIZE (1691+4)
#define HCI_HOST_ACL_PACKET_LEN 1024

to:

#define HCI_ACL_PAYLOAD_SIZE (255+4)
#define HCI_HOST_ACL_PACKET_LEN (255+4)

Then I can keep connecting and disconnecting just fine. MTU will be set to 244 and that is the package length I am trying to send and it seems to keep working fine after.

This is based on pico-sdk develop with bluekitchen v1.6.2 and the small patch to add ${PICO_BTSTACK_PATH}/src/hci_event_builder.c to the makefile.

Hope that helps.

@mringwal
Copy link
Member

It's not clear what's going on. Please comment out HCI_DUMP_STDOUT_MAX_SIZE_ACL as well as ENABLE_LOG_DEBUG and get a new file that shows contains the hang. No need to shorten it, it's best if boot up is included.

Even with Date Length Extension, the max packet is 251 bytes, which results in a max ATT MTU of 248 resp. a usable payload for a ATT Notification of 244. BTstack will try to negotiate for ATT MTU of 248 in the best case, so it's irritating that you would get 512 from att_event_mtu_exchange_complete_get_mtu(). I'd like to check this part in the new log.

@rvt
Copy link
Author

rvt commented Jan 22, 2025

Attached new log: newlog.pklg.gz

for completeness: PICO-SDK 2.1.0 / bluekitchen v1.6.2

Added a log entry for the MTU exchange and shows on line 325 MTU Exchange complete, handle 0040, MTU 512

Line 1179 shows When you see this message, then the btstack stopped working that is where all communication stops, no connections works anymore.


I also did a quick test on picow_bt_example_gatt_streamer_server_background.uf2 (default code from gatt_streamer_server.c) that I compile straight from the examples. When a connection is made from Serial Bluetooth Terminal the MTU is also send as 512 (see below log), minus 3 is thus 509. Exactly what I did see in my version. However since test_data buffer is set to 200bytes, this is within the mentioned 248 bytes and thus everything will work (assuming).
I do see that the HCI_SUBEVENT_LE_DATA_LENGTH_CHANGE seems to indicate the correct MTU as it shows LE Connection 0x0040: data length change - max 251 bytes per packet.
Should context->test_data_len be set from HCI_SUBEVENT_LE_DATA_LENGTH_CHANGE??

I will also create a full log with streaming later this week for this example to see if I can trigger the issue with the default code. Just need to get the streaming to work as non of my tests applications seems to read the stream and I am not beable to trigger the issue I was seeing..

for completeness: PICO-SDK 2.1.0 / bluekitchen v1.6.2

BTstack up and running on 28:CD:C1:09:13:4E.                                                                                   
To start the streaming, please run the le_streamer_client example on other device, or use some GATT Explorer, e.g. LightBlue, BLExplr.
A: ATT connected, handle 0x0040, test data len 20
- LE Connection 0x0040: connected - connection interval 45.00 ms, latency 0
- LE Connection 0x0040: request 15 ms connection interval
- LE Connection 0x0040: data length change - max 251 bytes per packet
Write to 0x0021, len 2
- LE Connection 0x0040: connection update - connection interval 15.00 ms, latency 4
A: ATT disconnected, handle 0x0040
- LE Connection 0x0040: disconnect, reason 13
A: ATT connected, handle 0x0040, test data len 20
- LE Connection 0x0040: connected - connection interval 45.00 ms, latency 0
- LE Connection 0x0040: request 15 ms connection interval
- LE Connection 0x0040: data length change - max 251 bytes per packet
Write to 0x0021, len 2
A: ATT MTU = 509 => use test data of len 200
- LE Connection 0x0040: connection update - connection interval 15.00 ms, latency 4
- LE Connection 0x0040: data length change - max 251 bytes per packet
A: ATT disconnected, handle 0x0040
- LE Connection 0x0040: disconnect, reason 13
A: ATT connected, handle 0x0040, test data len 20
- LE Connection 0x0040: connected - connection interval 45.00 ms, latency 0
- LE Connection 0x0040: request 15 ms connection interval
- LE Connection 0x0040: data length change - max 251 bytes per packet
Write to 0x0021, len 2
A: ATT MTU = 509 => use test data of len 200
- LE Connection 0x0040: connection update - connection interval 15.00 ms, latency 4
- LE Connection 0x0040: data length change - max 251 bytes per packet
A: ATT disconnected, handle 0x0040
- LE Connection 0x0040: disconnect, reason 13

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants