Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bluetooth: Controller: build assert if comand buffer is too small #19869

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions applications/connectivity_bridge/prj.conf
Original file line number Diff line number Diff line change
Expand Up @@ -75,3 +75,6 @@ CONFIG_HW_ID_LIBRARY=y
CONFIG_RESET_ON_FATAL_ERROR=y
CONFIG_ASSERT=y
CONFIG_POWEROFF=y

# BT
CONFIG_BT_BUF_CMD_TX_COUNT=3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be moved to the .conf files in boards and be set along with the rest of the BT options

10 changes: 10 additions & 0 deletions doc/nrf/releases_and_maturity/known_issues.rst
Original file line number Diff line number Diff line change
Expand Up @@ -189,6 +189,16 @@ KRKNWK-14299: NRPA MAC address cannot be set in Zephyr
Bluetooth LE
============

.. rst-class:: v2-9-0-nRF54H20-1 v2-9-0 v2-8-0 v2-7-0 v2-6-2 v2-6-1 v2-6-0

DRGN-24352: Missing disconnection events when HCI Controller to host flow control is enabled
The :zephyr:code-sample:`bluetooth_hci_ipc` sample and :ref:`ipc_radio` application do not support stream flow control and drop bytes.
With the deferred disconnection complete generation added in the |NCS| v2.6.0, there are cases where the :kconfig:option:`CONFIG_BT_HCI_ACL_FLOW_CONTROL` Kconfig option has value ``y``, and the value of :kconfig:option:`CONFIG_BT_BUF_CMD_TX_COUNT` is smaller than (:kconfig:option:`CONFIG_BT_BUF_ACL_RX_COUNT` + 1).
This might result in dropped HCI command packets.
The visible symptom is a missing disconnect event and subsequent failure in connection, if a peripheral device falls out of range or is powered off.

**Workaround:** Make sure that the value of the :kconfig:option:`CONFIG_BT_BUF_ACL_RX_COUNT` Kconfig option is smaller than the one for :kconfig:option:`CONFIG_BT_BUF_CMD_TX_COUNT` in your project.

.. rst-class:: v2-9-0-nRF54H20-rc1 v2-9-0 v2-8-0 v2-7-0 v2-6-2 v2-6-1 v2-6-0
KyraLengfeld marked this conversation as resolved.
Show resolved Hide resolved

NCSDK-31095: Issues with the :kconfig:option:`CONFIG_SEGGER_SYSVIEW` Kconfig option
Expand Down
42 changes: 41 additions & 1 deletion subsys/bluetooth/controller/hci_internal.c
Original file line number Diff line number Diff line change
Expand Up @@ -877,6 +877,46 @@ static uint8_t link_control_cmd_put(uint8_t const * const cmd)
}
#endif

#if defined(CONFIG_BT_HCI_ACL_FLOW_CONTROL)
/*
* The Host will generate up to acl_pkts number of Host Number of Completed Packets command plus a
* number of normal HCI commands, as such we need to ensure the tx command buffer count is big
* enough to not block incoming ACKs from the host.
*
* When Controller to Host data flow control is supported, ensure that BT_BUF_CMD_TX_COUNT is
* greater than or equal to (BT_BUF_ACL_RX_COUNT + Ncmd), where Ncmd is the supported maximum
* Num_HCI_Command_Packets.
*
* The SDC controller (currently) does not support Num_HCI_Command_Packets > 1, which means Ncmd
* is always 1.
*/
#if defined(CONFIG_BT_CONN) && defined(CONFIG_BT_HCI_HOST)
BUILD_ASSERT(BT_BUF_ACL_RX_COUNT < CONFIG_BT_BUF_CMD_TX_COUNT,
Copy link
Contributor

@alwa-nordic alwa-nordic Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CONFIG_BT_BUF_CMD_TX_COUNT when CONFIG_BT_HCI_HOST is very different from when BT_HCI_RAW. (Yes, it's unfortunate and confusing that it's overloaded like this.)

When BT_HCI_RAW, this config controls the capacity of bt_buf_get_tx(BT_BUF_CMD), and it's very clear what the only use of this capacity is. It used only to receive a commands from the Host, and the rate of these commands are controlled by the command flow control. This is usable to compute a statically known worst case buffer use, guaranteeing a K_NO_WAIT allocation. See my comment on documenting assumptions below.

OTOH, when CONFIG_BT_HCI_HOST, this config controls controls the capacity of bt_hci_cmd_create, which is used for anything and everything. That can include many concurrent calls from the application. This is not amendable to statically compute any guarantees in the general case. If we make an additional assumption that whole application and host is only ever allocating one command, and waiting for its completion before allocating the next one, we get an analogous guarantee that bt_hci_cmd_create never blocks. But what problem does that guarantee solve? bt_hci_cmd_create is always K_FOREVER. This also needs to be documented. I suspect this solves a deadlock, right? I'll post more if I deduce it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
BUILD_ASSERT(BT_BUF_ACL_RX_COUNT < CONFIG_BT_BUF_CMD_TX_COUNT,
BUILD_ASSERT(BT_BUF_ACL_RX_COUNT < BT_BUF_CMD_TX_COUNT,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant CONFIG_BT_BUF_CMD_TX_COUNT in the comment above, will make it more clear.

"Too low HCI command buffers compared to ACL Rx buffers, \
reduce CONFIG_BT_BUF_ACL_RX_COUNT_EXTRA or increase CONFIG_BT_BUF_CMD_TX_COUNT.");
#else /* controller-only build */
BUILD_ASSERT(CONFIG_BT_BUF_CMD_TX_COUNT - 1 > 0,
"We need at least two HCI command buffers to avoid deadlocks.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semantically, this wants BT_HCI_RAW, not !CONFIG_BT_HCI_HOST.

#endif /* CONFIG_BT_CONN && CONFIG_BT_HCI_HOST */

/* Note: When host and controller are built separately, the RX Buffer Count communicated here from
* the host might be exeeding the controller's TX command buffer count. This may result in the
* controller sending up to host_total_num_acl_data_packets, it cannot receive the acknoledgements
* via its TX command buffers for.
* As we do not propagate the TX command buffer count information to the controller, the simplest
* approach is to align the communicated host RX Buffer Count to it here.
*/
static int sdc_hci_cmd_cb_host_buffer_size_wrapper(void *cmd_params)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static int sdc_hci_cmd_cb_host_buffer_size_wrapper(void *cmd_params)
static int sdc_hci_cmd_cb_host_buffer_size_wrapper(const sdc_hci_cmd_cb_host_buffer_size_t *cmd_params)

Copy link
Contributor

@alwa-nordic alwa-nordic Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does indeed form a guaranteed K_NO_WAIT allocation in HCI samples, with some added assumptions which need to be documented:

  • a) The controller ncmd is always 1 (or 0 or the host pretends it is).
  • The controller frees cmd buffer strictly before sending cmd complete/status.
  • There is no other command than Host Num Complete that ignores flow control.
  • The Host does not send "empty" Host Num Completed messages. (The spec allows zeros in both the number of handles and number of ACKs for that handle.)
  • (The check added by this PR.) The Controller does not send more ACL than it has room for Host Num Complete messages. (Because of (a), all but one pool in the cmd_tx_pool is always free for Host Num Complete).
  • The Controller frees the Host Num Complete buffer strictly before it considers the credits therein as obtained and allowing it to send more data.

Consider if some of these assumptions should be turned into specification by adding them to the Zephyr HCI interface specification include/zephyr/drivers/bluetooth.h.

{
sdc_hci_cmd_cb_host_buffer_size_t ctrl_cmd_params =
*(const sdc_hci_cmd_cb_host_buffer_size_t *)cmd_params;
KyraLengfeld marked this conversation as resolved.
Show resolved Hide resolved
ctrl_cmd_params.host_total_num_acl_data_packets = MIN(
ctrl_cmd_params.host_total_num_acl_data_packets, (CONFIG_BT_BUF_CMD_TX_COUNT - 1));

return sdc_hci_cmd_cb_host_buffer_size((const void *)&ctrl_cmd_params);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return sdc_hci_cmd_cb_host_buffer_size((const void *)&ctrl_cmd_params);
return sdc_hci_cmd_cb_host_buffer_size(&ctrl_cmd_params);

}
#endif /* CONFIG_BT_HCI_ACL_FLOW_CONTROL */

static uint8_t controller_and_baseband_cmd_put(uint8_t const * const cmd,
uint8_t * const raw_event_out,
uint8_t *param_length_out)
Expand All @@ -903,7 +943,7 @@ static uint8_t controller_and_baseband_cmd_put(uint8_t const * const cmd,
case SDC_HCI_OPCODE_CMD_CB_SET_CONTROLLER_TO_HOST_FLOW_CONTROL:
return sdc_hci_cmd_cb_set_controller_to_host_flow_control((void *)cmd_params);
case SDC_HCI_OPCODE_CMD_CB_HOST_BUFFER_SIZE:
return sdc_hci_cmd_cb_host_buffer_size((void *)cmd_params);
return sdc_hci_cmd_cb_host_buffer_size_wrapper((void *)cmd_params);
case SDC_HCI_OPCODE_CMD_CB_HOST_NUMBER_OF_COMPLETED_PACKETS:
return sdc_hci_cmd_cb_host_number_of_completed_packets((void *)cmd_params);
#endif
Expand Down