Objectives
- Learn the mbed CAN API and the basic underlying principles
- Learn about and use the different multitasking models on a CAN node.
CAN (Controller Area Network) is a communications standard used to network electronics in vehicles, including our solar car. While commercial vehicles also have other network standards to choose from (and some may use a combination of several), all the networked devices (including battery management system, motor controller, telemetry, MPPT) on our car communicate over a shared CAN bus.
A CAN bus consists of at least two devices (nodes) connected to a shared bus, and is a message-based, broadcast network:
- Message-based: transmissions occur in complete messages (also called frames).
- Broadcast: all nodes on the network can hear the transmissions of any node (though they need not act on irrelevant messages).
The two main operations are transmitting and receiving messages.
CAN messages carry these pieces of user data:
- ID field, either 11 bits (standard frame) or 29 bits (extended frame). A lower ID gives the message priority in the event of a collision (which happens when multiple nodes start transmitting simultaneously).
- Data, up to 8 bytes. The length is transmitted as part of the frame.
- Remote transmission request (RTR), one bit. This differentiates a remote request frame from a data frame, and more details are below.
The ID field is used to indicate the kind of data being transmitted (for example, "battery voltage") and the data field contains the actual data (continuing the example, 107.4 V encoded as 2 bytes fixed point). Message IDs must be unique on any single CAN bus. Because of the priority system, giving a lower ID to more important messages (or messages with stricter deadlines) is recommended.
Usually, nodes transmit data regularly (for example, battery voltage is automatically sent every second by the BMS), but the RTR field can be used to request specific data. Responding to RTR messages must be handled by application code, and it's not something we use.
CAN also provides these non-user data fields:
- Cyclic Redundancy Check (CRC), 15 bits, as a checksum so corrupted messages are discarded.
- Acknowledgement, one bit, asserted by any other node upon successful reception (including CRC check).
- If a message is not acknowledged by any other node, the transmitter will attempt to retransmit.
- However, it is impossible to tell from the acknowledgement bit if a message was received by a particular node.
- As acknowledgements are generated by the CAN controller peripheral (before user-level code), it is also impossible to tell if application code has processed and acted on a message correctly.
CAN controllers keep count of errors, and too many errors may result in a bus-off condition, where the controller disconnects from the bus. The CAN controller must be re-initialized (through user-level code) before communications can resume. The full list of error counter rules is complex and a more detailed explanation is here.
This is only a very high-level overview.
The CAN bus itself consists of two wires, CANH and CANL as a differential pair, to which all nodes are attached. There are two bit levels, either dominant (0) or recessive (1). When multiple nodes transmit simultaneously (collision), a dominant bit takes priority over a recessive bit (the mechanism for ID-based arbitration). If no node is transmitting, the bus is held at a recessive level.
The CAN controller operates on two different, single-ended (non-differential), logic-level lines: TXD and RXD. RXD indicates the current bit level on the CANH/CANL lines, while TXD indicates the bit to transmit.
A CAN transceiver bridges the logic-level TXD/RXD lines and the bus-level CANH/CANL lines. While CAN controllers may be a on-chip peripheral on some microcontrollers, CAN transceivers are usually separate chips and may provide some degree of electrical isolation.
Add the following CAN circuit onto your breadboard, in addition to the existing switch and RGB LED circuit:
Here is one possible way to breadboard it. Be careful connecting the pins that carry 5v (highlighted in orange), not all pins are 5v tolerant! As for the CAN-side wires, we will use the convention of yellow for CANH and green for CANL.
To connect multiple boards (or your board to the central node), connect all the CANH pins together and connect all the CANL pins together.
The central node is pre-programmed to respond to and generate CAN messages used in this lab. It also acts as a USB CAN network sniffer using the SLCAN protocol, appearing as a serial port (like lab1's serial "Hello World") to your host machine. Once you connect the central node's USB port to your computer, find the serial port like you did in lab1.
There are many protocol analyzer programs available, but for this lab we'll use USBtin, which is a simple, cross-platform GUI program for displaying CAN traffic.
Start the program, enter the serial port name in the top-left (ignore the baud rate and CAN controller mode) and hit "Connect".
If everything was configured correctly, you should see the messages that the master node transmits regularly, with id = 043h ("43 hexadecimal", equivalent to 0x43) and length (DLC) 2:
If there are no active CAN nodes on the bus, CAN transmission fails (because no other node is ACKing) and the CAN sniffer won't pick up anything. In that case, try programming any of the other nodes on the bus with the code skeleton below. The programmed firmware doesn't need to actually do anything, only initialize the CAN peripherals at the proper baud.
You can also go into monitor mode, which displays one updating line per unique CAN ID:
No more spam in this mode, and higher data rate messages won't drown out lower data rate messages. For example, if you press the user button on the master node, you should see a message appear with id=0x42. However, you would be unlikely to notice that message in the torrent of trace mode.
At the bottom, you can configure a message to be transmitted. Try entering a message with id=0x42 and data length 0, then press "Send". The LED on the master should pulse momentarily - this is basically what you'll be doing in the next section, but from your microcontroller.
The central node is configured to pulse its LED on for 0.5 seconds when receiving a CAN message with ID 0x42.
Objective: When the button is pressed, transmit a message that triggers a blink on the central node.
Start with this code skeleton in your src/main.cpp
:
#include "mbed.h"
#include "ledutils.h"
DigitalOut led1(LED1);
RawSerial serial(SERIAL_TX, SERIAL_RX, 115200);
DigitalIn btn(D8, PullUp);
CAN can(D10, D2); // RX, TX
int main() {
// Initialize CAN controller at 1 Mbaud
can.frequency(1000000);
while (true) {
/* YOUR CODE HERE */
}
}
The first thing you need to do is to detect a button press. Unlike in the previous lab, where the application was only level-sensitive (cares about the state of the button, whether it is pressed or not), this application is edge-sensitive (we will define a button press as the up to down action). The simplest solution is to, in the main loop, track the previous button state and compare the current button state against the previous state. A button press is when the previous button state is 1
(up) and the current button state is 0
(down). Remember to update the previous button state at the end of each loop.
In keeping with efficient development practices, you may want to test your button press detector in isolation before stacking CAN on top of it. One simple method is to toggle an LED on each press. You may also have it print something to the serial console.
Problems? Common issues may be:
- Remember to declare your previous button state outside the
while (true)
main loop. Otherwise, it will re-initialize each time around the loop and won't be very useful as a persistent tracker. - Are you losing edges? Make sure the button state compare and update happen atomically with regards to reading the button state. That is, if you read the button during a compare operation, then read it again to update the previous state, there's no guarantee that both reads return the same result, from the same time.
- A solution around this is to read the button state (once per loop!) into a temporary current state variable, then use that variable in the comparison and update operations.
- Are you detecting multiple edges per press? This is a limitation of mechanical switches: they may bounce for a few milliseconds before they settle. If you sample fast enough, these may register as false edges.
- One solution is to filter in hardware. The most common approach is adding a RC filter.
- Debouncing can also be implemented in firmware. One approach is to wait for the switch signal to settle for some amount of time before changing the current state. This requires additional code (and a very small amount of compute), but no hardware (and hence, no recurring costs).
- For the purposes of this lab, ignore this effect.
Compared to the button press, writing to the CAN bus is simple. First, create a message object using the CANMessage
constructor, CANMessage(int _id, const char *_data, char _len = 8, CANType _type = CANData, CANFormat _format = CANStandard)
CANMessage msg(0x42, (char*)NULL, 0);
The constructor has quite a few arguments. For our purposes, we will generally leave type and format as the default (data frame, standard format). For our current message, we only care about the id field, which is set to 0x42. Since there is no data, we pass in a NULL pointer (cast as char* because NULL alone is ambiguous since Mbed offers two similar CANMessage constructors) and a length of 0.
Note that there is also an id-only CANMessage constructor. We don't use that because it it creates a remote (RTR) frame, typically use to request data from another node.
Then, transmit that message using CAN::write(CANMessage)
:
can.write(msg);
Note that you can combine both operations into one line:
can.write(CANMessage(0x42, (char*)NULL, 0));
Have your edge detection code execute the above on a button press, and you should be done. Feel free to compare against the solution, too.
Note: this presents mbed's method for constructing CAN messages. Libraries (such as
zephyr-common
) may provide higher-level abstractions to build and pack data into CAN messages.
When the central node receives a CAN message with ID 0x41, it will pulse its LED with a length specified by the data field, in milliseconds. The blink length is the first 16-bit integer in the payload, in big-endian (network order) format.
Objective: When the button is pressed, transmit a message that triggers a blink on the central node at some interesting amount of time (not 500ms as the last lab).
Now, we will put meaningful data into the data pointer and length fields of the CANMessage constructor. Your goal will be to pack a 16-bit integer into the byte-oriented payload field. First, start by declaring the blink length (in this example, 1000 ms = 1 second):
uint16_t blinkLengthMs = 1000;
Then, declare a 2-byte vector to store the re-packed payload:
uint8_t data[2];
Pack the data from blinkLengthMs
into data
, one byte at a time. Usually, this is accomplished by shifting the 8 bits to be packed into the least significant byte, then masking out the other bytes. For example, to get bits 15...8, we would write:
data[0] = (blinkLengthMs >> 8) & 0xff;
This right shifts blinkLengthMs by 8, so that bits 15...8 are now in bits 7...0. The masking isn't really necessary for a 16-bit integer, but would be if you were taking a byte in the middle of a larger (32-bit, for example) integer.
After you've packed data[1]
, you can construct the CAN message and send it. A convenient one-liner is:
can.write(CANMessage(0x41, (char*)data, 2));
This creates a CAN message with id=0x41 and payload with 2 bytes from data
. Note that the CANMessage API inexplicably takes in a char*
(rather than an unsigned char*
or uint8_t*
type, so the (char*)
cast is required.
Run it, press the button, and check that the central node pulses its LED for the duration you programmed. A reference solution is also available for the curious.
You may notice that the central node is constantly cycling the hue of its RGB LED. It also broadcasts the RGB LED's hue regularly, allowing other nodes to synchronize with it.
Objective: Have your RGB LED track and mimic the central node's RGB LED.
Note: you may think another way to accomplish this objective is to run both nodes open-loop, but starting both at the same time. The reason this does not work well is because of clock drift: since each node has its own (non-synchronized) clock source, the timing will be slightly off and their hues will drift. While the onboard crystals provide good frequency tolerance and stability, they will still de-synchronize over long periods of time.
Start by declaring a RGB LED object. For convenience, a RGBPwmLed
object has been provided in the ledutils.h
header. Instantiate one as follows:
RGBPwmOut rgbLed(D9, D11, D12);
It has one API function, which sets the R, G, B PWM outputs using an input H, S, V. The hue is specified in centi-degrees [0, 36000), while the saturation and value are specified in 16-bit fixed point [0, 65535].
RGBPwmOut::hsv_uint16(uint16_t h_cdeg, uint16_t s, uint16_t v);
Messages can be read from a CAN object using CAN::read(CANMessage&)
. If there was a message pending, the function returns 1 and stores the message in the input (reference) argument. If there was no message pending, the function return 0.
In lab1, you learned about pointers. C++ also has reference types, denoted with
&
, which act like pointers but use value notation. You can see its use inCAN::read
, allowing the function to return both a read status (as its return value) and a message.Note: some style guides discourage the use of references as output values, preferring to use pointer notation to make it explicit that an argument may be overwritten.
A common structure for checking for messages is:
CANMessage msg;
while (can.read(msg)) {
// do something based on the received message
}
CANMessage
(of typeCAN_Message
) has the structure:struct CAN_Message { unsigned int id; // 29 bit identifier unsigned char data[8]; // Data field unsigned char len; // Length of data field in bytes CANFormat format; // 0 - STANDARD, 1- EXTENDED IDENTIFIER CANType type; // 0 - DATA FRAME, 1 - REMOTE FRAME };For this lab, we'll only consider
id
, anddata
.
The central node broadcasts its hue, in centi-degrees, as a 16-bit integer in the payload of a CAN message with id=0x43. You will have to unpack the byte-oriented data from the CAN message into a 16-bit hue (essentially using the opposite process in lab 2.3) and write it to the hue of the RGB LED. Use saturation = 65535 and value (brightness) = 32767. Don't forget that the hue should only be changed upon receiving a message with id=0x43 - there may be other network traffic on the CAN network.
Once you're done, you can compare against the reference solution.
Note: most CAN peripheral hardware provides several message boxes which messages can be received into - once a message is read out of a box, it can receive a new one. Essentially, this provides a small buffer of received messages before additional incoming messages must be dropped.
In our code, we sometimes use a
CANRXTXBuffer
object (instead of justCAN
) - this presents the same API as mbed'sCAN
but adds a transparent software buffer.
A microcontroller that can only do one thing is quite limiting, so let's put together the push-to-blink functionality (lab 2.2), the RGB LED hue update functionality (lab 2.4), and also the central node LED pulse functionality. Note that the central node code is set up such with the functionality in lab 2.2, where a button press causes it to send a CAN message with id=0x42.
Objective: Upon receiving a CAN message with id=0x42, pulse an LED on for 0.5s. While also updating the RGB LED hue from remote messages and sending a CAN message on a button press.
The simplest way would be to add another message handler in the can.read(msg)
loop:
CANMessage msg;
while (can.read(msg)) {
if (msg.id == 0x43) {
// hue code here
} else if (msg.id == 0x42) {
led2 = 1;
wait(0.5);
led2 = 0;
}
}
Try it, and it works, but only kind of. The problem is that the wait
is blocking - it doesn't return, so the system can't do other things (like update the RGB LED hue from received CAN messages or detect button presses). This is noticeable: during the LED pulse time, the RGB LED hue will freeze, then jump to the newest received value. Obviously, this looks bad, and this is bad.
Most microcontrollers have built-in timer peripherals (essentially counters that tick in the background, independently of what the CPU is doing) and can be queried for the current count. Mbed provides the Timer
class as a hardware abstraction layer. A Timer
has these operations:
Timer::start()
: starts the timer. Timers start off not ticking.Timer::reset()
: resets the passed time to 0. Does not change start / stopped state.Timer::stop()
: self-explanatory.Timer::read()
: reads the passed time, in seconds, as a float.Timer::read_ms()
,Timer::read_us()
: reads the passed time as an integer.
Given that, we could restructure our code so that upon receiving the LED pulse message, the LED is turned on and a timer is started and reset. Then, regularly in the main loop, the timer is read and compared against the threshold time (500 ms here). Once it goes over the threshold, the LED is turned off. Since no blocking operations are performed, the RGB LED hue will continue updating while the LED is on.
This style is called cooperative multitasking because the different tasks must co-operate and yield control to other tasks regularly. For example, when we have a blocking wait in the deceptively simple method, the LED pulsing prevented the RGB LED hue from updating. This type of multitasking is simple, explicit (almost nothing going on "behind the scenes"), but also vulnerable to poor code like blocking operations.
Implement the non-blocking pulsing LED, and verify that it works (RGB LED hue continues updating throughout the LED pulse such that you don't see visible hue discontinuities). Once you're done, you can check against the reference solution.
While we currently don't use threading in our codebase, it's still a common programming model and is worthwhile to learn its features and pitfalls.
Note: The solution code for this subpart won't work on the Nucleo F042K6 since it doesn't support the RTOS component. This subpart along with solution code for the BRAIN is left in for reference.
While cooperative multitasking accomplished our goals, it did so by spreading code around - for example, while we would interpret the LED blink as a logical unit, it actually ended up separated into two phases. In this section, we will explore a different approach to multitasking: threading and operating systems.
Objective: Lab 2.5, but using the mbed RTOS.
A thread is a sequence of instructions in a program. In the examples above, the entire program consisted of one thread,
main()
. However, with a scheduler (a typical component in an operating system), it is possible to run multiple threads at once - typically, the scheduler interleaves the threads in time onto one processor. Across threads within a process, resources like memory are shared.There are many different models of communications between threads. The simplest, but most dangerous, is by manipulating shared memory. More structured communications channels are also available, like locks, semaphores, and queues. In this section, we will focus on queues as they are a robust while conceptually simple channel.
A real-time operating system is an operating system that attempts to meet real-time constraints, generally prioritizing latency over throughput. There are two types: hard real-time systems are guaranteed to meet task deadlines, while soft real-time systems provide no such guarantees. The mbed RTOS consists of a task scheduler but provides no task deadline guarantees (or even any static timing analysis capability), and is a soft real-time system.
Multitasking in general is a hard problem, and this tutorial only covers the very basics.
While the RTOS library has been included in the standard build, we haven't used it ... until now. Let's start by structuring our threads:
- the
main
thread will handle CAN communications, dispatching notifications to other threads based on received data - a LED thread will blink the LED upon receiving a notification
- a button thread will notify the CAN thread to transmit a message
First, include the RTOS header:
#include "rtos.h"
Then, declare two Thread
objects (note: main
is implicitly its own thread):
Thread ledThread;
Thread buttonThread;
Each Thread will run a function, the skeletons of which are provided:
void led_thread() {
while (true) {
// TODO: pulse LED upon receiving notification
}
}
void button_thread() {
bool lastButton = true;
while (true) {
bool thisButton = btn;
if (thisButton != lastButton && btn == 0) {
// TODO: enqueue a CAN message with id=0x42 for transmission upon button press
}
lastButton = thisButton;
Thread::wait(5);
}
}
One important note is the use of Thread::wait(uint32_t millisec)
, as opposed to bare wait
. Thread::wait
de-schedules the thread (allowing other threads to run as it is waiting), as opposed to spinning (which takes up compute resources doing nothing). As the button is sampled sufficiently slowly, there is no need to debounce.
With the Thread
objects and functions, we are now ready to write our main
function:
int main() {
// Initialize CAN controller at 1 Mbaud
can.frequency(1000000);
ledThread.start(led_thread);
buttonThread.start(button_thread);
while (true) {
// CAN receive handling
CANMessage msg;
while (can.read(msg)) {
if (msg.id == 0x43) {
uint16_t hue = (msg.data[0] << 8) | (msg.data[1] << 0);
rgbLed.hsv_uint16(hue, 65535, 32767);
} else if (msg.id == 0x42) {
// TODO: notify the LED thread to pulse the LED
}
}
// CAN transmit handling
// TODO: transmit CAN messages
Thread::yield();
}
}
You'll notice two new constructs here:
Thread::start(*fn)
starts aThread
object at a given function. From then on, the argument function and the calling thread will run in parallel.Thread::yield()
"yields" the current thread, allowing other threads a chance to run. While the operating system will pre-empt (and switch to another thread) if one thread has been running for too long, proactively yielding when a thread has no more work (for example, waiting on input when polling) is good practice.
Next, define the two communication channels. One will be a queue of pulse times for the LED, and another will be a queue of CAN messages to transmit:
Mail<CANMessage, 16> canTransmitQueue;
Mail<uint16_t, 1> ledQueue;
Mail
is a queue that stores elements (not to be confused with Queue
, which can only store pointers. The first type parameter is the data type that is stored, and the second parameter is the number of elements in the queue. In the example above, canTransmitQueue
consists of up to 16 elements of type CANMessage
.
Mail
's API is kind of funny:
-
To enqueue an element, you first need to allocate storage space using
Mail::alloc()
, which returns a pointer to an element that you need to initialize. Then, you can actually enqueue the element usingMail::put(elem)
For example, to enqueue a wait time of 500ms intoledQueue
:uint16_t* waitTime = ledQueue.alloc(osWaitForever); *waitTime = 500; ledQueue.put(waitTime);
Note:
Mail::alloc()
takes an optional parameter of the maximum time to wait for a free element buffer. In the above, we essentially do a blocking allocation, waiting indefinitely until one is available. If a finite time is passed in,alloc
can fail and returnNULL
. -
To dequeue an element, use
osEvent evt = Mail::get()
, which returns an object of typeosEvent
. The object'sstatus
field indicates the event's meaning, and we are interested in the case whenevt.status == osEventMail
. For a mail event, we can get a pointer to the element usingevt.value.p
followed by a typecast. Remember toMail::free()
the pointer when done so the element memory can be re-used. An example for polling and readingledQueue
is:osEvent evt = ledQueue.get(); if (evt.status == osEventMail) { uint16_t waitTime = *(uint16_t*)evt.value.p; ledQueue.free((uint16_t*)evt.value.p); led2 = 1; Thread::wait(waitTime); led2 = 0; }
You can see that all the LED code is cleanly centralized into one location. Also note that
Mail::get()
takes an optional maximum wait time too, defaulting to forever (essentially blocking until a message is available).
Putting the two above examples together, our full code now looks like:
#include "mbed.h"
#include "rtos.h"
#include "ledutils.h"
RGBPwmOut rgbLed(P0_5, P0_6, P0_7);
DigitalOut led1(P0_3);
DigitalOut led2(P0_9);
DigitalIn btn(P0_4);
RawSerial serial(P0_8, NC, 115200);
CAN can(P0_28, P0_29);
Mail<CANMessage, 16> canTransmitQueue;
Thread ledThread;
Mail<uint16_t, 1> ledQueue;
void led_thread() {
while (true) {
osEvent evt = ledQueue.get();
if (evt.status == osEventMail) {
uint16_t waitTime = *(uint16_t*)evt.value.p;
ledQueue.free((uint16_t*)evt.value.p);
led2 = 1;
Thread::wait(waitTime);
led2 = 0;
}
}
}
Thread buttonThread;
void button_thread() {
bool lastButton = true;
while (true) {
bool thisButton = btn;
if (thisButton != lastButton && btn == 0) {
// TODO: enqueue a CAN message with id=0x42 for transmission upon button press
}
lastButton = thisButton;
Thread::wait(5);
}
}
int main() {
// Initialize CAN controller at 1 Mbaud
can.frequency(1000000);
ledThread.start(led_thread);
buttonThread.start(button_thread);
while (true) {
// CAN receive handling
CANMessage msg;
while (can.read(msg)) {
if (msg.id == 0x43) {
uint16_t hue = (msg.data[0] << 8) | (msg.data[1] << 0);
rgbLed.hsv_uint16(hue, 65535, 32767);
} else if (msg.id == 0x42) {
uint16_t* waitTime = ledQueue.alloc(osWaitForever);
*waitTime = 500;
ledQueue.put(waitTime);
}
}
// CAN transmit handling
// TODO: transmit CAN messages
Thread::yield();
}
}
Test that it works, that you can simultaneously blink the LED while still updating the RGB LED. Afterwards, fill out the two TODOs in the same style as the LED enqueue / dequeue operations.
Once you've given it a shot, check out the reference solution.
With great power comes great responsibility.
Now that we've implemented the same functionality in both cooperative multitasking and threaded forms, let's compare.
What did we gain with threading?
- Related sequences of instructions appear as a single unit in code, even if their execution is separated by other threads. This is apparent in the LED pulsing code.
- The appearance of multitasking without needing to handle the details manually, as with cooperative multitasking.
- Communication between tasks are made explicit with the use of
Mail
queues.
What pitfalls did we avoid?
- The use of
Mail
queues provides a principled way to communicate between threads. Threading actually exposes a shared memory model, where threads can read and write each other's memory, and not being careful and methodical can lead to subtle, non-deterministic bugs like race conditions. - The Mbed RTOS has implementation limitations, like maximum number of threads and predefined maximum stack size. While this trivial example stayed within the limits, the RTOS isn't always able to produce a helpful error message on failure, making debugging painful.
What did we lose with threading?
- The actual execution behavior of the RTOS isn't predictable, especially with regards to very fine (milliseconds) timing. While the thread scheduler gives the illusion of multiple threads executing simultaneously, the actual thread swapping occurs at human timescales (every few milliseconds) rather than machine timescales.
Overall, threading provides a different approach than cooperative multitasking and can be a powerful tool in certain situations, but only if you're aware of the pitfalls and shortcomings. If you're interested in using threading for your project, do talk to us so we can discuss whether it's appropriate and how to mitigate the potential issues.