-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Heap: Improve debug caller address reporting #8720
base: master
Are you sure you want to change the base?
Conversation
lite refactoring of DEBUG_ESP_OOM printing Overall refactoring to make easier to follow. Added ABI changes includes atomic save of OOM debug info Added debug support for C++ "delete" operator Avoid identifying the wrapper code as the source of the Heap call. Each outer wrapper passes the caller's address down to the next level. Improves "caller" address reporting for OOM and Poison events. Previously, some OOM and Poison events reported info pointing to the thick heap wrapper rather than back to the caller. Update abi.cpp, heap.cpp and umm_local.c (umm_poison) to support caller parameter.
sketch or library. Enhanced DEBUG_ESP_OOM to track OOM events when C++ Exceptions: "enabled" is selected. Report more interesting higher level code address as caller instead of GCC C++ module. Enhanced poison neighbor reporting to distinguish between allocation vs. neighbor failing poison check. For draft version, added debug macro HEAP_DEBUG_PROBE_PSFLC_CB to aid in evaluating caller address results.
Added DEBUG_ESP_WITHINISR to provide notification for ISRs using C++ 'new'/'delete' operators or LIBC calls using _malloc_r, ... Moved block of port related changes for UMM_POISON... from `umm_malloc_cfg.h` to `umm_malloc_cfgport.h`. Support for specifying both UMM_POISON_CHECK and UMM_POISON_CHECK_LITE at the same time did not work as expected. UMM_POISON_CHECK_LITE dominated in the build. Fix: Keep things simple prevent combining UMM_POISON_CHECK and UMM_POISON_CHECK_LITE. Defines for UMM_POISON_CHECK, UMM_POISON_CHECK_LITE, and UMM_POISON_NONE are checked for mutual exclusivity. For sketches with extremely limited resources, added UMM_POISON_NONE to prevent UMM_POISON_CHECK_LITE from being added to debug builds. Make UMM_POISON_CHECK_LITE fail messages more specific about the location. At free distinguishes between neighbor and allocation. And, identify the caller's address.
pvPortMalloc(,,,true) case. Improved comments.
…nto pr-heap-refactor3
…nto pr-heap-refactor3
…nto pr-heap-refactor3
…nto pr-heap-refactor3
Let me know if you prefer I factor out experimental |
If it is still a draft, probably makes sense to leave it out of the PR to reduce loc |
Updated umm_poison move from umm_malloc_cfg.h to umm_malloc_cfgport.h
in postmortem and printing OOM from Heap wrappers. Fixed crash within postmortem when OOM file is NULL Fixed printing OOM file name when not in ISR.
removed experimental debug extension as requested added forgotten OOM count to the postmortem report
resolved conflicts with umm_malloc_cfg.h and umm_malloc_cfgport.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated this description to reflect the current PR state: not ready
Still true?
ets_printf_P(PSTR("\nlast failed alloc call: %08X(%d)@%S:%d\n"), | ||
(uint32_t)umm_last_fail_alloc_addr, umm_last_fail_alloc_size, | ||
umm_last_fail_alloc_file, umm_last_fail_alloc_line); | ||
ets_printf_P(PSTR("\nlast failed alloc call: 0x%08X(%d), File: %S:%d\n"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
%s
not %S
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
%S
, umm_last_fail_alloc_file
can be a PROGMEM
string
At least some SDK file name strings are in .irom.text
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No special case in newlib here? Implementation thinks of both 's' and 'S' as exactly same
https://github.com/earlephilhower/newlib-xtensa/blob/ebc967552ce827f21fc579fd8c437037c1b472ab/newlib/libc/stdio/nano-vfprintf_i.c#L217
printf(3) says it is '%ls', which is wide-char ('wchar_t') and non-applicable here
Synonym for ls. Don't use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have always been confused about this duality. Some print functions give an error (or is it a warning?) about %S and PROGMEM strings while other functions don't, like Serial.printf_P.
%S
is used throughout core_esp_8266_postmortem.cpp
for handling the printing of PROGMEM strings.
The comment for the code you pointed out reads to me that Arduino intends to have %S for PROGMEM strings.
case 'S': // TODO: Verify cap-S under Arduino is "PROGMEM char*", not wchar_t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6280e98 / #5376
requested in #4223 (...which I would guess gets dragged down from AVR Core?)
merged in the original newlib 3.x.x code via igrr/newlib-xtensa#5
Add "%S" (capital-S) format that I've been told, but cannot verify, is used in Arduino to specify a PROGMEM string parameter in printfs, as an alias for "%s" since plain "%s" can now handle PROGMEM.
as-is both '%s' and even '%.*s' work seamlessly, no need for extra care
but it is true that it is used across the file, probably better served in a separate PR. at the very least for consistency
cores/esp8266/heap.cpp
Outdated
#define UMM_MALLOC_FL(s,f,l,c) umm_poison_malloc(s) | ||
#define UMM_CALLOC_FL(n,s,f,l,c) umm_poison_calloc(n,s) | ||
#define UMM_REALLOC_FL(p,s,f,l,c) umm_poison_realloc_flc(p,s,f,l,c) | ||
#define UMM_FREE_FL(p,f,l,c) umm_poison_free_flc(p,f,l,c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to use struct à la source_location
to the reduce number of args?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! That would be convenient.
Thoughts and Concerns:
- Can
source_location
be used from.c
modules? - Will function names be stored in
PROGMEM
or DRAM? - Their example shows a longer (full definition) string than we have used for the function name, which could cause build size issues for debug builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I am not able to build with #include <source_location>
Detecting libraries used...
/home/mhightow/Arduino/hardware/esp8266com/esp8266/tools/xtensa-lx106-elf/bin/xtensa-lx106-elf-g++ -D__ets__ -DICACHE_FLASH -U__STRICT_ANSI__ -D_GNU_SOURCE -DESP8266 -Os @/tmp/arduino_build_311930/core/build.opt -I/home/mhightow/Arduino/hardware/esp8266com/esp8266/tools/sdk/include -I/home/mhightow/Arduino/hardware/esp8266com/esp8266/tools/sdk/lwip2/include -I/home/mhightow/Arduino/hardware/esp8266com/esp8266/tools/sdk/libc/xtensa-lx106-elf/include -I/tmp/arduino_build_311930/core -c @/home/mhightow/Arduino/hardware/esp8266com/esp8266/tools/warnings/none-cppflags -g -free -fipa-pta -Werror=return-type -mlongcalls -mtext-section-literals -fno-rtti -falign-functions=4 -std=gnu++17 -ffunction-sections -fdata-sections -fexceptions -DMMU_IRAM_SIZE=0xC000 -DMMU_ICACHE_SIZE=0x4000 -DFP_IN_IROM -w -x c++ -E -CC -DNONOSDK22x_190703=1 -DF_CPU=80000000L -DLWIP_OPEN_SRC -DTCP_MSS=536 -DLWIP_FEATURES=1 -DLWIP_IPV6=0 -DDEBUG_ESP_PORT=Serial -DARDUINO=10819 -DARDUINO_ESP8266_NODEMCU_ESP12E -DARDUINO_ARCH_ESP8266 "-DARDUINO_BOARD=\"ESP8266_NODEMCU_ESP12E\"" "-DARDUINO_BOARD_ID=\"nodemcuv2\"" -DLED_BUILTIN=2 -DFLASHMODE_DIO -I/home/mhightow/Arduino/hardware/esp8266com/esp8266/cores/esp8266 -I/home/mhightow/Arduino/hardware/esp8266com/esp8266/variants/nodemcu /tmp/arduino_build_311930/sketch/SrcLoc.ino.cpp -o /dev/null
SrcLoc:3:10: fatal error: source_location: No such file or directoryAlternatives for source_location: []
3 | #include <source_location>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
I also noticed that path ... esp8266com/esp8266/tools/sdk/libc/xtensa-lx106-elf/include
does not exist on my machine, even after rerunning ./get.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not available in C, as it would need default-arg constructed by the caller.
Not available in -std=c++17, part of the -std=c++2a
What I meant was copying the behaviour, at least in structure which is then passed to the func. May be significantly larger in size, on second though :/
i.e. static struct foo { .file = ..., .lineno = ...}; foo.caller = __builtin__(); func(..., &foo);
Plus, we do have a hack around __FILE__
which is not applied to source_location to put strings in progmem as default .rodata
is still in ram
cores/esp8266/abi.cpp
Outdated
|
||
extern "C" void __cxa_pure_virtual(void) __attribute__ ((__noreturn__)); | ||
extern "C" void __cxa_deleted_virtual(void) __attribute__ ((__noreturn__)); | ||
|
||
#if defined(__cpp_exceptions) && (defined(DEBUG_ESP_OOM) || defined(DEBUG_ESP_PORT) || defined(DEBUG_ESP_WITHINISR)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't 'new_handler' be independent of debugging?
since the idea is to allow plain new to sometimes avoid throwing (regardless of whether it builds w/ -fexceptions or without). exception-less builds obviously just call abort for would-be exception
ref new_op.cc and new_opnt.cc, nothrow variant just wraps the other in try catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't 'new_handler' be independent of debugging?
It has been a while since I did this, I think the increased build size for non-debug builds was a consideration.
It would add 136 bytes of IROM to the non-debug build so that it always builds with our revised new_handler
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will have to recheck too, then. Debug=enabled without exceptions are not handled though?
Plus there are more missing overloads from -std=c++17
https://en.cppreference.com/w/cpp/memory/new/operator_new
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plus there are more missing overloads from -std=c++17
The current master does not support "new" align operations. Compiler reports:
/workdir/repo/gcc-gnu/libstdc++-v3/libsupc++/new_opa.cc:86: undefined reference to
memalign'`
No hits on a quick issue check for memalign
.
The library umm_malloc needs an enhancement to support an allocate-aligned option. Also, things will get complicated when incorporating the poison check feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've worked through adding an alignment option to the umm_malloc
library. In the process, I have addressed an unreported alignment issue. umm_malloc
allocated data addresses always landed on an 8-byte aligned - 4-byte.
Now I need to finish working through all the build options in abi.cpp
and heap.cpp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(edit: comment thread from above)
Can this be a config error? GCC / newlib / hal / etc.
Reading cppreference and some of the internal GCC stuff from '-faligned-new=X', 'Fundamental alignment' and max_align_t seem to come from 'long double' and 'long long' which are afaik 4-byte ones.
Currently, it does indeed show 8-byte
constexpr auto a = alignof(long long); // 8
constexpr auto b = alignof(long double); // 8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(edit: comment thread from above)
Can this be a config error? GCC / newlib / hal / etc.
Reading cppreference and some of the internal GCC stuff from '-faligned-new=X', 'Fundamental alignment' and max_align_t seem to come from 'long double' and 'long long' which are afaik 4-byte ones.
Currently, it does indeed show 8-byte
constexpr auto a = alignof(long long); // 8 constexpr auto b = alignof(long double); // 8
...which is sort-of expected, as run-of-the-mill xtensa isa pdf does specify alignment req for long double and long long as 8. compiler here works according to spec, stack and heap alignment would stay consistent vs. current approach
no mention / explicit alignment things mentioned in the machine config, besides the BIGGEST_ALIGNMENT
of 128 (bits)
https://github.com/gcc-mirror/gcc/tree/master/gcc/config/xtensa /ALIGN/
esp32 variants adhere to a similar standard. however, esp8266 rtos does use 4-byte heap alignment
(which I would assume works just fine for the most part)
Cleanup remnants from backport
Fixed postmortem build issue around optional oom count function Made corrections for build macro DEBUG_ESP_WITHINISR
Enhancements to the UMM_MALLOC library * Support for 8-byte aligned data addresses. * To accommodate any libraries or Sketches that may have workarounds for the old 4-byte alignment, a deprecated build option "-DUMM_LEGACY_ALIGN_4BYTE=1" was added. * Support the memalign() function needed for the C++17 "new" aligned operator. Enable with build option "-DUMM_ENABLE_MEMALIGN=1." * Build options "-DUMM_LEGACY_ALIGN_4BYTE=1" and "-DUMM_ENABLE_MEMALIGN=1" are mutually exclusive. * Allocations from memalign() internally appear and are handled like any other UMM_MALLOC memory allocation. * The existing free() function handles releasing memalign() memory allocations. * Function realloc() should not be called for aligned memory allocations. It can break the alignment. At worst, the alignment falls back to sizeof(umm_block), 8 bytes. * The UMM_POISON build option was extended to support memalign(). Modules heap.cpp and abi.cpp updated for build option "-DUMM_ENABLE_MEMALIGN=1". C++ "new" operator overloads are updated to support the alignment argument.
…alloc_cfgport.h Corrected last umm_block placement and heap_end Added build define DEV_DEBUG_ABI_CPP. Its intended use is for module code maintenance. Use DEV_DEBUG_ABI_CPP when debugging the new/delete overload wrappers in abi.cpp and heap.cpp.
#if UMM_ENABLE_MEMALIGN | ||
|
||
/* | ||
* Ensure alignment is power of 2; however, we allow zero. | ||
*/ | ||
|
||
if (0u != (alignment & (alignment - 1u))) { | ||
STATS__ALIGNMENT_ERROR(id_malloc, alignment); | ||
return ptr; | ||
} | ||
|
||
/* | ||
* Allocations default to 8-byte alignment same as __STDCPP_DEFAULT_NEW_ALIGNMENT__ | ||
* No need to execute extra alignment logic for small alignments. | ||
*/ | ||
|
||
static_assert(__STDCPP_DEFAULT_NEW_ALIGNMENT__ == 8u); | ||
static_assert(__STDCPP_DEFAULT_NEW_ALIGNMENT__ == sizeof(umm_block)); | ||
if (alignment <= sizeof(umm_block)) { | ||
alignment = 0u; // Use implementation default, 8. | ||
} | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note of the -faligned-new=X
- https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Dialect-Options.html#index-faligned-new
Shouldn't this default to 4-byte instead or alignof(umm block)? But, not sure I follow the size requirement for the block (...including the 4-byte adjustment happening below when initializing).
Or is this solving internal issue of UMM (i.e. 4-byte -> 8-byte) incorrectly handling some of allocation cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I add -faligned-new=4
or other non-8 values to the build the GCC compiler complains with:
cc1: warning: command-line option '-faligned-new=4' is valid for C++/ObjC++ but not for C
It will also complain with 16.
I will remove the static_asserts
. That will allow the use of -faligned-new=
umm_malloc allocates in blocks of 8 bytes.
Assume the total Heap range is in multiples of 8 bytes and starts at an 8-byte aligned address boundary.
It costs a total of 8 bytes of space (4 bytes at each end of the address range) for resulting data allocation addresses to be 8-byte aligned.
The umm_block structure is sized as two 4-byte fields.
The first holds two 16-bit block addresses to the previous and next umm_block.
The 2nd 4-byte field holds two 16-bit block addresses for the previous and next free umm_blocks.
Take the case of a minimum-sized allocation ocuping 1 umm_block, 8-bytes.
The 2nd 4-byte field becomes the start of the data address when allocated.
Once a block is allocated, the Heap overhead is 4 bytes leaving 4 bytes for data space. If we make sure the heap starts at an address that is 4-byte aligned but not 8-byte aligned, then the resulting data address will be 8-byte aligned. And, this will hold for all allocations.
ets_printf_P(PSTR("\nlast failed alloc call: %08X(%d)@%S:%d\n"), | ||
(uint32_t)umm_last_fail_alloc_addr, umm_last_fail_alloc_size, | ||
umm_last_fail_alloc_file, umm_last_fail_alloc_line); | ||
ets_printf_P(PSTR("\nlast failed alloc call: 0x%08X(%d), File: %S:%d\n"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No special case in newlib here? Implementation thinks of both 's' and 'S' as exactly same
https://github.com/earlephilhower/newlib-xtensa/blob/ebc967552ce827f21fc579fd8c437037c1b472ab/newlib/libc/stdio/nano-vfprintf_i.c#L217
printf(3) says it is '%ls', which is wide-char ('wchar_t') and non-applicable here
Synonym for ls. Don't use.
cores/esp8266/abi.cpp
Outdated
|
||
extern "C" void __cxa_pure_virtual(void) __attribute__ ((__noreturn__)); | ||
extern "C" void __cxa_deleted_virtual(void) __attribute__ ((__noreturn__)); | ||
|
||
#if defined(__cpp_exceptions) && (defined(DEBUG_ESP_OOM) || defined(DEBUG_ESP_PORT) || defined(DEBUG_ESP_WITHINISR)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(edit: comment thread from above)
Can this be a config error? GCC / newlib / hal / etc.
Reading cppreference and some of the internal GCC stuff from '-faligned-new=X', 'Fundamental alignment' and max_align_t seem to come from 'long double' and 'long long' which are afaik 4-byte ones.
Currently, it does indeed show 8-byte
constexpr auto a = alignof(long long); // 8
constexpr auto b = alignof(long double); // 8
Cleanup '//D' comments Improvements to new/delete debug logging
Removed '//D' block comments Removed redundant title sub-string from DEV_DEBUG_ABI_CPP. Moved some operation delete's with respect to build macros such that they were built when needed.
to esp8266/examples. I am not sure this is the best locations.
…te monitoring. Made example more instructive on build option error.
PR Status: ready
Heap: Improve debug caller address reporting to indicate the address in the sketch or library.
Lite refactoring of
heap.cpp
and expanded commentsDEBUG_ESP_OOM
printingOverall avoid identifying the wrapper code as the source of the Heap call. Each outer wrapper passes the caller's address down to the next level.
Improves "caller" address reporting for OOM and Poison events. Previously, some OOM and Poison events reported info pointing to the thick heap wrapper rather than back to the caller. Update
abi.cpp
,heap.cpp
andumm_local.c
(umm_poison
) to support caller parameters.Enhanced
DEBUG_ESP_OOM
to track OOM events when C++ Exceptions: "enabled" is selected. Report more interesting higher-level code addresses as the caller instead of GCC C++ module.Added
DEBUG_ESP_WITHINISR
to provide notification for ISRs using C++ "new"/"delete" operators orLIBC
calls using_malloc_r
, ...Enhanced poison neighbor reporting to distinguish between allocation vs. neighbor failing poison check.
Support for specifying both UMM_POISON_CHECK and UMM_POISON_CHECK_LITE at the same time did not work as expected. UMM_POISON_CHECK_LITE dominated in the build. Fix: Keep things simple and prevent combining UMM_POISON_CHECK and UMM_POISON_CHECK_LITE. Defines for UMM_POISON_CHECK, UMM_POISON_CHECK_LITE, and UMM_POISON_NONE are checked for mutual exclusivity.
For sketches with extremely limited resources, added UMM_POISON_NONE to prevent UMM_POISON_CHECK_LITE from being added to debug builds.
Moved port related changes for
UMM_POISON...
fromumm_malloc_cfg.h
toumm_malloc_cfgport.h
.Added support for aligned allocations. To preserve smaller build sizes, aligned allocations default to disable. When required, supply build option
-DUMM_ENABLE_MEMALIGN=1
.Heap Changes
Enhancements to the UMM_MALLOC library
-DUMM_LEGACY_ALIGN_4BYTE=1
was added.-DUMM_ENABLE_MEMALIGN=1
.-DUMM_LEGACY_ALIGN_4BYTE=1
and-DUMM_ENABLE_MEMALIGN=1
are mutually exclusive.memalign()
internally appear and are handled like any otherUMM_MALLOC
memory allocation.free()
function handles releasingmemalign()
memory allocations.realloc()
should not be called for aligned memory allocations. It can break the alignment. At worst, the alignment falls back tosizeof(umm_block)
, 8 bytes.Modules heap.cpp and abi.cpp updated for build option
-DUMM_ENABLE_MEMALIGN=1
. C++ "new" operator overloads were updated to support the alignment argument.(Keep this description updated to reflect the current PR )