Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prune the compiled code to its minimum expression #639

Draft
wants to merge 34 commits into
base: dev_branch
Choose a base branch
from

Conversation

JuanSapriza
Copy link
Contributor

@JuanSapriza JuanSapriza commented Feb 5, 2025

Objective

The objective of this PR is to minimize the code as much as possible.
Some new improvements will not affect compilation at all, such as adding --gc-sections, which simply removes unused sections and can free several kB.

Other changes might be more drastical and I will set them as optional and heavily disadviced configurations.

Background

Just leaving here as documentation. I am compiling the following code

int main() return 0;

The initial size was 24 kB. By removing the unused sections it goes down to 12 kB.

The remaining 12 kB are mostly due to:

  1. Interrupt handlers
  2. Standard libraries
    Screenshot from 2025-02-05 21-40-48
    Screenshot from 2025-02-05 21-40-38

Tasks

🟩 Remove unused sections
🟧 Not compile unused interrupt handlers ⚠️
🟩 Prune functions in syscalls to make the lighter
🟩 Remove unnecessary calls to standard libraries
🔲 Optionally make data come immediately after text
🔲 Reduce default data usage
🟩 Allow to have only 1 memory bank
🔲 Fix the use of COMPILER_FLAGS=-Os
❓ Move the power manager from 0x3000 (or any rounded address)
🔲 Remove handlers 32-63 from here
🔲 Protect against frequencies not multiple of 100 in the calculation of the NCO

Important ⚠️

This PR builds on top of #636. It is blocked by the merging of that one first (or i will have to cherry pick changes)

@davidmallasen davidmallasen added the software Software and application label Feb 6, 2025
@JuanSapriza
Copy link
Contributor Author

@davidmallasen @davideschiavone

With the latest modifications i see that the whole code + data of a small application could very well fit inside 1 memory bank. Yet there is the limitation of needing to have 2 at least.

Why is there such limitation?

@davideschiavone
Copy link
Member

davideschiavone commented Feb 6, 2025

@davidmallasen @davideschiavone

With the latest modifications i see that the whole code + data of a small application could very well fit inside 1 memory bank. Yet there is the limitation of needing to have 2 at least.

Why is there such limitation?

In principle there is not, it's legacy, if you want we can remove it - still, having 2 banks would be more efficient than one bank (Harvard architecture) so maybe two 16KB memories is better than one single 32kB - however, two 16kB of memories together are bigger than one bank of 32kB - so it's a trade-off - but only one bank would be very inefficient (2 cycles to fetch an instruction every time it is done at the same time of any other load or dma transaction, and 3/4 cycles for a lw/sw from the cve2 CPU)

@JuanSapriza
Copy link
Contributor Author

I agree, but for HEEPidermis, for example, there is a chance that accesses to memory from peripherals are sparse.
I will try to remove this restriction so we can simulate both options (two smaller or one bigger) once we have the full application.

@davideschiavone
Copy link
Member

I agree, but for HEEPidermis, for example, there is a chance that accesses to memory from peripherals are sparse. I will try to remove this restriction so we can simulate both options (two smaller or one bigger) once we have the full application.

having two small ones should be already possible (AFAIK) - maybe having only 1 is more tricky as the mcu-gen has probably been tailored around having at least 2, which is not just for the HW but for the generation of the linker scripts - nothing too hard to fix, but not just a parameter

@davideschiavone
Copy link
Member

btw - question @JuanSapriza , why do we have the print functions if we are not using it in your example? shouldn't that be pruned?

@JuanSapriza
Copy link
Contributor Author

Tomorrow i will test having two smaller ones. I think i tried it a while ago but dont remember the result. WIll also see how to have only one.

Regarding the printfs, it's sadly not as simple. The stdlibs are included in some of the drivers, so we might need to do some heavy prunning.

Today i managed to remove several things, but realized that the cost of making the compilation flow so much more complex and sensitive vs having 4kB of printf definitions was probably not worth it now.

@JuanSapriza
Copy link
Contributor Author

JuanSapriza commented Feb 7, 2025

With 2x16 kB banks everything fine. A Hello World:

Total space: 32.0 kB = Continuous: [16, 16] kB + Interleaved: [0] kB
Region 	 Start 	End	Sz(kB)	Usd(kB)	Req(kB)	Utilz(%) 
Code:  	  0.0	 16.0	 16.0	9.6	 12.3	76.8
Data:  	 16.0	 32.0	 16.0	5.5	  5.5	34.7

Cont 0 CCCCCCCCC------- 	56.2%
Cont 1 ddddd----------- 	31.2%

@JuanSapriza
Copy link
Contributor Author

@davideschiavone is suggesting to try this out. https://github.com/Velko/FsLibc/tree/master
I think it will help a lot bringing both code and data space down

@JuanSapriza
Copy link
Contributor Author

having two small ones should be already possible (AFAIK) - maybe having only 1 is more tricky as the mcu-gen has probably been tailored around having at least 2, which is not just for the HW but for the generation of the linker scripts - nothing too hard to fix, but not just a parameter

Just tried with 1 mem bank. Only needed to change a check in system.py where it was verifying that mem banks >=2 .... beyond that everything else worked out of the box jaahahah

@JuanSapriza
Copy link
Contributor Author

I went down to the syscalls, handlers and asserts and removed the use of <stdio.h>. This effectively frees 4 kB. Now i am at 8 kB of code + 5.3 kB of data that needs prunning.

Things that i need to check:

🔲 Changing this exception printings might break everything - we need tests over exceptions!
🔲 Not including unused files for compilation. The interrupt handlers are being overriden by their non-weak implementations, even if your application doesnt use that peripheral. This comes from the problem of including all files. e.g. if you comment the contents of dma.h you will get errors in the compilation from dma_sdk.h (??!?!?)
🔲 We have several circular dependencies on the drivers

@JuanSapriza
Copy link
Contributor Author

JuanSapriza commented Feb 7, 2025

Removed 1.7kB of program memory by adding a macro that resolved the UART NCo in compile time instead of before every printf.

This does not stop you from changing the baudrate in runtime, but you need to recompute the NCO.

LuigiGiuffrida98 and others added 2 commits February 19, 2025 14:25
* Don't disable global interrupts with DMA interrupt

* Add vsim option

* Add DMA functionality for SPI subaddressing

* Add setup functions to bsp

* Fix DMA subaddressing mode.

* Add tests for DMA subaddressing mode.

* Fix verible

* Debug in CI

* Delete commented parts in dma.sv and uncomment test sections in example_dma_subaddressing. Restore original version of example_power_manager.

* Add hardware fifo mode to DMA

This DMA mode allows data to be forwarded to an external accelerator with I/O fifos tightly coupled with the DMA

* Fix verible

* Revert "Debug in CI"

This reverts commit 560df9e.

* Fix CI

* Test CI

* Modify tb.vlt

Add testharness signals related to hw fifo mode to tb.vlt, as it may happen that they are actually undrive if no accelerator is coupled with the DMA

* Fix example_power_manager

dma.c change in DMA_CSR_REG_MIE_MASK in this PR causes the external interrupts to be disabled before the GPIO-related code portion in the example_power_manager. The previous version relied on the dma HAL to enable the external interrupts. The example has been modified to enable the external interrupts in the main.

* Revert "Fix CI"

This reverts commit ecdd737.

* Revert "Test CI"

This reverts commit a12dbea.

* Blacklist dma subaddress mode in CI

* Add documentation and image for new DMA modes

* Fix hw_fifo package imports and add copyright

* Fix hw_fifo_ctrl files

* Fix verible

---------

Co-authored-by: alessionaclerio22 <[email protected]>
* Added script to visualize memory usage after compilation

* Improved clarity

* Corrected the script that was not taking the values correctly in the vcase of no interleaved bus

* removed the need for numpy

* Corrected parsing of max memory size of code

* Improved way of computing start and end of text and data

* Added comments and simplified some functions

* Removed the parsing of the map file. Now parsing the output of readelf

* Fixed issues with representation

* Catch problems with flash_exec

* added readelf to the CI

* added apt insall binutils

* Simply quit the script if readelf not available

* removed all version

* corrected flash load

* Fixed display of interleaved space
@JuanSapriza
Copy link
Contributor Author

From the remaining 6.8 kB I see that the distribution is more or less as follows:

  • 3 kB of intrrupt handlers
  • 3 kB of libraries (basic syscalls)

I trust we can reduce the size of those interrupts based on the following:

Size of each fast interrupt handler

Each interrupt handler is dominated by the push/pop of all the registers.

  • Removing them by not declaring the function as an interrupt will reduce code size by 1kB.
  • But... breaks everything, naturally hahaha
  • Making the functions volatile will reduce in almost 1kB by only pushing the SP - but will end up crashing anyways.
  • I believe something can be done in the middle

image

  • Other option I can think of is trying not to define these functions if their peripheral is not used. This requires a better handling of source/header inclusions, which currently is a ducking mess 🦆

@JuanSapriza
Copy link
Contributor Author

In order to print an error message in the case where an interrupt that does not have an associated handler we were using the puts function. This function in turn used XXX and YYY function which resulted in the inclusion of 3 kB of code.

Sacrificing those messages the code can be reduced by 3kB when you do not use printf.
Removing puts required forcing the inclusion of some of the functions we had in syscalls.c

@JuanSapriza
Copy link
Contributor Author

Added a function _writestr in syscalls that allows us to simply write a string. Is a low-cost substitue to printf. Adds an overhead of ~100 Bytes, which feels acceptable vs 4 kB of using printf. Also, by using this in several places (like assert and handler), i reduced another 400 Bytes... so pays off hehe)

8 kB of code

int main(){
    printf("hi world");
    return 0;
}

3.7 kB of code

int main(){
    _writestr("hi world");
    return 0;
}

@JuanSapriza
Copy link
Contributor Author

JuanSapriza commented Feb 21, 2025

⚠️ Dangerous

I did something a little dangerous. Explanation below:

The handler functions of the fast interrupt controller (FIC) are defined added in the vectors.S and defined/declared in drivers/fast_intr_cntr/fast_intr_ctrl.c. There you have several handler_irq_fast_xxx for each fast interrupt connected to the core. Inside that function two things were done: 1) clear the interrupt flag, 2) call another handler: fic_irq_xxx.

void handler_irq_fast_dma(void)
{
    // The interrupt is cleared.
    clear_fast_interrupt(kDma_fic_e);
    // call the weak fic handler
    fic_irq_dma();
}

These handler_irq_fast_xxx was declared as __attribute__((interrupt)) - which made it back up all the registers upon entry and pop them back upon returning. This is safe and a good practice -- but... this means that we will ALWAYS have 14×2×17 instructions for pushing and popping even if the peripheral is not even used and the handler will be an empty handler! So we are always using ~2kB just for useless push/pop.

An opportunity to remove these arises from the second function call, the one to fic_irq_dma.
This is an empty weak implementation to be replaced by the application or driver: i.e. if you dont use a peripheral, this function remains empty, using very little space in memory.

__attribute__((weak, optimize("O0"))) void fic_irq_dma(void)
{
    /* Users should implement their non-weak version */
}

In the drivers you can find strong implementations like this one:

void fic_irq_dma(void)
{
    for (int i = 0; i < DMA_CH_NUM; i++)
    {
        if (dma_subsys_per[i].peri->TRANSACTION_IFR == 1)
        {
            dma_subsys_per[i].intrFlag = 1;
            dma_intr_handler_trans_done(i);
 . . . 

If we managed to transfer the register back-up safety to the strong implementation, then all those push/pop would only appear if you actually use the interrupt, and not by default.


My way of achieving this was pushing and popping only the registers necessary to clear the interrupt flag by making the handler "volatile", and then make the strong implementation "interrupt" to ensure that during the actual interrupt handling routine nothing is overridden.

I used these two macros to push/pop after entering the handler and before exiting. Also made the handler __attribute((naked)).

#define SAVE_REGISTERS()         \
    __asm volatile(              \
        "addi sp, sp, -16\n\t"   \
        "sw   a0, 12(sp)\n\t"   \
        "sw   ra, 8(sp)\n\t"     \
        "sw   a4, 4(sp)\n\t"     \
        "sw   a5, 0(sp)\n\t"     \
    )

#define RESTORE_REGISTERS()      \
    __asm volatile(              \
        "lw   a5, 0(sp)\n\t"     \
        "lw   a4, 4(sp)\n\t"     \
        "lw   ra, 8(sp)\n\t"     \
        "lw   a0, 12(sp)\n\t"   \
        "addi sp, sp, 16\n\t"    \
        "mret\n\t"              \
    )

well... this does not work.

My inital guess was that the stack is only 16 words deep, but I dont know where i could confirm or deny this. @davideschiavone @davidmallasen ideas? If that were the case, after doing some pushes i shouldnt be able to push the whole 16 registers again. I would get bullshit when popping and thus why i get llegal instruction (hart 0) at PC 0x00010460: 0xee909810 (PC much higher than the end of my code).

However, if i pop before calling the second handler (in which case the stack should be empty again), the program gets stuck forever.

ToDo find out where


So.. now... i removed the __attribute__((interrupt)) from the strong implementations and everything works fine. My guess: pure luck 🍀 This is a dangerous situation, right? When entering the strong implementation of the handler they only push/pop a few registers.


The advantage of this is that the code goes down from 3.8 kB -> 3.1 kB (not down 2 kB as some push/pop persist.

#define INTERRUPT_HANDLER_ABI __attribute__((aligned(4), naked))


/**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is dangerous, i think this is wrong. When you jump to an async event (and return from it) you need to push to the stack all the registers you are gonna modify within the ISR and then restore them. How do you know that you are gonna use only those 5 registers ?

Especially if they are week, so could be redefined by users?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the handler the registers used are well known, the problem is indeed in the weak implementation.

If I managed to make all the registers be backed-up upon entering the weak implementation, that would be perfect. However, I could not manage to make it work. Can you think of anything that would prevent that?

@davideschiavone
Copy link
Member

Added a function _writestr in syscalls that allows us to simply write a string. Is a low-cost substitue to printf. Adds an overhead of ~100 Bytes, which feels acceptable vs 4 kB of using printf. Also, by using this in several places (like assert and handler), i reduced another 400 Bytes... so pays off hehe)

8 kB of code

int main(){
    printf("hi world");
    return 0;
}

3.7 kB of code

int main(){
    _writestr("hi world");
    return 0;
}

I would not add the _writesrtr function among the system calls. It's not standard. Regarding the interrupt puts, that's legacy from the core testbench, i would simply remove it or call exit or whatever dealing with the testbench. We don't need to print anything here

@JuanSapriza
Copy link
Contributor Author

I would not add the _writesrtr function among the system calls. It's not standard. Regarding the interrupt puts, that's legacy from the core testbench, i would simply remove it or call exit or whatever dealing with the testbench. We don't need to print anything here

Where would you add it? Needs to be somewhere low as handlers and assert use it.

puts was indeed removed in this branch.

@davideschiavone
Copy link
Member

davideschiavone commented Feb 21, 2025

I would not add the _writesrtr function among the system calls. It's not standard. Regarding the interrupt puts, that's legacy from the core testbench, i would simply remove it or call exit or whatever dealing with the testbench. We don't need to print anything here

Where would you add it? Needs to be somewhere low as handlers and assert use it.

puts was indeed removed in this branch.

Handlers should not print in general, but indeed assert could. For now leave it there

@davidmallasen davidmallasen changed the base branch from main to develop February 24, 2025 08:44
@JuanSapriza
Copy link
Contributor Author

I tested this on FPGA and also works with hello world and example_dma.

@JuanSapriza JuanSapriza marked this pull request as ready for review February 24, 2025 19:58
@davideschiavone
Copy link
Member

can you @JuanSapriza please remove the ISRs 32 to 63? https://github.com/esl-epfl/x-heep/blob/main/sw/device/lib/crt/vectors.S#L84 basically deleting from line 84 to 117 included

/*
REMOVING THIS CALL REDUCES CODE SIZE CONSIDERABLY
This will not print an error message in case of
jumping to an unknown error handler.
la a0, unknown_msg
jal ra, puts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we remove these puts? either replacing them with _write or simply handling them with the testbench? can be done here on in another Pr in the future, in this case pls open an issue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave the comment to leave a record that they can be added if someone ever needs them, if you uncomment it it should still work. I can add this additional info if you think thats ok

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no I just mean removing the puts, not the ISRs - maybe replacing them with _write?

/*
THESE STRINGS ARE NOT LONGER NEEDED
Only the last two were used, and their
call was removed to reduce base code size
.section .rodata
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if they are not needed why don't we delete them?

}

__attribute__((weak)) void handler_ecall(void) {
printf("Environment call encountered\n");
//printf("Environment call encountered\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why commenting it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nah, this should be changed to a _writestr as in the others


// Requested baudrate is too high for the given clock frequency.
if (nco != nco_masked) {
if (uart->nco != nco_masked) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is nco set and where the division is performed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should i add a comment on this?

@@ -49,20 +50,19 @@ pid_t _getpid (void);
int _isatty (int __fildes);
int _link (const char *__path1, const char *__path2);
_off_t _lseek (int __fildes, _off_t __offset, int __whence);
ssize_t _read (int __fd, void *__buf, size_t __nbyte);
int _read (int __fd, void *__buf, int __nbyte);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you check this also with the OpenHW Group compiler used in HEEPatia? and with LLVM?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with clang they compile (in the CI). I can check with the OHW compiler on HEEPatia. But i would say if this does not work with that compiler we open an issue later and i fix it quickly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately I do not agree as as soon as X-HEEP gets revendorized in HEEPatia, everything goes broken - I am sure these changes may break it indeed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am checking now, will let you know in a while

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it works with the OHW compiler.
However, few things to fix:

  1. The app size scripts goes crazy when code is on the IL banks -- will do a PR later
  2. OHW compiler requires you to explicitly include "syscalls.h" -- will try running all the apps later

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I compiled all the apps from HEEPatia with make test TEST_FLAGS=--compile-only and 33 apps do not compile, most of them did already not compile on HEEPatia, and the others do not seem to be related to these changes.

Copy link
Member

@davideschiavone davideschiavone Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is crazy, i thought @LuigiGiuffrida98 and @davidmallasen fixed them...
Let's discuss this tomorrow

// We assume that the lowest baudrate will be 9600, and the largest 256000.
// Given this range, by dividing by 100 it always remains integer and below 32-bits.
// This saves us the need of performing 64-bit divisions to compute NCO.
#define UART_NCO ((uint32_t)(((uint32_t)((uint32_t)(UART_BAUDRATE/100))<<20)/((uint32_t)(REFERENCE_CLOCK_Hz/100))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we are using the FLL and we change the reference clock? did you try this in HEEPatia for example? I suggest you re-vendor this fork of X-HEEP in HEEPatia to try things out (tagging @davidmallasen for information)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an interesting question. In theory, as long as the frequency is multiple of 100 it should work just fine. this could be not true if e.g. we use 32'768 Hz.

What i can do is add a macro that chooses the way of computing this: precise/light. Then, would this be chosen at which level? Not sure

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, a macro could be fine, like "TARGET_USE_FLL" or "USING_FIXED_CLK" etc. Keep in mind that this will increment the maintenance effort.... how big is the div64 bit function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

almost 2kB!!! it would be 30% of the code of the minimum application

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I added an #ifdef in syscalls to prevent using a NCO that might not be defined.
That way if NCO is defined (you are using one of the FPGA targets, then we use the hardcoded NCO (because also the system clock will be hardcoded --i understand--). If that is not the case (e.g. you are targetting asic) then you compute the NCO with the 64-bit division.

    uart.base_addr   = mmio_region_from_addr((uintptr_t)UART_START_ADDRESS);
    uart.baudrate    = UART_BAUDRATE;
    uart.clk_freq_hz = soc_ctrl_get_frequency(&soc_ctrl);
    #ifdef UART_NCO
    uart.nco         = UART_NCO;
    #else
    uart.nco         = ((uint64_t)uart.baudrate << (NCO_WIDTH + 4)) / uart.clk_freq_hz;
    #endif

Thanks @davidmallasen !!

Copy link
Contributor Author

@JuanSapriza JuanSapriza Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this by removing the define for TARGET=pynq-z2 and compiles 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also works fine on FPGA

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only thing I do not like is that for chips (or any other thing) that will have a fixed clock frequency, we need to remember to define UART_NCO, which is little descriptive about it... we should have something more generic like "USE_FIXED_CLK_FREQ" or any other thing... we can discuss about it -

as I am on this topic, back at HEEPocrates time (with my old crappy Make flow for SW), the target folder was also exposed at HEEPocrates level (i.e. different than the X-HEEP one) while I think now it is lost, which is wrong... we should have a target for each CHIP (see indeed the fixed FLL thing) - for HEEPnosis the only thing I had to change was the USE_TARGET_FLL which we solved it by having the COMPILIER_FLAGS passed command line (thx @JuanSapriza ) - but in principle, it does not scale - we should have a TARGET for the chip itself, and another one (maybe called BOARD or whatever else) for the BOARD, being the testbench in case of X-HEEP_SIM, being something else in the X-HEEP_FPGA_PYNQ, etc... now they are a bit mixed, but if you see HEEPocrates it has many targets depending on the configuration of the testing board, the simulation, etc... we should re-think this - this thing I am writing here does not belong to this PR - it's just a not that I will now transfer to a Github issue - just posting it here to open the discussion on this PR tomorrow (pls @JuanSapriza add this PR in the heepidermis slides)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Yesterday with @davidmallasen we reached a similar conclusion (or at least that crossed my mind).

@JuanSapriza
Copy link
Contributor Author

can you @JuanSapriza please remove the ISRs 32 to 63? https://github.com/esl-epfl/x-heep/blob/main/sw/device/lib/crt/vectors.S#L84 basically deleting from line 84 to 117 included

Not connected on ibex stands for we will never jump here unless its an error? then yes, i'll gladly do it

@JuanSapriza JuanSapriza marked this pull request as draft February 25, 2025 10:28
@davideschiavone
Copy link
Member

can you @JuanSapriza please remove the ISRs 32 to 63? https://github.com/esl-epfl/x-heep/blob/main/sw/device/lib/crt/vectors.S#L84 basically deleting from line 84 to 117 included

Not connected on ibex stands for we will never jump here unless its an error? then yes, i'll gladly do it

the CPU has at most 32 ISRs (or let's say, the ID of the last one is 31, but physically they are 19) so it is physically impossible to jump to ISR 32 to 63. Don't know why the code is there, must be a copy and paste from another testbench

@@ -83,38 +83,6 @@ vector_table:
j __no_irq_handler
// 64-32 : not connected on Ibex
j verification_irq_handler
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the ISR number 32? i.e. the 33th? if so we need to delete this one as well :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but do you know why would it be different? (it did not jump to _no_irq_handler

@davideschiavone davideschiavone changed the base branch from develop to dev_branch February 27, 2025 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
software Software and application
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants