mirror of
https://github.com/espressif/esp-idf.git
synced 2025-10-03 10:30:58 +02:00
change(newlib): enable LIBC_OPTIMIZED_MISALIGNED_ACCESS by default
This commit is contained in:
@@ -145,7 +145,7 @@ menu "LibC"
|
||||
|
||||
config LIBC_OPTIMIZED_MISALIGNED_ACCESS
|
||||
bool "Use performance-optimized memXXX/strXXX functions on misaligned memory access"
|
||||
default n
|
||||
default y
|
||||
depends on ESP_ROM_HAS_SUBOPTIMAL_NEWLIB_ON_MISALIGNED_MEMORY
|
||||
help
|
||||
Enables performance-optimized implementations of memory and string functions
|
||||
|
@@ -194,6 +194,7 @@ The following options will reduce IRAM usage of some ESP-IDF features:
|
||||
:SOC_GPSPI_SUPPORTED: - Enable :ref:`CONFIG_HEAP_PLACE_FUNCTION_INTO_FLASH`. Provided that :ref:`CONFIG_SPI_MASTER_ISR_IN_IRAM` is not enabled and the heap functions are not incorrectly used from ISRs, this option is safe to enable in all configurations.
|
||||
:esp32c2: - Enable :ref:`CONFIG_BT_RELEASE_IRAM`. Release BT text section and merge BT data, bss & text into a large free heap region when ``esp_bt_mem_release`` is called. This makes Bluetooth unavailable until the next restart, but saving ~22 KB or more of IRAM.
|
||||
- Disable :ref:`CONFIG_LIBC_LOCKS_PLACE_IN_IRAM` if no ISRs that run while cache is disabled (i.e. IRAM ISRs) use libc lock APIs.
|
||||
:CONFIG_ESP_ROM_HAS_SUBOPTIMAL_NEWLIB_ON_MISALIGNED_MEMORY: - Disable :ref:`CONFIG_LIBC_OPTIMIZED_MISALIGNED_ACCESS` to save approximately 1000 bytes of IRAM, at the cost of reduced performance.
|
||||
|
||||
.. only:: esp32
|
||||
|
||||
|
@@ -87,7 +87,6 @@ The following optimizations improve the execution of nearly all code, including
|
||||
:SOC_CPU_HAS_FPU: - Avoid using floating point arithmetic ``float``. Even though {IDF_TARGET_NAME} has a single precision hardware floating point unit, floating point calculations are always slower than integer calculations. If possible then use fixed point representations, a different method of integer representation, or convert part of the calculation to be integer only before switching to floating point.
|
||||
:not SOC_CPU_HAS_FPU: - Avoid using floating point arithmetic ``float``. On {IDF_TARGET_NAME} these calculations are emulated in software and are very slow. If possible, use fixed point representations, a different method of integer representation, or convert part of the calculation to be integer only before switching to floating point.
|
||||
- Avoid using double precision floating point arithmetic ``double``. These calculations are emulated in software and are very slow. If possible then use an integer-based representation, or single-precision floating point.
|
||||
:CONFIG_ESP_ROM_HAS_SUBOPTIMAL_NEWLIB_ON_MISALIGNED_MEMORY: - Avoid misaligned 4-byte memory accesses in performance-critical code sections. For potential performance improvements, consider enabling :ref:`CONFIG_LIBC_OPTIMIZED_MISALIGNED_ACCESS`, which requires approximately 190 bytes of IRAM and 870 bytes of flash memory. Note that properly aligned memory operations will always execute at full speed without performance penalties.
|
||||
|
||||
|
||||
.. only:: esp32s2 or esp32s3 or esp32p4
|
||||
|
@@ -107,3 +107,64 @@ The header ``<sys/signal.h>`` is no longer available in Picolibc. To ensure comp
|
||||
|
||||
#include <sys/signal.h> /* fatal error: sys/signal.h: No such file or directory */
|
||||
#include <signal.h> /* Ok: standard and portable */
|
||||
|
||||
.. only:: CONFIG_ESP_ROM_HAS_SUBOPTIMAL_NEWLIB_ON_MISALIGNED_MEMORY
|
||||
|
||||
RISC-V Chips and Misaligned Memory Access in LibC Functions
|
||||
-----------------------------------------------------------
|
||||
|
||||
Espressif RISC-V chips can perform misaligned memory accesses with only a small
|
||||
performance penalty compared to aligned accesses.
|
||||
|
||||
Previously, LibC functions that operate on memory (such as copy or comparison
|
||||
functions) were implemented using byte-by-byte operations when a non-word-aligned
|
||||
pointer was passed. Now, these functions use word (4-byte) load/store operations
|
||||
whenever possible, resulting in a significant performance increase. These optimized
|
||||
implementations are enabled by default via :ref:`CONFIG_LIBC_OPTIMIZED_MISALIGNED_ACCESS`,
|
||||
which reduces the application’s memory budget (IRAM) by approximately 800–1000 bytes.
|
||||
|
||||
The table below shows benchmark results on the ESP32-C3 chip using 4096-byte buffers:
|
||||
|
||||
.. list-table:: Benchmark Results
|
||||
:header-rows: 1
|
||||
:widths: 20 20 20 20
|
||||
|
||||
* - Function
|
||||
- Old (CPU cycles)
|
||||
- Optimized (CPU cycles)
|
||||
- Improvement (%)
|
||||
* - memcpy
|
||||
- 32873
|
||||
- 4200
|
||||
- 87.2
|
||||
* - memcmp
|
||||
- 57436
|
||||
- 14722
|
||||
- 74.4
|
||||
* - memmove
|
||||
- 49336
|
||||
- 9237
|
||||
- 81.3
|
||||
* - strcpy
|
||||
- 28678
|
||||
- 16659
|
||||
- 41.9
|
||||
* - strcmp
|
||||
- 36867
|
||||
- 11146
|
||||
- 69.8
|
||||
|
||||
.. note::
|
||||
The results above apply to misaligned memory operations.
|
||||
Performance for aligned memory operations remains unchanged.
|
||||
|
||||
Functions with Improved Performance
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
- ``memcpy``
|
||||
- ``memcmp``
|
||||
- ``memmove``
|
||||
- ``strcpy``
|
||||
- ``strncpy``
|
||||
- ``strcmp``
|
||||
- ``strncmp``
|
||||
|
@@ -87,7 +87,6 @@
|
||||
:SOC_CPU_HAS_FPU: - 避免使用浮点运算 ``float``。尽管 {IDF_TARGET_NAME} 具备单精度浮点运算器,但是浮点运算总是慢于整数运算。因此可以考虑使用不同的整数表示方法进行运算,如定点表示法,或者将部分计算用整数运算后再切换为浮点运算。
|
||||
:not SOC_CPU_HAS_FPU: - 避免使用浮点运算 ``float``。{IDF_TARGET_NAME} 通过软件模拟进行浮点运算,因此速度非常慢。可以考虑使用不同的整数表示方法进行运算,如定点表示法,或者将部分计算用整数运算后再切换为浮点运算。
|
||||
- 避免使用双精度浮点运算 ``double``。{IDF_TARGET_NAME} 通过软件模拟进行双精度浮点运算,因此速度非常慢。可以考虑使用基于整数的表示方法或单精度浮点数。
|
||||
:CONFIG_ESP_ROM_HAS_SUBOPTIMAL_NEWLIB_ON_MISALIGNED_MEMORY: - 在性能要求较高的代码段中,应避免执行未对齐的 4 字节内存访问。为提升性能,可以考虑启用 :ref:`CONFIG_LIBC_OPTIMIZED_MISALIGNED_ACCESS`。启用此选项将额外占用约 190 字节的 IRAM 和 870 字节的 flash 存储。请注意,正确对齐的内存操作始终能够以全速执行,且不会产生性能损耗。
|
||||
|
||||
|
||||
.. only:: esp32s2 or esp32s3 or esp32p4
|
||||
|
Reference in New Issue
Block a user