From: Simon Glass <simon.glass@canonical.com> This series imports dlmalloc 2.8.6 from Doug Lea, replacing the old version 2.6.6 that U-Boot has been using since 2002. The new version provides: - Better memory efficiency with improved binning algorithms - More robust overflow checking via MAX_REQUEST - Somewhat cleaner codebase All U-Boot-specific modifications from the historical commits have been ported to the new implementation, including: - Pre-relocation malloc via malloc_simple - Valgrind annotations - Malloc testing infrastructure - mcheck heap protection support - Sandbox USE_DL_PREFIX support The approach here is to leave the upstream code unchanged, so much as possible, clearly marking U-Boot-specific changes with an #ifdef Unfortunately the code size is not great out-of-the-box, so the final part of the series includes some options to remove in-place realloc(), provide a simplified init, remove the tree stucture for large blocks and a few other things. With these adjustments the new version is about 1K less code on Thumb2 (firefly-rk3288). The new free() algorithm is more sophisticated but also larger. If needed we might be able to shrink by a few hundred bytes. Of course SPL doesn't normally use free() so the benefit might be minimal. Another point worth mentioning is that the pre-inited av_[] array has been replaced replaced with a BSS _sm_ struct which reduces the image size by about 1.5K. One patch adjusts some imx8mp boards to deal with the larger BSS. Some code-size stats: $ buildman -b mala imx8mp_venice firefly-rk3288 firefly-rk3399 -sS --step 0 Summary of 2 commits for 3 boards (3 threads, 11 jobs per thread) 01: backtrace: Strip the source tree prefix from filenames aarch64: w+ imx8mp_venice firefly-rk3399 40: doc: Add malloc documentation aarch64: (for 2/2 boards) all -3904.0 bss +864.0 data -2076.0 spl/u-boot-spl:all -654.0 spl/u-boot-spl:bss +316.0 spl/u-boot-spl:data -1034.0 spl/u-boot-spl:text +64.0 text -2692.0 arm: (for 1/1 boards) all -1436.0 data -1040.0 text -396.0 For the new malloc.h I have avoided including string.h so have added that to various places that need it. The existing common/dlmalloc.src file is left alone. In order to bring this in without losing functionality, I went through the patches applied to the original implementation over time. Where these commits were added, they are added as a cherry-pick, with the original commit hash. Here is a list of what was done with each U-Boot commit on top of the new common/dlmalloc.c and include/malloc.h: 1. 217c9dad827 2002-10-25 Initial revision - Ignored 2. 5b1d713721c 2002-11-03 Initial revision - Ignored 3. 8bde7f776c7 2003-06-27 * Code cleanup: - Ignored as we don't really want to change the style 4. d87080b721e 2006-03-31 GCC-4.x fixes: clean up global data pointer initialization for all boards. - Global data is not needed at this point 5. 81673e9ae14 2008-05-13 Make sure common.h is the first include. - common.h has been removed 6. f2302d4430e 2008-08-06 Fix merge problems - no merge problems to fix with the new code 7. 60a3f404acb 2009-06-13 malloc.h: protect it against multiple include - Already covered: new malloc.h has MALLOC_280_H include guards 8. 5e93bd1c9aa 2009-08-21 Consolidate arch-specific sbrk() implementations - Already covered: sbrk() and mem_malloc_init() in separate commit 9. d4e8ada0f6d 2009-08-21 Consolidate arch-specific mem_malloc_init() implementations - Already covered: mem_malloc_init() in separate commit 10. 521af04d853 2009-09-21 Conditionally perform common relocation fixups - Not needed: Manual relocation removed in 4babaa0c28b 11. b4feeb4e8a1 2009-11-24 i386: Fix malloc initialization - Already covered: mem_malloc_init() is common, no arch-specific guards 12. 2740544881f 2010-01-15 malloc: return NULL if not initialized yet - Done: Add check in dlmalloc() to return NULL if not initialized 13. ae30b8c200d 2010-04-06 malloc: sbrk() should return MORECORE_FAILURE instead of NULL on failure - Already covered: sbrk() returns MFAIL on failure 14. ea882baf9c1 2010-06-20 New implementation for internal handling of environment variables. - Not needed: Just changes #if 0 to #ifdef DEBUG for old stats code 15. 1ba91ba2339 2010-10-14 dlmalloc.c: Fix gcc alias warning - Not needed: New dlmalloc has no strict-aliasing warnings 16. 2e5167ccad9 2010-10-28 Replace CONFIG_RELOC_FIXUP_WORKS by CONFIG_NEEDS_MANUAL_RELOC - Not needed: Manual relocation removed in 4babaa0c28b 17. 6163f5b4c88 2010-11-15 malloc: Fix issue with calloc memory possibly being non-zero - Already covered: sbrk() clears memory on negative increment 18. 21726a7afce 2011-06-29 Add assert() for debug assertions - Not needed: New dlmalloc uses U-Boot's global assert() 19. ea95cb73310 2011-09-10 utx8245: fix build breakage due to assert() - Not needed: New dlmalloc has different debug check functions 20. 213adf6dffe 2012-03-29 Malloc: Fix -Wundef warnings - Not needed: New malloc.h doesn't have these #if issues 21. 93691842e8d 2012-09-04 Fix strict-aliasing warning in dlmalloc - Not needed: New dlmalloc has no malloc_bin_reloc() 22. 00d0d2ad4e8 2012-06-03 malloc: remove extern declarations of malloc_bin_reloc() in board.c files - Not needed: New dlmalloc has no malloc_bin_reloc() 23. 199adb601ff 2012-10-29 common/misc: sparse fixes - Not needed: New dlmalloc uses proper NULL 24. 7b395232da8 2013-01-21 malloc: make malloc_bin_reloc static - Not needed: New dlmalloc has no malloc_bin_reloc() 25. 472d546054d 2013-04-01 Consolidate bool type - Not needed: Just a comment change (True -> true) 26. d93041a4ca0 2014-07-10 Remove form-feeds from dlmalloc.c - Not needed: New dlmalloc doesn't have form-feeds 27. d59476b6446 2014-07-10 Add a simple malloc() implementation for pre-relocation - Done (updated): Redirect to malloc_simple before GD_FLG_FULL_MALLOC_INIT 28. 6d7601e7443 2014-07-10 sandbox: Always enable malloc debug - Done (updated): Combined with #64, use 'DEBUG 1' for new dlmalloc 29. 854d2b9753e 2014-10-29 dlmalloc: ensure gd is set for early alloc - Not needed: Reverted by #38 30. 868de51ddee 2014-08-26 malloc: Output region when debugging - Already covered: debug() message in mem_malloc_init() 31. c9356be3074 2014-11-10 dm: Split the simple malloc() implementation into its own file - Already covered: Redirect to malloc_simple.c via GD_FLG_FULL_MALLOC_INIT 32. 0aa8a4ad999 2015-03-04 dlmalloc: do memset in malloc init as new default config - Already covered: SYS_MALLOC_CLEAR_ON_INIT at line 6396 33. fb5cf7f16be 2015-02-27 Move initf_malloc() to a common place - Already covered: initf_malloc() at line 6357 34. 1eb0c03c219 2015-09-13 malloc_simple: Add Kconfig option for using only malloc_simple in the SPL - Not needed: Changes to Kconfig/malloc_simple.c, not dlmalloc.c 35. 4f144a41646 2016-01-25 malloc: work around some memalign fragmentation issues - Done (updated): Ported to internal_memalign() at line 4955 36. ee05fedc6c8 2016-02-04 malloc: solve dead code issue in memalign() - Not needed: New dlmalloc 2.8.6 has rewritten internal_memalign() 37. 2f0bcd4de1a 2016-03-05 malloc: use hidden visibility - Done (updated): Use DLMALLOC_EXPORT at line 546 38. deff6fb3a77 2016-03-05 malloc: remove !gd handling - Not needed: Reverts #29, we don't add gd check 39. 4eece2602b6 2016-04-21 common/dlmalloc.c: Delete content that was moved to malloc.h - Not needed: New dlmalloc doesn't have #if 0 code 40. 034eda867f4 2016-04-25 malloc: improve memalign fragmentation fix - Done (updated): Combined with #35 in memalign workaround port 41. 4e33316f656 2017-05-25 malloc: Turn on DEBUG when enabling unit tests - Already covered: Combined with #28, #63 at line 555 42. f1896c45cb2 2017-07-24 spl: make SPL and normal u-boot stage use independent SYS_MALLOC_F_LEN - Already covered: Use CONFIG_IS_ENABLED and CONFIG_VAL at line 6410 43. a874cac3b45 2017-11-10 malloc: don't compare pointers to 0 - Not needed: New dlmalloc uses proper NULL comparisons 44. ee038c58d51 2018-05-18 malloc: Use malloc simple before malloc is fully initialized in memalign() - Already covered: memalign_simple redirect at line 5367 45. 7cbd2d2e327 2018-11-18 malloc_simple: Add logging of allocations - Not needed: Changes to malloc_simple.c, not dlmalloc.c 46. 4c6be01c271 2019-03-27 malloc: Fix memalign not honoring alignment prior to full malloc init - Already covered: Uses memalign_simple at line 5367 47. bb71a2d9dcd 2019-10-25 dlmalloc: calloc: fix zeroing early allocations - Done (updated): Port to dlcalloc() at line 4857 48. cfda60f99ae 2020-02-03 sandbox: Use a prefix for all allocation functions - Done: USE_DL_PREFIX and reverse mappings in malloc.h 49. be621c11b9f 2020-04-15 dlmalloc: remove unit test support in SPL - Already covered: CONFIG_IS_ENABLED(UNIT_TEST) at line 554 50. 9297e366d6a 2020-04-29 malloc: dlmalloc: add an ability for the malloc to be re-init/init multiple times - Not needed: No boards use CONFIG_SYS_MALLOC_DEFAULT_TO_INIT 51. f7ae49fc4f3 2020-05-10 common: Drop log.h from common header - Already covered: Includes log.h at line 559 52. 401d1c4f5d2 2020-10-30 common: Drop asm/global_data.h from common header - Already covered: Includes asm/global_data.h at line 557 53. c6bf4f38988 2021-02-10 malloc: adjust memcpy() and memset() definitions. - Not needed: New malloc.h doesn't declare memset/memcpy 54. c197f6e2792 2021-03-15 malloc: Export malloc_simple_info() - Not needed: Only changes malloc.h, not dlmalloc.c 55. 5ad9220bf7b 2021-05-29 malloc: add SPDX license identifiers - Not needed: New dlmalloc has MIT-0 license from upstream 56. bdaeea1b686 2022-03-23 malloc: Annotate allocator for valgrind - Done (updated): Valgrind annotations in dlmalloc(), dlfree(), dlrealloc() 57. 62d638386c1 2022-09-06 test: Support testing malloc() failures - Done: malloc_testing/malloc_max_allocs in dlmalloc() 58. f88d48cc74f 2023-02-27 dlmalloc: Fix a warning with clang-15 - Done: Add (void) to dlmalloc_stats() function definition 59. c9db9a2ef55 2023-08-25 dlmalloc: Add support for SPL_SYS_MALLOC_CLEAR_ON_INIT - Already covered: Uses CONFIG_IS_ENABLED() in mem_malloc_init() from #32 60. 6a595c2f67e 2023-09-06 common: malloc: Remove unused NEEDS_MANUAL_RELOC code bits - Not needed: NEEDS_MANUAL_RELOC has been removed 61. ac897385bbf 2023-10-02 Merge branch 'next' - Not needed: Merge commit, no dlmalloc changes 62. 3d6d5075146 2023-09-26 spl: Use SYS_MALLOC_F instead of SYS_MALLOC_F_LEN - Already covered: Uses CONFIG_IS_ENABLED(SYS_MALLOC_F) throughout 63. 1786861415f 2023-10-07 malloc: Enable assertions if UNIT_TEST is enabled - Done (updated): Combined with #28, use 'DEBUG 1' for new dlmalloc 64. c82ff481159 2024-03-31 mcheck: prepare +1 tier for mcheck-wrappers, in dl-*alloc commands - Done (updated): Added STATIC_IF_MCHECK and *_impl macros for dlmalloc 2.8.6 65. dfba071ddc3 2024-03-31 mcheck: Use memset/memcpy instead of MALLOC_ZERO/MALLOC_COPY for mcheck. - Done: Undef and redefine MALLOC_ZERO/MALLOC_COPY when mcheck enabled 66. 151493a8750 2024-03-31 mcheck: integrate mcheck into dlmalloc.c - Done: Added mcheck wrapper functions for dlmalloc, dlfree, dlrealloc, dlmemalign, dlcalloc 67. ae838768d79 2024-03-31 mcheck: support memalign - Done: Implemented dlmemalign wrapper with mcheck hooks 68. 18c1bfafe0c 2024-03-31 mcheck: add pedantic mode support - Done: Added mcheck_pedantic_prehook() calls and mcheck_pedantic()/mcheck_check_all() API 69. a79fc7a79cc 2024-04-27 common: Remove <common.h> and add needed includes - Not needed: common.h has been removed 70. d678a59d2d7 2024-05-18 Revert "Merge patch series "arm: dts: am62-beagleplay: Fix Beagleplay Ethernet"" - Not needed: common.h has been removed 71. 03de305ec48 2024-05-20 Restore patch series "arm: dts: am62-beagleplay: Fix Beagleplay Ethernet" - Not needed: common.h has been removed 72. 910cef3d2fb 2024-07-13 common: Remove duplicate newlines - Not needed: New dlmalloc has its own formatting from upstream 73. 6627fbba203 2024-07-23 include: Remove duplicate newlines - Not needed: New malloc.h has its own formatting from upstream 74. 04894f5ad53 2024-07-30 malloc: Support testing with realloc() - Done: Combined with #57, malloc_testing check in dlrealloc() 75. 8642b2178d2 2024-08-02 dlmalloc: Fix integer overflow in request2size() - Not needed: New dlmalloc 2.8.6 uses MAX_REQUEST for robust overflow checks 76. 0a10b49206a 2024-08-02 dlmalloc: Fix integer overflow in sbrk() - Already covered: sbrk() checks bounds before memset in U-Boot section 77. 9b9368b5c4d 2024-08-02 dlmalloc: Make sure allocation size is within malloc area - Not needed: New dlmalloc 2.8.6 uses MAX_REQUEST for robust overflow checks 78. 41fecdc94e3 2024-10-21 common: Tidy up how malloc() is inited - Already covered: mem_malloc_init() uses map_sysmem in U-Boot section 79. 22f87ef5304 2025-08-17 malloc: Avoid defining calloc() - Done: Added SYS_MALLOC_SIMPLE section to malloc.h with calloc redirect Simon Glass (37): test: hooks: Add a symlink for tasman treewide: Add missing string.h includes imx8mp: Increase the BSS limit for a few boards test: Use TOTAL_MALLOC_LEN for abuf and alist tests malloc: Rename dlmalloc.c to dlmalloc_old.c malloc: Rename malloc.h to malloc_old.h malloc: Import dlmalloc 2.8.6 malloc: Add mem_malloc_init() and sbrk() malloc: Add U-Boot configuration for dlmalloc 2.8.6 malloc: Fix assert warning malloc: return NULL if not initialized yet Add a simple malloc() implementation for pre-relocation malloc: Enable assertions if UNIT_TEST is enabled malloc: Reduce code size with INSECURE and NO_MALLINFO malloc: work around some memalign fragmentation issues malloc: use hidden visibility dlmalloc: calloc: fix zeroing early allocations sandbox: Use a prefix for all allocation functions malloc: Annotate allocator for valgrind test: Support testing malloc() failures dlmalloc: Fix a warning with clang-15 mcheck: prepare +1 tier for mcheck-wrappers, in dl-*alloc commands mcheck: Use memset/memcpy instead of MALLOC_ZERO/MALLOC_COPY for mcheck. mcheck: integrate mcheck into dlmalloc.c mcheck: support memalign mcheck: add pedantic mode support malloc: Avoid defining calloc() malloc: Set up the malloc() state in mem_malloc_init() malloc: Allow building dlmalloc with SPL_SYS_MALLOC_SIMPLE malloc: Add a way to control the size of dlmalloc malloc: Add NO_REALLOC_IN_PLACE option to reduce code size malloc: Add NO_TREE_BINS option to reduce code size malloc: Add SIMPLE_MEMALIGN to simplify memalign for code size malloc: Add SMALLCHUNKS_AS_FUNCS to convert macros to funcs test: Add some tests for dlmalloc malloc: Switch to the new malloc() implementation doc: Add malloc documentation Kconfig | 63 + arch/arm/mach-zynq/slcr.c | 1 + board/ti/common/cape_detect.c | 1 + boot/expo_build_cb.c | 1 + cmd/printf.c | 1 + common/Makefile | 4 + common/bouncebuf.c | 5 +- common/dlmalloc.c | 8277 ++++++++++++++----- common/dlmalloc_old.c | 2611 ++++++ common/iomux.c | 1 + common/menu.c | 1 + configs/imx8mp_data_modul_edm_sbc_defconfig | 2 +- configs/imx8mp_dhsom.config | 2 +- configs/imx8mp_venice_defconfig | 2 +- configs/venice2_defconfig | 1 + doc/arch/sandbox/sandbox.rst | 2 + doc/develop/index.rst | 1 + doc/develop/malloc.rst | 333 + drivers/crypto/fsl/desc_constr.h | 1 + drivers/crypto/fsl/error.c | 1 + drivers/crypto/fsl/fsl_blob.c | 1 + drivers/crypto/fsl/fsl_hash.c | 1 + drivers/dma/apbh_dma.c | 1 + drivers/fpga/versalpl.c | 1 + drivers/net/fsl-mc/dpio/qbman_portal.c | 1 + drivers/net/qe/uccf.c | 1 + drivers/spi/spi-mem-nodm.c | 1 + drivers/video/imx/ipu_common.c | 1 + include/malloc.h | 1515 ++-- include/malloc_old.h | 999 +++ lib/circbuf.c | 1 + lib/crypto/x509_helper.c | 2 + lib/dhry/dhry_1.c | 1 + lib/libavb/avb_sysdeps_posix.c | 1 + lib/linux_compat.c | 1 + lib/list_sort.c | 1 + lib/mbedtls/mscode_parser.c | 1 + lib/membuf.c | 1 + lib/strto.c | 1 + test/common/Makefile | 1 + test/common/malloc.c | 629 ++ test/hooks/bin/tasman | 1 + test/lib/abuf.c | 5 +- test/lib/alist.c | 3 +- 44 files changed, 11624 insertions(+), 2858 deletions(-) create mode 100644 common/dlmalloc_old.c create mode 100644 doc/develop/malloc.rst create mode 100644 include/malloc_old.h create mode 100644 test/common/malloc.c create mode 120000 test/hooks/bin/tasman -- 2.43.0 base-commit: 05fd95e17deb4dea9f44e491ead363b29f013de6 branch: mala
From: Simon Glass <simon.glass@canonical.com> Add a symlink to ellesmere so we can run tests on tasman. Signed-off-by: Simon Glass <simon.glass@canonical.com> --- test/hooks/bin/tasman | 1 + 1 file changed, 1 insertion(+) create mode 120000 test/hooks/bin/tasman diff --git a/test/hooks/bin/tasman b/test/hooks/bin/tasman new file mode 120000 index 00000000000..784d574a1e1 --- /dev/null +++ b/test/hooks/bin/tasman @@ -0,0 +1 @@ +ellesmere \ No newline at end of file -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Add string.h to files that use string functions like strdup, strcmp, strcpy, etc. These are implicitly available through the malloc.h header but that will soon change. For bouncebuf, take this opportunity to sort the headers correctly. Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- arch/arm/mach-zynq/slcr.c | 1 + board/ti/common/cape_detect.c | 1 + boot/expo_build_cb.c | 1 + cmd/printf.c | 1 + common/bouncebuf.c | 5 +++-- common/iomux.c | 1 + common/menu.c | 1 + drivers/crypto/fsl/desc_constr.h | 1 + drivers/crypto/fsl/error.c | 1 + drivers/crypto/fsl/fsl_blob.c | 1 + drivers/crypto/fsl/fsl_hash.c | 1 + drivers/dma/apbh_dma.c | 1 + drivers/fpga/versalpl.c | 1 + drivers/net/fsl-mc/dpio/qbman_portal.c | 1 + drivers/net/qe/uccf.c | 1 + drivers/spi/spi-mem-nodm.c | 1 + drivers/video/imx/ipu_common.c | 1 + lib/circbuf.c | 1 + lib/crypto/x509_helper.c | 2 ++ lib/dhry/dhry_1.c | 1 + lib/libavb/avb_sysdeps_posix.c | 1 + lib/linux_compat.c | 1 + lib/list_sort.c | 1 + lib/mbedtls/mscode_parser.c | 1 + lib/membuf.c | 1 + lib/strto.c | 1 + 26 files changed, 29 insertions(+), 2 deletions(-) diff --git a/arch/arm/mach-zynq/slcr.c b/arch/arm/mach-zynq/slcr.c index ef877df0fe8..b69d5aad961 100644 --- a/arch/arm/mach-zynq/slcr.c +++ b/arch/arm/mach-zynq/slcr.c @@ -5,6 +5,7 @@ #include <asm/io.h> #include <malloc.h> +#include <string.h> #include <asm/arch/hardware.h> #include <asm/arch/sys_proto.h> diff --git a/board/ti/common/cape_detect.c b/board/ti/common/cape_detect.c index da805befabc..4984f7a3a22 100644 --- a/board/ti/common/cape_detect.c +++ b/board/ti/common/cape_detect.c @@ -8,6 +8,7 @@ #include <malloc.h> #include <i2c.h> #include <extension_board.h> +#include <string.h> #include <vsprintf.h> #include "cape_detect.h" diff --git a/boot/expo_build_cb.c b/boot/expo_build_cb.c index 442ad760e79..6dd3dbd92b8 100644 --- a/boot/expo_build_cb.c +++ b/boot/expo_build_cb.c @@ -14,6 +14,7 @@ #include <expo.h> #include <log.h> #include <malloc.h> +#include <string.h> #include <vsprintf.h> #include <asm/cb_sysinfo.h> diff --git a/cmd/printf.c b/cmd/printf.c index a1727ac15a2..52f21c8b842 100644 --- a/cmd/printf.c +++ b/cmd/printf.c @@ -89,6 +89,7 @@ #include <stddef.h> #include <stdio.h> #include <stdlib.h> +#include <string.h> #include <vsprintf.h> #define WANT_HEX_ESCAPES 0 diff --git a/common/bouncebuf.c b/common/bouncebuf.c index b2f87e4d939..5a7d3efa521 100644 --- a/common/bouncebuf.c +++ b/common/bouncebuf.c @@ -5,11 +5,12 @@ * Copyright (C) 2012 Marek Vasut <marex@denx.de> */ +#include <bouncebuf.h> #include <cpu_func.h> +#include <errno.h> #include <log.h> #include <malloc.h> -#include <errno.h> -#include <bouncebuf.h> +#include <string.h> #include <asm/cache.h> #include <linux/dma-mapping.h> diff --git a/common/iomux.c b/common/iomux.c index 1224c15eb71..e488934b29f 100644 --- a/common/iomux.c +++ b/common/iomux.c @@ -7,6 +7,7 @@ #include <console.h> #include <serial.h> #include <malloc.h> +#include <string.h> #if CONFIG_IS_ENABLED(CONSOLE_MUX) void iomux_printdevs(const int console) diff --git a/common/menu.c b/common/menu.c index 5a2126aa01a..b66803337d3 100644 --- a/common/menu.c +++ b/common/menu.c @@ -7,6 +7,7 @@ #include <ansi.h> #include <cli.h> #include <malloc.h> +#include <string.h> #include <errno.h> #include <linux/delay.h> #include <linux/list.h> diff --git a/drivers/crypto/fsl/desc_constr.h b/drivers/crypto/fsl/desc_constr.h index 209557c4ffa..ce938d49887 100644 --- a/drivers/crypto/fsl/desc_constr.h +++ b/drivers/crypto/fsl/desc_constr.h @@ -7,6 +7,7 @@ * Based on desc_constr.h file in linux drivers/crypto/caam */ +#include <string.h> #include <linux/compat.h> #include "desc.h" diff --git a/drivers/crypto/fsl/error.c b/drivers/crypto/fsl/error.c index dfcf5dbab35..9008dccb27c 100644 --- a/drivers/crypto/fsl/error.c +++ b/drivers/crypto/fsl/error.c @@ -9,6 +9,7 @@ #include <log.h> #include <malloc.h> +#include <string.h> #include <vsprintf.h> #include "desc.h" #include "jr.h" diff --git a/drivers/crypto/fsl/fsl_blob.c b/drivers/crypto/fsl/fsl_blob.c index 0ecd6befd25..32beb03e8ae 100644 --- a/drivers/crypto/fsl/fsl_blob.c +++ b/drivers/crypto/fsl/fsl_blob.c @@ -9,6 +9,7 @@ #include <malloc.h> #include <memalign.h> #include <fsl_sec.h> +#include <string.h> #include <asm/cache.h> #include <linux/errno.h> #include "jobdesc.h" diff --git a/drivers/crypto/fsl/fsl_hash.c b/drivers/crypto/fsl/fsl_hash.c index 79b32e2627c..ea90aece64b 100644 --- a/drivers/crypto/fsl/fsl_hash.c +++ b/drivers/crypto/fsl/fsl_hash.c @@ -8,6 +8,7 @@ #include <log.h> #include <malloc.h> #include <memalign.h> +#include <string.h> #include "jobdesc.h" #include "desc.h" #include "jr.h" diff --git a/drivers/dma/apbh_dma.c b/drivers/dma/apbh_dma.c index 331815c469f..89ff00540ae 100644 --- a/drivers/dma/apbh_dma.c +++ b/drivers/dma/apbh_dma.c @@ -16,6 +16,7 @@ #include <linux/list.h> #include <malloc.h> +#include <string.h> #include <linux/errno.h> #include <asm/io.h> #include <asm/arch/clock.h> diff --git a/drivers/fpga/versalpl.c b/drivers/fpga/versalpl.c index 1957e8dcaca..2fba888b8cc 100644 --- a/drivers/fpga/versalpl.c +++ b/drivers/fpga/versalpl.c @@ -8,6 +8,7 @@ #include <log.h> #include <asm/arch/sys_proto.h> #include <memalign.h> +#include <string.h> #include <versalpl.h> #include <zynqmp_firmware.h> #include <asm/cache.h> diff --git a/drivers/net/fsl-mc/dpio/qbman_portal.c b/drivers/net/fsl-mc/dpio/qbman_portal.c index f4e82b0507c..d338fac4def 100644 --- a/drivers/net/fsl-mc/dpio/qbman_portal.c +++ b/drivers/net/fsl-mc/dpio/qbman_portal.c @@ -5,6 +5,7 @@ #include <log.h> #include <malloc.h> +#include <string.h> #include <asm/arch/clock.h> #include <linux/bug.h> #include "qbman_portal.h" diff --git a/drivers/net/qe/uccf.c b/drivers/net/qe/uccf.c index badf4e5db3e..ab411361722 100644 --- a/drivers/net/qe/uccf.c +++ b/drivers/net/qe/uccf.c @@ -8,6 +8,7 @@ #include <malloc.h> #include <stdio.h> +#include <string.h> #include <linux/errno.h> #include <asm/io.h> #include <linux/immap_qe.h> diff --git a/drivers/spi/spi-mem-nodm.c b/drivers/spi/spi-mem-nodm.c index 6d9ab61769a..6a79fda625b 100644 --- a/drivers/spi/spi-mem-nodm.c +++ b/drivers/spi/spi-mem-nodm.c @@ -8,6 +8,7 @@ #include <malloc.h> #include <spi.h> #include <spi-mem.h> +#include <string.h> int spi_mem_exec_op(struct spi_slave *slave, const struct spi_mem_op *op) diff --git a/drivers/video/imx/ipu_common.c b/drivers/video/imx/ipu_common.c index bd1ef0a800d..40d578d3980 100644 --- a/drivers/video/imx/ipu_common.c +++ b/drivers/video/imx/ipu_common.c @@ -13,6 +13,7 @@ /* #define DEBUG */ #include <config.h> #include <log.h> +#include <string.h> #include <linux/delay.h> #include <linux/types.h> #include <linux/err.h> diff --git a/lib/circbuf.c b/lib/circbuf.c index 461c240f788..043b5a60d36 100644 --- a/lib/circbuf.c +++ b/lib/circbuf.c @@ -6,6 +6,7 @@ #include <log.h> #include <malloc.h> +#include <string.h> #include <circbuf.h> diff --git a/lib/crypto/x509_helper.c b/lib/crypto/x509_helper.c index 87e8ff67ae1..bf79d42cd60 100644 --- a/lib/crypto/x509_helper.c +++ b/lib/crypto/x509_helper.c @@ -5,6 +5,8 @@ * Copyright (C) 2012 Red Hat, Inc. All Rights Reserved. * Written by David Howells (dhowells@redhat.com) */ + +#include <string.h> #include <linux/err.h> #include <crypto/public_key.h> #include <crypto/x509_parser.h> diff --git a/lib/dhry/dhry_1.c b/lib/dhry/dhry_1.c index 275a89942ea..4287b57e316 100644 --- a/lib/dhry/dhry_1.c +++ b/lib/dhry/dhry_1.c @@ -44,6 +44,7 @@ char SCCSid[] = "@(#) @(#)dhry_1.c:3.4 -- 5/15/91 19:30:21"; #include <malloc.h> #include <stdio.h> +#include <string.h> #include "dhry.h" diff --git a/lib/libavb/avb_sysdeps_posix.c b/lib/libavb/avb_sysdeps_posix.c index 6ffdb0b7eb3..1fde82be4d8 100644 --- a/lib/libavb/avb_sysdeps_posix.c +++ b/lib/libavb/avb_sysdeps_posix.c @@ -7,6 +7,7 @@ #include <malloc.h> #include <stdarg.h> #include <stdlib.h> +#include <string.h> #include "avb_sysdeps.h" diff --git a/lib/linux_compat.c b/lib/linux_compat.c index 985e88eb397..4df9db689ed 100644 --- a/lib/linux_compat.c +++ b/lib/linux_compat.c @@ -1,6 +1,7 @@ #include <malloc.h> #include <memalign.h> +#include <string.h> #include <asm/cache.h> #include <linux/compat.h> diff --git a/lib/list_sort.c b/lib/list_sort.c index a6e54d5bc46..cf5cac17720 100644 --- a/lib/list_sort.c +++ b/lib/list_sort.c @@ -8,6 +8,7 @@ #include <linux/compat.h> #include <malloc.h> #include <linux/printk.h> +#include <string.h> #endif #include <linux/list.h> #include <linux/list_sort.h> diff --git a/lib/mbedtls/mscode_parser.c b/lib/mbedtls/mscode_parser.c index c3805c6503c..956a5a47243 100644 --- a/lib/mbedtls/mscode_parser.c +++ b/lib/mbedtls/mscode_parser.c @@ -8,6 +8,7 @@ #include <linux/kernel.h> #include <linux/err.h> +#include <string.h> #include <crypto/pkcs7.h> #include <crypto/mscode.h> diff --git a/lib/membuf.c b/lib/membuf.c index 207dff5625b..02b0cc8c6b0 100644 --- a/lib/membuf.c +++ b/lib/membuf.c @@ -9,6 +9,7 @@ #include <errno.h> #include <log.h> #include <malloc.h> +#include <string.h> #include <vsprintf.h> #include "membuf.h" diff --git a/lib/strto.c b/lib/strto.c index 206d1e91847..b7b3655f17f 100644 --- a/lib/strto.c +++ b/lib/strto.c @@ -11,6 +11,7 @@ #include <errno.h> #include <malloc.h> +#include <string.h> #include <vsprintf.h> #include <linux/ctype.h> -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> These boards have quite a low BSS limit of 1K. The new dlmalloc needs about 1K of BSS (instead of 2K of data), since its state has moved out of the data region. Increase the limit by 0.5K for these boards: imx8mp_data_modul_edm_sbc imx8mp_dhcom_drc02 imx8mp_dhcom_pdk2 imx8mp_dhcom_pdk3 imx8mp_dhcom_picoitx imx8mp_venice Signed-off-by: Simon Glass <simon.glass@canonical.com> --- configs/imx8mp_data_modul_edm_sbc_defconfig | 2 +- configs/imx8mp_dhsom.config | 2 +- configs/imx8mp_venice_defconfig | 2 +- configs/venice2_defconfig | 1 + 4 files changed, 4 insertions(+), 3 deletions(-) diff --git a/configs/imx8mp_data_modul_edm_sbc_defconfig b/configs/imx8mp_data_modul_edm_sbc_defconfig index 8228bf59756..24764c89702 100644 --- a/configs/imx8mp_data_modul_edm_sbc_defconfig +++ b/configs/imx8mp_data_modul_edm_sbc_defconfig @@ -26,7 +26,7 @@ CONFIG_SPL_STACK=0x96fc00 CONFIG_SPL_TEXT_BASE=0x920000 CONFIG_SPL_HAS_BSS_LINKER_SECTION=y CONFIG_SPL_BSS_START_ADDR=0x96fc00 -CONFIG_SPL_BSS_MAX_SIZE=0x400 +CONFIG_SPL_BSS_MAX_SIZE=0x600 CONFIG_SYS_BOOTM_LEN=0x8000000 CONFIG_SYS_LOAD_ADDR=0x50000000 CONFIG_SPL=y diff --git a/configs/imx8mp_dhsom.config b/configs/imx8mp_dhsom.config index 3980c410266..cdabdbd3859 100644 --- a/configs/imx8mp_dhsom.config +++ b/configs/imx8mp_dhsom.config @@ -28,7 +28,7 @@ CONFIG_USE_PREBOOT=y CONFIG_FIT_EXTERNAL_OFFSET=0x3000 CONFIG_SPL_BOARD_INIT=y CONFIG_SPL_BOOTROM_SUPPORT=y -CONFIG_SPL_BSS_MAX_SIZE=0x400 +CONFIG_SPL_BSS_MAX_SIZE=0x600 CONFIG_SPL_BSS_START_ADDR=0x96fc00 CONFIG_SPL_CUSTOM_SYS_MALLOC_ADDR=0x4c000000 CONFIG_SPL_DM=y diff --git a/configs/imx8mp_venice_defconfig b/configs/imx8mp_venice_defconfig index 39b82063537..5365493ef30 100644 --- a/configs/imx8mp_venice_defconfig +++ b/configs/imx8mp_venice_defconfig @@ -20,7 +20,7 @@ CONFIG_SPL_STACK=0x960000 CONFIG_SPL_TEXT_BASE=0x920000 CONFIG_SPL_HAS_BSS_LINKER_SECTION=y CONFIG_SPL_BSS_START_ADDR=0x98fc00 -CONFIG_SPL_BSS_MAX_SIZE=0x400 +CONFIG_SPL_BSS_MAX_SIZE=0x600 CONFIG_SYS_BOOTM_LEN=0x10000000 CONFIG_SYS_LOAD_ADDR=0x40480000 CONFIG_SPL=y diff --git a/configs/venice2_defconfig b/configs/venice2_defconfig index 3d80197ef38..a832f324dce 100644 --- a/configs/venice2_defconfig +++ b/configs/venice2_defconfig @@ -59,3 +59,4 @@ CONFIG_USB_ETHER_ASIX=y CONFIG_USB_GADGET=y CONFIG_CI_UDC=y CONFIG_USB_GADGET_DOWNLOAD=y +CONFIG_SPL_MAX_SIZE=0x28000 -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Several tests use CONFIG_SYS_MALLOC_LEN to test allocations that should fail due to exceeding pool size. However, the actual malloc pool size is TOTAL_MALLOC_LEN, which includes CONFIG_ENV_SIZE for boards that need to store the environment in RAM. The extra space accommodates: - the hash table allocated via calloc() - strdup() calls for each environment variable key - strdup() calls for each environment variable value This is an estimate and typically consumes less than CONFIG_ENV_SIZE, leaving more free space in the malloc pool than was reserved. On qemu-x86_64, CONFIG_ENV_SIZE is 0x40000, making the actual pool 0x240000 bytes. Tests expecting malloc(CONFIG_SYS_MALLOC_LEN) to fail might unexpectedly succeed since there's more space available. Update all tests to use TOTAL_MALLOC_LEN to correctly reflect the actual malloc pool size. Co-developed-by: Claude <noreply@anthropic.com> --- test/lib/abuf.c | 5 +++-- test/lib/alist.c | 3 ++- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/test/lib/abuf.c b/test/lib/abuf.c index 9cbb627d0b6..e97bb8b66bc 100644 --- a/test/lib/abuf.c +++ b/test/lib/abuf.c @@ -5,6 +5,7 @@ */ #include <abuf.h> +#include <env_internal.h> #include <mapmem.h> #include <test/lib.h> #include <test/test.h> @@ -244,7 +245,7 @@ static int lib_test_abuf_large(struct unit_test_state *uts) /* Try an impossible size */ abuf_init(&buf); - ut_asserteq(false, abuf_realloc(&buf, CONFIG_SYS_MALLOC_LEN)); + ut_asserteq(false, abuf_realloc(&buf, TOTAL_MALLOC_LEN)); ut_assertnull(buf.data); ut_asserteq(0, buf.size); ut_asserteq(false, buf.alloced); @@ -264,7 +265,7 @@ static int lib_test_abuf_large(struct unit_test_state *uts) ut_assert(delta > 0); /* try to increase it */ - ut_asserteq(false, abuf_realloc(&buf, CONFIG_SYS_MALLOC_LEN)); + ut_asserteq(false, abuf_realloc(&buf, TOTAL_MALLOC_LEN)); ut_asserteq_ptr(ptr, buf.data); ut_asserteq(TEST_DATA_LEN, buf.size); ut_asserteq(true, buf.alloced); diff --git a/test/lib/alist.c b/test/lib/alist.c index 0bf24578d2e..108eaed8d92 100644 --- a/test/lib/alist.c +++ b/test/lib/alist.c @@ -5,6 +5,7 @@ */ #include <alist.h> +#include <env_internal.h> #include <string.h> #include <test/lib.h> #include <test/test.h> @@ -41,7 +42,7 @@ static int lib_test_alist_init(struct unit_test_state *uts) /* use an impossible size */ ut_asserteq(false, alist_init(&lst, obj_size, - CONFIG_SYS_MALLOC_LEN)); + TOTAL_MALLOC_LEN)); ut_assertnull(lst.data); ut_asserteq(0, lst.count); ut_asserteq(0, lst.alloc); -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Rename this file so that we can start to bring in the new one. Signed-off-by: Simon Glass <simon.glass@canonical.com> --- common/Makefile | 2 +- common/{dlmalloc.c => dlmalloc_old.c} | 0 2 files changed, 1 insertion(+), 1 deletion(-) rename common/{dlmalloc.c => dlmalloc_old.c} (100%) diff --git a/common/Makefile b/common/Makefile index 7270af457f5..ffa46ce5e06 100644 --- a/common/Makefile +++ b/common/Makefile @@ -71,7 +71,7 @@ obj-$(CONFIG_BOUNCE_BUFFER) += bouncebuf.o obj-$(CONFIG_$(PHASE_)SERIAL) += console.o obj-$(CONFIG_CROS_EC) += cros_ec.o -obj-y += dlmalloc.o +obj-y += dlmalloc_old.o obj-$(CONFIG_$(PHASE_)SYS_MALLOC_F) += malloc_simple.o obj-$(CONFIG_$(PHASE_)CYCLIC) += cyclic.o diff --git a/common/dlmalloc.c b/common/dlmalloc_old.c similarity index 100% rename from common/dlmalloc.c rename to common/dlmalloc_old.c -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Add mem_malloc_init() to initialise the malloc heap and sbrk() for heap extension. These are U-Boot-specific functions that manage the memory pool used by dlmalloc. Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- common/dlmalloc.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++ include/malloc.h | 27 +++++++++++++++++++++++++ 2 files changed, 78 insertions(+) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index ba8ca5bd0ae..2be2262ebeb 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -542,6 +542,18 @@ MAX_RELEASE_CHECK_RATE default: 4095 unless not HAVE_MMAP #define DLMALLOC_EXPORT extern #endif +#ifdef __UBOOT__ +#include <mapmem.h> +#include <asm/global_data.h> + +DECLARE_GLOBAL_DATA_PTR; + +ulong mem_malloc_start; +ulong mem_malloc_end; +ulong mem_malloc_brk; + +#endif /* __UBOOT__ */ + #ifndef WIN32 #ifdef _WIN32 #define WIN32 1 @@ -6290,3 +6302,42 @@ History: structure of old version, but most details differ.) */ + +/* --------------------- U-Boot additions --------------------- */ + +#ifdef __UBOOT__ + +void *sbrk(ptrdiff_t increment) +{ + ulong old = mem_malloc_brk; + ulong new = old + increment; + + /* mem_malloc_end points one byte past the end, so >= is correct */ + if ((new < mem_malloc_start) || (new >= mem_malloc_end)) + return (void *)MORECORE_FAILURE; + + /* + * if we are giving memory back make sure we clear it out since + * we set MORECORE_CLEARS to 1 + */ + if (increment < 0) + memset((void *)new, '\0', -increment); + + mem_malloc_brk = new; + + return (void *)old; +} + +void mem_malloc_init(ulong start, ulong size) +{ + mem_malloc_start = (ulong)map_sysmem(start, size); + mem_malloc_end = mem_malloc_start + size; + mem_malloc_brk = mem_malloc_end; + + debug("using memory %#lx-%#lx for malloc()\n", mem_malloc_start, + mem_malloc_end); +#if CONFIG_IS_ENABLED(SYS_MALLOC_CLEAR_ON_INIT) + memset((void *)mem_malloc_start, '\0', size); +#endif +} +#endif /* __UBOOT__ */ diff --git a/include/malloc.h b/include/malloc.h index 4608082d2e0..e0a5b732203 100644 --- a/include/malloc.h +++ b/include/malloc.h @@ -625,6 +625,33 @@ void mspace_inspect_all(mspace msp, void* arg); #endif /* MSPACES */ +/* --------------------- U-Boot additions --------------------- */ + +#ifdef __UBOOT__ +#include <linux/types.h> + +/* Memory pool boundaries */ +extern ulong mem_malloc_start; +extern ulong mem_malloc_end; +extern ulong mem_malloc_brk; + +/** + * mem_malloc_init() - Initialize the malloc() heap + * + * @start: Start address of heap memory region + * @size: Size of heap memory region in bytes + */ +void mem_malloc_init(ulong start, ulong size); + +/** + * sbrk() - Extend the heap + * + * @increment: Number of bytes to add (or remove if negative) + * Return: Previous break value on success, MORECORE_FAILURE on error + */ +void *sbrk(ptrdiff_t increment); +#endif /* __UBOOT__ */ + #ifdef __cplusplus }; /* end of extern "C" */ #endif -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Add an #ifdef __UBOOT__ section to configure dlmalloc for U-Boot's embedded environment: - Disable mmap, set LACKS_* for unavailable headers - Include string.h and errno.h - Add ABORT definition using infinite loop - Define DEBUG 0 to avoid assert redefinition issues - Fix dlmalloc_footprint_limit() prototype (add void) - Fix dlmalloc_usable_size() to use const void * - Use MFAIL instead of MORECORE_FAILURE in sbrk() Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- common/dlmalloc.c | 30 +++++++++++++++++++++++++----- 1 file changed, 25 insertions(+), 5 deletions(-) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index 2be2262ebeb..480dd46c0cf 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -543,6 +543,26 @@ MAX_RELEASE_CHECK_RATE default: 4095 unless not HAVE_MMAP #endif #ifdef __UBOOT__ + +#define LACKS_FCNTL_H +#define LACKS_UNISTD_H +#define LACKS_SYS_PARAM_H +#define LACKS_SYS_MMAN_H +#define LACKS_SYS_TYPES_H +#define LACKS_SCHED_H +#define LACKS_TIME_H +#define HAVE_MMAP 0 +#define HAVE_MREMAP 0 +#define MORECORE_CONTIGUOUS 1 +#define MORECORE_CANNOT_TRIM 1 +#define MORECORE_CLEARS 1 +#define NO_MALLOC_STATS 1 +#define USE_LOCKS 0 +#define USE_SPIN_LOCKS 0 +#define MALLOC_FAILURE_ACTION +#define ABORT do {} while (1) + +#include <malloc.h> #include <mapmem.h> #include <asm/global_data.h> @@ -1020,7 +1040,7 @@ DLMALLOC_EXPORT size_t dlmalloc_max_footprint(void); guarantee that this number of bytes can actually be obtained from the system. */ -DLMALLOC_EXPORT size_t dlmalloc_footprint_limit(); +DLMALLOC_EXPORT size_t dlmalloc_footprint_limit(void); /* malloc_set_footprint_limit(); @@ -1281,7 +1301,7 @@ DLMALLOC_EXPORT void dlmalloc_stats(void); p = malloc(n); assert(malloc_usable_size(p) >= 256); */ -size_t dlmalloc_usable_size(void*); +size_t dlmalloc_usable_size(const void*); #endif /* ONLY_MSPACES */ @@ -5400,9 +5420,9 @@ int dlmallopt(int param_number, int value) { return change_mparam(param_number, value); } -size_t dlmalloc_usable_size(void* mem) { +size_t dlmalloc_usable_size(const void* mem) { if (mem != 0) { - mchunkptr p = mem2chunk(mem); + mchunkptr p = mem2chunk((void*)mem); if (is_inuse(p)) return chunksize(p) - overhead_for(p); } @@ -6314,7 +6334,7 @@ void *sbrk(ptrdiff_t increment) /* mem_malloc_end points one byte past the end, so >= is correct */ if ((new < mem_malloc_start) || (new >= mem_malloc_end)) - return (void *)MORECORE_FAILURE; + return MFAIL; /* * if we are giving memory back make sure we clear it out since -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Include log.h early to prevent an assert() redefinition warning. Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- common/dlmalloc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index 480dd46c0cf..ff13a779211 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -562,6 +562,7 @@ MAX_RELEASE_CHECK_RATE default: 4095 unless not HAVE_MMAP #define MALLOC_FAILURE_ACTION #define ABORT do {} while (1) +#include <log.h> #include <malloc.h> #include <mapmem.h> #include <asm/global_data.h> -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> When malloc() was called before it was properly initialized (as would happen if when used before relocation to RAM) it returned random, non-NULL values, which called all kinds of difficult to debug subsequent errors. Make sure to return NULL when initialization was not done yet. Signed-off-by: Simon Glass <simon.glass@canonical.com> (cherry picked from commit 2740544881f652566756815dda4da0bcd946e9de) --- common/dlmalloc.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index ff13a779211..baa9b500e10 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -4582,6 +4582,11 @@ static void* tmalloc_small(mstate m, size_t nb) { #if !ONLY_MSPACES void* dlmalloc(size_t bytes) { +#ifdef __UBOOT__ + /* Return NULL if not initialized yet */ + if (!mem_malloc_start && !mem_malloc_end) + return NULL; +#endif /* Basic algorithm: If a small request (< 256 bytes minus per-chunk overhead): -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> If we are to have driver model before relocation we need to support some way of calling memory allocation routines. The standard malloc() is pretty complicated: 1. It uses some BSS memory for its state, and BSS is not available before relocation 2. It supports algorithms for reducing memory fragmentation and improving performace of free(). Before relocation we could happily just not support free(). 3. It includes about 4KB of code (Thumb 2) and 1KB of data. However since this has been loaded anyway this is not really a problem. The simplest way to support pre-relocation malloc() is to reserve an area of memory and allocate it in increasing blocks as needed. This implementation does this. To enable it, you need to define the size of the malloc() pool as described in the README. It will be located above the pre-relocation stack on supported architectures. Note that this implementation is only useful on machines which have some memory available before dram_init() is called - this includes those that do no DRAM init (like tegra) and those that do it in SPL (quite a few boards). Enabling driver model preior to relocation for the rest of the boards is left for a later exercise. Changes from original commit: - Squash in commit 'malloc: Redirect to malloc_simple before relocation' - Modify dlmalloc/dlfree/dlrealloc/dlmemalign (new 2.8.6 names) - Add #ifdef __UBOOT__ wrapper around the checks - Redirect to malloc_simple()/memalign_simple() instead of embedding code - Add declarations for malloc_simple() and memalign_simple() to malloc.h - Move global_data.h include and DECLARE_GLOBAL_DATA_PTR to top - Add proper documentation for the two new functions Signed-off-by: Simon Glass <sjg@chromium.org> Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> (cherry picked from commit d59476b6446799c21e64147d86483140154c1886) --- common/dlmalloc.c | 38 ++++++++++++++++++++++++++++++++++++++ include/malloc.h | 36 ++++++++++++++++++++++++++++++++++++ 2 files changed, 74 insertions(+) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index baa9b500e10..f0b6db20f5c 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -565,6 +565,7 @@ MAX_RELEASE_CHECK_RATE default: 4095 unless not HAVE_MMAP #include <log.h> #include <malloc.h> #include <mapmem.h> +#include <vsprintf.h> #include <asm/global_data.h> DECLARE_GLOBAL_DATA_PTR; @@ -4583,6 +4584,11 @@ static void* tmalloc_small(mstate m, size_t nb) { void* dlmalloc(size_t bytes) { #ifdef __UBOOT__ +#if CONFIG_IS_ENABLED(SYS_MALLOC_F) + if (!(gd->flags & GD_FLG_FULL_MALLOC_INIT)) + return malloc_simple(bytes); +#endif + /* Return NULL if not initialized yet */ if (!mem_malloc_start && !mem_malloc_end) return NULL; @@ -4725,6 +4731,13 @@ void* dlmalloc(size_t bytes) { /* ---------------------------- free --------------------------- */ void dlfree(void* mem) { +#ifdef __UBOOT__ +#if CONFIG_IS_ENABLED(SYS_MALLOC_F) + /* free() is a no-op - all the memory will be freed on relocation */ + if (!(gd->flags & GD_FLG_FULL_MALLOC_INIT)) + return; +#endif +#endif /* Consolidate freed chunks with preceeding or succeeding bordering free chunks, if they exist, and then place in a bin. Intermixed @@ -5228,6 +5241,14 @@ static void internal_inspect_all(mstate m, #if !ONLY_MSPACES void* dlrealloc(void* oldmem, size_t bytes) { +#ifdef __UBOOT__ +#if CONFIG_IS_ENABLED(SYS_MALLOC_F) + if (!(gd->flags & GD_FLG_FULL_MALLOC_INIT)) { + /* This is harder to support and should not be needed */ + panic("pre-reloc realloc() is not supported"); + } +#endif +#endif void* mem = 0; if (oldmem == 0) { mem = dlmalloc(bytes); @@ -5304,6 +5325,12 @@ void* dlrealloc_in_place(void* oldmem, size_t bytes) { } void* dlmemalign(size_t alignment, size_t bytes) { +#ifdef __UBOOT__ +#if CONFIG_IS_ENABLED(SYS_MALLOC_F) + if (!(gd->flags & GD_FLG_FULL_MALLOC_INIT)) + return memalign_simple(alignment, bytes); +#endif +#endif if (alignment <= MALLOC_ALIGNMENT) { return dlmalloc(bytes); } @@ -6366,4 +6393,15 @@ void mem_malloc_init(ulong start, ulong size) memset((void *)mem_malloc_start, '\0', size); #endif } + +int initf_malloc(void) +{ +#if CONFIG_IS_ENABLED(SYS_MALLOC_F) + assert(gd->malloc_base); /* Set up by crt0.S */ + gd->malloc_limit = CONFIG_VAL(SYS_MALLOC_F_LEN); + gd->malloc_ptr = 0; +#endif + + return 0; +} #endif /* __UBOOT__ */ diff --git a/include/malloc.h b/include/malloc.h index e0a5b732203..d5cccc96e50 100644 --- a/include/malloc.h +++ b/include/malloc.h @@ -650,6 +650,42 @@ void mem_malloc_init(ulong start, ulong size); * Return: Previous break value on success, MORECORE_FAILURE on error */ void *sbrk(ptrdiff_t increment); + +/** + * malloc_simple() - Allocate memory from the simple malloc pool + * + * Allocates memory from a simple pool used before full malloc() is available. + * This is used before relocation when BSS is not yet available for dlmalloc's + * state. Memory allocated with this function cannot be freed. + * + * @size: Number of bytes to allocate + * Return: Pointer to allocated memory, or NULL if pool is exhausted + */ +void *malloc_simple(size_t size); + +/** + * memalign_simple() - Allocate aligned memory from the simple malloc pool + * + * Allocates aligned memory from a simple pool used before full malloc() is + * available. This is used before relocation when BSS is not yet available + * for dlmalloc's state. Memory allocated with this function cannot be freed. + * + * @alignment: Required alignment (must be a power of 2) + * @bytes: Number of bytes to allocate + * Return: Pointer to allocated memory, or NULL if pool is exhausted + */ +void *memalign_simple(size_t alignment, size_t bytes); + +/** + * initf_malloc() - Set up the early malloc() pool + * + * Sets up the simple malloc() pool which is used before full malloc() + * is available after relocation. + * + * Return: 0 (always succeeds) + */ +int initf_malloc(void); + #endif /* __UBOOT__ */ #ifdef __cplusplus -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> dlmalloc has some sanity checks it performs on free() which can help detect memory corruption. However, they are only enabled if DEBUG is defined before including common.h. Define DEBUG earlier if UNIT_TEST is enabled so that assertions are enabled in sandbox. Changes from original commit(s): - Combine commits 213adf6dffe and 1786861415f - Use 'DEBUG 1' instead of 'DEBUG' since new dlmalloc uses '#if DEBUG' Signed-off-by: Sean Anderson <seanga2@gmail.com> Reviewed-by: Simon Glass <sjg@chromium.org> Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> (cherry picked from commit 6d7601e74437f3c915667a829ab722ba5174ec72) (cherry picked from commit 1786861415f4494a38630584a8fbc9c939a024ce) --- common/dlmalloc.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index f0b6db20f5c..98de6523758 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -544,6 +544,10 @@ MAX_RELEASE_CHECK_RATE default: 4095 unless not HAVE_MMAP #ifdef __UBOOT__ +#if CONFIG_IS_ENABLED(UNIT_TEST) +#define DEBUG 1 +#endif + #define LACKS_FCNTL_H #define LACKS_UNISTD_H #define LACKS_SYS_PARAM_H -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Enable INSECURE=1 to skip runtime heap validation checks (except for sandbox), and NO_MALLINFO=1 to remove mallinfo support (except when unit tests are enabled). These reduce code size significantly. Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- common/dlmalloc.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index 98de6523758..268d3fea52a 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -563,6 +563,14 @@ MAX_RELEASE_CHECK_RATE default: 4095 unless not HAVE_MMAP #define NO_MALLOC_STATS 1 #define USE_LOCKS 0 #define USE_SPIN_LOCKS 0 + +#if !CONFIG_IS_ENABLED(UNIT_TEST) +#define NO_MALLINFO 1 +#endif +#if !CONFIG_IS_ENABLED(SANDBOX) +#define INSECURE 1 +#endif + #define MALLOC_FAILURE_ACTION #define ABORT do {} while (1) -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Use of memalign can trigger fragmentation issues where the over-sized allocation needed to guarantee alignment fails, even though the exact user-requested size would succeed and be properly aligned. If the padded allocation fails, try allocating exactly the user's requested size. If that happens to be aligned, return it. Otherwise, try a third allocation with just enough extra space to achieve alignment. Changes from original commits: - Port to dlmalloc 2.8.6 internal_memalign() instead of mEMALIGn() - Use internal_malloc/internal_free instead of mALLOc/fREe Signed-off-by: Stephen Warren <swarren@nvidia.com> Reviewed-by: Tom Rini <trini@konsulko.com> Acked-by: Lukasz Majewski <l.majewski@samsung.com> Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> (cherry picked from 4f144a416469c6a29127b0656523ae628ea7cbaf) --- common/dlmalloc.c | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index 268d3fea52a..5a8e463671c 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -4975,6 +4975,46 @@ static void* internal_memalign(mstate m, size_t alignment, size_t bytes) { size_t nb = request2size(bytes); size_t req = nb + alignment + MIN_CHUNK_SIZE - CHUNK_OVERHEAD; mem = internal_malloc(m, req); +#ifdef __UBOOT__ + /* + * The attempt to over-allocate (with a size large enough to guarantee the + * ability to find an aligned region within allocated memory) failed. + * + * Try again, this time only allocating exactly the size the user wants. + * If the allocation now succeeds and just happens to be aligned, we can + * still fulfill the user's request. + */ + if (mem == 0) { + size_t extra, extra2; + + mem = internal_malloc(m, bytes); + /* Aligned -> use it */ + if (mem != 0 && (((size_t)(mem)) & (alignment - 1)) == 0) + return mem; + /* + * Otherwise, try again, requesting enough extra space to be able to + * acquire alignment. + */ + if (mem != 0) { + internal_free(m, mem); + /* Add in extra bytes to match misalignment of unexpanded alloc */ + extra = alignment - (((size_t)(mem)) % alignment); + mem = internal_malloc(m, bytes + extra); + /* + * mem might not be the same as before. Validate that the previous + * value of extra still works for the current value of mem. + */ + if (mem != 0) { + extra2 = alignment - (((size_t)(mem)) % alignment); + if (extra2 > extra) { + internal_free(m, mem); + mem = 0; + } + } + } + /* Fall through to original NULL check and chunk splitting logic */ + } +#endif if (mem != 0) { mchunkptr p = mem2chunk(mem); if (PREACTION(m)) -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> When running sandbox, U-Boot's malloc symbols can be hooked into the GOT before U-Boot code runs. This causes issues because the dynamic linker may call malloc/free before gd is initialized. Use hidden visibility for malloc symbols to prevent them from being hooked into the GOT, so only code in the U-Boot binary itself calls them; any other code calls the standard C library malloc(). Changes from original commit: - Use DLMALLOC_EXPORT mechanism instead of #pragma in malloc.h Cc: Rabin Vincent <rabin@rab.in> Signed-off-by: Stephen Warren <swarren@wwwdotorg.org> Reviewed-by: Tom Rini <trini@konsulko.com> Reviewed-by: Simon Glass <sjg@chromium.org> Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> (cherry picked from commit 2f0bcd4de1a5b990e58d12cd0c7f9d7e9248fec4) --- common/dlmalloc.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index 5a8e463671c..d53cbf2f2e1 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -538,6 +538,15 @@ MAX_RELEASE_CHECK_RATE default: 4095 unless not HAVE_MMAP #define DLMALLOC_VERSION 20806 #endif /* DLMALLOC_VERSION */ +/* + * For U-Boot, use hidden visibility to prevent malloc symbols from being + * hooked into the GOT, avoiding issues during early initialization before + * gd is set up. + */ +#ifdef __UBOOT__ +#define DLMALLOC_EXPORT extern __attribute__((visibility("hidden"))) +#endif + #ifndef DLMALLOC_EXPORT #define DLMALLOC_EXPORT extern #endif -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> When full malloc is enabled and SYS_MALLOC_F is also enabled, the simple pre-reloc heap is used before relocation. The calloc_must_clear macro relies on chunk metadata which does not exist for simple malloc allocations. Use memset directly to zero out memory from simple malloc. Changes from original commit: - Port to dlcalloc() in dlmalloc 2.8.6 - Update memset() second arg to be a char Signed-off-by: Simon Goldschmidt <simon.k.r.goldschmidt@gmail.com> Reviewed-by: Tom Rini <trini@konsulko.com> Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> (cherry picked from bb71a2d9dcd9c53aa4d4b8e4d26c24d9b59b74c3) --- common/dlmalloc.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index d53cbf2f2e1..a07166206dc 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -4877,6 +4877,15 @@ void* dlcalloc(size_t n_elements, size_t elem_size) { req = MAX_SIZE_T; /* force downstream failure on overflow */ } mem = dlmalloc(req); +#ifdef __UBOOT__ +#if CONFIG_IS_ENABLED(SYS_MALLOC_F) + /* For pre-reloc simple malloc, just zero the memory directly */ + if (mem != 0 && !(gd->flags & GD_FLG_FULL_MALLOC_INIT)) { + memset(mem, '\0', req); + return mem; + } +#endif +#endif if (mem != 0 && calloc_must_clear(mem2chunk(mem))) memset(mem, 0, req); return mem; -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> In order to allow use of both U-Boot's malloc() and the C library's version, set a prefix for the allocation functions so that they can co-exist. This is only done for sandbox. For other archs everything remains the same. Signed-off-by: Simon Glass <sjg@chromium.org> Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> (cherry picked from cfda60f99ae237494e9341aad9676152d3bac3c9) --- include/malloc.h | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/include/malloc.h b/include/malloc.h index d5cccc96e50..76068032da7 100644 --- a/include/malloc.h +++ b/include/malloc.h @@ -60,6 +60,14 @@ extern "C" { #if !ONLY_MSPACES +/* + * Rename the U-Boot alloc functions so that sandbox can still use the system + * ones + */ +#ifdef CONFIG_SANDBOX +#define USE_DL_PREFIX +#endif + #ifndef USE_DL_PREFIX #define dlcalloc calloc #define dlfree free @@ -82,6 +90,21 @@ extern "C" { #define dlindependent_calloc independent_calloc #define dlindependent_comalloc independent_comalloc #define dlbulk_free bulk_free +#else /* USE_DL_PREFIX */ +/* Ensure that U-Boot actually uses dlmalloc versions */ +#define calloc(n, s) dlcalloc(n, s) +#define free(p) dlfree(p) +#define malloc(s) dlmalloc(s) +#define memalign(a, s) dlmemalign(a, s) +#define posix_memalign(p, a, s) dlposix_memalign(p, a, s) +#define realloc(p, s) dlrealloc(p, s) +#define valloc(s) dlvalloc(s) +#define pvalloc(s) dlpvalloc(s) +#define mallinfo() dlmallinfo() +#define mallopt(p, v) dlmallopt(p, v) +#define malloc_trim(s) dlmalloc_trim(s) +#define malloc_stats() dlmalloc_stats() +#define malloc_usable_size(p) dlmalloc_usable_size(p) #endif /* USE_DL_PREFIX */ #if !NO_MALLINFO -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Add valgrind annotations to track memory allocations: - VALGRIND_MALLOCLIKE_BLOCK in dlmalloc() at the postaction label - VALGRIND_FREELIKE_BLOCK in dlfree() for both pre-reloc and post-reloc paths - VALGRIND_RESIZEINPLACE_BLOCK/VALGRIND_MAKE_MEM_DEFINED in dlrealloc() and dlrealloc_in_place() when resizing in place - VALGRIND_MALLOCLIKE_BLOCK/VALGRIND_FREELIKE_BLOCK in dlrealloc() when allocating new memory Changes from original commit: - The new dlmalloc 2.8.6 uses a centralized "postaction" label pattern instead of multiple return points, allowing simpler annotation - Annotations placed at strategic points covering all allocation paths - dlrealloc_in_place() is a new function that needs annotations Signed-off-by: Sean Anderson <seanga2@gmail.com> Reviewed-by: Simon Glass <sjg@chromium.org> Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> (cherry picked from commit bdaeea1b6863b0ec80f2d4bc15d50b8d16efa708) --- common/dlmalloc.c | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index a07166206dc..9298fc445e4 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -588,6 +588,7 @@ MAX_RELEASE_CHECK_RATE default: 4095 unless not HAVE_MMAP #include <mapmem.h> #include <vsprintf.h> #include <asm/global_data.h> +#include <valgrind/memcheck.h> DECLARE_GLOBAL_DATA_PTR; @@ -4743,6 +4744,10 @@ void* dlmalloc(size_t bytes) { postaction: POSTACTION(gm); +#ifdef __UBOOT__ + if (mem) + VALGRIND_MALLOCLIKE_BLOCK(mem, bytes, SIZE_SZ, false); +#endif return mem; } @@ -4755,8 +4760,10 @@ void dlfree(void* mem) { #ifdef __UBOOT__ #if CONFIG_IS_ENABLED(SYS_MALLOC_F) /* free() is a no-op - all the memory will be freed on relocation */ - if (!(gd->flags & GD_FLG_FULL_MALLOC_INIT)) + if (!(gd->flags & GD_FLG_FULL_MALLOC_INIT)) { + VALGRIND_FREELIKE_BLOCK(mem, SIZE_SZ); return; + } #endif #endif /* @@ -4778,6 +4785,9 @@ void dlfree(void* mem) { #endif /* FOOTERS */ if (!PREACTION(fm)) { check_inuse_chunk(fm, p); +#ifdef __UBOOT__ + VALGRIND_FREELIKE_BLOCK(mem, SIZE_SZ); +#endif if (RTCHECK(ok_address(fm, p) && ok_inuse(p))) { size_t psize = chunksize(p); mchunkptr next = chunk_plus_offset(p, psize); @@ -5349,12 +5359,25 @@ void* dlrealloc(void* oldmem, size_t bytes) { if (newp != 0) { check_inuse_chunk(m, newp); mem = chunk2mem(newp); +#ifdef __UBOOT__ + if (mem == oldmem) { + VALGRIND_RESIZEINPLACE_BLOCK(oldmem, 0, bytes, SIZE_SZ); + VALGRIND_MAKE_MEM_DEFINED(oldmem, bytes); + } else { + VALGRIND_MALLOCLIKE_BLOCK(mem, bytes, SIZE_SZ, false); + VALGRIND_FREELIKE_BLOCK(oldmem, SIZE_SZ); + } +#endif } else { mem = internal_malloc(m, bytes); if (mem != 0) { size_t oc = chunksize(oldp) - overhead_for(oldp); memcpy(mem, oldmem, (oc < bytes)? oc : bytes); +#ifdef __UBOOT__ + VALGRIND_MALLOCLIKE_BLOCK(mem, bytes, SIZE_SZ, false); + VALGRIND_FREELIKE_BLOCK(oldmem, SIZE_SZ); +#endif internal_free(m, oldmem); } } @@ -5387,6 +5410,10 @@ void* dlrealloc_in_place(void* oldmem, size_t bytes) { if (newp == oldp) { check_inuse_chunk(m, newp); mem = oldmem; +#ifdef __UBOOT__ + VALGRIND_RESIZEINPLACE_BLOCK(oldmem, 0, bytes, SIZE_SZ); + VALGRIND_MAKE_MEM_DEFINED(oldmem, bytes); +#endif } } } -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> It is helpful to test that out-of-memory checks work correctly in code that calls malloc(). Add a simple way to force failure after a given number of malloc() calls. Also add support for realloc() testing (from commit 04894f5ad53). Changes from original commits: - Variable declarations moved to top of U-Boot section (before dlmalloc()) - Adapted to new dlmalloc function names (dlmalloc/dlrealloc vs mALLOc/rEALLOc) Signed-off-by: Simon Glass <sjg@chromium.org> Reviewed-by: Sean Anderson <seanga2@gmail.com> Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> (cherry picked from commit 62d638386c17d17b929ad10956c7f60825335a4e) (cherry picked from commit 04894f5ad53cab0ee03eb3bc1cc1682e22f5dd1b) --- common/dlmalloc.c | 24 ++++++++++++++++++++++++ include/malloc.h | 12 ++++++++++++ 2 files changed, 36 insertions(+) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index 9298fc445e4..aacc9b5db3b 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -592,6 +592,9 @@ MAX_RELEASE_CHECK_RATE default: 4095 unless not HAVE_MMAP DECLARE_GLOBAL_DATA_PTR; +static bool malloc_testing; /* enable test mode */ +static int malloc_max_allocs; /* return NULL after this many calls to malloc() */ + ulong mem_malloc_start; ulong mem_malloc_end; ulong mem_malloc_brk; @@ -4614,6 +4617,11 @@ void* dlmalloc(size_t bytes) { /* Return NULL if not initialized yet */ if (!mem_malloc_start && !mem_malloc_end) return NULL; + + if (CONFIG_IS_ENABLED(UNIT_TEST) && malloc_testing) { + if (--malloc_max_allocs < 0) + return NULL; + } #endif /* Basic algorithm: @@ -5328,6 +5336,10 @@ void* dlrealloc(void* oldmem, size_t bytes) { panic("pre-reloc realloc() is not supported"); } #endif + if (CONFIG_IS_ENABLED(UNIT_TEST) && malloc_testing) { + if (--malloc_max_allocs < 0) + return NULL; + } #endif void* mem = 0; if (oldmem == 0) { @@ -6491,6 +6503,17 @@ void mem_malloc_init(ulong start, ulong size) #endif } +void malloc_enable_testing(int max_allocs) +{ + malloc_testing = true; + malloc_max_allocs = max_allocs; +} + +void malloc_disable_testing(void) +{ + malloc_testing = false; +} + int initf_malloc(void) { #if CONFIG_IS_ENABLED(SYS_MALLOC_F) @@ -6501,4 +6524,5 @@ int initf_malloc(void) return 0; } + #endif /* __UBOOT__ */ diff --git a/include/malloc.h b/include/malloc.h index 76068032da7..72db7fdb507 100644 --- a/include/malloc.h +++ b/include/malloc.h @@ -658,6 +658,18 @@ extern ulong mem_malloc_start; extern ulong mem_malloc_end; extern ulong mem_malloc_brk; +/** + * malloc_enable_testing() - Enable malloc failure testing + * + * @max_allocs: Number of allocations to allow before returning NULL + */ +void malloc_enable_testing(int max_allocs); + +/** + * malloc_disable_testing() - Disable malloc failure testing + */ +void malloc_disable_testing(void); + /** * mem_malloc_init() - Initialize the malloc() heap * -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Add (void) to dlmalloc_stats() function definition to match its declaration and avoid the clang-15 warning about function declarations without prototypes. Signed-off-by: Tom Rini <trini@konsulko.com> Reviewed-by: Simon Glass <sjg@chromium.org> Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> (cherry picked from commit f88d48cc74f0e78b14fed812101d94de65e43802) --- common/dlmalloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index aacc9b5db3b..03fd902c9f6 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -5553,7 +5553,7 @@ struct mallinfo dlmallinfo(void) { #endif /* NO_MALLINFO */ #if !NO_MALLOC_STATS -void dlmalloc_stats() { +void dlmalloc_stats(void) { internal_malloc_stats(gm); } #endif /* NO_MALLOC_STATS */ -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Add STATIC_IF_MCHECK macro and rename the main allocation functions to *_impl versions (dlmalloc_impl, dlfree_impl, dlrealloc_impl, dlmemalign_impl, dlcalloc_impl) to prepare for mcheck wrappers. When MCHECK_HEAP_PROTECTION is not defined, the *_impl macros map directly to the original function names, so behavior is unchanged. Changes from original commit: - Adapted to new dlmalloc 2.8.6 function names (dl* vs mALLOc, etc.) - Updated all internal calls to use *_impl versions Signed-off-by: Eugene Uriev <eugeneuriev@gmail.com> Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> (cherry picked from commit c82ff481159d2cf7e637c709df84883e09bba588) --- common/dlmalloc.c | 40 ++++++++++++++++++++++++++++------------ 1 file changed, 28 insertions(+), 12 deletions(-) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index 03fd902c9f6..972cadd2e2f 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -557,6 +557,17 @@ MAX_RELEASE_CHECK_RATE default: 4095 unless not HAVE_MMAP #define DEBUG 1 #endif +#ifdef MCHECK_HEAP_PROTECTION +#define STATIC_IF_MCHECK static +#else +#define STATIC_IF_MCHECK +#define dlmalloc_impl dlmalloc +#define dlfree_impl dlfree +#define dlrealloc_impl dlrealloc +#define dlmemalign_impl dlmemalign +#define dlcalloc_impl dlcalloc +#endif + #define LACKS_FCNTL_H #define LACKS_UNISTD_H #define LACKS_SYS_PARAM_H @@ -4607,7 +4618,8 @@ static void* tmalloc_small(mstate m, size_t nb) { #if !ONLY_MSPACES -void* dlmalloc(size_t bytes) { +STATIC_IF_MCHECK +void* dlmalloc_impl(size_t bytes) { #ifdef __UBOOT__ #if CONFIG_IS_ENABLED(SYS_MALLOC_F) if (!(gd->flags & GD_FLG_FULL_MALLOC_INIT)) @@ -4764,7 +4776,8 @@ void* dlmalloc(size_t bytes) { /* ---------------------------- free --------------------------- */ -void dlfree(void* mem) { +STATIC_IF_MCHECK +void dlfree_impl(void* mem) { #ifdef __UBOOT__ #if CONFIG_IS_ENABLED(SYS_MALLOC_F) /* free() is a no-op - all the memory will be freed on relocation */ @@ -4885,7 +4898,8 @@ void dlfree(void* mem) { #endif /* FOOTERS */ } -void* dlcalloc(size_t n_elements, size_t elem_size) { +STATIC_IF_MCHECK +void* dlcalloc_impl(size_t n_elements, size_t elem_size) { void* mem; size_t req = 0; if (n_elements != 0) { @@ -4894,7 +4908,7 @@ void* dlcalloc(size_t n_elements, size_t elem_size) { (req / n_elements != elem_size)) req = MAX_SIZE_T; /* force downstream failure on overflow */ } - mem = dlmalloc(req); + mem = dlmalloc_impl(req); #ifdef __UBOOT__ #if CONFIG_IS_ENABLED(SYS_MALLOC_F) /* For pre-reloc simple malloc, just zero the memory directly */ @@ -5328,7 +5342,8 @@ static void internal_inspect_all(mstate m, #if !ONLY_MSPACES -void* dlrealloc(void* oldmem, size_t bytes) { +STATIC_IF_MCHECK +void* dlrealloc_impl(void* oldmem, size_t bytes) { #ifdef __UBOOT__ #if CONFIG_IS_ENABLED(SYS_MALLOC_F) if (!(gd->flags & GD_FLG_FULL_MALLOC_INIT)) { @@ -5343,14 +5358,14 @@ void* dlrealloc(void* oldmem, size_t bytes) { #endif void* mem = 0; if (oldmem == 0) { - mem = dlmalloc(bytes); + mem = dlmalloc_impl(bytes); } else if (bytes >= MAX_REQUEST) { MALLOC_FAILURE_ACTION; } #ifdef REALLOC_ZERO_BYTES_FREES else if (bytes == 0) { - dlfree(oldmem); + dlfree_impl(oldmem); } #endif /* REALLOC_ZERO_BYTES_FREES */ else { @@ -5433,7 +5448,8 @@ void* dlrealloc_in_place(void* oldmem, size_t bytes) { return mem; } -void* dlmemalign(size_t alignment, size_t bytes) { +STATIC_IF_MCHECK +void* dlmemalign_impl(size_t alignment, size_t bytes) { #ifdef __UBOOT__ #if CONFIG_IS_ENABLED(SYS_MALLOC_F) if (!(gd->flags & GD_FLG_FULL_MALLOC_INIT)) @@ -5441,7 +5457,7 @@ void* dlmemalign(size_t alignment, size_t bytes) { #endif #endif if (alignment <= MALLOC_ALIGNMENT) { - return dlmalloc(bytes); + return dlmalloc_impl(bytes); } return internal_memalign(gm, alignment, bytes); } @@ -5449,7 +5465,7 @@ void* dlmemalign(size_t alignment, size_t bytes) { int dlposix_memalign(void** pp, size_t alignment, size_t bytes) { void* mem = 0; if (alignment == MALLOC_ALIGNMENT) - mem = dlmalloc(bytes); + mem = dlmalloc_impl(bytes); else { size_t d = alignment / sizeof(void*); size_t r = alignment % sizeof(void*); @@ -5473,14 +5489,14 @@ void* dlvalloc(size_t bytes) { size_t pagesz; ensure_initialization(); pagesz = mparams.page_size; - return dlmemalign(pagesz, bytes); + return dlmemalign_impl(pagesz, bytes); } void* dlpvalloc(size_t bytes) { size_t pagesz; ensure_initialization(); pagesz = mparams.page_size; - return dlmemalign(pagesz, (bytes + pagesz - SIZE_T_ONE) & ~(pagesz - SIZE_T_ONE)); + return dlmemalign_impl(pagesz, (bytes + pagesz - SIZE_T_ONE) & ~(pagesz - SIZE_T_ONE)); } void** dlindependent_calloc(size_t n_elements, size_t elem_size, -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> These fast helpers sometimes breach mem-chunk boundaries. Thus they trigger mcheck alarm. Standard ones are accurate though. When MCHECK_HEAP_PROTECTION is enabled, redefine MALLOC_ZERO and MALLOC_COPY to use standard memset/memcpy instead of the optimized versions that may access memory outside allocated chunks. Signed-off-by: Eugene Uriev <eugeneuriev@gmail.com> Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> (cherry picked from commit dfba071ddc3e609e61770b34ab0115fbce05edb2) --- common/dlmalloc.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index 972cadd2e2f..4f88e48f4b0 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -559,6 +559,10 @@ MAX_RELEASE_CHECK_RATE default: 4095 unless not HAVE_MMAP #ifdef MCHECK_HEAP_PROTECTION #define STATIC_IF_MCHECK static +#undef MALLOC_COPY +#undef MALLOC_ZERO +static inline void MALLOC_ZERO(void *p, size_t sz) { memset(p, 0, sz); } +static inline void MALLOC_COPY(void *dest, const void *src, size_t sz) { memcpy(dest, src, sz); } #else #define STATIC_IF_MCHECK #define dlmalloc_impl dlmalloc -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Add mcheck wrapper functions for dlmalloc, dlfree, dlrealloc, dlmemalign, and dlcalloc. When MCHECK_HEAP_PROTECTION is enabled, these wrappers call the mcheck hooks around the internal *_impl functions to provide heap corruption detection. Also add the mcheck() and mprobe() API functions. Changes from original commit: - Adapted for dlmalloc 2.8.6 function names (dl* instead of mALLOc) - Updated function signatures (void* instead of Void_t*) Signed-off-by: Eugene Uriev <eugeneuriev@gmail.com> Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> (cherry picked from commit 151493a875071448e2582489f6fa84d1630b3368) --- common/dlmalloc.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 63 insertions(+) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index 4f88e48f4b0..102b6c2bf8d 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -5591,6 +5591,69 @@ size_t dlmalloc_usable_size(const void* mem) { return 0; } +#ifdef MCHECK_HEAP_PROTECTION +#include "mcheck_core.inc.h" + +void *dlmalloc(size_t bytes) +{ + size_t fullsz = mcheck_alloc_prehook(bytes); + void *p = dlmalloc_impl(fullsz); + + if (!p) + return p; + return mcheck_alloc_posthook(p, bytes); +} + +void dlfree(void *mem) { dlfree_impl(mcheck_free_prehook(mem)); } + +void *dlrealloc(void *oldmem, size_t bytes) +{ + if (bytes == 0) { + if (oldmem) + dlfree(oldmem); + return NULL; + } + + if (oldmem == NULL) + return dlmalloc(bytes); + + void *p = mcheck_reallocfree_prehook(oldmem); + size_t newsz = mcheck_alloc_prehook(bytes); + + p = dlrealloc_impl(p, newsz); + if (!p) + return p; + return mcheck_alloc_noclean_posthook(p, bytes); +} + +void *dlmemalign(size_t alignment, size_t bytes) +{ + return NULL; +} + +/* dlpvalloc, dlvalloc redirect to dlmemalign, so they need no wrapping */ + +void *dlcalloc(size_t n, size_t elem_size) +{ + /* NB: no overflow check here */ + size_t fullsz = mcheck_alloc_prehook(n * elem_size); + void *p = dlcalloc_impl(1, fullsz); + + if (!p) + return p; + return mcheck_alloc_noclean_posthook(p, n * elem_size); +} + +/* mcheck API */ +int mcheck(mcheck_abortfunc_t f) +{ + mcheck_initialize(f, 0); + return 0; +} + +enum mcheck_status mprobe(void *__ptr) { return mcheck_mprobe(__ptr); } +#endif /* MCHECK_HEAP_PROTECTION */ + #endif /* !ONLY_MSPACES */ /* ----------------------------- user mspaces ---------------------------- */ -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Implement the dlmemalign wrapper function for mcheck heap protection. Uses mcheck_memalign_prehook() and mcheck_memalign_posthook() to properly handle aligned allocations. Changes from original commit: - Uses dlmemalign/dlmemalign_impl instead of mEMALIGn/mEMALIGn_impl Signed-off-by: Eugene Uriev <eugeneuriev@gmail.com> Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> (cherry picked from commit ae838768d79cbb834c4a8a5f4810df373e58b622) --- common/dlmalloc.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index 102b6c2bf8d..c9eb18787e8 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -5628,7 +5628,12 @@ void *dlrealloc(void *oldmem, size_t bytes) void *dlmemalign(size_t alignment, size_t bytes) { - return NULL; + size_t fullsz = mcheck_memalign_prehook(alignment, bytes); + void *p = dlmemalign_impl(alignment, fullsz); + + if (!p) + return p; + return mcheck_memalign_posthook(alignment, p, bytes); } /* dlpvalloc, dlvalloc redirect to dlmemalign, so they need no wrapping */ -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Add mcheck_pedantic_prehook() calls to dlmalloc, dlrealloc, dlmemalign, and dlcalloc wrapper functions. Also add the mcheck_pedantic() and mcheck_check_all() API functions. The pedantic mode is runtime controlled, so the registry hooks are called on every allocation operation. Changes from original commit: - Uses dl* function names instead of mALLOc style names Signed-off-by: Eugene Uriev <eugeneuriev@gmail.com> Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> (cherry picked from commit 18c1bfafe0ccdd3229d91bbb07ed942e9f233f93) --- common/dlmalloc.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index c9eb18787e8..4ee7c6c133f 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -5596,6 +5596,7 @@ size_t dlmalloc_usable_size(const void* mem) { void *dlmalloc(size_t bytes) { + mcheck_pedantic_prehook(); size_t fullsz = mcheck_alloc_prehook(bytes); void *p = dlmalloc_impl(fullsz); @@ -5608,6 +5609,7 @@ void dlfree(void *mem) { dlfree_impl(mcheck_free_prehook(mem)); } void *dlrealloc(void *oldmem, size_t bytes) { + mcheck_pedantic_prehook(); if (bytes == 0) { if (oldmem) dlfree(oldmem); @@ -5628,6 +5630,7 @@ void *dlrealloc(void *oldmem, size_t bytes) void *dlmemalign(size_t alignment, size_t bytes) { + mcheck_pedantic_prehook(); size_t fullsz = mcheck_memalign_prehook(alignment, bytes); void *p = dlmemalign_impl(alignment, fullsz); @@ -5640,6 +5643,7 @@ void *dlmemalign(size_t alignment, size_t bytes) void *dlcalloc(size_t n, size_t elem_size) { + mcheck_pedantic_prehook(); /* NB: no overflow check here */ size_t fullsz = mcheck_alloc_prehook(n * elem_size); void *p = dlcalloc_impl(1, fullsz); @@ -5650,12 +5654,20 @@ void *dlcalloc(size_t n, size_t elem_size) } /* mcheck API */ +int mcheck_pedantic(mcheck_abortfunc_t f) +{ + mcheck_initialize(f, 1); + return 0; +} + int mcheck(mcheck_abortfunc_t f) { mcheck_initialize(f, 0); return 0; } +void mcheck_check_all(void) { mcheck_pedantic_check(); } + enum mcheck_status mprobe(void *__ptr) { return mcheck_mprobe(__ptr); } #endif /* MCHECK_HEAP_PROTECTION */ -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> The C runtime calls malloc() before starting main(), e.g. to get some memory to use for dynamic linking. If CONFIG_TPL_SYS_MALLOC_SIMPLE is enabled, the calloc() symbol is defined within U-Boot. The C runtime may call that too. Add the SYS_MALLOC_SIMPLE section to the new malloc.h header to redirect malloc, realloc, calloc, and memalign to their simple implementations when SYS_MALLOC_SIMPLE is enabled. Add a COMPILING_DLMALLOC guard so that dlmalloc.c can include malloc.h without hitting the SYS_MALLOC_SIMPLE redirects, which would otherwise cause conflicts with the dlfree/free macro definitions. Changes from original commit: - Applied to new dlmalloc 2.8.6 malloc.h header structure Signed-off-by: Simon Glass <sjg@chromium.org> Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> (cherry picked from commit 22f87ef53045c19df9a770c4101ed3ba744c1b35) --- include/malloc.h | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/include/malloc.h b/include/malloc.h index 72db7fdb507..f8f0dbb9b70 100644 --- a/include/malloc.h +++ b/include/malloc.h @@ -68,6 +68,24 @@ extern "C" { #define USE_DL_PREFIX #endif +/* + * When using simple malloc (SPL/TPL), redirect to simple implementations. + * Skip this when compiling dlmalloc.c itself to avoid conflicts. + */ +#if CONFIG_IS_ENABLED(SYS_MALLOC_SIMPLE) +#define malloc malloc_simple +#define realloc realloc_simple +#define calloc calloc_simple +#define memalign memalign_simple +#if IS_ENABLED(CONFIG_VALGRIND) +#define free free_simple +#else +static inline void free(void *ptr) {} +#endif +void *calloc(size_t nmemb, size_t size); +void *realloc_simple(void *ptr, size_t size); +#else /* !SYS_MALLOC_SIMPLE || COMPILING_DLMALLOC */ + #ifndef USE_DL_PREFIX #define dlcalloc calloc #define dlfree free @@ -106,6 +124,7 @@ extern "C" { #define malloc_stats() dlmalloc_stats() #define malloc_usable_size(p) dlmalloc_usable_size(p) #endif /* USE_DL_PREFIX */ +#endif /* !SYS_MALLOC_SIMPLE || COMPILING_DLMALLOC */ #if !NO_MALLINFO #ifndef HAVE_USR_INCLUDE_MALLOC_H -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Move the malloc state initialisation from lazy init in sys_alloc() to explicit init in mem_malloc_init(). This allows is_initialized() to always return true for U-Boot, eliminating runtime checks. The initialisation sets up: - least_addr, seg.base, seg.size, seg.sflags - magic, release_checks, mflags - bins (via init_bins) - top chunk (via init_top) - footprint tracking Add a simplified sys_alloc() for small builds. Since the heap is pre-allocated with fixed size, sys_alloc() only needs to extend via sbrk() if space remains: no mmap, no multiple segments, no comple merging. The helper functions mmap_alloc(), mmap_resize(), prepend_alloc() and add_segment() are thus compiled out for non-sandbox builds. This is controlled by a new SIMPLE_SYSALLOC option, which is the default. Sandbox retains full functionality for testing. With this, the new dlmalloc is approximately at parity with the old one, e.g. about 400 bytes less code on Thumb2 (firefly-rk3288). There is a strange case here with a small number of boards which set up the full malloc() when CONFIG_SYS_MALLOC_SIMPLE is enabled. This cannot work. With CONFIG_SPL_SYS_MALLOC_SIMPLE, all malloc()/free()/realloc() calls are redirected to simple implementations via macros in thhe malloc.h header. In this case, mem_malloc_init() doesn't need to init the dlmalloc state structure (gm) since it will never be used. Initing _gm_ pulls in the entire malloc_state BSS structure (~472 bytes) plus initialisation code (~128 bytes), adding ~600 bytes to SPL on boards that use full malloc (K3 platforms with CONFIG_K3_LOAD_SYSFW). Skip the _gm_ init when SYS_MALLOC_SIMPLE is enabled. These boards call mem_malloc_init() even though it will have no effect: $ ./tools/qconfig.py -f CONFIG_K3_LOAD_SYSFW SPL_SYS_MALLOC_SIMPLE -l am62ax_evm_r5 am62px_evm_r5 am62x_beagleplay_r5 am62x_evm_r5 am62x_evm_r5_ethboot am62x_lpsk_r5 am64x_evm_r5 am68_sk_r5 am69_sk_r5 j721s2_evm_r5 j722s_evm_r5 j784s4_evm_r5 phycore_am62x_r5 phycore_am62x_r5_usbdfu phycore_am64x_r5 verdin-am62_r5 Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- common/dlmalloc.c | 88 +++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 86 insertions(+), 2 deletions(-) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index 4ee7c6c133f..9330848d059 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -595,6 +595,11 @@ static inline void MALLOC_COPY(void *dest, const void *src, size_t sz) { memcpy( #define INSECURE 1 #endif +/* Use simplified sys_alloc for non-sandbox builds */ +#if !IS_ENABLED(CONFIG_SANDBOX) +#define SIMPLE_SYSALLOC 1 +#endif + #define MALLOC_FAILURE_ACTION #define ABORT do {} while (1) @@ -2719,7 +2724,12 @@ static struct malloc_state _gm_; #endif /* !ONLY_MSPACES */ +#if defined(__UBOOT__) && SIMPLE_SYSALLOC +/* U-Boot initializes in mem_malloc_init() so is_initialized() is always true */ +#define is_initialized(M) 1 +#else #define is_initialized(M) ((M)->top != 0) +#endif /* -------------------------- system alloc setup ------------------------- */ @@ -3903,6 +3913,7 @@ static void internal_malloc_stats(mstate m) { requirements (especially in memalign). */ +#if !defined(__UBOOT__) || !SIMPLE_SYSALLOC /* Malloc using mmap */ static void* mmap_alloc(mstate m, size_t nb) { size_t mmsize = mmap_align(nb + SIX_SIZE_T_SIZES + CHUNK_ALIGN_MASK); @@ -3934,7 +3945,9 @@ static void* mmap_alloc(mstate m, size_t nb) { } return 0; } +#endif /* !defined(__UBOOT__) || !SIMPLE_SYSALLOC */ +#if !defined(__UBOOT__) || !NO_REALLOC_IN_PLACE /* Realloc using mmap */ static mchunkptr mmap_resize(mstate m, mchunkptr oldp, size_t nb, int flags) { size_t oldsize = chunksize(oldp); @@ -3969,12 +3982,13 @@ static mchunkptr mmap_resize(mstate m, mchunkptr oldp, size_t nb, int flags) { } return 0; } +#endif /* !NO_REALLOC_IN_PLACE */ /* -------------------------- mspace management -------------------------- */ /* Initialize top chunk and its size */ -static void init_top(mstate m, mchunkptr p, size_t psize) { +static void __maybe_unused init_top(mstate m, mchunkptr p, size_t psize) { /* Ensure alignment */ size_t offset = align_offset(chunk2mem(p)); p = (mchunkptr)((char*)p + offset); @@ -3989,7 +4003,7 @@ static void init_top(mstate m, mchunkptr p, size_t psize) { } /* Initialize bins for a new mstate that is otherwise zeroed out */ -static void init_bins(mstate m) { +static void __maybe_unused init_bins(mstate m) { /* Establish circular links for smallbins */ bindex_t i; for (i = 0; i < NSMALLBINS; ++i) { @@ -4017,6 +4031,7 @@ static void reset_on_error(mstate m) { } #endif /* PROCEED_ON_ERROR */ +#if !defined(__UBOOT__) || !SIMPLE_SYSALLOC /* Allocate chunk and prepend remainder with chunk in successor base. */ static void* prepend_alloc(mstate m, char* newbase, char* oldbase, size_t nb) { @@ -4111,9 +4126,62 @@ static void add_segment(mstate m, char* tbase, size_t tsize, flag_t mmapped) { check_top_chunk(m, m->top); } +#endif /* !__UBOOT__ || !SIMPLE_SYSALLOC */ /* -------------------------- System allocation -------------------------- */ +#if defined(__UBOOT__) && SIMPLE_SYSALLOC +/* + * U-Boot simplified sys_alloc: The heap is pre-allocated with fixed size in + * mem_malloc_init(), so we can only extend via sbrk() if space remains. + * No mmap, no multiple segments, no complex merging needed. + */ +static void* sys_alloc(mstate m, size_t nb) { + char* tbase; + size_t asize; + size_t tsize; + + asize = granularity_align(nb + SYS_ALLOC_PADDING); + if (asize <= nb) + return NULL; /* wraparound */ + + tbase = (char *)CALL_MORECORE(asize); + if (tbase == CMFAIL) { + MALLOC_FAILURE_ACTION; + return NULL; + } + tsize = asize; + + m->footprint += tsize; + if (m->footprint > m->max_footprint) + m->max_footprint = m->footprint; + + /* Extend the top chunk - sbrk returns contiguous memory */ + if (tbase == m->seg.base + m->seg.size) { + m->seg.size += tsize; + init_top(m, m->top, m->topsize + tsize); + } else { + /* Non-contiguous - shouldn't happen with U-Boot's simple sbrk */ + return NULL; + } + + if (nb < m->topsize) { + size_t rsize = m->topsize -= nb; + mchunkptr p = m->top; + mchunkptr r = m->top = chunk_plus_offset(p, nb); + r->head = rsize | PINUSE_BIT; + set_size_and_pinuse_of_inuse_chunk(m, p, nb); + check_top_chunk(m, m->top); + check_malloced_chunk(m, chunk2mem(p), nb); + return chunk2mem(p); + } + + MALLOC_FAILURE_ACTION; + return NULL; +} + +#else /* !__UBOOT__ || !SIMPLE_SYSALLOC */ + /* Get memory from system using MORECORE or MMAP */ static void* sys_alloc(mstate m, size_t nb) { char* tbase = CMFAIL; @@ -4322,6 +4390,7 @@ static void* sys_alloc(mstate m, size_t nb) { MALLOC_FAILURE_ACTION; return 0; } +#endif /* !__UBOOT__ || !SIMPLE_SYSALLOC */ /* ----------------------- system deallocation -------------------------- */ @@ -6601,6 +6670,21 @@ void mem_malloc_init(ulong start, ulong size) #if CONFIG_IS_ENABLED(SYS_MALLOC_CLEAR_ON_INIT) memset((void *)mem_malloc_start, '\0', size); #endif + +#if !CONFIG_IS_ENABLED(SYS_MALLOC_SIMPLE) + /* Initialize the malloc state so is_initialized() is true */ + gm->least_addr = (char *)mem_malloc_start; + gm->seg.base = (char *)mem_malloc_start; + gm->seg.size = size; + gm->seg.sflags = 0; /* not mmapped */ + gm->magic = mparams.magic; + gm->release_checks = MAX_RELEASE_CHECK_RATE; + gm->mflags = mparams.default_mflags; + init_bins(gm); + init_top(gm, (mchunkptr)mem_malloc_start, size - TOP_FOOT_SIZE); + gm->footprint = size; + gm->max_footprint = size; +#endif } void malloc_enable_testing(int max_allocs) -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> When building boards that use CONFIG_SPL_SYS_MALLOC_SIMPLE (like qemu-x86_64), we need to avoid a conflict between the stub free() function defined by malloc and the real free() defined by dlmalloc.c Fix this by define COMPILING_DLMALLOC in dlmalloc.c before including malloc.h and adding a guard to the latter. Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- common/dlmalloc.c | 2 ++ include/malloc.h | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index 9330848d059..869473b2bd1 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -603,6 +603,8 @@ static inline void MALLOC_COPY(void *dest, const void *src, size_t sz) { memcpy( #define MALLOC_FAILURE_ACTION #define ABORT do {} while (1) +#define COMPILING_DLMALLOC + #include <log.h> #include <malloc.h> #include <mapmem.h> diff --git a/include/malloc.h b/include/malloc.h index f8f0dbb9b70..997651e5c9c 100644 --- a/include/malloc.h +++ b/include/malloc.h @@ -72,7 +72,7 @@ extern "C" { * When using simple malloc (SPL/TPL), redirect to simple implementations. * Skip this when compiling dlmalloc.c itself to avoid conflicts. */ -#if CONFIG_IS_ENABLED(SYS_MALLOC_SIMPLE) +#if CONFIG_IS_ENABLED(SYS_MALLOC_SIMPLE) && !defined(COMPILING_DLMALLOC) #define malloc malloc_simple #define realloc realloc_simple #define calloc calloc_simple -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> There are quite a few options available which can reduce code size. Most of them only make sense in SPL. Add a Kconfig option to enable a smaller dlmalloc for U-Boot proper and SPL. Signed-off-by: Simon Glass <simon.glass@canonical.com> --- Kconfig | 47 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 47 insertions(+) diff --git a/Kconfig b/Kconfig index 357f027cc97..c4a65597035 100644 --- a/Kconfig +++ b/Kconfig @@ -464,6 +464,53 @@ config SYS_MALLOC_DEFAULT_TO_INIT If such a scenario is sought choose yes. +config SYS_MALLOC_SMALL + bool "Optimise malloc for code size" + help + Enable code-size optimisations for dlmalloc. This: + + - Disables tree bins for allocations >= 256 bytes, using simple + linked-list bins instead. This changes large-allocation performance + from O(log n) to O(n) but saves ~1.5-2KB. + + - Simplifies memalign() by removing fallback retry logic that attempts + multiple allocation strategies when initial over-allocation fails. + This saves ~100-150 bytes. + + - Disables in-place realloc optimisation, which resizes allocations + without copying if space permits. This saves ~200 bytes. + + - Uses static malloc parameters instead of runtime-configurable ones. + + These optimisations may increase fragmentation and reduce performance + for workloads with many large or aligned allocations, but are suitable + for most U-Boot use cases where code size is more important. + + If unsure, say N. + +config SPL_SYS_MALLOC_SMALL + bool "Optimise malloc for code size in SPL" + depends on SPL && !SPL_SYS_MALLOC_SIMPLE + default y + help + Enable code-size optimisations for dlmalloc in SPL. This: + + - Disables tree bins for allocations >= 256 bytes, using simple + linked-list bins instead. This changes large-allocation performance + from O(log n) to O(n) but saves ~1.5-2KB. + + - Simplifies memalign() by removing fallback retry logic. This saves + ~100-150 bytes. + + - Disables in-place realloc optimisation. This saves ~200 bytes. + + - Uses static malloc parameters instead of runtime-configurable ones. + + SPL typically has predictable memory usage where these optimisations + have minimal impact, making the code size savings worthwhile. + + If unsure, say Y to minimize SPL code size. + config TOOLS_DEBUG bool "Enable debug information for tools" help -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Add a new NO_REALLOC_IN_PLACE option that disables in-place realloc optimization. When enabled via CONFIG_SYS_MALLOC_SMALL, realloc() always allocates new memory, copies data, and frees the old block instead of trying to extend the existing allocation. This saves about 500 bytes by eliminating try_realloc_chunk() and mmap_resize() functions. When unit tests are enabled, the extra realloc logic is included. Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- common/dlmalloc.c | 28 ++++++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index 869473b2bd1..4439d278188 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -595,6 +595,10 @@ static inline void MALLOC_COPY(void *dest, const void *src, size_t sz) { memcpy( #define INSECURE 1 #endif +#if CONFIG_IS_ENABLED(SYS_MALLOC_SMALL) +#define NO_REALLOC_IN_PLACE 1 +#endif + /* Use simplified sys_alloc for non-sandbox builds */ #if !IS_ENABLED(CONFIG_SANDBOX) #define SIMPLE_SYSALLOC 1 @@ -807,6 +811,9 @@ ulong mem_malloc_brk; #ifndef NO_SEGMENT_TRAVERSAL #define NO_SEGMENT_TRAVERSAL 0 #endif /* NO_SEGMENT_TRAVERSAL */ +#ifndef NO_REALLOC_IN_PLACE +#define NO_REALLOC_IN_PLACE 0 +#endif /* NO_REALLOC_IN_PLACE */ /* mallopt tuning options. SVID/XPG defines four standard parameter @@ -3984,7 +3991,7 @@ static mchunkptr mmap_resize(mstate m, mchunkptr oldp, size_t nb, int flags) { } return 0; } -#endif /* !NO_REALLOC_IN_PLACE */ +#endif /* !defined(__UBOOT__) || !NO_REALLOC_IN_PLACE */ /* -------------------------- mspace management -------------------------- */ @@ -5002,6 +5009,7 @@ void* dlcalloc_impl(size_t n_elements, size_t elem_size) { /* ------------ Internal support for realloc, memalign, etc -------------- */ +#if !defined(__UBOOT__) || !NO_REALLOC_IN_PLACE /* Try to realloc; only in-place unless can_move true */ static mchunkptr try_realloc_chunk(mstate m, mchunkptr p, size_t nb, int can_move) { @@ -5081,6 +5089,7 @@ static mchunkptr try_realloc_chunk(mstate m, mchunkptr p, size_t nb, } return newp; } +#endif /* !defined(__UBOOT__) || !NO_REALLOC_IN_PLACE */ static void* internal_memalign(mstate m, size_t alignment, size_t bytes) { void* mem = 0; @@ -5444,8 +5453,9 @@ void* dlrealloc_impl(void* oldmem, size_t bytes) { } #endif /* REALLOC_ZERO_BYTES_FREES */ else { - size_t nb = request2size(bytes); mchunkptr oldp = mem2chunk(oldmem); +#if !defined(__UBOOT__) || !NO_REALLOC_IN_PLACE + size_t nb = request2size(bytes); #if ! FOOTERS mstate m = gm; #else /* FOOTERS */ @@ -5484,10 +5494,23 @@ void* dlrealloc_impl(void* oldmem, size_t bytes) { } } } +#else /* defined(__UBOOT__) && NO_REALLOC_IN_PLACE */ + mem = dlmalloc_impl(bytes); + if (mem != 0) { + size_t oc = chunksize(oldp) - overhead_for(oldp); + memcpy(mem, oldmem, (oc < bytes)? oc : bytes); +#ifdef __UBOOT__ + VALGRIND_MALLOCLIKE_BLOCK(mem, bytes, SIZE_SZ, false); + VALGRIND_FREELIKE_BLOCK(oldmem, SIZE_SZ); +#endif + dlfree_impl(oldmem); + } +#endif /* !defined(__UBOOT__) || !NO_REALLOC_IN_PLACE */ } return mem; } +#if !defined(__UBOOT__) || !NO_REALLOC_IN_PLACE void* dlrealloc_in_place(void* oldmem, size_t bytes) { void* mem = 0; if (oldmem != 0) { @@ -5522,6 +5545,7 @@ void* dlrealloc_in_place(void* oldmem, size_t bytes) { } return mem; } +#endif /* !defined(__UBOOT__) || !NO_REALLOC_IN_PLACE */ STATIC_IF_MCHECK void* dlmemalign_impl(size_t alignment, size_t bytes) { -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Add a new NO_TREE_BINS option to disable binary-tree bins for large allocations (>= 256 bytes). While this is an invasive changes, it saves about 1.25K of code size on arm64 (as well as 250 bytes of data). When enabled, all large chunks use a simple doubly-linked list instead of tree bins, trading O(log n) performance for smaller code size. The tradeoff here is that large allocations use O(n) search instead of O(log n) and fragmentation coud also become worse. So performance will suffer when there are lots of large allocations and frees are done. This is rare in SPL. Implementation: - Add a dedicated mchunkptr largebin field to malloc_state - Replace treebins[NTREEBINS] array with a single linked-list pointer - Implement simplified insert/unlink operations using largebin list - Update allocation functions (tmalloc_small/large) for linear search - Update heap checking functions (do_check_treebin, bin_find) to handle linked list traversal instead of tree traversal It is enabled by CONFIG_SYS_MALLOC_SMALL, i.e. by default in SPL. Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- common/dlmalloc.c | 157 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 154 insertions(+), 3 deletions(-) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index 4439d278188..13ae0e10918 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -597,6 +597,9 @@ static inline void MALLOC_COPY(void *dest, const void *src, size_t sz) { memcpy( #if CONFIG_IS_ENABLED(SYS_MALLOC_SMALL) #define NO_REALLOC_IN_PLACE 1 +#define NO_TREE_BINS 1 +#else +#define NO_TREE_BINS 0 #endif /* Use simplified sys_alloc for non-sandbox builds */ @@ -2686,7 +2689,11 @@ struct malloc_state { size_t release_checks; size_t magic; mchunkptr smallbins[(NSMALLBINS+1)*2]; +#if defined(__UBOOT__) && NO_TREE_BINS + mchunkptr largebin; /* Single linked list for all large chunks */ +#else tbinptr treebins[NTREEBINS]; +#endif size_t footprint; size_t max_footprint; size_t footprint_limit; /* zero means no limit */ @@ -2914,7 +2921,9 @@ static void do_check_mmapped_chunk(mstate m, mchunkptr p); static void do_check_inuse_chunk(mstate m, mchunkptr p); static void do_check_free_chunk(mstate m, mchunkptr p); static void do_check_malloced_chunk(mstate m, void* mem, size_t s); +#if !defined(__UBOOT__) || !NO_TREE_BINS static void do_check_tree(mstate m, tchunkptr t); +#endif static void do_check_treebin(mstate m, bindex_t i); static void do_check_smallbin(mstate m, bindex_t i); static void do_check_malloc_state(mstate m); @@ -2924,6 +2933,8 @@ static size_t traverse_and_check(mstate m); /* ---------------------------- Indexing Bins ---------------------------- */ +/* When NO_TREE_BINS is enabled, large chunks use a single linked list + in treebin[0] instead of the tree structure */ #define is_small(s) (((s) >> SMALLBIN_SHIFT) < NSMALLBINS) #define small_index(s) (bindex_t)((s) >> SMALLBIN_SHIFT) #define small_index2size(i) ((i) << SMALLBIN_SHIFT) @@ -3397,6 +3408,7 @@ static void do_check_malloced_chunk(mstate m, void* mem, size_t s) { } } +#if !defined(__UBOOT__) || !NO_TREE_BINS /* Check a tree and its subtrees. */ static void do_check_tree(mstate m, tchunkptr t) { tchunkptr head = 0; @@ -3447,9 +3459,28 @@ static void do_check_tree(mstate m, tchunkptr t) { } while (u != t); assert(head != 0); } +#endif /* Check all the chunks in a treebin. */ static void do_check_treebin(mstate m, bindex_t i) { +#if defined(__UBOOT__) && NO_TREE_BINS + /* With NO_TREE_BINS, only index 0 is used for the large bin list */ + if (i == 0) { + mchunkptr p = m->largebin; + if (p != 0) { + /* Check the linked list */ + mchunkptr start = p; + do { + do_check_any_chunk(m, p); + assert(!is_inuse(p)); + assert(!next_pinuse(p)); + assert(p->fd->bk == p); + assert(p->bk->fd == p); + p = p->fd; + } while (p != start); + } + } +#else tbinptr* tb = treebin_at(m, i); tchunkptr t = *tb; int empty = (m->treemap & (1U << i)) == 0; @@ -3457,6 +3488,7 @@ static void do_check_treebin(mstate m, bindex_t i) { assert(empty); if (!empty) do_check_tree(m, t); +#endif } /* Check all the chunks in a smallbin. */ @@ -3498,6 +3530,18 @@ static int bin_find(mstate m, mchunkptr x) { } } else { +#if defined(__UBOOT__) && NO_TREE_BINS + /* With NO_TREE_BINS, all large chunks are in largebin list */ + if (m->largebin != 0) { + mchunkptr p = m->largebin; + mchunkptr start = p; + do { + if (p == x) + return 1; + p = p->fd; + } while (p != start); + } +#else bindex_t tidx; compute_tree_index(size, tidx); if (treemap_is_marked(m, tidx)) { @@ -3515,6 +3559,7 @@ static int bin_find(mstate m, mchunkptr x) { } while ((u = u->fd) != t); } } +#endif } return 0; } @@ -3744,6 +3789,53 @@ static void internal_malloc_stats(mstate m) { /* ------------------------- Operations on trees ------------------------- */ +#if defined(__UBOOT__) && NO_TREE_BINS +/* When tree bins are disabled, use a simple doubly-linked list for all large chunks */ +static void insert_large_chunk(mstate M, tchunkptr X, size_t S) { + mchunkptr XP = (mchunkptr)(X); + mchunkptr F = M->largebin; + (void)S; /* unused in NO_TREE_BINS mode */ + if (F == 0) { + M->largebin = XP; + XP->fd = XP->bk = XP; + } + else if (RTCHECK(ok_address(M, F))) { + mchunkptr B = F->bk; + if (RTCHECK(ok_address(M, B))) { + XP->fd = F; + XP->bk = B; + F->bk = XP; + B->fd = XP; + } + else { + CORRUPTION_ERROR_ACTION(M); + } + } + else { + CORRUPTION_ERROR_ACTION(M); + } +} + +static void unlink_large_chunk(mstate M, tchunkptr X) { + mchunkptr XP = (mchunkptr)(X); + mchunkptr F = XP->fd; + mchunkptr B = XP->bk; + if (F == XP) { + M->largebin = 0; + } + else if (RTCHECK(ok_address(M, F) && F->bk == XP && ok_address(M, B) && B->fd == XP)) { + F->bk = B; + B->fd = F; + if (M->largebin == XP) + M->largebin = F; + } + else { + CORRUPTION_ERROR_ACTION(M); + } +} + +#else /* !defined(__UBOOT__) || !NO_TREE_BINS */ + /* Insert chunk into tree */ #define insert_large_chunk(M, X, S) {\ tbinptr* H;\ @@ -3884,6 +3976,8 @@ static void internal_malloc_stats(mstate m) { }\ } +#endif /* !defined(__UBOOT__) || !NO_TREE_BINS */ + /* Relays to large vs small bin operations */ #define insert_chunk(M, P, S)\ @@ -4593,7 +4687,26 @@ static void dispose_chunk(mstate m, mchunkptr p, size_t psize) { static void* tmalloc_large(mstate m, size_t nb) { tchunkptr v = 0; size_t rsize = -nb; /* Unsigned negation */ +#if !defined(__UBOOT__) || !NO_TREE_BINS tchunkptr t; +#endif +#if defined(__UBOOT__) && NO_TREE_BINS + /* With NO_TREE_BINS, do a linear search through largebin list */ + if (m->largebin != 0) { + mchunkptr p = m->largebin; + mchunkptr start = p; + do { + size_t trem = chunksize(p) - nb; + if (trem < rsize) { + rsize = trem; + v = (tchunkptr)p; + if (rsize == 0) + break; + } + p = p->fd; + } while (p != start); + } +#else bindex_t idx; compute_tree_index(nb, idx); if ((t = *treebin_at(m, idx)) != 0) { @@ -4637,6 +4750,7 @@ static void* tmalloc_large(mstate m, size_t nb) { } t = leftmost_child(t); } +#endif /* If dv is a better fit, return 0 so malloc will use it */ if (v != 0 && rsize < (size_t)(m->dvsize - nb)) { @@ -4662,8 +4776,32 @@ static void* tmalloc_large(mstate m, size_t nb) { /* allocate a small request from the best fitting chunk in a treebin */ static void* tmalloc_small(mstate m, size_t nb) { - tchunkptr t, v; +#if !defined(__UBOOT__) || !NO_TREE_BINS + tchunkptr t; +#endif + tchunkptr v; size_t rsize; +#if defined(__UBOOT__) && NO_TREE_BINS + /* With NO_TREE_BINS, use largebin list for best fit search */ + if (m->largebin != 0) { + mchunkptr p = m->largebin; + mchunkptr best = p; + rsize = chunksize(p) - nb; + /* Scan the list for the best fit */ + mchunkptr start = p; + while ((p = p->fd) != start) { + size_t trem = chunksize(p) - nb; + if (trem < rsize) { + rsize = trem; + best = p; + } + } + v = (tchunkptr)best; + } + else { + return 0; + } +#else bindex_t i; binmap_t leastbit = least_bit(m->treemap); compute_bit2idx(leastbit, i); @@ -4677,6 +4815,7 @@ static void* tmalloc_small(mstate m, size_t nb) { v = t; } } +#endif if (RTCHECK(ok_address(m, v))) { mchunkptr r = chunk_plus_offset(v, nb); @@ -4794,7 +4933,13 @@ void* dlmalloc_impl(size_t bytes) { goto postaction; } - else if (gm->treemap != 0 && (mem = tmalloc_small(gm, nb)) != 0) { + else if ( +#if defined(__UBOOT__) && NO_TREE_BINS + gm->largebin != 0 && +#else + gm->treemap != 0 && +#endif + (mem = tmalloc_small(gm, nb)) != 0) { check_malloced_chunk(gm, mem, nb); goto postaction; } @@ -4804,7 +4949,13 @@ void* dlmalloc_impl(size_t bytes) { nb = MAX_SIZE_T; /* Too big to allocate. Force failure (in sys alloc) */ else { nb = pad_request(bytes); - if (gm->treemap != 0 && (mem = tmalloc_large(gm, nb)) != 0) { + if ( +#if defined(__UBOOT__) && NO_TREE_BINS + gm->largebin != 0 && +#else + gm->treemap != 0 && +#endif + (mem = tmalloc_large(gm, nb)) != 0) { check_malloced_chunk(gm, mem, nb); goto postaction; } -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Add a new SIMPLE_MEMALIGN to remove the fallback-retry logic from memalign(), to reduce code sizein SPL. The fallback mechanism attempts multiple allocation strategies: 1. Over-allocate to guarantee finding aligned space 2. If that fails, allocate exact size and check if aligned 3. If not aligned, free and retry with calculated extra space While this fallback is useful in low-memory situations, SPL typically has predictable memory usage and sufficient heap space for the initial over-allocation to succeed. The fallback adds code complexity without obvious practical benefit. This reduces code size on imx8mp_venice (for example) SPL by 74 bytes. Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- common/dlmalloc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index 13ae0e10918..65bfb97e1db 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -597,9 +597,11 @@ static inline void MALLOC_COPY(void *dest, const void *src, size_t sz) { memcpy( #if CONFIG_IS_ENABLED(SYS_MALLOC_SMALL) #define NO_REALLOC_IN_PLACE 1 +#define SIMPLE_MEMALIGN 1 #define NO_TREE_BINS 1 #else #define NO_TREE_BINS 0 +#define SIMPLE_MEMALIGN 0 #endif /* Use simplified sys_alloc for non-sandbox builds */ @@ -5260,7 +5262,7 @@ static void* internal_memalign(mstate m, size_t alignment, size_t bytes) { size_t nb = request2size(bytes); size_t req = nb + alignment + MIN_CHUNK_SIZE - CHUNK_OVERHEAD; mem = internal_malloc(m, req); -#ifdef __UBOOT__ +#if defined(__UBOOT__) && !SIMPLE_MEMALIGN /* * The attempt to over-allocate (with a size large enough to guarantee the * ability to find an aligned region within allocated memory) failed. -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> The insert_small_chunk() and unlink_first_small_chunk() macros are inlined at multiple places in the code. Provide an option to convert these to functions, so the compiler can try to reduce code size. Add braces to the insert_chunk macro. This reduces code size imx8mp_venice SPL by about 208 bytes Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- common/dlmalloc.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/common/dlmalloc.c b/common/dlmalloc.c index 65bfb97e1db..54fd2e4a08a 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -599,6 +599,7 @@ static inline void MALLOC_COPY(void *dest, const void *src, size_t sz) { memcpy( #define NO_REALLOC_IN_PLACE 1 #define SIMPLE_MEMALIGN 1 #define NO_TREE_BINS 1 +#define SMALLCHUNKS_AS_FUNCS 1 #else #define NO_TREE_BINS 0 #define SIMPLE_MEMALIGN 0 @@ -632,6 +633,10 @@ ulong mem_malloc_brk; #endif /* __UBOOT__ */ +#ifndef SMALLCHUNKS_AS_FUNCS +#define SMALLCHUNKS_AS_FUNCS 0 +#endif + #ifndef WIN32 #ifdef _WIN32 #define WIN32 1 @@ -3714,6 +3719,25 @@ static void internal_malloc_stats(mstate m) { */ /* Link a free chunk into a smallbin */ +#if defined(__UBOOT__) && SMALLCHUNKS_AS_FUNCS +static void insert_small_chunk(mstate M, mchunkptr P, size_t S) { + bindex_t I = small_index(S); + mchunkptr B = smallbin_at(M, I); + mchunkptr F = B; + assert(S >= MIN_CHUNK_SIZE); + if (!smallmap_is_marked(M, I)) + mark_smallmap(M, I); + else if (RTCHECK(ok_address(M, B->fd))) + F = B->fd; + else { + CORRUPTION_ERROR_ACTION(M); + } + B->fd = P; + F->bk = P; + P->fd = F; + P->bk = B; +} +#else #define insert_small_chunk(M, P, S) {\ bindex_t I = small_index(S);\ mchunkptr B = smallbin_at(M, I);\ @@ -3731,6 +3755,7 @@ static void internal_malloc_stats(mstate m) { P->fd = F;\ P->bk = B;\ } +#endif /* Unlink a chunk from a smallbin */ #define unlink_small_chunk(M, P, S) {\ @@ -3759,6 +3784,24 @@ static void internal_malloc_stats(mstate m) { } /* Unlink the first chunk from a smallbin */ +#if defined(__UBOOT__) && SMALLCHUNKS_AS_FUNCS +static void unlink_first_small_chunk(mstate M, mchunkptr B, mchunkptr P, bindex_t I) { + mchunkptr F = P->fd; + assert(P != B); + assert(P != F); + assert(chunksize(P) == small_index2size(I)); + if (B == F) { + clear_smallmap(M, I); + } + else if (RTCHECK(ok_address(M, F) && F->bk == P)) { + F->bk = B; + B->fd = F; + } + else { + CORRUPTION_ERROR_ACTION(M); + } +} +#else #define unlink_first_small_chunk(M, B, P, I) {\ mchunkptr F = P->fd;\ assert(P != B);\ @@ -3775,6 +3818,7 @@ static void internal_malloc_stats(mstate m) { CORRUPTION_ERROR_ACTION(M);\ }\ } +#endif /* Replace dv node, binning the old one */ /* Used only when dvsize known to be small */ @@ -3982,9 +4026,15 @@ static void unlink_large_chunk(mstate M, tchunkptr X) { /* Relays to large vs small bin operations */ +#if defined(__UBOOT__) && SMALLCHUNKS_AS_FUNCS +#define insert_chunk(M, P, S)\ + if (is_small(S)) { insert_small_chunk(M, P, S); }\ + else { tchunkptr TP = (tchunkptr)(P); insert_large_chunk(M, TP, S); } +#else #define insert_chunk(M, P, S)\ if (is_small(S)) insert_small_chunk(M, P, S)\ else { tchunkptr TP = (tchunkptr)(P); insert_large_chunk(M, TP, S); } +#endif #define unlink_chunk(M, P, S)\ if (is_small(S)) unlink_small_chunk(M, P, S)\ -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> We know or assume that dlmalloc itself works correctly, but there is still the possibility that the U-Boot integration has bugs. Add a test suite for the malloc() implementation, covering: - Basic malloc/free operations - Edge cases (zero size, NULL pointer handling) - realloc() in various scenarios - memalign() with different alignments - Multiple allocations and fragmentation - malloc_enable_testing() failure simulation - Large allocations (1MB, 16MB) - Full pool allocation (CONFIG_SYS_MALLOC_LEN plus environment size) - Fill pool test with random sizes Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- test/common/Makefile | 1 + test/common/malloc.c | 629 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 630 insertions(+) create mode 100644 test/common/malloc.c diff --git a/test/common/Makefile b/test/common/Makefile index a5df946396a..9674bbec030 100644 --- a/test/common/Makefile +++ b/test/common/Makefile @@ -13,5 +13,6 @@ obj-$(CONFIG_CONSOLE_PAGER) += console.o obj-$(CONFIG_CYCLIC) += cyclic.o obj-$(CONFIG_EVENT_DYNAMIC) += event.o obj-y += cread.o +obj-y += malloc.o obj-$(CONFIG_CONSOLE_PAGER) += pager.o obj-$(CONFIG_$(PHASE_)CMDLINE) += print.o diff --git a/test/common/malloc.c b/test/common/malloc.c new file mode 100644 index 00000000000..b114267dd83 --- /dev/null +++ b/test/common/malloc.c @@ -0,0 +1,629 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Tests for malloc() implementation + * + * Copyright 2025 Google LLC + * Written by Simon Glass <sjg@chromium.org> + */ + +#include <linux/sizes.h> +#include <malloc.h> +#include <mapmem.h> +#include <stdlib.h> +#include <asm/global_data.h> +#include <env_internal.h> +#include <test/common.h> +#include <test/test.h> +#include <test/ut.h> + +DECLARE_GLOBAL_DATA_PTR; + +/* + * get_alloced_size() - Get currently allocated memory size + * + * Return: Number of bytes currently allocated (not freed) + */ +static int get_alloced_size(void) +{ + struct mallinfo info = mallinfo(); + + return info.uordblks; +} + +/* Test basic malloc() and free() */ +static int common_test_malloc_basic(struct unit_test_state *uts) +{ + int before; + void *ptr; + + before = get_alloced_size(); + + ptr = malloc(100); + ut_assertnonnull(ptr); + + ut_assert(get_alloced_size() >= before + 100); + + free(ptr); + + ut_asserteq(before, get_alloced_size()); + + return 0; +} +COMMON_TEST(common_test_malloc_basic, 0); + +/* Test malloc() with zero size and free(NULL) */ +static int common_test_malloc_zero(struct unit_test_state *uts) +{ + int before; + void *ptr; + + before = get_alloced_size(); + + ptr = malloc(0); + ut_assertnonnull(ptr); + free(ptr); + + ut_asserteq(before, get_alloced_size()); + + return 0; +} +COMMON_TEST(common_test_malloc_zero, 0); + +/* Test calloc() zeros memory */ +static int common_test_calloc(struct unit_test_state *uts) +{ + int before, i; + char *ptr; + + before = get_alloced_size(); + + ptr = calloc(100, 1); + ut_assertnonnull(ptr); + + for (i = 0; i < 100; i++) + ut_asserteq(0, ptr[i]); + + ut_assert(get_alloced_size() >= before + 100); + + free(ptr); + + ut_asserteq(before, get_alloced_size()); + + return 0; +} +COMMON_TEST(common_test_calloc, 0); + +/* Test realloc() to larger size */ +static int common_test_realloc_larger(struct unit_test_state *uts) +{ + char *ptr, *ptr2; + int before, i; + + before = get_alloced_size(); + + ptr = malloc(50); + ut_assertnonnull(ptr); + + for (i = 0; i < 50; i++) + ptr[i] = i; + + ptr2 = realloc(ptr, 100); + ut_assertnonnull(ptr2); + + /* + * Check original data preserved + */ + for (i = 0; i < 50; i++) + ut_asserteq(i, ptr2[i]); + + free(ptr2); + + ut_asserteq(before, get_alloced_size()); + + return 0; +} +COMMON_TEST(common_test_realloc_larger, 0); + +/* Test realloc() to smaller size */ +static int common_test_realloc_smaller(struct unit_test_state *uts) +{ + char *ptr, *ptr2; + int before, i; + + before = get_alloced_size(); + + ptr = malloc(100); + ut_assertnonnull(ptr); + + for (i = 0; i < 100; i++) + ptr[i] = i; + + ptr2 = realloc(ptr, 50); + ut_assertnonnull(ptr2); + + /* + * Check data preserved + */ + for (i = 0; i < 50; i++) + ut_asserteq(i, ptr2[i]); + + free(ptr2); + + ut_asserteq(before, get_alloced_size()); + + return 0; +} +COMMON_TEST(common_test_realloc_smaller, 0); + +/* Test realloc() with NULL pointer (should act like malloc) */ +static int common_test_realloc_null(struct unit_test_state *uts) +{ + int before; + void *ptr; + + before = get_alloced_size(); + + ptr = realloc(NULL, 100); + ut_assertnonnull(ptr); + ut_assert(get_alloced_size() >= before + 100); + + free(ptr); + + ut_asserteq(before, get_alloced_size()); + + return 0; +} +COMMON_TEST(common_test_realloc_null, 0); + +/* + * Test realloc() with zero size + * + * Standard dlmalloc behavior (without REALLOC_ZERO_BYTES_FREES or mcheck): + * realloc(ptr, 0) returns a minimum-sized allocation. + */ +static int common_test_realloc_zero(struct unit_test_state *uts) +{ + void *ptr, *ptr2; + int before; + + before = get_alloced_size(); + + ptr = malloc(100); + ut_assertnonnull(ptr); + ut_assert(get_alloced_size() >= before + 100); + + ptr2 = realloc(ptr, 0); + + /* + * dlmalloc returns a minimum-sized allocation for realloc(ptr, 0) + * since REALLOC_ZERO_BYTES_FREES is not enabled. + * It may realloc in-place or return a different pointer. + */ + ut_assertnonnull(ptr2); + + free(ptr2); + + ut_asserteq(before, get_alloced_size()); + + return 0; +} +COMMON_TEST(common_test_realloc_zero, 0); + +/* Test memalign() with various alignments */ +static int common_test_memalign(struct unit_test_state *uts) +{ + int before; + void *ptr; + + before = get_alloced_size(); + + /* + * Test power-of-2 alignments + */ + ptr = memalign(16, 100); + ut_assertnonnull(ptr); + ut_asserteq(0, (ulong)ptr & 0xf); + free(ptr); + + ptr = memalign(256, 100); + ut_assertnonnull(ptr); + ut_asserteq(0, (ulong)ptr & 0xff); + free(ptr); + + ptr = memalign(4096, 100); + ut_assertnonnull(ptr); + ut_asserteq(0, (ulong)ptr & 0xfff); + free(ptr); + + ut_asserteq(before, get_alloced_size()); + + return 0; +} +COMMON_TEST(common_test_memalign, 0); + +/* Test multiple allocations */ +static int common_test_malloc_multiple(struct unit_test_state *uts) +{ + int expected = 0, before, i; + void *ptrs[10]; + + before = get_alloced_size(); + + /* Allocate multiple blocks */ + for (i = 0; i < 10; i++) { + ptrs[i] = malloc((i + 1) * 100); + ut_assertnonnull(ptrs[i]); + expected += (i + 1) * 100; + } + + /* Should have allocated at least the requested amount */ + ut_assert(get_alloced_size() >= before + expected); + + /* Free in reverse order */ + for (i = 9; i >= 0; i--) + free(ptrs[i]); + + ut_asserteq(before, get_alloced_size()); + + return 0; +} +COMMON_TEST(common_test_malloc_multiple, 0); + +/* Test malloc() failure when testing enabled */ +static int common_test_malloc_failure(struct unit_test_state *uts) +{ + void *ptr1, *ptr2, *ptr3; + int before; + + before = get_alloced_size(); + + /* Enable failure after 2 allocations */ + malloc_enable_testing(2); + + ptr1 = malloc(100); + ut_assertnonnull(ptr1); + + ptr2 = malloc(100); + ut_assertnonnull(ptr2); + + /* This should fail */ + ptr3 = malloc(100); + ut_assertnull(ptr3); + + malloc_disable_testing(); + + /* Should work again */ + ptr3 = malloc(100); + ut_assertnonnull(ptr3); + + free(ptr1); + free(ptr2); + free(ptr3); + + ut_asserteq(before, get_alloced_size()); + + return 0; +} +COMMON_TEST(common_test_malloc_failure, 0); + +/* Test realloc() failure when testing enabled */ +static int common_test_realloc_failure(struct unit_test_state *uts) +{ + void *ptr1, *ptr2; + int before; + + before = get_alloced_size(); + + ptr1 = malloc(50); + ut_assertnonnull(ptr1); + + /* Enable failure after 0 allocations */ + malloc_enable_testing(0); + + /* This should fail and return NULL, leaving ptr1 intact */ + ptr2 = realloc(ptr1, 100); + ut_assertnull(ptr2); + + malloc_disable_testing(); + + /* ptr1 should still be valid, try to realloc it */ + ptr2 = realloc(ptr1, 100); + ut_assertnonnull(ptr2); + + free(ptr2); + + ut_asserteq(before, get_alloced_size()); + + return 0; +} +COMMON_TEST(common_test_realloc_failure, 0); + +/* Test large allocation */ +static int common_test_malloc_large(struct unit_test_state *uts) +{ + int size = SZ_1M, before; + void *ptr; + + before = get_alloced_size(); + + ptr = malloc(size); + ut_assertnonnull(ptr); + memset(ptr, 0x5a, size); + + ut_assert(get_alloced_size() >= before + size); + + free(ptr); + + ut_asserteq(before, get_alloced_size()); + + return 0; +} +COMMON_TEST(common_test_malloc_large, 0); + +/* Test many small allocations (tests binning) */ +static int common_test_malloc_small_bins(struct unit_test_state *uts) +{ + int after_free, before, i; + void *ptrs[100]; + + before = get_alloced_size(); + + /* Allocate many small blocks of various sizes */ + for (i = 0; i < 100; i++) { + ptrs[i] = malloc((i % 32) + 8); + ut_assertnonnull(ptrs[i]); + } + + /* Free every other one to create fragmentation */ + for (i = 0; i < 100; i += 2) + free(ptrs[i]); + + after_free = get_alloced_size(); + + /* Allocate more to test reuse */ + for (i = 0; i < 100; i += 2) { + ptrs[i] = malloc((i % 32) + 8); + ut_assertnonnull(ptrs[i]); + } + + /* Should be back to roughly the same size (may vary due to overhead) */ + ut_assert(get_alloced_size() >= after_free); + + /* Free all */ + for (i = 0; i < 100; i++) + free(ptrs[i]); + + ut_asserteq(before, get_alloced_size()); + + return 0; +} +COMMON_TEST(common_test_malloc_small_bins, 0); + +/* Test alternating allocation sizes */ +static int common_test_malloc_alternating(struct unit_test_state *uts) +{ + void *small1, *large1, *small2, *large2; + int before; + + before = get_alloced_size(); + + small1 = malloc(32); + ut_assertnonnull(small1); + + large1 = malloc(8192); + ut_assertnonnull(large1); + + small2 = malloc(64); + ut_assertnonnull(small2); + + large2 = malloc(16384); + ut_assertnonnull(large2); + + ut_assert(get_alloced_size() >= before + 32 + 8192 + 64 + 16384); + + free(small1); + free(large1); + free(small2); + free(large2); + + ut_asserteq(before, get_alloced_size()); + + return 0; +} +COMMON_TEST(common_test_malloc_alternating, 0); + +/* Test malloc() with boundary sizes */ +static int common_test_malloc_boundaries(struct unit_test_state *uts) +{ + int before; + void *ptr; + + before = get_alloced_size(); + + /* Test allocation right at small/large boundary (typically 256 bytes) */ + ptr = malloc(256); + ut_assertnonnull(ptr); + free(ptr); + + ptr = malloc(255); + ut_assertnonnull(ptr); + free(ptr); + + ptr = malloc(257); + ut_assertnonnull(ptr); + free(ptr); + + ut_asserteq(before, get_alloced_size()); + + return 0; +} +COMMON_TEST(common_test_malloc_boundaries, 0); + +/* Test malloc_usable_size() */ +static int common_test_malloc_usable_size(struct unit_test_state *uts) +{ + int before, size; + void *ptr; + + before = get_alloced_size(); + + ptr = malloc(100); + ut_assertnonnull(ptr); + + size = malloc_usable_size(ptr); + /* Usable size should be at least the requested size */ + ut_assert(size >= 100); + + free(ptr); + + ut_asserteq(before, get_alloced_size()); + + return 0; +} +COMMON_TEST(common_test_malloc_usable_size, 0); + +/* Test mallinfo() returns reasonable values */ +static int common_test_mallinfo(struct unit_test_state *uts) +{ + void *ptr1, *ptr2, *ptr3; + struct mallinfo info; + int arena_before; + int used_after1; + int used_after2; + int before; + + before = get_alloced_size(); + + info = mallinfo(); + arena_before = info.arena; + + ptr1 = malloc(1024); + ut_assertnonnull(ptr1); + + info = mallinfo(); + /* Arena size should not change (it's the total heap size) */ + ut_asserteq(arena_before, info.arena); + /* Used memory should increase */ + ut_assert(info.uordblks >= before + 1024); + used_after1 = info.uordblks; + + ptr2 = malloc(2048); + ut_assertnonnull(ptr2); + + info = mallinfo(); + ut_asserteq(arena_before, info.arena); + ut_assert(info.uordblks >= used_after1 + 2048); + used_after2 = info.uordblks; + + ptr3 = malloc(512); + ut_assertnonnull(ptr3); + + info = mallinfo(); + ut_asserteq(arena_before, info.arena); + ut_assert(info.uordblks >= used_after2 + 512); + + free(ptr1); + free(ptr2); + free(ptr3); + + ut_asserteq(before, get_alloced_size()); + + return 0; +} +COMMON_TEST(common_test_mallinfo, 0); + +/* Test allocating a very large size */ +static int common_test_malloc_very_large(struct unit_test_state *uts) +{ + size_t size, before; + void *ptr; + + before = get_alloced_size(); + size = TOTAL_MALLOC_LEN - before - SZ_64K; + + ptr = malloc(size); + ut_assertnonnull(ptr); + ut_assert(get_alloced_size() >= before + size); + + free(ptr); + + ut_asserteq(before, get_alloced_size()); + + return 0; +} +COMMON_TEST(common_test_malloc_very_large, 0); + +/* Test allocating the full malloc pool size */ +static int common_test_malloc_full_pool(struct unit_test_state *uts) +{ + /* Try to allocate the full pool size - should fail due to overhead */ + ut_assertnull(malloc(TOTAL_MALLOC_LEN)); + + return 0; +} +COMMON_TEST(common_test_malloc_full_pool, 0); + +/* Test filling the entire malloc pool with allocations */ +static int common_test_malloc_fill_pool(struct unit_test_state *uts) +{ + int alloc_size, before, count, i, total; + const int ptr_table_size = 0x100000; + void **ptrs; + void *ptr; + + /* + * this is only really safe on sandbox since it uses up all memory and + * assumed that at least half of the malloc() pool is unallocated + */ + if (!IS_ENABLED(CONFIG_SANDBOX)) + return -EAGAIN; + + before = get_alloced_size(); + + /* Use memory outside malloc pool to store pointers */ + ptrs = map_sysmem(0x1000, ptr_table_size); + + /* Allocate until we run out of memory, using random sizes */ + count = 0; + total = 0; + while (1) { + /* Random size up to 1 MB */ + alloc_size = rand() % (SZ_1M); + ptr = malloc(alloc_size); + if (!ptr) + break; + ptrs[count++] = ptr; + total += alloc_size; + /* Safety check to avoid infinite loop */ + if (count >= ptr_table_size / sizeof(void *)) + break; + } + printf("count %d total %d ptr_table_size %d\n", count, total, + ptr_table_size); + + /* + * Should have allocated most of the pool - if we can't allocate + * 1MB, then at most 1MB is available, so we must have allocated + * at least (pool_size - 1MB) + */ + ut_assert(count > 0); + ut_assert(count < ptr_table_size / sizeof(void *)); + ut_assert(get_alloced_size() >= TOTAL_MALLOC_LEN - SZ_1M); + + /* Free all allocations */ + for (i = 0; i < count; i++) + free(ptrs[i]); + + /* Should be back to starting state */ + ut_asserteq(before, get_alloced_size()); + + /* Verify we can allocate large blocks again */ + ptr = malloc(TOTAL_MALLOC_LEN / 2); + ut_assertnonnull(ptr); + free(ptr); + + unmap_sysmem(ptrs); + + return 0; +} +COMMON_TEST(common_test_malloc_fill_pool, 0); -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Add CONFIG_SYS_MALLOC_LEGACY to select the current allocator and adjust the header-file and Makefile rule to use the new dlmalloc implementation. Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- Kconfig | 16 ++++++++++++++++ common/Makefile | 4 ++++ include/malloc.h | 19 +++++++------------ 3 files changed, 27 insertions(+), 12 deletions(-) diff --git a/Kconfig b/Kconfig index c4a65597035..378ecfb1867 100644 --- a/Kconfig +++ b/Kconfig @@ -511,6 +511,22 @@ config SPL_SYS_MALLOC_SMALL If unsure, say Y to minimize SPL code size. +config SYS_MALLOC_LEGACY + bool "Use legacy dlmalloc 2.6.6 instead of dlmalloc 2.8.6" + help + Select this option to use the older dlmalloc 2.6.6 implementation + instead of the newer 2.8.6 version. The legacy allocator uses a + simpler bin system, has larger code size in most cases and uses more + static data. + + The legacy allocator may have slightly worse fragmentation behavior + for some workloads but has been well-tested over many years in U-Boot. + + This option is provided for compatibility and testing. New boards + should use the default dlmalloc 2.8.6. + + If unsure, say N to use the modern allocator. + config TOOLS_DEBUG bool "Enable debug information for tools" help diff --git a/common/Makefile b/common/Makefile index ffa46ce5e06..fdf4cff94f4 100644 --- a/common/Makefile +++ b/common/Makefile @@ -71,7 +71,11 @@ obj-$(CONFIG_BOUNCE_BUFFER) += bouncebuf.o obj-$(CONFIG_$(PHASE_)SERIAL) += console.o obj-$(CONFIG_CROS_EC) += cros_ec.o +ifdef CONFIG_SYS_MALLOC_LEGACY obj-y += dlmalloc_old.o +else +obj-y += dlmalloc.o +endif obj-$(CONFIG_$(PHASE_)SYS_MALLOC_F) += malloc_simple.o obj-$(CONFIG_$(PHASE_)CYCLIC) += cyclic.o diff --git a/include/malloc.h b/include/malloc.h index 997651e5c9c..73b2da0c383 100644 --- a/include/malloc.h +++ b/include/malloc.h @@ -1,15 +1,4 @@ /* SPDX-License-Identifier: GPL-2.0+ */ -/* - * Stub header to include the old malloc header - * - * This allows the old malloc implementation to be preserved while - * preparing for a new dlmalloc version. - */ - -#include <malloc_old.h> - -#if 0 /* not active yet */ - /* Default header file for malloc-2.8.x, written by Doug Lea and released to the public domain, as explained at @@ -32,6 +21,12 @@ * If MSPACES is defined, declarations for mspace versions are included. */ +#ifdef CONFIG_SYS_MALLOC_LEGACY + +#include <malloc_old.h> + +#else + #ifndef MALLOC_280_H #define MALLOC_280_H @@ -748,4 +743,4 @@ int initf_malloc(void); #endif /* MALLOC_280_H */ -#endif /* not active yet */ +#endif /* !CONFIG_SYS_MALLOC_LEGACY */ -- 2.43.0
From: Simon Glass <simon.glass@canonical.com> Add doc/develop/malloc.rst documenting U-Boot's dynamic memory allocation implementation: - Overview of pre/post-relocation malloc phases - dlmalloc 2.8.6 version and features - Data structure sizes (~500 bytes vs 1032 bytes in 2.6.6) - Configuration options for code-size optimization - Debugging features (mcheck, valgrind, malloc testing) - API reference Also add an introductory comment to dlmalloc.c summarising the U-Boot configuration. Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- common/dlmalloc.c | 15 ++ doc/arch/sandbox/sandbox.rst | 2 + doc/develop/index.rst | 1 + doc/develop/malloc.rst | 333 +++++++++++++++++++++++++++++++++++ 4 files changed, 351 insertions(+) create mode 100644 doc/develop/malloc.rst diff --git a/common/dlmalloc.c b/common/dlmalloc.c index 54fd2e4a08a..c1c9d8a8938 100644 --- a/common/dlmalloc.c +++ b/common/dlmalloc.c @@ -1,4 +1,19 @@ // SPDX-License-Identifier: GPL-2.0+ +/* + * U-Boot Dynamic Memory Allocator + * + * This is Doug Lea's dlmalloc version 2.8.6, adapted for U-Boot. + * + * U-Boot Configuration: + * - Uses sbrk() via MORECORE (no mmap support) + * - Pre-relocation: redirects to malloc_simple.c + * - Post-relocation: full dlmalloc with heap from CONFIG_SYS_MALLOC_LEN + * - Sandbox keeps full features for testing; other boards use: + * INSECURE=1, NO_MALLINFO=1, NO_REALLOC_IN_PLACE=1 + * + * See doc/develop/malloc.rst for more information. + */ + /* Copyright 2023 Doug Lea diff --git a/doc/arch/sandbox/sandbox.rst b/doc/arch/sandbox/sandbox.rst index 9e9b027be8b..0d94c5a49cf 100644 --- a/doc/arch/sandbox/sandbox.rst +++ b/doc/arch/sandbox/sandbox.rst @@ -688,6 +688,8 @@ If sdl-config is on a different path from the default, set the SDL_CONFIG environment variable to the correct pathname before building U-Boot. +.. _sandbox_valgrind: + Using valgrind / memcheck ------------------------- diff --git a/doc/develop/index.rst b/doc/develop/index.rst index d325ad23897..c40ada5899f 100644 --- a/doc/develop/index.rst +++ b/doc/develop/index.rst @@ -51,6 +51,7 @@ Implementation global_data logging makefiles + malloc menus printf smbios diff --git a/doc/develop/malloc.rst b/doc/develop/malloc.rst new file mode 100644 index 00000000000..3c6b6ea65a4 --- /dev/null +++ b/doc/develop/malloc.rst @@ -0,0 +1,333 @@ +.. SPDX-License-Identifier: GPL-2.0-or-later + +Dynamic Memory Allocation +========================= + +U-Boot uses Doug Lea's malloc implementation (dlmalloc) for dynamic memory +allocation. This provides the standard C library functions malloc(), free(), +realloc(), calloc(), and memalign(). + +Overview +-------- + +U-Boot's malloc implementation has two phases: + +1. **Pre-relocation (simple malloc)**: Before U-Boot relocates itself to the + top of RAM, a simple malloc implementation is used. This allocates memory + from a small fixed-size pool and does not support free(). This is + controlled by CONFIG_SYS_MALLOC_F_LEN. + +2. **Post-relocation (full malloc)**: After relocation, the full dlmalloc + implementation is initialized with a larger heap. The heap size is + controlled by CONFIG_SYS_MALLOC_LEN. + +The transition between these phases is managed by the GD_FLG_FULL_MALLOC_INIT +flag in global_data. + +dlmalloc Version +---------------- + +U-Boot uses dlmalloc version 2.8.6 (updated from 2.6.6 in 2025), which +provides: + +- Efficient memory allocation with low fragmentation +- Small bins for allocations up to 256 bytes (32 bins) +- Tree bins for larger allocations (32 bins) +- Best-fit allocation strategy +- Boundary tags for coalescing free blocks + +Data Structures +--------------- + +The allocator uses two main static structures: + +**malloc_state** (~944 bytes on 64-bit systems): + +- ``smallbins``: 33 pairs of pointers for small allocations (528 bytes) +- ``treebins``: 32 tree root pointers for large allocations (256 bytes) +- ``top``: Pointer to the top chunk (wilderness) +- ``dvsize``, ``topsize``: Sizes of designated victim and top chunks +- Bookkeeping: footprint tracking, bitmaps, segment info + +**malloc_params** (48 bytes on 64-bit systems): + +- Page size, granularity, thresholds for mmap and trim + +For comparison, the older dlmalloc 2.6.6 used a single 2064-byte ``av_`` array +on 64-bit systems. The 2.8.6 version uses about half the static data while +providing better algorithms. + +Kconfig Options +--------------- + +Main U-Boot (post-relocation) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +``CONFIG_SYS_MALLOC_LEN`` + Hex value defining the size of the main malloc pool after relocation. + This is the heap available for driver model, file systems, and general + dynamic memory allocation. Default: 0x400000 (4 MB), varies by platform. + +``CONFIG_SYS_MALLOC_F`` + Bool to enable malloc() pool before relocation. Required for driver model + and many boot features. Default: y if DM is enabled. + +``CONFIG_SYS_MALLOC_F_LEN`` + Hex value for the size of pre-relocation malloc pool. This small pool is + used before DRAM is initialized. Default: 0x2000 (8 KB), varies by platform. + +``CONFIG_SYS_MALLOC_CLEAR_ON_INIT`` + Bool to zero the malloc pool on initialization. This slows boot but ensures + malloc returns zeroed memory. Disable for faster boot when using large + heaps. Default: y + +``CONFIG_SYS_MALLOC_DEFAULT_TO_INIT`` + Bool to call malloc_init() when mem_malloc_init() is called. Used when + moving malloc from one memory region to another. Default: n + +``CONFIG_SYS_MALLOC_BOOTPARAMS`` + Bool to malloc a buffer for bi_boot_params instead of using a fixed + location. Default: n + +``CONFIG_VALGRIND`` + Bool to annotate malloc operations for Valgrind memory debugging. Only + useful when running sandbox builds under Valgrind. See + :ref:`sandbox_valgrind` for details. Default: n + +``CONFIG_SYS_MALLOC_SMALL`` + Bool to enable code-size optimisations for dlmalloc. This option combines + several optimisations: + + - Disables tree bins for allocations >= 256 bytes, using simple linked-list + bins instead. This changes large-allocation performance from O(log n) to + O(n) but saves ~1.5-2KB. + - Simplifies memalign() by removing fallback retry logic. Saves ~100-150 bytes. + - Disables in-place realloc optimisation. Saves ~200 bytes. + - Uses static malloc parameters instead of runtime-configurable ones. + - Converts small chunk macros to functions to reduce code duplication. + + These optimisations may increase fragmentation and reduce performance for + workloads with many large or aligned allocations, but are suitable for most + U-Boot use cases where code size is more important. Default: n + +``CONFIG_SYS_MALLOC_LEGACY`` + Bool to use the legacy dlmalloc 2.6.6 implementation instead of the modern + dlmalloc 2.8.6. The legacy allocator has smaller code size (~450 bytes less) + but uses more static data (~500 bytes more on 64-bit). Provided for + compatibility and testing. New boards should use the modern allocator. + Default: n + +xPL Boot Phases +~~~~~~~~~~~~~~~ + +The SPL (Secondary Program Loader), TPL (Tertiary Program Loader), and VPL +(Verification Program Loader) boot phases each have their own malloc +configuration options. These are prefixed with ``SPL_``, ``TPL_``, or ``VPL_`` +and typically mirror the main U-Boot options. + +Similar to U-Boot proper, xPL phases can use simple malloc (``malloc_simple``) +for pre-DRAM allocation. However, unlike U-Boot proper which transitions from +simple malloc to full dlmalloc after relocation, xPL phases that enable +``CONFIG_SPL_SYS_MALLOC_SIMPLE`` (or equivalent) cannot transition to full +malloc within that phase, since the dlmalloc code is not included in the +binary. + +Note: When building with ``CONFIG_XPL_BUILD``, the code uses +``CONFIG_IS_ENABLED()`` macros to automatically select the appropriate +phase-specific option (e.g., ``CONFIG_IS_ENABLED(SYS_MALLOC_F)`` expands to +``CONFIG_SPL_SYS_MALLOC_F`` when building SPL). + +SPL (Secondary Program Loader) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +``CONFIG_SPL_SYS_MALLOC_F`` + Bool to enable malloc() pool in SPL before DRAM is initialized. Required + for driver model in SPL. Default: y if SPL_FRAMEWORK and SYS_MALLOC_F. + +``CONFIG_SPL_SYS_MALLOC_F_LEN`` + Hex value for SPL pre-DRAM malloc pool size. Default: inherits from + CONFIG_SYS_MALLOC_F_LEN. + +``CONFIG_SPL_SYS_MALLOC_SIMPLE`` + Bool to use only malloc_simple functions in SPL instead of full dlmalloc. + The simple allocator is smaller (saves ~600 bytes) but cannot free() + memory. Default: n + +``CONFIG_SPL_SYS_MALLOC`` + Bool to enable a full malloc pool in SPL after DRAM is initialized. + Used with CONFIG_SPL_CUSTOM_SYS_MALLOC_ADDR. Default: n + +``CONFIG_SPL_HAS_CUSTOM_MALLOC_START`` + Bool to use a custom address for SPL malloc pool instead of the default + location. Requires CONFIG_SPL_CUSTOM_SYS_MALLOC_ADDR. Default: n + +``CONFIG_SPL_CUSTOM_SYS_MALLOC_ADDR`` + Hex address for SPL malloc pool when using custom location. + +``CONFIG_SPL_SYS_MALLOC_SIZE`` + Hex value for SPL malloc pool size when using CONFIG_SPL_SYS_MALLOC. + Default: 0x100000 (1 MB). + +``CONFIG_SPL_SYS_MALLOC_CLEAR_ON_INIT`` + Bool to zero SPL malloc pool on initialization. Useful when malloc pool + is in a region that must be zeroed before first use. Default: inherits + from CONFIG_SYS_MALLOC_CLEAR_ON_INIT. + +``CONFIG_SPL_SYS_MALLOC_SMALL`` + Bool to enable code-size optimisations for dlmalloc in SPL. Enables the + same optimisations as CONFIG_SYS_MALLOC_SMALL (disables tree bins, + simplifies memalign, disables in-place realloc, uses static parameters, + converts small chunk macros to functions). SPL typically has predictable + memory usage where these optimisations have minimal impact, making the + code size savings worthwhile. Default: y + +``CONFIG_SPL_STACK_R_MALLOC_SIMPLE_LEN`` + Hex value for malloc_simple heap size after switching to DRAM stack in SPL. + Only used when CONFIG_SPL_STACK_R and CONFIG_SPL_SYS_MALLOC_SIMPLE are + enabled. Provides a larger heap than the initial SRAM pool. Default: + 0x100000 (1 MB). + +TPL (Tertiary Program Loader) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +``CONFIG_TPL_SYS_MALLOC_F`` + Bool to enable malloc() pool in TPL. Default: y if TPL and SYS_MALLOC_F. + +``CONFIG_TPL_SYS_MALLOC_F_LEN`` + Hex value for TPL malloc pool size. Default: inherits from + CONFIG_SPL_SYS_MALLOC_F_LEN. + +``CONFIG_TPL_SYS_MALLOC_SIMPLE`` + Bool to use only malloc_simple in TPL instead of full dlmalloc. Saves + code size at the cost of no free() support. Default: n + +VPL (Verification Program Loader) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +``CONFIG_VPL_SYS_MALLOC_F`` + Bool to enable malloc() pool in VPL. Default: y if VPL and SYS_MALLOC_F. + +``CONFIG_VPL_SYS_MALLOC_F_LEN`` + Hex value for VPL malloc pool size. Default: inherits from + CONFIG_SPL_SYS_MALLOC_F_LEN. + +``CONFIG_VPL_SYS_MALLOC_SIMPLE`` + Bool to use only malloc_simple in VPL. Default: y + +dlmalloc Compile-Time Options +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +These options are set in the U-Boot section of ``common/dlmalloc.c``: + +``NO_MALLOC_STATS`` + Disable malloc_stats() function. Default: 1 (disabled) + +``NO_MALLINFO`` + Disable mallinfo() function. Default: 1 for non-sandbox builds + +``INSECURE`` + Disable runtime heap validation checks. This reduces code size but removes + detection of heap corruption. Default: 1 for non-sandbox builds + +``NO_REALLOC_IN_PLACE`` + Disable in-place realloc optimisation. Enabled by CONFIG_SYS_MALLOC_SMALL. + Saves ~200 bytes of code. Default: 0 + +``NO_TREE_BINS`` + Disable tree bins for large allocations (>= 256 bytes), using simple + linked-list bins instead. Enabled by CONFIG_SYS_MALLOC_SMALL. Saves + ~1.5-2KB but changes performance from O(log n) to O(n). Default: 0 + +``SIMPLE_MEMALIGN`` + Simplify memalign() by removing fallback retry logic. Enabled by + CONFIG_SYS_MALLOC_SMALL. Saves ~100-150 bytes. Default: 0 + +``STATIC_MALLOC_PARAMS`` + Use static malloc parameters instead of runtime-configurable ones. + Enabled by CONFIG_SYS_MALLOC_SMALL. Default: 0 + +``SMALLCHUNKS_AS_FUNCS`` + Convert small chunk macros (insert_small_chunk, unlink_first_small_chunk) + to functions to reduce code duplication. Enabled by CONFIG_SYS_MALLOC_SMALL. + Default: 0 + +``SIMPLE_SYSALLOC`` + Use simplified sys_alloc() that only supports contiguous sbrk() extension. + Enabled automatically for non-sandbox builds. Saves code by removing mmap + and multi-segment support. Default: 1 for non-sandbox, 0 for sandbox + +``MORECORE_CONTIGUOUS`` + Assume sbrk() returns contiguous memory. Default: 1 + +``MORECORE_CANNOT_TRIM`` + Disable releasing memory back to the system. Default: 1 + +``HAVE_MMAP`` + Enable mmap() for large allocations. Default: 0 (U-Boot uses sbrk only) + +Code Size +--------- + +The dlmalloc 2.8.6 implementation is larger than the older 2.6.6 version due +to its more sophisticated algorithms. To minimise code size for +resource-constrained systems, U-Boot provides several optimisation levels: + +**Default optimisations** (always enabled for non-sandbox builds): + +- INSECURE=1 (saves ~1100 bytes) +- NO_MALLINFO=1 (saves ~200 bytes) +- SIMPLE_SYSALLOC=1 (saves code by simplifying sys_alloc) + +**CONFIG_SYS_MALLOC_SMALL** (additional optimisations, default y for SPL): + +- NO_TREE_BINS=1 (saves ~1.5-2KB) +- NO_REALLOC_IN_PLACE=1 (saves ~200 bytes) +- SIMPLE_MEMALIGN=1 (saves ~100-150 bytes) +- STATIC_MALLOC_PARAMS=1 +- SMALLCHUNKS_AS_FUNCS=1 (reduces code duplication) + +With default optimisations only, the code-size increase over dlmalloc 2.6.6 +is about 450 bytes, while data usage decreases by about 500 bytes. + +With CONFIG_SYS_MALLOC_SMALL enabled, significant additional code savings +are achieved, making it suitable for size-constrained SPL builds. + +Sandbox builds retain full functionality for testing, including mallinfo() +for memory-leak detection. + +Debugging +--------- + +For debugging heap issues, consider: + +1. **mcheck**: U-Boot includes mcheck support for detecting buffer overruns. + Enable CONFIG_MCHECK to use mcheck(), mcheck_pedantic(), and + mcheck_check_all(). + +2. **Valgrind**: When running sandbox with Valgrind, the allocator includes + annotations to help detect memory errors. See :ref:`sandbox_valgrind`. + +3. **malloc testing**: Unit tests can use malloc_enable_testing() to simulate + allocation failures. + +API Reference +------------- + +Standard C functions: + +- ``void *malloc(size_t size)`` - Allocate memory +- ``void free(void *ptr)`` - Free allocated memory +- ``void *realloc(void *ptr, size_t size)`` - Resize allocation +- ``void *calloc(size_t nmemb, size_t size)`` - Allocate zeroed memory +- ``void *memalign(size_t alignment, size_t size)`` - Aligned allocation + +Pre-relocation simple malloc (from malloc_simple.c): + +- ``void *malloc_simple(size_t size)`` - Simple bump allocator +- ``void *memalign_simple(size_t alignment, size_t size)`` - Aligned version + +See Also +-------- + +- :doc:`memory` - Memory management overview +- :doc:`global_data` - Global data and the GD_FLG_FULL_MALLOC_INIT flag -- 2.43.0
participants (1)
-
Simon Glass