This was a class assignment in the 15-213 class at Carnegie Mellon. The staff had set up a test suite and an online leaderboard to rank the speed of each student's malloc implementation.
I figured out that the test cases allocated a disproportionate amount of X-byte blocks. I was able to get to the top by hardcoding a specific freelist just for X-byte blocks.
Learned a lesson about easily it is to game a benchmark :)
variadix 5 hours ago [-]
This is a useful lesson! Tuning an allocator for a single application reduces to optimizing for that application’s empirical allocation patterns (sizes, lifetimes, access, usage).
tkinom 21 hours ago [-]
Implemented my own specialized memory allocator 26+ yrs ago. (Y2K timeframe) Probably older than the most of the CMU students :-(
Use pre-allocated pools with array of indexes, free/allocation idx for alloc and free.
Con: Fixed pool size and fixed amount of memory can be allocated per pool.
Pro: constant cost operations per alloc/free via Atomic inc/dec of idx - no linklist tranversing ; Can be alloc in kernel space and free in user space (linux/QNX) and in multiple user processes when memory pools are in shmem; Run very will in SMP environment without any locks - all memory contentions were handled with atomic +/- alloc/free idx.
Same source code run in QNX, vxworks and linux (kernel and user space) at that time.
tnelsond4 15 hours ago [-]
I was trying to get my c wasm module down in size and emmalloc is pretty small, but it requires a lot of js glue to make it work, but using a small allocator like walloc requires no glue and it's insanely small. I got my module down from 27kb to 17kb.
I'm gonna read this article and try making my own allocator next.
ethancedwards8 16 hours ago [-]
We also did this at Harvard in CS 61, which is the Systems Programming and Machine Organization class currently taught by Eddie Kohler (and has been for a while). It's a good exercise :)
pillmillipedes 1 days ago [-]
[dead]
Rendered at 19:03:11 GMT+0000 (UTC) with Wasmer Edge.
I figured out that the test cases allocated a disproportionate amount of X-byte blocks. I was able to get to the top by hardcoding a specific freelist just for X-byte blocks.
Learned a lesson about easily it is to game a benchmark :)
Use pre-allocated pools with array of indexes, free/allocation idx for alloc and free.
Con: Fixed pool size and fixed amount of memory can be allocated per pool.
Pro: constant cost operations per alloc/free via Atomic inc/dec of idx - no linklist tranversing ; Can be alloc in kernel space and free in user space (linux/QNX) and in multiple user processes when memory pools are in shmem; Run very will in SMP environment without any locks - all memory contentions were handled with atomic +/- alloc/free idx.
Same source code run in QNX, vxworks and linux (kernel and user space) at that time.
I'm gonna read this article and try making my own allocator next.