Add support for medium size classes, [4KiB..32KiB], 2KiB apart by default.

Add the 'M' and 'm' MALLOC_OPTIONS flags, which control the maximum medium size
class.

Relax the cap on small/medium run size to arena_maxclass.

Reduce arena_run_reg_dalloc() integer division code complexity.

Increase the default chunk size from 1MiB to 4MiB.
This commit is contained in:
Jason Evans 2009-12-29 00:09:15 -08:00
parent 6d7bb5357a
commit b2378168a4
2 changed files with 468 additions and 270 deletions

View File

@ -35,7 +35,7 @@
.\" @(#)malloc.3 8.1 (Berkeley) 6/4/93 .\" @(#)malloc.3 8.1 (Berkeley) 6/4/93
.\" $FreeBSD: head/lib/libc/stdlib/malloc.3 182225 2008-08-27 02:00:53Z jasone $ .\" $FreeBSD: head/lib/libc/stdlib/malloc.3 182225 2008-08-27 02:00:53Z jasone $
.\" .\"
.Dd November 13, 2009 .Dd November 19, 2009
.Dt JEMALLOC 3 .Dt JEMALLOC 3
.Os .Os
.Sh NAME .Sh NAME
@ -228,7 +228,11 @@ will prevent any dirty unused pages from accumulating.
@roff_fill@negatively. @roff_fill@negatively.
.It K .It K
Double/halve the virtual memory chunk size. Double/halve the virtual memory chunk size.
The default chunk size is 1 MB. The default chunk size is 16 MiB.
.It M
Double/halve the size of the maximum medium size class.
The valid range is from one page to one half chunk.
The default value is 32 KiB.
.It N .It N
Double/halve the number of arenas. Double/halve the number of arenas.
The default number of arenas is two times the number of CPUs, or one if there The default number of arenas is two times the number of CPUs, or one if there
@ -281,7 +285,7 @@ The default value is 128 bytes.
@roff_xmalloc@.It X @roff_xmalloc@.It X
@roff_xmalloc@Rather than return failure for any allocation function, display a @roff_xmalloc@Rather than return failure for any allocation function, display a
@roff_xmalloc@diagnostic message on @roff_xmalloc@diagnostic message on
@roff_xmalloc@.Dv stderr @roff_xmalloc@.Dv STDERR_FILENO
@roff_xmalloc@and cause the program to drop core (using @roff_xmalloc@and cause the program to drop core (using
@roff_xmalloc@.Xr abort 3 ) . @roff_xmalloc@.Xr abort 3 ) .
@roff_xmalloc@This option should be set at compile time by including the @roff_xmalloc@This option should be set at compile time by including the
@ -335,9 +339,9 @@ However, it may make sense to reduce the number of arenas if an application
does not make much use of the allocation functions. does not make much use of the allocation functions.
.Pp .Pp
@roff_mag@In addition to multiple arenas, this allocator supports @roff_mag@In addition to multiple arenas, this allocator supports
@roff_mag@thread-specific caching for small objects (smaller than one page), in @roff_mag@thread-specific caching for small and medium objects, in order to make
@roff_mag@order to make it possible to completely avoid synchronization for most @roff_mag@it possible to completely avoid synchronization for most small and
@roff_mag@small allocation requests. @roff_mag@medium allocation requests.
@roff_mag@Such caching allows very fast allocation in the common case, but it @roff_mag@Such caching allows very fast allocation in the common case, but it
@roff_mag@increases memory usage and fragmentation, since a bounded number of @roff_mag@increases memory usage and fragmentation, since a bounded number of
@roff_mag@objects can remain allocated in each thread cache. @roff_mag@objects can remain allocated in each thread cache.
@ -348,23 +352,27 @@ Chunks are always aligned to multiples of the chunk size.
This alignment makes it possible to find metadata for user objects very This alignment makes it possible to find metadata for user objects very
quickly. quickly.
.Pp .Pp
User objects are broken into three categories according to size: small, large, User objects are broken into four categories according to size: small, medium,
and huge. large, and huge.
Small objects are smaller than one page. Small objects are smaller than one page.
Medium objects range from one page to an upper limit determined at run time (see
the
.Dq M
option).
Large objects are smaller than the chunk size. Large objects are smaller than the chunk size.
Huge objects are a multiple of the chunk size. Huge objects are a multiple of the chunk size.
Small and large objects are managed by arenas; huge objects are managed Small, medium, and large objects are managed by arenas; huge objects are managed
separately in a single data structure that is shared by all threads. separately in a single data structure that is shared by all threads.
Huge objects are used by applications infrequently enough that this single Huge objects are used by applications infrequently enough that this single
data structure is not a scalability issue. data structure is not a scalability issue.
.Pp .Pp
Each chunk that is managed by an arena tracks its contents as runs of Each chunk that is managed by an arena tracks its contents as runs of
contiguous pages (unused, backing a set of small objects, or backing one large contiguous pages (unused, backing a set of small or medium objects, or backing
object). one large object).
The combination of chunk alignment and chunk page maps makes it possible to The combination of chunk alignment and chunk page maps makes it possible to
determine all metadata regarding small and large allocations in constant time. determine all metadata regarding small and large allocations in constant time.
.Pp .Pp
Small objects are managed in groups by page runs. Small and medium objects are managed in groups by page runs.
Each run maintains a bitmap that tracks which regions are in use. Each run maintains a bitmap that tracks which regions are in use.
@roff_tiny@Allocation requests that are no more than half the quantum (8 or 16, @roff_tiny@Allocation requests that are no more than half the quantum (8 or 16,
@roff_tiny@depending on architecture) are rounded up to the nearest power of @roff_tiny@depending on architecture) are rounded up to the nearest power of
@ -380,10 +388,17 @@ Allocation requests that are more than the minumum cacheline-multiple size
class, but no more than the minimum subpage-multiple size class (see the class, but no more than the minimum subpage-multiple size class (see the
.Dq C .Dq C
option) are rounded up to the nearest multiple of the cacheline size (64). option) are rounded up to the nearest multiple of the cacheline size (64).
Allocation requests that are more than the minimum subpage-multiple size class Allocation requests that are more than the minimum subpage-multiple size class,
are rounded up to the nearest multiple of the subpage size (256). but no more than the maximum subpage-multiple size class are rounded up to the
Allocation requests that are more than one page, but small enough to fit in nearest multiple of the subpage size (256).
an arena-managed chunk (see the Allocation requests that are more than the maximum subpage-multiple size class,
but no more than the maximum medium size class (see the
.Dq M
option) are rounded up to the nearest medium size class; spacing is an
automatically determined power of two and ranges from the subpage size to the
page size.
Allocation requests that are more than the maximum medium size class, but small
enough to fit in an arena-managed chunk (see the
.Dq K .Dq K
option), are rounded up to the nearest run size. option), are rounded up to the nearest run size.
Allocation requests that are too large to fit in an arena-managed chunk are Allocation requests that are too large to fit in an arena-managed chunk are
@ -444,7 +459,7 @@ The
variable allows the programmer to override the function which emits variable allows the programmer to override the function which emits
the text strings forming the errors and warnings if for some reason the text strings forming the errors and warnings if for some reason
the the
.Dv stderr .Dv STDERR_FILENO
file descriptor is not suitable for this. file descriptor is not suitable for this.
Please note that doing anything which tries to allocate memory in Please note that doing anything which tries to allocate memory in
this function is likely to result in a crash or deadlock. this function is likely to result in a crash or deadlock.

File diff suppressed because it is too large Load Diff