Introduce a backport of C11 atomics
This introduces a backport of C11 atomics. It has four implementations; ranked
in order of preference, they are:
- GCC/Clang __atomic builtins
- GCC/Clang __sync builtins
- MSVC _Interlocked builtins
- C11 atomics, from <stdatomic.h>
The primary advantages are:
- Close adherence to the standard API gives us a defined memory model.
- Type safety: atomic objects are now separate types from non-atomic ones, so
that it's impossible to mix up atomic and non-atomic updates (which is
undefined behavior that compilers are starting to take advantage of).
- Efficiency: we can specify ordering for operations, avoiding fences and
atomic operations on strongly ordered architectures (example:
`atomic_write_u32(ptr, val);` involves a CAS loop, whereas
`atomic_store(ptr, val, ATOMIC_RELEASE);` is a plain store.
This diff leaves in the current atomics API (implementing them in terms of the
backport). This lets us transition uses over piecemeal.
Testing:
This is by nature hard to test. I've manually tested the first three options on
Linux on gcc by futzing with the #defines manually, on freebsd with gcc and
clang, on MSVC, and on OS X with clang. All of these were x86 machines though,
and we don't have any test infrastructure set up for non-x86 platforms.
2017-01-26 01:54:27 +08:00
|
|
|
#ifndef JEMALLOC_INTERNAL_ATOMIC_H
|
|
|
|
#define JEMALLOC_INTERNAL_ATOMIC_H
|
|
|
|
|
2023-06-10 08:37:47 +08:00
|
|
|
#include "jemalloc/internal/jemalloc_preamble.h"
|
Introduce a backport of C11 atomics
This introduces a backport of C11 atomics. It has four implementations; ranked
in order of preference, they are:
- GCC/Clang __atomic builtins
- GCC/Clang __sync builtins
- MSVC _Interlocked builtins
- C11 atomics, from <stdatomic.h>
The primary advantages are:
- Close adherence to the standard API gives us a defined memory model.
- Type safety: atomic objects are now separate types from non-atomic ones, so
that it's impossible to mix up atomic and non-atomic updates (which is
undefined behavior that compilers are starting to take advantage of).
- Efficiency: we can specify ordering for operations, avoiding fences and
atomic operations on strongly ordered architectures (example:
`atomic_write_u32(ptr, val);` involves a CAS loop, whereas
`atomic_store(ptr, val, ATOMIC_RELEASE);` is a plain store.
This diff leaves in the current atomics API (implementing them in terms of the
backport). This lets us transition uses over piecemeal.
Testing:
This is by nature hard to test. I've manually tested the first three options on
Linux on gcc by futzing with the #defines manually, on freebsd with gcc and
clang, on MSVC, and on OS X with clang. All of these were x86 machines though,
and we don't have any test infrastructure set up for non-x86 platforms.
2017-01-26 01:54:27 +08:00
|
|
|
|
2019-03-08 07:58:26 +08:00
|
|
|
#define JEMALLOC_U8_ATOMICS
|
Introduce a backport of C11 atomics
This introduces a backport of C11 atomics. It has four implementations; ranked
in order of preference, they are:
- GCC/Clang __atomic builtins
- GCC/Clang __sync builtins
- MSVC _Interlocked builtins
- C11 atomics, from <stdatomic.h>
The primary advantages are:
- Close adherence to the standard API gives us a defined memory model.
- Type safety: atomic objects are now separate types from non-atomic ones, so
that it's impossible to mix up atomic and non-atomic updates (which is
undefined behavior that compilers are starting to take advantage of).
- Efficiency: we can specify ordering for operations, avoiding fences and
atomic operations on strongly ordered architectures (example:
`atomic_write_u32(ptr, val);` involves a CAS loop, whereas
`atomic_store(ptr, val, ATOMIC_RELEASE);` is a plain store.
This diff leaves in the current atomics API (implementing them in terms of the
backport). This lets us transition uses over piecemeal.
Testing:
This is by nature hard to test. I've manually tested the first three options on
Linux on gcc by futzing with the #defines manually, on freebsd with gcc and
clang, on MSVC, and on OS X with clang. All of these were x86 machines though,
and we don't have any test infrastructure set up for non-x86 platforms.
2017-01-26 01:54:27 +08:00
|
|
|
#if defined(JEMALLOC_GCC_ATOMIC_ATOMICS)
|
|
|
|
# include "jemalloc/internal/atomic_gcc_atomic.h"
|
2019-03-08 07:58:26 +08:00
|
|
|
# if !defined(JEMALLOC_GCC_U8_ATOMIC_ATOMICS)
|
|
|
|
# undef JEMALLOC_U8_ATOMICS
|
|
|
|
# endif
|
Introduce a backport of C11 atomics
This introduces a backport of C11 atomics. It has four implementations; ranked
in order of preference, they are:
- GCC/Clang __atomic builtins
- GCC/Clang __sync builtins
- MSVC _Interlocked builtins
- C11 atomics, from <stdatomic.h>
The primary advantages are:
- Close adherence to the standard API gives us a defined memory model.
- Type safety: atomic objects are now separate types from non-atomic ones, so
that it's impossible to mix up atomic and non-atomic updates (which is
undefined behavior that compilers are starting to take advantage of).
- Efficiency: we can specify ordering for operations, avoiding fences and
atomic operations on strongly ordered architectures (example:
`atomic_write_u32(ptr, val);` involves a CAS loop, whereas
`atomic_store(ptr, val, ATOMIC_RELEASE);` is a plain store.
This diff leaves in the current atomics API (implementing them in terms of the
backport). This lets us transition uses over piecemeal.
Testing:
This is by nature hard to test. I've manually tested the first three options on
Linux on gcc by futzing with the #defines manually, on freebsd with gcc and
clang, on MSVC, and on OS X with clang. All of these were x86 machines though,
and we don't have any test infrastructure set up for non-x86 platforms.
2017-01-26 01:54:27 +08:00
|
|
|
#elif defined(JEMALLOC_GCC_SYNC_ATOMICS)
|
|
|
|
# include "jemalloc/internal/atomic_gcc_sync.h"
|
2019-03-08 07:58:26 +08:00
|
|
|
# if !defined(JEMALLOC_GCC_U8_SYNC_ATOMICS)
|
|
|
|
# undef JEMALLOC_U8_ATOMICS
|
|
|
|
# endif
|
Introduce a backport of C11 atomics
This introduces a backport of C11 atomics. It has four implementations; ranked
in order of preference, they are:
- GCC/Clang __atomic builtins
- GCC/Clang __sync builtins
- MSVC _Interlocked builtins
- C11 atomics, from <stdatomic.h>
The primary advantages are:
- Close adherence to the standard API gives us a defined memory model.
- Type safety: atomic objects are now separate types from non-atomic ones, so
that it's impossible to mix up atomic and non-atomic updates (which is
undefined behavior that compilers are starting to take advantage of).
- Efficiency: we can specify ordering for operations, avoiding fences and
atomic operations on strongly ordered architectures (example:
`atomic_write_u32(ptr, val);` involves a CAS loop, whereas
`atomic_store(ptr, val, ATOMIC_RELEASE);` is a plain store.
This diff leaves in the current atomics API (implementing them in terms of the
backport). This lets us transition uses over piecemeal.
Testing:
This is by nature hard to test. I've manually tested the first three options on
Linux on gcc by futzing with the #defines manually, on freebsd with gcc and
clang, on MSVC, and on OS X with clang. All of these were x86 machines though,
and we don't have any test infrastructure set up for non-x86 platforms.
2017-01-26 01:54:27 +08:00
|
|
|
#elif defined(_MSC_VER)
|
|
|
|
# include "jemalloc/internal/atomic_msvc.h"
|
|
|
|
#elif defined(JEMALLOC_C11_ATOMICS)
|
|
|
|
# include "jemalloc/internal/atomic_c11.h"
|
|
|
|
#else
|
|
|
|
# error "Don't have atomics implemented on this platform."
|
|
|
|
#endif
|
|
|
|
|
2023-06-10 08:37:47 +08:00
|
|
|
#define ATOMIC_INLINE JEMALLOC_ALWAYS_INLINE
|
|
|
|
|
Introduce a backport of C11 atomics
This introduces a backport of C11 atomics. It has four implementations; ranked
in order of preference, they are:
- GCC/Clang __atomic builtins
- GCC/Clang __sync builtins
- MSVC _Interlocked builtins
- C11 atomics, from <stdatomic.h>
The primary advantages are:
- Close adherence to the standard API gives us a defined memory model.
- Type safety: atomic objects are now separate types from non-atomic ones, so
that it's impossible to mix up atomic and non-atomic updates (which is
undefined behavior that compilers are starting to take advantage of).
- Efficiency: we can specify ordering for operations, avoiding fences and
atomic operations on strongly ordered architectures (example:
`atomic_write_u32(ptr, val);` involves a CAS loop, whereas
`atomic_store(ptr, val, ATOMIC_RELEASE);` is a plain store.
This diff leaves in the current atomics API (implementing them in terms of the
backport). This lets us transition uses over piecemeal.
Testing:
This is by nature hard to test. I've manually tested the first three options on
Linux on gcc by futzing with the #defines manually, on freebsd with gcc and
clang, on MSVC, and on OS X with clang. All of these were x86 machines though,
and we don't have any test infrastructure set up for non-x86 platforms.
2017-01-26 01:54:27 +08:00
|
|
|
/*
|
|
|
|
* This header gives more or less a backport of C11 atomics. The user can write
|
|
|
|
* JEMALLOC_GENERATE_ATOMICS(type, short_type, lg_sizeof_type); to generate
|
|
|
|
* counterparts of the C11 atomic functions for type, as so:
|
|
|
|
* JEMALLOC_GENERATE_ATOMICS(int *, pi, 3);
|
|
|
|
* and then write things like:
|
|
|
|
* int *some_ptr;
|
|
|
|
* atomic_pi_t atomic_ptr_to_int;
|
|
|
|
* atomic_store_pi(&atomic_ptr_to_int, some_ptr, ATOMIC_RELAXED);
|
|
|
|
* int *prev_value = atomic_exchange_pi(&ptr_to_int, NULL, ATOMIC_ACQ_REL);
|
|
|
|
* assert(some_ptr == prev_value);
|
|
|
|
* and expect things to work in the obvious way.
|
|
|
|
*
|
|
|
|
* Also included (with naming differences to avoid conflicts with the standard
|
|
|
|
* library):
|
|
|
|
* atomic_fence(atomic_memory_order_t) (mimics C11's atomic_thread_fence).
|
|
|
|
* ATOMIC_INIT (mimics C11's ATOMIC_VAR_INIT).
|
|
|
|
*/
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Pure convenience, so that we don't have to type "atomic_memory_order_"
|
|
|
|
* quite so often.
|
|
|
|
*/
|
|
|
|
#define ATOMIC_RELAXED atomic_memory_order_relaxed
|
2017-03-09 15:32:53 +08:00
|
|
|
#define ATOMIC_ACQUIRE atomic_memory_order_acquire
|
|
|
|
#define ATOMIC_RELEASE atomic_memory_order_release
|
|
|
|
#define ATOMIC_ACQ_REL atomic_memory_order_acq_rel
|
Introduce a backport of C11 atomics
This introduces a backport of C11 atomics. It has four implementations; ranked
in order of preference, they are:
- GCC/Clang __atomic builtins
- GCC/Clang __sync builtins
- MSVC _Interlocked builtins
- C11 atomics, from <stdatomic.h>
The primary advantages are:
- Close adherence to the standard API gives us a defined memory model.
- Type safety: atomic objects are now separate types from non-atomic ones, so
that it's impossible to mix up atomic and non-atomic updates (which is
undefined behavior that compilers are starting to take advantage of).
- Efficiency: we can specify ordering for operations, avoiding fences and
atomic operations on strongly ordered architectures (example:
`atomic_write_u32(ptr, val);` involves a CAS loop, whereas
`atomic_store(ptr, val, ATOMIC_RELEASE);` is a plain store.
This diff leaves in the current atomics API (implementing them in terms of the
backport). This lets us transition uses over piecemeal.
Testing:
This is by nature hard to test. I've manually tested the first three options on
Linux on gcc by futzing with the #defines manually, on freebsd with gcc and
clang, on MSVC, and on OS X with clang. All of these were x86 machines though,
and we don't have any test infrastructure set up for non-x86 platforms.
2017-01-26 01:54:27 +08:00
|
|
|
#define ATOMIC_SEQ_CST atomic_memory_order_seq_cst
|
|
|
|
|
2020-03-09 11:43:41 +08:00
|
|
|
/*
|
|
|
|
* Another convenience -- simple atomic helper functions.
|
|
|
|
*/
|
|
|
|
#define JEMALLOC_GENERATE_EXPANDED_INT_ATOMICS(type, short_type, \
|
|
|
|
lg_size) \
|
|
|
|
JEMALLOC_GENERATE_INT_ATOMICS(type, short_type, lg_size) \
|
|
|
|
ATOMIC_INLINE void \
|
|
|
|
atomic_load_add_store_##short_type(atomic_##short_type##_t *a, \
|
|
|
|
type inc) { \
|
|
|
|
type oldval = atomic_load_##short_type(a, ATOMIC_RELAXED); \
|
|
|
|
type newval = oldval + inc; \
|
|
|
|
atomic_store_##short_type(a, newval, ATOMIC_RELAXED); \
|
2020-05-30 04:21:41 +08:00
|
|
|
} \
|
|
|
|
ATOMIC_INLINE void \
|
|
|
|
atomic_load_sub_store_##short_type(atomic_##short_type##_t *a, \
|
|
|
|
type inc) { \
|
|
|
|
type oldval = atomic_load_##short_type(a, ATOMIC_RELAXED); \
|
|
|
|
type newval = oldval - inc; \
|
|
|
|
atomic_store_##short_type(a, newval, ATOMIC_RELAXED); \
|
2020-03-09 11:43:41 +08:00
|
|
|
}
|
|
|
|
|
2017-03-09 04:13:59 +08:00
|
|
|
/*
|
|
|
|
* Not all platforms have 64-bit atomics. If we do, this #define exposes that
|
|
|
|
* fact.
|
|
|
|
*/
|
|
|
|
#if (LG_SIZEOF_PTR == 3 || LG_SIZEOF_INT == 3)
|
|
|
|
# define JEMALLOC_ATOMIC_U64
|
|
|
|
#endif
|
|
|
|
|
Introduce a backport of C11 atomics
This introduces a backport of C11 atomics. It has four implementations; ranked
in order of preference, they are:
- GCC/Clang __atomic builtins
- GCC/Clang __sync builtins
- MSVC _Interlocked builtins
- C11 atomics, from <stdatomic.h>
The primary advantages are:
- Close adherence to the standard API gives us a defined memory model.
- Type safety: atomic objects are now separate types from non-atomic ones, so
that it's impossible to mix up atomic and non-atomic updates (which is
undefined behavior that compilers are starting to take advantage of).
- Efficiency: we can specify ordering for operations, avoiding fences and
atomic operations on strongly ordered architectures (example:
`atomic_write_u32(ptr, val);` involves a CAS loop, whereas
`atomic_store(ptr, val, ATOMIC_RELEASE);` is a plain store.
This diff leaves in the current atomics API (implementing them in terms of the
backport). This lets us transition uses over piecemeal.
Testing:
This is by nature hard to test. I've manually tested the first three options on
Linux on gcc by futzing with the #defines manually, on freebsd with gcc and
clang, on MSVC, and on OS X with clang. All of these were x86 machines though,
and we don't have any test infrastructure set up for non-x86 platforms.
2017-01-26 01:54:27 +08:00
|
|
|
JEMALLOC_GENERATE_ATOMICS(void *, p, LG_SIZEOF_PTR)
|
|
|
|
|
|
|
|
/*
|
|
|
|
* There's no actual guarantee that sizeof(bool) == 1, but it's true on the only
|
|
|
|
* platform that actually needs to know the size, MSVC.
|
|
|
|
*/
|
|
|
|
JEMALLOC_GENERATE_ATOMICS(bool, b, 0)
|
|
|
|
|
2020-03-09 11:43:41 +08:00
|
|
|
JEMALLOC_GENERATE_EXPANDED_INT_ATOMICS(unsigned, u, LG_SIZEOF_INT)
|
Introduce a backport of C11 atomics
This introduces a backport of C11 atomics. It has four implementations; ranked
in order of preference, they are:
- GCC/Clang __atomic builtins
- GCC/Clang __sync builtins
- MSVC _Interlocked builtins
- C11 atomics, from <stdatomic.h>
The primary advantages are:
- Close adherence to the standard API gives us a defined memory model.
- Type safety: atomic objects are now separate types from non-atomic ones, so
that it's impossible to mix up atomic and non-atomic updates (which is
undefined behavior that compilers are starting to take advantage of).
- Efficiency: we can specify ordering for operations, avoiding fences and
atomic operations on strongly ordered architectures (example:
`atomic_write_u32(ptr, val);` involves a CAS loop, whereas
`atomic_store(ptr, val, ATOMIC_RELEASE);` is a plain store.
This diff leaves in the current atomics API (implementing them in terms of the
backport). This lets us transition uses over piecemeal.
Testing:
This is by nature hard to test. I've manually tested the first three options on
Linux on gcc by futzing with the #defines manually, on freebsd with gcc and
clang, on MSVC, and on OS X with clang. All of these were x86 machines though,
and we don't have any test infrastructure set up for non-x86 platforms.
2017-01-26 01:54:27 +08:00
|
|
|
|
2020-03-09 11:43:41 +08:00
|
|
|
JEMALLOC_GENERATE_EXPANDED_INT_ATOMICS(size_t, zu, LG_SIZEOF_PTR)
|
Introduce a backport of C11 atomics
This introduces a backport of C11 atomics. It has four implementations; ranked
in order of preference, they are:
- GCC/Clang __atomic builtins
- GCC/Clang __sync builtins
- MSVC _Interlocked builtins
- C11 atomics, from <stdatomic.h>
The primary advantages are:
- Close adherence to the standard API gives us a defined memory model.
- Type safety: atomic objects are now separate types from non-atomic ones, so
that it's impossible to mix up atomic and non-atomic updates (which is
undefined behavior that compilers are starting to take advantage of).
- Efficiency: we can specify ordering for operations, avoiding fences and
atomic operations on strongly ordered architectures (example:
`atomic_write_u32(ptr, val);` involves a CAS loop, whereas
`atomic_store(ptr, val, ATOMIC_RELEASE);` is a plain store.
This diff leaves in the current atomics API (implementing them in terms of the
backport). This lets us transition uses over piecemeal.
Testing:
This is by nature hard to test. I've manually tested the first three options on
Linux on gcc by futzing with the #defines manually, on freebsd with gcc and
clang, on MSVC, and on OS X with clang. All of these were x86 machines though,
and we don't have any test infrastructure set up for non-x86 platforms.
2017-01-26 01:54:27 +08:00
|
|
|
|
2020-03-09 11:43:41 +08:00
|
|
|
JEMALLOC_GENERATE_EXPANDED_INT_ATOMICS(ssize_t, zd, LG_SIZEOF_PTR)
|
2017-03-07 03:40:58 +08:00
|
|
|
|
2020-03-09 11:43:41 +08:00
|
|
|
JEMALLOC_GENERATE_EXPANDED_INT_ATOMICS(uint8_t, u8, 0)
|
2018-03-09 08:34:17 +08:00
|
|
|
|
2020-03-09 11:43:41 +08:00
|
|
|
JEMALLOC_GENERATE_EXPANDED_INT_ATOMICS(uint32_t, u32, 2)
|
Introduce a backport of C11 atomics
This introduces a backport of C11 atomics. It has four implementations; ranked
in order of preference, they are:
- GCC/Clang __atomic builtins
- GCC/Clang __sync builtins
- MSVC _Interlocked builtins
- C11 atomics, from <stdatomic.h>
The primary advantages are:
- Close adherence to the standard API gives us a defined memory model.
- Type safety: atomic objects are now separate types from non-atomic ones, so
that it's impossible to mix up atomic and non-atomic updates (which is
undefined behavior that compilers are starting to take advantage of).
- Efficiency: we can specify ordering for operations, avoiding fences and
atomic operations on strongly ordered architectures (example:
`atomic_write_u32(ptr, val);` involves a CAS loop, whereas
`atomic_store(ptr, val, ATOMIC_RELEASE);` is a plain store.
This diff leaves in the current atomics API (implementing them in terms of the
backport). This lets us transition uses over piecemeal.
Testing:
This is by nature hard to test. I've manually tested the first three options on
Linux on gcc by futzing with the #defines manually, on freebsd with gcc and
clang, on MSVC, and on OS X with clang. All of these were x86 machines though,
and we don't have any test infrastructure set up for non-x86 platforms.
2017-01-26 01:54:27 +08:00
|
|
|
|
2017-03-09 04:13:59 +08:00
|
|
|
#ifdef JEMALLOC_ATOMIC_U64
|
2020-03-09 11:43:41 +08:00
|
|
|
JEMALLOC_GENERATE_EXPANDED_INT_ATOMICS(uint64_t, u64, 3)
|
2017-03-09 04:13:59 +08:00
|
|
|
#endif
|
Introduce a backport of C11 atomics
This introduces a backport of C11 atomics. It has four implementations; ranked
in order of preference, they are:
- GCC/Clang __atomic builtins
- GCC/Clang __sync builtins
- MSVC _Interlocked builtins
- C11 atomics, from <stdatomic.h>
The primary advantages are:
- Close adherence to the standard API gives us a defined memory model.
- Type safety: atomic objects are now separate types from non-atomic ones, so
that it's impossible to mix up atomic and non-atomic updates (which is
undefined behavior that compilers are starting to take advantage of).
- Efficiency: we can specify ordering for operations, avoiding fences and
atomic operations on strongly ordered architectures (example:
`atomic_write_u32(ptr, val);` involves a CAS loop, whereas
`atomic_store(ptr, val, ATOMIC_RELEASE);` is a plain store.
This diff leaves in the current atomics API (implementing them in terms of the
backport). This lets us transition uses over piecemeal.
Testing:
This is by nature hard to test. I've manually tested the first three options on
Linux on gcc by futzing with the #defines manually, on freebsd with gcc and
clang, on MSVC, and on OS X with clang. All of these were x86 machines though,
and we don't have any test infrastructure set up for non-x86 platforms.
2017-01-26 01:54:27 +08:00
|
|
|
|
|
|
|
#undef ATOMIC_INLINE
|
|
|
|
|
|
|
|
#endif /* JEMALLOC_INTERNAL_ATOMIC_H */
|