How ARM Ensures Atomicity: LDREX, STREX Explained
1. Introduction: Why Atomic Access Matters
As embedded developers, we often write C code that reads from or modifies memory — like RecivedFramesCounter++ — assuming it will execute without interruption. But in reality, that’s rarely true.
At the assembly level, these operations are split into three steps: load, modify, and store. This sequence is not atomic, and our firmware can be interrupted between any of those steps by an interrupt service routine, a Task context switch, or even a DMA access.
When that happens, the data can be corrupted, breaking the logic of our code or causing an undefined behavior.
To address this, ARM introduced the exclusive load (LDREX) and exclusive store (STREX) instructions. Together, they allow us to build atomic memory operations efficiently.
2. What Are Atomic Operations?
If two observers in our system try to access the same memory location at the same time, an atomic operation guarantees that this shared access will never cause data corruption.
An atomic operation appears as an indivisible sequence, even if it takes multiple CPU instructions internally.
3.How ARM Provides Atomic Access – LDREX and STREX?
To support atomic memory access in concurrent systems, ARM provides two special instructions:
LDREX – Load Exclusive
• Marks a memory address as exclusive for the current context.
• The processor sets a local exclusive monitor to track the address.
• If another context accesses or modifies the same memory, the exclusivity is cleared.
STREX – Store Exclusive
• Attempts to store the modified value back to memory only if exclusivity is still valid.
• If another context interfered since LDREX, the STREX fails and returns 1.
• If there was no interference, STREX succeeds and returns 0.
This pair of instructions enables a read-modify-write operation that is only successful if no other context has accessed the memory in the middle of the sequence, as demonstrated by the diagram below.

4.Exclusive Monitors:
To ensure exclusive access to memory, ARM processors rely on a special hardware block called the Exclusive Monitor.
The Exclusive Monitor works like a state machine that tracks memory addresses.
When you perform an LDREX, the monitor transitions from Open State to Exclusive State, marking the target address as being "watched."
If no other context accesses that memory, the STREX instruction will succeed, and the operation is considered atomic.
If another context modifies the same address, the monitor is cleared automatically, and STREX will fail, forcing the operation to retry safely.

There are two types of exclusive monitors in ARM architecture:
• Local Monitor:
Used to ensure atomicity between threads or contexts running on the same CPU core.
It tracks exclusive access locally and is cleared if the same memory is accessed by another context, like an interrupt or DMA.
• Global Monitor:
Used to ensure atomicity between threads running on different cores in a multicore system.
It coordinates between cores and is cleared when another core writes to the same shared memory.

## 5.Atomic Sequence implementation :
The following function demonstrates how to implement an atomic increment using ARM’s LDREXW and STREXW instructions :
void atomic_increment(volatile uint32_t *address )
{
uint32_t value = 0U ;
uint32_t status = 0U ;
do
{
/*1 Load the current value with exclusive access */
value = __LDREXW( address );
/*2 Increment */
value = value + 1U ;
/* 3 Try to store the modified value*/
status = __STREXW(value, address );
/*4 check result and retry if store failed*/
} while (status != 0U );
}
6.Automated Assembly Insertion of LDREX/STREX with Compiler:
Previously, we used LDREX/STREX instructions to build atomic access sequences manually.
However, for portability and code readability, the C11 standard introduced atomic operations in <stdatomic.h>.
These allow you to write portable atomic code that will compile to the correct hardware instructions (like LDREX/STREX on ARM) without needing to write assembly.
for our increment example we will use the “atomic_fetch_add” function :
``` bash
#include <stdint.h>
#include <stdatomic.h>
void atomic_increment(volatile uint32_t *address)
{
atomic_fetch_add(address , 1);
}
If we compile the previous atomic_increment function for an ARM Cortex-M4 using the following command:
arm-none-eabi-gcc -mcpu=cortex-m4 -mthumb -O2 -c atomic_increment.c -o atomic_increment.o