All Posts

Same Name, Different Data: How Thread-Local Storage Works

1.Introduction:

When writing multithreaded software, one of the first challenges you

face is shared data. A global variable might seem harmless---until

two threads try to update it at the same time. Suddenly, you've got race

conditions, corrupted values, and a debugging headache that keeps you up

at night.

The usual advice is to protect shared variables with **locks or

mutexes**, but that comes at a cost: extra complexity, performance hits,

and the constant risk of deadlocks. But what if some variables don't

actually need to be shared at all? What if every thread could have its

own private copy?

That's exactly what Thread-Local Storage (TLS) provides. Instead of

forcing threads to fight over a single global variable, TLS gives each

thread its own instance. Same name, same code access, but under the hood

each thread works with its own memory slot

2. What is Thread-Local Storage?

Thread-Local Storage is a mechanism that allows each thread to maintain

a unique copy of a variable. Conceptually, it feels like a global

variable---you can access it from anywhere in your code---but in

reality, the value you see belongs only to the current thread.

This is different from stack variables, which are temporary and vanish

after a function returns. TLS variables persist for the lifetime of the

thread, making them perfect for storing per-thread state like counters,

buffers, or error codes.

![](https://firebasestorage.googleapis.com/v0/b/wadix-458217.firebasestorage.app/o/blog_posts%2F8f8b55cf-4340-403f-bb0e-2bd6f42e7544%2Fattachments%2Fglobal_vs_tls.png?alt=media&token=da7f57b5-2d04-4d26-9a2d-f8ae82e5d267)

Figure1 – Global vs Thread-Local Variable in Two Threads

In short: **globals are shared, stacks are temporary, TLS is private and

persistent.** It's a simple concept, but one that solves a big problem

in concurrent programming.

3. Why Do We Need TLS?

The need for TLS shows up the moment threads start stepping on each

other's toes. Imagine a logging system where multiple threads write to

the same global buffer. Without careful locking, the messages get

scrambled together like two people talking over each other on the same

phone line. With TLS, each thread can write to its own buffer, and

later those logs can be merged cleanly.

A classic real-world example is the errno variable in C. Every time a

system call fails, it sets errno to indicate the error. But in a

multithreaded program, each thread might be dealing with different

errors at the same time. If errno were global, one thread could

overwrite another's value. TLS solves this by making errno thread-local:

each thread gets its own error code, safely isolated from the rest.

TLS is the quiet workhorse that keeps multithreaded software sane: no

lock overhead, no messy synchronization, and no accidental overwrites.

Just simple, per-thread data isolation.

4. How TLS Works in C11?

C11 introduced the \_Thread_local storage-class specifier, giving C

programmers a standard way to declare variables that are local to each

thread. This means every thread has its own independent copy of the

variable, even though the name is the same across the program.

for example :

#include \<stdio.h\>

#include \<pthread.h\>
_Thread_local int counter = 0;

void* worker(void* arg)

{

int id = *(int*)arg;

counter++;

printf("Thread %d counter: %d \n", id, counter);

return NULL;

int main()

{

pthread_t t1, t2;

int a = 1, b = 2;

pthread_create(&t1, NULL, worker, &a);

pthread_join(t1, NULL);

pthread_create(&t2, NULL, worker, &b);

pthread_join(t2, NULL);

}

If we compile and run this, the output looks like:

![](https://firebasestorage.googleapis.com/v0/b/wadix-458217.firebasestorage.app/o/blog_posts%2F8f8b55cf-4340-403f-bb0e-2bd6f42e7544%2Fattachments%2Fresult1.png?alt=media&token=7a8f1b43-45ba-45a7-ab99-7ced43dce0cf)

Each thread starts with its own counter at zero, so when it

increments, it prints 1. The two threads don't interfere with each other

because each has its own private copy of the variable.

Now let's see what happens without \_Thread_local, if we just

declare a normal global variable

Running the same program gives:

![](https://firebasestorage.googleapis.com/v0/b/wadix-458217.firebasestorage.app/o/blog_posts%2F8f8b55cf-4340-403f-bb0e-2bd6f42e7544%2Fattachments%2Fresult2.png?alt=media&token=527942e2-969a-4ccf-b645-66ea74738785)

Here, both threads are incrementing the same global counter. The

first thread sets it to 1, and the second thread sees that new value and

increments it to 2.

To prove that \_Thread_local actually gives **separate per-thread

instances, let's look at the assembly dump: globals compile to one

fixed address**, while thread-locals compile to

thread-pointer--relative accesses.

When we declare int counter; as a plain global, the compiler generates

instructions like this:

Global counter assembly

Now compare that with the code generated for _Thread_local int counter:

Thread-local counter assembly

Instead of %rip, the compiler uses the special %fs segment register with

an offset (@tpoff). %fs is set differently for each thread by the OS and

pthreads runtime. That means the same symbol counter resolves to a

different memory location depending on which thread is running.

At the C source level, both versions look almost identical, but the

assembly reveals what's really happening under the hood.

5.Conclusion:

Thread-local storage in C11 gives each thread its own private "copy" of

a variable, avoiding the conflicts of shared globals. At the C level it

looks almost identical, but the assembly reveals the truth: globals map

to one fixed memory address, while \_Thread_local variables resolve

through the thread pointer, giving every thread its own instance. A tiny

keyword, but a powerful tool for writing safer, cleaner multithreaded

code.