Right off the bat, we'll dive into this subject by defining what concurrency is. Since it is quite easy to confuse "concurrent" with "parallel", we will try to make a clear distinction between the two from the get-go.
Concurrency is about dealing with a lot of things at the same time.
Parallelism is about doing a lot of things at the same time.
We call the concept of progressing multiple tasks at the same time
There are two ways to multitask. One is by progressing tasks concurrently,
but not at the same time. Another is to progress tasks at the exact same time in parallel.
Something we need to be able to progress a task. Our resources are limited. This could be CPU time or memory.
A set of operations that requires some kind of resource to progress. A task must consist of several sub-operations.
Something happening independently at the exact same time.
Tasks that are
in progress at the same time, but not necessarily progressing
This is an important distinction. If two tasks are running concurrently,
but are not running in parallel, they must be able to stop and resume their progress.
We say that a task is
interruptable if it allows for this kind of concurrency.
I firmly believe the main reason we find parallel and concurrent programming hard to reason about stems from how we model events in our everyday life. We tend to define these terms loosely so our intuition is often wrong.
It doesn't help that concurrent is defined in the dictionary as: operating or occurring at the same time which doesn't really help us much when trying to describe how it differs from parallel
For me, this first clicked when I started to understand why we want to make a distinction between parallel and concurrent in the first place!
The why has everything to do with resource utilization and efficiency.
Efficiency is the (often measurable) ability to avoid wasting materials, energy, efforts, money, and time in doing something or in producing a desired result.
Is increasing the resources we use to solve a task. It has nothing to do with efficiency.
Has everything to do with efficiency and resource utilization. Concurrency can never make one single task go faster. It can only help us utilize our resources better and thereby finish a set of tasks faster.
In businesses that manufacture goods, we often talk about LEAN processes. And this is pretty easy to compare with why programmers care so much about what we can achieve if we handle tasks concurrently.
I'll let let this 3 minute video explain it for me:
Ok, so it's not the newest video on the subject, but it explains a lot in 3 minutes. Most importantly the gains we try to achieve when applying LEAN techniques, and most importantly: eliminate waiting and non-value-adding tasks.
In programming we could say that we want to avoid
polling(in a busy loop).
Now would adding more resources (more workers) help in the video above? Yes, but we use double the resources to produce the same output as one person with an optimal process could do. That's not the best utilization of our resources.
To continue the parallel we started, we could say that we could solve the problem of a freezing UI while waiting for an I/O event to occur by spawning a new thread and
pollin a loop or
blockthere instead of our main thread. However, that new thread is either consuming resources doing nothing, or worse, using one core to busy loop while checking if an event is ready. Either way, it's not optimal, especially if you run a server you want to utilize fully.
If you consider the coffee machine as some I/O resource, we would like to start that process, then move on to preparing the next job, or do other work that needs to be done instead of waiting.
But that means there are things happening in parallel here?
Yes, the coffee machine is doing work while the "worker" is doing maintenance and filling water. But this is the crux: Our reference frame is the worker, not the whole system. The guy making coffee is your code.
It's the same when you make a database query. After you've sent the query to the database server, the CPU on the database server will be working on your request while you wait for a response. In practice, it's a way of parallelizing your work.
Concurrency is about working smarter. Parallelism is a way of throwing more resources at the problem.
As you might understand from what I've written so far, writing async code mostly makes sense when you need to be smart to make optimal use of your resources.
Now, if you write a program that is working hard to solve a problem, there often is no help in concurrency, this is where parallelism comes into play since it gives you a way to throw more resources at the problem if you can split it into parts that you can work on in parallel.
I can see two major use cases for concurrency:
- When performing I/O and you need to wait for some external event to occur
- When you need to divide your attention and prevent one task from waiting too long
The first is the classic I/O example: you have to wait for a network call, a database query or something else to happen before you can progress a task. However, you have many tasks to do so instead of waiting you continue work elsewhere and either check in regularly to see if the task is ready to progress or make sure you are notified when that task is ready to progress.
The second is an example that is often the case when having a UI. Let's pretend you only have one core. How do you prevent the whole UI from becoming unresponsive while performing other CPU intensive tasks?
Well, you can stop whatever task you're doing every 16ms, and run the "update UI" task, and then resume whatever you were doing afterwards. This way, you will have to stop/resume your task 60 times a second, but you will also have a fully responsive UI which has roughly a 60 Hz refresh rate.
We'll cover threads a bit more when we talk about strategies for handling I/O, but I'll mention them here as well. One challenge when using OS threads to understand concurrency is that they appear to be mapped to cores. That's not necessarily a correct mental model to use even though most operating systems will try to map one thread to one core up to the number of threads is equal to the number of cores.
Once we create more threads than there are cores, the OS will switch between our
threads and progress each of them
concurrently using the scheduler to give each
thread some time to run. And you also have to consider the fact that your program
is not the only one running on the system. Other programs might spawn several threads
as well which means there will be many more threads than there are cores on the CPU.
Therefore, threads can be a means to perform tasks in parallel, but they can also be a means to achieve concurrency.
This brings me over to the last part about concurrency. It needs to be defined in some sort of reference frame.
When you write code that is perfectly synchronous from your perspective, stop for a second and consider how that looks from the operating system perspective.
The Operating System might not run your code from start to end at all. It might stop and resume your process many times. The CPU might get interrupted and handle some inputs while you think it's only focused on your task.
So synchronous execution is only an illusion. But from the perspective of you as a programmer, it's not, and that is the important takeaway:
When we talk about concurrency without providing any other context we are using you as a programmer and your code (your process) as the reference frame. If you start pondering about concurrency without keeping this in the back of your head it will get confusing very fast.
The reason I spend so much time on this is that once you realize that, you'll start to see that some of the things you hear and learn that might seem contradicting really is not. You'll just have to consider the reference frame first.
If this still sounds complicated, I understand. Just sitting and reflecting about concurrency is difficult, but if we try to keep these thoughts in the back of our head when we work with async code I promise it will get less and less confusing.