How Linux executes a program internally, How Scheduling works in Linux

Linux is the best source for Operating System studies for all Programmers, Computer Science Students, System Programmers, and Computer Scientists.

A study of the Linux Operating System helps not only to understand the Operating System, but also how a large system works and how multithreading and memory allocation work in computer science.

Understanding Process Scheduling and Memory management helps programmers to create large, complex systems.

In this article, we are going to see how Linux executes a program internally.

We will see the end-to-end flow of execution of a program, from launching to CPU execution.

Linux does not execute code, but the CPU does; the Linux Kernel is responsible for allocating the executable code to the CPU.

Linux treats all programs as processes in the memory and then assigns them to the CPU for execution

How Linux executes a Program from the Shell or the GUI

In Linux, each program is represented by a Process, and only an existing process can create a new Process

For detailed information about what is Process in Linux read it here.

Let’s begin the journey of a Program, from launching to execution, and see what happens inside Linux

Graphical User Interface(GUI):

The most common method to execute a program is to use a Linux Desktop. When a user double-clicks on a Program Icon, then Linux Desktop(like GNOME or KDE) acts as a parent process,

The Shell (CLI):

When the user types ./the_program and hits enter, then the Shell (Bash or Zsh) acts as a parent process

So the command to execute a program in Linux can be from a GUI or Shell, in both ways, either GUI or Shell is the parent process to run the program, because in Linux, only an existing process can create a new process. Let’s continue.

When you execute a program from the Shell, the Shell first parses the command and then resolves the path

If you run a program from the GUI, then the Desktop Environment also resolves the path

After these steps, either the GUI or the Shell sends a command to the kernel that starts this program.

Now further process is the same from the GUI and the Shell when you start the program

A new process related to the program users started will be created at this time

execve() :

Execve() is a system call that initiates a request to a program that is currently launched, and when a process is created associated with that program.

Execve() validates the program executable and checks the permission that does current user has to run it
If the User does not have permission, then the request for execution is immediately rejected

If the Linux Kernel finds that Permissions are given, then execve() checks whether the binary is valid or not; for that, it checks whether the program follows the ELF format or not.

ELF is basically Executable and Linkable Format.
If the executable has ELF format, it means it is compatible with the Linux platform.
When programmers compile a program on the Linux platform, it automatically gets ELF-compatible

Once Linux verifies ELF compatibility then Linux starts preparing for real execution

ELF Loader:

Once the program is passed through all the validations from Execve(), Linux passes the control to the ELF Loader function.

The job of ELF Loader is to prepare resources for execution; it decides the memory layout for the process or program.

Linux heavily makes use of Virtual memory, because primary memory is limited, so everything cannot be loaded in primary memory.

ELF Loader creates a virtual memory layout.

It will also decide the heap and stack in primary memory for the executable code of the program
ELF Loader will also set the entry point of the program so that the CPU can later use that address to execute it.

Control passed back to execve():

After ELF Loader function finished successfully, control is back to execev now

ELF Loader has prepared a process image, but execution has not started yet

The program’s process is marked as runnable, but the program will be executed when the scheduler picks that process to run, and it will be given to the CPU.

When the scheduler picks the process to run, then a Context Switch happens. In Context Switch, the CPU’s instruction pointers are updated with the program’s address or entry point, and execution starts

Since Linux is a multiprocessing system, when a program stops executing for a while, CPU registers will be updated with the addresses of the program where it was stopped.

Linux Process Scheduling

As we discussed, once a process is runnable, then the scheduler picks that process depending of the availability of the Processor.
So understanding of how Linux Scheduler works is also necessary.
Linux is a highly efficient multitasking Operating System; its scheduling algorithm is highly optimized and designed efficiently
Linux Scheduler is the central decision maker of which process must execute next and which one should get some rest.
Linux is a preemptive Multitasking Operating System, which means the Kernel can interrupt the currently executing process and execute another one.
As we know, Linux runs millions of servers and complex systems, there this efficient multitasking scheduler plays a key role
Linux use the CFS- Completely Fair Scheduling algorithm, as its name suggests, it adds fairness in process scheduling
Fairness in allocating CPU time for each process

How does CFS keep fair scheduling of the Processes

CFS uses process priorities and historical usage of the Processes
In CFS, each process is given a ‘nice’ value
‘Nice’ value range is from -20, which is the highest priority, to +19, which is the lowest
Higher priority processes get more CPU time
During switching processes kernel saves the states and memory addresses of the previous state so that the previous process can be resumed smoothly

History of CFS

The heart of the Linux is its Process Scheduling
Linux Process Scheduling has gone a longer process of evolution
First version of Linux relied on simple Round Robin scheduler
In Round Robin processes cycle through Queue and each process gets slice of CPU time
This was sufficient for a simple Operating System, but as Linux grew rapidly, its usage grew and the need for better scheduling emerged

It was year 2001 when Linux used priority based Scheduling for the first time
In this, scheduling each Process Queue has priority, but this scheduling algorithm started getting issues like poor scaling and process starvation
It was the year 2003 when Ingo Molnar, developed CFS, CFS achieved constant timing while execution
It uses two arrays, Active and Expire. A Process keeps getting CPU time until it expires and is moved to Expired
A revision for CFS has been introduced by Ingo Molnar in the year 2007, this Scheduler uses Red Black Tree which is self self-balancing tree.
This version adds stability in scheduling and CFS is still used in Linux.
CFS added a huge value and trust in Linux users and improved performance to an unimaginable extent

Background of Ingo Molnar

Since the mid-1990s, he has been one of the main contributors of Linux
Ingo bron in Hungary and currently working in Red Hat as Linux Kernel Engineer
Another notable contribution from him is refining thread management in Http or FTP Server