strace Command in Linux

Introduction

In this tutorial, we will delve into the Linux strace command and discover its capabilities in tracing and monitoring system calls initiated by a running process. The strace command is a powerful tool that provides valuable insights into the inner workings of programs, making it essential for effective debugging and troubleshooting. We will begin with an introduction to the strace command, then explore tracing system calls, and finally, learn how to use strace for debugging processes. By the end of this tutorial, you'll be equipped with the skills to effectively utilize the strace command in your Linux system administration and development tasks.

Introduction to strace Command

In this section, we will explore the strace command, a vital tool in Linux for tracing and monitoring the system calls made by a running process. System calls serve as the interface between a process and the operating system kernel, and understanding these calls is crucial for efficient debugging and problem resolution for any systemadmin.

Let's begin by installing the strace package:

sudo apt-get update
sudo apt-get install -y strace

Example output:

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  libunwind8
Suggested packages:
  fakeroot
The following NEW packages will be installed:
  libunwind8 strace
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 292 kB of archives.
After this operation, 1,054 kB of additional disk space will be used.
Do you want to continue? [Y/n] Y
...

Now, let's use the strace command to trace a basic program. We will use the ls command as an example:

strace ls

Example output:

execve("/usr/bin/ls", ["ls"], 0x7ffee4f7a0f0 /* 23 vars */) = 0
brk(NULL)                               = 0x55b7d6c23000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
...

The output reveals the sequence of system calls invoked by the ls command, including execve to initiate the command, brk to allocate memory, access to verify file permissions, and openat to open the dynamic linker cache file.

Analyzing the strace output provides insights into a program's interaction with the operating system, which is incredibly beneficial for debugging and understanding program execution.

Tracing System Calls with strace

In this part, we will explore in greater detail the usage of the strace command for tracing system calls made by a running process on your Linux system.

Let's begin by creating a simple Python script that will serve as our tracing target:

cat > ~/project/example.py << EOF
import time

print("Hello, World!")
time.sleep(5)
EOF

Now, let's trace the execution of this script using strace:

strace python ~/project/example.py

Example output:

execve("/usr/bin/python", ["python", "/home/labex/project/example.py"], 0x7ffee4f7a0f0 /* 23 vars */) = 0
brk(NULL)                               = 0x55b7d6c23000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
...
write(1, "Hello, World!\n", 14)         = 14
time(NULL)                              = 1618304400
nanosleep({5, 0}, NULL)                 = 0
exit_group(0)                           = ?
+++ exited with 0 +++

The output displays the system calls triggered by the Python script. This includes execve, which executes the Python interpreter, write for printing the "Hello, World!" message, time for retrieving the current time, and nanosleep to pause the script's execution for 5 seconds.

By analyzing this strace output, you can decipher how your program interacts with the operating system, identifying potential issues or performance limitations that might arise in Linux environments for your systemadmin tasks.

Consider another example. This time, we will trace the execution of the ls command, enhanced with additional options:

strace -c ls -l ~/project

Example output:

total 4
-rw-r--r-- 1 labex labex 59 Apr 12 13:33 example.py
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 45.45    0.000005           5         1           execve
 27.27    0.000003           3         1           brk
  9.09    0.000001           1         1           access
  9.09    0.000001           1         1           openat
  9.09    0.000001           1         1           close
  0.00    0.000000           0         4           read
  0.00    0.000000           0         2           fstat
  0.00    0.000000           0         1           mmap
  0.00    0.000000           0         1           mprotect
  0.00    0.000000           0         1           munmap
  0.00    0.000000           0         2           ioctl
  0.00    0.000000           0         1           statfs
  0.00    0.000000           0         1           access
  0.00    0.000000           0         2           newfstatat
  0.00    0.000000           0         2           close
------ ----------- ----------- --------- --------- ----------------
100.00    0.000011                    22           total

In this instance, we utilized the -c option to generate a summary of system calls executed by the ls command. The report includes the proportion of time spent in each system call, the count of calls, and any errors encountered. Understanding the root user role and systemadmin nuances is key to interpreting these logs.

This detailed information proves valuable in pinpointing performance bottlenecks or gaining deeper insights into a program's behavior.

Debugging Processes with strace

In this segment, we will explore the application of the strace command for debugging running processes and identifying possible problems within the Linux operating system.

To begin, let's create a simple C program that we can use for debugging:

cat > ~/project/example.c << EOF
#include <stdio.h>
#include <unistd.h>

int main() {
    printf("Hello, World!\n");
    sleep(5);
    return 0;
}
EOF

Now, let's compile the program and run it under strace:

gcc -o ~/project/example ~/project/example.c
strace ~/project/example

Example output:

execve("/home/labex/project/example", ["/home/labex/project/example"], 0x7ffee4f7a0f0 /* 23 vars */) = 0
brk(NULL)                               = 0x55b7d6c23000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
...
write(1, "Hello, World!\n", 14)         = 14
time(NULL)                              = 1618304400
sleep(5)                                = 5
exit_group(0)                           = ?
+++ exited with 0 +++

The output details the sequence of system calls initiated by the C program. Notable calls include execve for starting the program, write to output "Hello, World!", and sleep to pause execution for 5 seconds.

Suppose we need to troubleshoot an issue with the program. Strace can help. For instance, if the program isn't writing the expected output to a file, we can trace file-related system calls to diagnose the situation:

strace -e trace=file ~/project/example

Example output:

execve("/home/labex/project/example", ["/home/labex/project/example"], 0x7ffee4f7a0f0 /* 23 vars */) = 0
write(1, "Hello, World!\n", 14)         = 14
time(NULL)                              = 1618304400
sleep(5)                                = 5
exit_group(0)                           = ?
+++ exited with 0 +++

The output reveals that the program does not use any file-related system calls, suggesting that the problem is not connected to file handling.

By strategically employing strace to trace particular system calls or the complete system call activity, you can often pinpoint the source of problems in your programs and debug them more efficiently. This is an invaluable skill for any systemadmin.

Summary

In this tutorial, we explored the powerful Linux strace command, allowing us to trace and monitor system calls made by a running process. We began with introducing the strace command and installing it. Next, we used strace to trace system calls made by the simple ls command, gaining insights into how the program interacts with the operating system. Then, we delved deeper into using strace to trace system calls, creating a simple Python script and observing the sequence of system calls it makes. By analyzing the strace output, we can better understand program behavior and debug issues that may arise. This knowledge is key for any systemadmin managing Linux systems.

400+ Linux Commands