Tracing file access in OS X and Linux

February 26th, 2012 Leave a comment Go to comments

Here I cover how to get a log of all file access in OS X and Linux, see my article for Windows to see how it’s done there.

Tracing file access in OS X and Linux can be done with the fs_usage command line utility and a little shell scripting. This tool requires you to run as root because it interacts with low level kernel tracing facilities that are not accessible by non-root users. This means you will need either to be able to su – or sudo on your host.

Open your favorite terminal and acquire root level access with su – if not using sudo . I’ll illustrate the examples using sudo , if you are gaining root access via su -, leave out the sudo from the command line examples

These examples assume you are in a directory where you can write out a log file (for example your home directory)

sudo fs_usage -w -f filesys -e Terminal tee | tee fs_usage.log

In this example I am calling fs_usage with the following options:

  • -w -> This tells fs_usage to output the full text in each column of data, text will wrap to the next line as need be
  • -f filesys -> This filters for file system activity only (leaving out network, cache hit, spawn, and other non filesystem related activity)
  • -e -> This excludes activity from the Terminal and tee processes (fs_usage activity is always excluded by default). This example was run on OS X, on Linux you may remove Terminal from the exclusion list.

There is usually a flurry of activity from the large number of active processes present on any system. It would be impractical to try to read the log stream in real time. Instead we pipe the output into tee to split the output between both the terminal window and the file fs_usage.log (you can name the log file with any name you like). Also note that every time you re-run example, the old fs_usage.log will get overwritten. If you want to preserve the data be sure to copy it to another file name. You can also tell tee to append by using the -a option.

Open another terminal and cd into the directory wherever fs_usage.log is being written. In this terminal we will search the log for events we are interested in. In this example I look for all file open events but you can use the same technique to look for specific process names, process IDs, file names, path names, etc.

grep -i open fs_usage.log

In this example output we can see that the process called mdworker is busy opening files in /Applications/Utilities/ . The number following the process name is the thread id (note not the process id). Having the thread ID is particularly useful when debugging an application, trying to pinpoint where file I/O is being triggered from.

As a side note mdworker is part of Spotlight on the Mac and you will see a constant stream of activity from it on OS X systems. You can add mdworker and mds to the exclusion list if you want to skip seeing Spotlight file system traffic in your log.

Finding activity by pathname is easy by simply changing the word “open” to any part of the path you want to search for. Be sure to surround path names with spaces in quotes. For example

grep “Saved Application State” fs_usage.log

This will search for all path names that have “Saved Application State” as part of the pathway.

Be sure to issue a CTRL-C in the terminal where fs_usage is running when you are done. The log file can grow quickly and chew up your disk space in a hurry.

Further Reading

grep is a powerful tool and knowing it fully will make parsing all types of files a breeze. I recommend keeping a pocket reference handy.

The manual page for fs_usage is worth a read, there are many more options than what I have illustrated here. Simply enter man fs_usage in your terminal.

  1. No comments yet.
  1. No trackbacks yet.