Pull useful information out of your Samba log Andrew Mallett | September 2017

This script reveals who has been using Samba network shares and what files they have been accessing.

Samba allows the sharing of files and other resources between Unix and/or Windows systems. Many Windows users will be familiar with mapping drives to network shares which is known as mounting such drives under Unix/Linux OS.

Like most good Unix services Samba has its own log file which can capture events that range from minimal to way too much information, depending on what's needed. The amount of information captured in the Samba log depends on the log level setting in smb.conf..

; Andys SMB Configuration File

   log level = 1
   log file = /var/log/samba/samba.log

Samba's log level 1 doesn't capture much of interest, however cranking this up to "2" will show filenames and users who have opened these shared files. Unfortunately Samba also puts out shirtloads of stuff you don't need to know including multiple entries for the same file.

Even worse, when opening shares from a Windows 7 system, the whole album name sub-folder structure gets chucked into the samba.log file, whereas only filenames actually opened (played) through a Unix/Linux mounted share make it to the log (see Samba logs and Windows browsing on how to fix this).

Samba logs the whole directory when using Windows

My own requirement was simply to have a list of what music I have listened to. Cranking up to Log level 2 meant not only was there too much information but also the log file tends to grow exponentially. To address this I created a separate custom log file, sambaccess and a script which pulls out the relevant stuff from the samba.log and then deletes it.

As with any log file, the tricky bit is pulling the information you want out of them, so watch out - regular expressions ahead..

Grep, sed, awk, ugh!

The first time I saw a regular expression I thought the cat had sat on my keyboard. At first glance the myriad characters can seem daunting but they are an important part of unlocking the power of Unix and so are worth getting to grips with.

Consider that in the above example in line 4 (we'll talk about the rest later) all I have done is strip out unwanted characters. There are a number of ways of achieving this and some uptight purists get their panties in a wad about doing things the right way. I prefer a more relaxed approach in not giving a rat's arse as long as the required information is not lost and the results are consistent; the actual code is just a means to an end.

That said, there is some value in simplicity (read: minimal code structure) so it's worth sitting back and thinking about just what information is required and the easiest way of stripping out the rest.

Each line I'm after contains the phrase "..<username> opened file.." so grep provides an efficient way of selecting these while ignoring the other unwanted lines..

grep opened

Example samba.log line..

andym opened file P/Portishead/Portishead, Pearl/07 This Life.mp3 read=Yes write=No (numopen=10) 

Working on the unwanted front parts of each line first, we remove the initial 22 characters from each line by cutting the part we want to keep. This effectively removes the above phrase at the beginning, plus a few other initial characters including a first forward slash, in my own samba.log..

cut -c 23-150

Portishead/Portishead, Pearl/07 This Life.mp3 read=Yes write=No (numopen=10) 

The 150 is an arbitrary endpoint (actually up to 150..) which can be increased if lines are longer.

Each function in line 4 is passed on to the next using the | pipe symbol; think of it as a conveyor belt, passing on some code to the next operation.

At this point we reach the first cat-on-a-keyboard expression..

sed s/[^/]*//

/Portishead, Pearl/07 This Life.mp3 read=Yes write=No (numopen=10) 

Here sed is used to delete all characters before the first forward slash / which in my log is the directory name of the artist. The sed s command performs a substitution and in this case substitutes the characters with nothing, which is effectively a deletion.

Next we replace that remaining initial forward slash with a space..

sed 's/\// /'

  Portishead, Pearl/07 This Life.mp3 read=Yes write=No (numopen=10) 

..which provides a nice indent for formatting later.

At this stage the 'conveyor belt of misery' has stripped out all the crap before the required information and we can now focus on the end of each line. Looking at the log I figured that although I didn't need the file extension, I actually didn't want any of the information after the file name so got sed to delete anything after and including the dot..

sed 's/.[^.]*$//'

  Portishead, Pearl/07 This Life

Finally I used sort to get duplicate names adjacent to each other which is needed before summoning uniq to cut out any verbose repetitiveness, finally appending the results to my custom log file sambaccess.

The rest of the script is concerned with appending new records to the top of the custom sambaccess log file (default is the bottom). This is achieved by moving the log file to a temporary name, putting the new data in a new sambaccess log, and then appending the old log to the new, which goes to the bottom end.

Admittedly all of this has been a lot of bollocks to go through, but the payoff is the satisfaction of a tricky job well done which fills me with such a sense of personal fulfillment that I will never need to meditate, take drugs or attend a life coaching workshop. Such is the life of the Unix administrator..