I was digging through my virtual hosts looking at log files and noticed that a few of them had pretty massive access logs. One of the more popular sites I run for some friends, American K-Pop Fans, had an access log of 363mb, and I’ve only been running the site for a few weeks! That obviously wasn’t going to work, so I started looking up how to manage log files. I noticed that Apache seemed to do an awesome job of keeping logs organized in /var/log/apache2/ and figured I should be able to model my log cleanliness after them.
After some Googling, I stumbled onto rotatelogs. After fumbling with it, although it’s pretty cool, I discovered that this wasn’t quite what I wanted. I did some more Googling and discovered logrotate, a program built into Linux for managing large amounts of logs. The two almost identical names confused me at first, but the difference became clear pretty fast.
rotatelogs is a really simple program for automatically breaking log files apart when Apache adds an entry. It does this by piping the log file entry through the program which then decides if it needs to create a new log file or can use the existing one. You can choose when to create a new log file based on time or log size. logrotate, on the other hand, is a Linux command-line utility which runs as a cron job every day. It runs all scripts in /etc/logrotate.d/. Looking in that directory, there’s an apache2 script which keeps Apache’s log files nice and tidy. There are also a variety of others, depending on what’s installed.
Both Linode’s logrotate article and Slicehost’s logrotate article helped me setup logrotate for my virtualhosts. Here’s what mine looks like:
1 2 3 4 5 6 7 8 9 10 | /srv/www/*/logs/*.log { rotate 14 daily compress delaycompress sharedscripts postrotate /usr/sbin/apache2ctl graceful > /dev/null endscript } |
The idea is pretty simple. Line by line:
- I list all of my virtual host log paths. All of my virtual hosts follow the same directory structure, so I can get away with wildcard usage like this.
- I tell it that I want to keep 14 days of previous log files.
- I tell it that I want it to run daily.
- I tell it that I want old log files compressed to save space.
- I tell it that it should delay compressing the most recent archived log file.
- I tell it that all of the virtual hosts listed on line 1 should be processed before the following script runs.
- I tell it to restart Apache gracefully (no open connections will be closed, and old log files won’t be closed immediately). The reason we use
delaycompressis because we don’t want to compress the most recent log file until we’re sure Apache is done with it.
That’s it! This simple script maintains all my log files for me so that I don’t have to worry about them growing out of control.