Using AIDE for file integrity monitoring (FIM) on Ubuntu or Debian

PCI-DSS 3.1 section 10.5.5 has the following requirement:

Use file-integrity monitoring or change-detection software on logs to ensure that existing log data cannot be changed without generating alerts (although new data being added should not cause an alert).

For large solutions, I would suggest using a well known tool such as Tripwire Enterprise. However many small to mid size companies that have a small footprint within their card holder data environment (CDE), may not be able to afford this. So what can companies use to meet this requirement? Implement AIDE (Advanced Intrusion Detection Environment).

Taken from the projects website, AIDE creates a database from the regular expression rules that it finds from the config file(s). Once this database is initialized it can be used to verify the integrity of the files.

AIDE is a very simple (yet powerful) program that runs from cron checking your files (typically once a night), and it will scan your system looking for any changes in the directories its monitoring. There are a number of different ways to use this program, but I’ll outline one that I like to use.

My requirements:
1. I want the reports to run nightly.
2. All change reports are emailed to me so I can archive them for a year offsite.
3. Have the database automatically commit the additions, deletions, and changes to baseline each time its ran.

In the event my system was compromised, I want to ensure that the malicious user was not able to modify, or delete my previous reports. Therefore, I choose not to store them on the machine. Its true that once the malicious user gained access to my system, they could change my AIDE config on me, but at least my previous reports will be intact which should help me when determining what malicious changes this user made to my server. Please note that I am making an assumption here that you are already backing up your system nightly, which would include your AIDE database! If you do not currently have a backup strategy in place, get one. Tools such as AIDE helps identify what files a malicious user may have changed, but if they completely crippled the system, you will need to restore from backups.

Setting up AIDE is fairly straight forward. It exists in most of package repositories out there including most variants of Linux and BSD.

On Ubuntu or Debian based systems, you can install it by:

[root@web01 ~]# apt-get update
[root@web01 ~]# apt-get install aide

Now to setup some basic configurations, such as the email notifications, update type, etc, modify the AIDE system configuration file according:

[root@web01 ~]# vim /etc/default/aide
...
FQDN=web01.domain.com
MAILSUBJ="Daily AIDE report for $FQDN"
[email protected]
QUIETREPORTS=no
COMMAND=update
COPYNEWDB=yes
...

Now that AIDE is installed, and the basic preferences are in place, its now time to check out the main configuration files. The default configuration from the upstream provider should give you a reasonable default configuration. But what if you wanted to add your website documentroot to this so you can keep track of what files are changing on your website? The Debian/Ubuntu way of configuring AIDE is a bit different from the CentOS/RHEL method.

All the configuration files resides in /etc/aide/aide.conf.d/. The number of the file appears to be used by the AIDE wrapper to use for deciding which order to process these files. The AIDE documentation seems to indicate that the most general rules should be processed last, so I’ll default to creating my servers profile with 50_aide_CUSTOM-RULES.

So lets say I want to monitor my documentroot, here is how this would be setup:

[root@web01 ~]# vim /etc/aide/aide.conf.d/50_aide_CUSTOM-RULES
...
/var/www/vhosts/domain.com FULL
...

Now AIDE will be keeping track of our website. But adding your site may lead to very noisy reports because most websites implement caching. So this now becomes a balancing act to exclude directories that change often, yet retain enough of your sites critical content. We could just leave the entire directory in AIDE, but I know I personally don’t want to read a change report that contains 1,000 changes every day. So in the case of this wordpress site, I exclude the cache directory by appending the following to my custom configuration:

[root@web01 ~]# vim /etc/aide/aide.conf.d/50_aide_CUSTOM-RULES
...
/var/www/vhosts/domain.com Full
!/var/www/vhosts/domain.com/web/wp-content/cache
...

The “!” means NOT to monitor that specific directory. You will need to run AIDE a few times and fine tune the configuration before you get a report that is useful for your specific needs.

Anytime a change is made to your AIDE configuration, you need to rebuild the AIDE run time configuration, and initialize the database. You do that by:

[root@web01 ~]# update-aide.conf
[root@web01 ~]# aideinit -y -f

Now, try making a basic change to /etc/hosts, then run a check on AIDE to see if it detects the change and emails out the report:

[root@web01 ~]# /etc/cron.daily/aide

If you wanted to just quickly test AIDE to ensure it picks up your changes, but won’t commit them to baseline, you can perform a one-time scan by:

[root@web01 ~]# aide.wrapper

To receive nightly AIDE reports, no further configuration is needed since Ubuntu/Debian already setup a cron job that will run AIDE automatically in /etc/cron.daily/aide. This will run whenever your system normally runs the cron.daily jobs, which is defined in /etc/crontab.

Posted below is an example report that AIDE would send me via email daily:

This is an automated report generated by the Advanced Intrusion Detection 
Environment on web01.domain.com started at 2016-03-07 13:16:35.

AIDE returned with exit code 7. Added, removed and changed files detected!
AIDE post run information
output database /var/lib/aide/aide.db.new was copied to /var/lib/aide/aide.db as requested by cron job configuration
End of AIDE post run information

AIDE produced no errors.

Output of the daily AIDE run (83 lines):
AIDE 0.15.1 found differences between database and filesystem!!
Start timestamp: 2016-03-07 13:16:35

Summary:
  Total number of files:	77937
  Added files:			2
  Removed files:		3
  Changed files:		7


---------------------------------------------------
Added files:
---------------------------------------------------

f++++++++++++++++: /var/log/aide/aide.log.0
d++++++++++++++++: /var/www/vhosts/domain.com/new

---------------------------------------------------
Removed files:
---------------------------------------------------

f----------------: /var/www/vhosts/domain.com/blah
f----------------: /var/www/vhosts/domain.com/test
d----------------: /var/www/vhosts/domain.com/test1

---------------------------------------------------
Changed files:
---------------------------------------------------

f   p.g    . A. .: /var/log/aide/aide.log
d =.... mc.. .. .: /var/spool/postfix/active
d =.... mc.. .. .: /var/spool/postfix/incoming
d =.... mc.. .. .: /var/spool/postfix/maildrop
F =.... mc.. ..  : /var/spool/postfix/public/pickup
F =.... mc.. ..  : /var/spool/postfix/public/qmgr
d =.... mc.. .. .: /var/www/vhosts/domain.com

---------------------------------------------------
Detailed information about changes:
---------------------------------------------------


File: /var/log/aide/aide.log
 Perm     : -rw-------                       , -rw-r-----
 Gid      : 0                                , 4
 ACL      : old = A:
----
user::rw-
group::---
other::---
----
                  D: 
            new = A:
----
user::rw-
group::r--
other::---
----
                  D: 

Directory: /var/spool/postfix/active
 Mtime    : 2016-03-07 13:10:36              , 2016-03-07 13:13:23
 Ctime    : 2016-03-07 13:10:36              , 2016-03-07 13:13:23

Directory: /var/spool/postfix/incoming
 Mtime    : 2016-03-07 13:10:36              , 2016-03-07 13:13:23
 Ctime    : 2016-03-07 13:10:36              , 2016-03-07 13:13:23

Directory: /var/spool/postfix/maildrop
 Mtime    : 2016-03-07 13:10:36              , 2016-03-07 13:13:23
 Ctime    : 2016-03-07 13:10:36              , 2016-03-07 13:13:23

FIFO: /var/spool/postfix/public/pickup
 Mtime    : 2016-03-07 13:12:37              , 2016-03-07 13:17:37
 Ctime    : 2016-03-07 13:12:37              , 2016-03-07 13:17:37

FIFO: /var/spool/postfix/public/qmgr
 Mtime    : 2016-03-07 13:10:36              , 2016-03-07 13:13:36
 Ctime    : 2016-03-07 13:10:36              , 2016-03-07 13:13:36

Directory: /var/www/vhosts/domain.com
 Mtime    : 2016-03-07 13:03:25              , 2016-03-07 13:16:17
 Ctime    : 2016-03-07 13:03:25              , 2016-03-07 13:16:17

End of AIDE output.

The check was done against /var/lib/aide/aide.db with the following characteristics:
 Size     : 13041865
 Bcount   : 25480
 Mtime    : 2016-03-07 13:13:23
 Ctime    : 2016-03-07 13:13:23
 Inode    : 273628
 RMD160   : bIthG3Q5FiJmj4CIYdASjJx5Ygc=
 TIGER    : omto0nb3/oIqIiKHEjnbhjvXeGdfycbV
 SHA256   : VJPGKy61GxGfcSrjJFbrP879y/skJaiQ
 SHA512   : 7pz3FdYh8TvoNOqjxWBToZQNG6oxmrrp
 CRC32    : 1dYwqA==
 HAVAL    : LBFzyApqoYn7ogzoROG5FpneBO1s7R3p
 GOST     : iJ1tWPLtYaxxoFDHZEW8gxCS3/pVlS1G

The AIDE run created a new database /var/lib/aide/aide.db.new with the following characteristics:
 Size     : 13041834
 Bcount   : 25480
 Inode    : 273627
 RMD160   : 4TKRFSc0nt/VGDVvPEY8U6YNzaw=
 TIGER    : o4RzDHHWBlH+Zt3P7vI8GHHgGV1OecrC
 SHA256   : Gher/aINaU8r73/lQEWLQQSsKqP7sGjO
 SHA512   : D0/w3S6NOLZHw7D7dt1QxYBXe6miP5hF
 CRC32    : 5SRdpg==
 HAVAL    : pe7+ai57TPpW34NjJgTQxs+cQsFJ9zq0
 GOST     : RrIiyspbpKEb5wEGSG2HTYM7N6NUtKSv

End of AIDE daily cron job at 2016-03-07 13:18, run time 102 seconds

So this reports tells me that a log file for AIDE was rotated out, a new folder was created in my DocumentRoot called new, and the files/folders blah, test, and test1 where removed from my DocumentRoot.

Please remember that utilizing a tool to provide file integrity monitoring is only one part of a defense in depth strategy. There is no silver bullet for system security, but every layer you add will increase your security footprint which helps you with taking a proactive approach to security.

Chroot SFTP-only users

In an environment where you have multiple developers working on different sites, or perhaps you have multiple clients hosting their websites on your solution, restricting access for those users can become important.

Security becomes a concern in a normal FTP environment as it doesn’t take much for a user to simply ‘cd ..’ and see what other users are on your server. We want a way to simply lock those users only into their home directories so they cannot essentially ‘break out’.

It is important to consider how you are going to give those chrooted SFTP users access to their directories. In a chroot, simply using a symlink will not work as the filesystem will have no knowledge of the data outside that chroot. Therefore you would have to consider either:

1. Chrooting the user to their web sites home directory
2. Chrooting the user to their home directory, then create a bind mount to their website.

Both have their pro’s and con’s, however it could be argued that chrooting them to their home directory and using bind mounts is more secure since its offers an added layer of security since you are not relying solely on permissions, but also on the chroot itself. For the purposes of this article, we are going to default to chrooting users to their home directory, then creating a bind mount to their website.

To get started, first, create the restrict SFTP-only group

[root@web01 ~]# groupadd sftponly

Next, edit the sshd config to setup the internal-sftp subsystem. You will need to comment out the first entry as shown below:

[root@web01 ~]# vim /etc/ssh/sshd_config
...
# Subsystem       sftp    /usr/libexec/openssh/sftp-server
Subsystem     sftp   internal-sftp
...

Now at the very bottom of /etc/ssh/sshd_config, setup the following block. It is important that this is created at the very end of the file:

[root@web01 ~]# vim /etc/ssh/sshd_config
...
Match Group sftponly
     ChrootDirectory %h
     X11Forwarding no
     AllowTCPForwarding no
     ForceCommand internal-sftp
...

Then restart SSHD by:

[root@web01 ~]# service sshd restart

Now that the foundation is complete, we can add chrooted SFTP-only users. You will notice I will set the home directory to /home/chroot/bob. This is optional, but I prefer it so you can quickly tell the difference between regular users and SFTP-only users. To create a new user called bob with the proper group assignments and permissions:

[root@web01 ~]# mkdir -p /home/chroot
[root@web01 ~]# useradd -d /home/chroot/bob -s /bin/false -G sftponly bob
[root@web01 ~]# passwd bob
[root@web01 ~]# chmod 755 /home/chroot/bob
[root@web01 ~]# chown root:root /home/chroot/bob

Users will not be able to write any data within their home directory since the home directory MUST be owned by root. If they want to be able to write files, create them a writable directory by:

[root@web01 ~]# mkdir /home/chroot/bob/files
[root@web01 ~]# chown bob:bob /home/chroot/bob/files

Now to allow them to access their content in /var/www/vhosts/domain.com, you need to create a bind mount:

[root@web01 ~]# vim /etc/fstab
...
/var/www/vhosts/domain.com   /home/chroot/bob/domain.com        none    bind    0 0
...

Finally, create the placeholder folder, and mount the bind mount:

[root@web01 ~]# mkdir /home/chroot/bob/domain.com
[root@web01 ~]# mount -a

Confirm user bob is setup in the right group, and that the root directory ‘/home/chroot/bob/domain.com has group writable perms. In my specific example, as the directory has the ownership apache:apache, I had to do the following:

[root@web01 ~]# usermod -a -G apache bob
[root@web01 ~]# chmod 775 /var/www/vhosts/domain.com

Benchmark MySQL with Sysbench

Tuning your MySQL configuration day in and day out without having an idea of what the hardware of the server can actually do in a perfect world can be a bit frustrating. This is where a tool like sysbench comes into play. Sysbench can allow you to get an idea of how MySQL will perform on your chosen server under load, using a basic set of tests.

It is important to note that this guide will not show you how to benchmark your existing MySQL dataset, but instead, it shows how your overall server will react to a generic MySQL dataset under heavy load.

Situations where this becomes useful is when you want to swap those SAS drives with SSD’s, or perhaps performing a comparison between running MySQL on a server vs using something like Amazon RDS or Rackspace Cloud Databases. It allows you to get a feel for where the bottlenecks may potentially come into play. Perhaps from IO, network saturation, CPU, etc.

Getting started with sysbench is pretty straight forward. I’ll outline how to create the test dataset, then perform a few benchmarks off that dataset. For the purposes of this article, I am most concerned about how many transactions per second MySQL can handle on my server in a perfect world.

First, log into your database server, and create a new test database. Do not attempt to use an existing database with content as sysbench will be populating it with its own tables. I posted 2 grant user statements on purpose. Set the access, username, and password as needed for your environment:

[root@db01 ~]# mysql
mysql> create database sbtest;
mysql> grant all on sbtest.* to 'sysbench'@'%' identified by 'your_uber_secure_password';
mysql> grant all on sbtest.* to 'sysbench'@'localhost' identified by 'your_uber_secure_password';
mysql> flush privileges;

Next, log into your server running sysbench, and install it:

# CentOS 6
[root@sysbench01 ~]#  rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
[root@sysbench01 ~]#  yum install sysbench

# CentOS 7
[root@sysbench01 ~]#  rpm -ivh http://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-5.noarch.rpm
[root@sysbench01 ~]#  yum install sysbench

# Ubuntu 12.04 / Ubuntu 14.04
[root@sysbench01 ~]#  apt-get update
[root@sysbench01 ~]#  apt-get install sysbench

On the sysbench server, run sysbench with the prepare statement so it can generate a table with data to be used during the benchmark. This command will populate a table in the sbtest database with 1,000,000 rows of data, and force innodb:

[root@sysbench01 ~]# sysbench --test=oltp --oltp-table-size=1000000 --mysql-host=192.168.1.1 --mysql-db=sbtest --mysql-user=sysbench --mysql-password=your_uber_secure_password --db-driver=mysql --mysql-table-engine=innodb prepare

You can verify the table was written properly on your database server by:

[root@db01 ~]# mysql
mysql> use sbtest;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
+------------------+
| Tables_in_sbtest |
+------------------+
| sbtest           |
+------------------+
1 row in set (0.00 sec)

mysql> select count(*) from sbtest;
+----------+
| count(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.13 sec)

Back on the server you are running sysbench on, we are going to run a benchmark using a read/write test (–oltp-read-only=off), for a max time of 60 seconds using 64 threads, with the test mode set to complex (range queries, range SUM, range ORDER by, inserts and updates on index, as well as non-index columns, delete rows).

[root@sysbench01 ~]# sysbench --test=oltp --oltp-table-size=1000000 --mysql-host=192.168.1.1 --mysql-db=sbtest --mysql-user=sysbench --mysql-password=your_uber_secure_password --max-time=60 --oltp-test-mode=complex --oltp-read-only=off --max-requests=0 --num-threads=64 --db-driver=mysql run

sysbench 0.4.12:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 64

Doing OLTP test.
Running mixed OLTP test
Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Using auto_inc on the id column
Threads started!
Time limit exceeded, exiting...
(last message repeated 63 times)
Done.

OLTP test statistics:
    queries performed:
        read:                            1932084
        write:                           690030
        other:                           276012
        total:                           2898126
    transactions:                        138006 (2299.32 per sec.)
    deadlocks:                           0      (0.00 per sec.)
    read/write requests:                 2622114 (43687.09 per sec.)
    other operations:                    276012 (4598.64 per sec.)

Test execution summary:
    total time:                          60.0203s
    total number of events:              138006
    total time taken by event execution: 3839.0815
    per-request statistics:
         min:                                  8.76ms
         avg:                                 27.82ms
         max:                                313.65ms
         approx.  95 percentile:              50.64ms

Threads fairness:
    events (avg/stddev):           2156.3438/34.49
    execution time (avg/stddev):   59.9856/0.01

Lets say you want to run the same test, but perform the test using read only queries:

[root@sysbench01 ~]# sysbench --test=oltp --oltp-table-size=1000000 --mysql-host=192.168.1.1 --mysql-db=sbtest --mysql-user=sysbench --mysql-password=your_uber_secure_password --max-time=60 --oltp-test-mode=complex --oltp-read-only=on --max-requests=0 --num-threads=64 --db-driver=mysql run

Here is an example of running the test in read/write mode, and disconnecting and reconnecting after each query:

[root@sysbench01 ~]# sysbench --test=oltp --oltp-table-size=1000000 --mysql-host=192.168.1.1 --mysql-db=sbtest --mysql-user=sysbench --mysql-password=your_uber_secure_password --max-time=60 --oltp-test-mode=complex --oltp-read-only=off --max-requests=0 --num-threads=64 --db-driver=mysql --oltp-reconnect-mode=query run

Once you are done with your testing, you can clean up the the database by:

[root@db01 ~]# mysql
mysql> drop database sbtest;
mysql> DROP USER 'sysbench'@'localhost';
mysql> DROP USER 'sysbench'@'%';
mysql> flush privileges;
mysql> quit

Load testing with Siege

Taken directly from the authors site at https://www.joedog.org/siege-home: Siege is an http load testing and benchmarking utility. It was designed to let web developers measure their code under duress, to see how it will stand up to load on the internet. Siege supports basic authentication, cookies, HTTP, HTTPS and FTP protocols. It lets its user hit a server with a configurable number of simulated clients. Those clients place the server “under siege.”

This tool becomes extremely useful when you need to get a feel for how a solution will handle under normal or high traffic events. During these simulated traffic events, it may help expose inefficient database queries, CPU intensive code, opportunities for setting up caching, or simply demonstrate the need for having to add additional web servers to increase overall site capacity.

Unlike many other load testers out there, Siege allows you to populate a file with a listing of your URL’s to help generate a more realistic simulation. While Siege can support a number of different tests, I generally keep it simple and basic. I’ll outline how I utilize it below.

First, install siege:

# CentOS 6 / RedHat 6
[root@loadtest01 ~]# rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
[root@loadtest01 ~]# yum install siege

# CentOS 7 / RedHat 7
[root@loadtest01 ~]# rpm -ivh http://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-5.noarch.rpm
[root@loadtest01 ~]# yum install siege

# Ubuntu 12.04 / Ubuntu 14.04
[root@loadtest01 ~]# apt-get update
[root@loadtest01 ~]# apt-get install siege

Now click around your site recording about 10 or more URL’s. Make sure it includes various sections of your site that people would likely be visiting. For example, if running an e-commerce site, be sure to include the base domain several times since that will be accessed most often usually. But also include several landing pages, a couple of products, and maybe the shopping cart. For example:

[root@loadtest01 ~]# vim /root/site.txt
http://example-store.com
http://example-store.com
http://example-store.com/cameras
http://example-store.com/electronics
http://example-store.com/home-decor/decorative-accents
http://example-store.com/home-decor/decorative-accents/herald-glass-vase
http://example-store.com/apparel/shirts
http://example-store.com/home-decor/books-music
http://example-store.com/home-decor/books-music/a-tale-of-two-cities.html
http://example-store.com/sale.html
http://example-store.com
http://example-store.com

You should now be able to run your load test:

[root@loadtest01 ~]# siege -c50 -d3 -t30M -i -f /root/site.txt

This load test will be sending 50 concurrent connections, with a random delay between 1 and 3 seconds, lasting for 30 minutes against the url’s posted in /root/site.txt.

A couple quick notes about the flags:

-c, --concurrent=NUM      CONCURRENT users, default is 10
-d, --delay=NUM           Time DELAY, random delay before each request
                            between 1 and NUM. (NOT COUNTED IN STATS)
-t, --time=NUMm           TIMED testing where "m" is modifier S, M, or H
                            ex: --time=1H, one hour test.
-i, --internet            INTERNET user simulation, hits URLs randomly.
-f, --file=FILE           FILE, select a specific URLS FILE.

While the simulated traffic test is running, things you will want to watch your solution for include:
– Observe the CPU, Memory, IO, and Memory usage of the servers.
– Check the database to see if there are any intensive queries consistently running, perhaps indicating the need for redis or memcached.
– Check the MySQL slow query log to see if there are queries that may need a table index, or otherwise need to be optimized.
– Check that any caching software you have installed is returning a good hit rate.
– Ensuring the site remains online during the tests.

Sometimes you want to load test a website that has a username and password prompt provided by an htaccess file. To allow siege to authenticate, do the following:

[root@loadtest01 ~]# auth=`echo -n 'username:password' | openssl base64`
[root@loadtest01 ~]# siege -c50 -d3 -t30M --header="Authorization:Basic $auth" -i -f /root/site.txt

How to resize ext3 or ext4 filesystems

You have a 50G Amazon Elastic Block Storage device, or maybe a 50G Rackspace SSD Cloud Block Storage device that is running low on space. So you clone it and create a 100G drive. But when you mount it on your system, it still only shows 50G. What the #$%&!

This is because the partition and filesystem needs to be expanded to know it has more space!

Assuming that you have the larger drive already attached to your system, first verify the drive size vs the partition size:

[root@web01 ~]# lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
xvda    202:0    0   20G  0 disk 
└─xvda1 202:1    0   20G  0 part /
xvdb    202:16   0  100G  0 disk 
└─xvdb1 202:17   0   50G  0 part

Checking the output above, you will see that xvdb has 100G available, but the partition, xvdb1, has only 50G. I’ll outline how to expand the filesystem to use all the new space on this ext4 volume.

Unmount the disk if its currently mounted:

[root@web01 ~]# umount /dev/xvdb1

Confirm the drive does not have any filesystem errors:

[root@web01 ~]# e2fsck /dev/xvdb1

Now comes the nail biting part. We need to remove the partition, then recreate it so it will see the entire disk. This guide assumes there is only 1 partition on this drive. This will not remove the data, however never trust notes that you didn’t verify yourself. Make sure you have backups before proceeding!

[root@web01 ~]# fdisk /dev/xvdb
d
n
p
1
enter
enter
t
83
w

Now, once the partition is recreated, you need to run another filesystem check:

[root@web01 ~]# e2fsck -f /dev/xvdb1
e2fsck 1.41.12 (17-May-2010)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/xvdb1: 13/3276800 files (0.0% non-contiguous), 251700/13107024 blocks

Then resize the filesystem:

[root@web01 ~]# resize2fs /dev/xvdb1
resize2fs 1.41.12 (17-May-2010)
Resizing the filesystem on /dev/xvdb1 to 26214055 (4k) blocks.
The filesystem on /dev/xvdb1 is now 26214055 blocks long.

Finally, mount the volume and confirm it shows the correct space:

[root@web01 ~]# mount -a
[root@web01 ~]# lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
xvda    202:0    0   20G  0 disk 
└─xvda1 202:1    0   20G  0 part /
xvdb    202:16   0  100G  0 disk 
└─xvdb1 202:17   0  100G  0 part /mnt

[root@web01 ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda1       20G  1.2G   18G   7% /
tmpfs           496M     0  496M   0% /dev/shm
/dev/xvdb1       99G   60M   94G   1% /mnt

Logrotate examples

Logrotate is a useful application for automatically rotating your log files. If you choose to store certain logs in directories that logrotate doesn’t know about, you need to create a definition for this.

I have posted a few articles about this for various scenarios, but I wanted to include one that just contains examples for reference.

Typically, entires for logrotate should be stored inside of: /etc/logrotate.d/

To rotate out the MySQL slow query log after it reaches 125M in size, and have a retention rate of 4 logs, use the following:

[root@db01 ~]# vim /etc/logrotate.d/mysqllogs
/var/lib/mysql/slow-log {
        missingok
        rotate 2
        size 125M
        create 640 mysql mysql
}

To rotate out a custom log file for your application daily, and keep 7 day’s worth of logs, compress it, and ensure the ownership stays owned by apache:

[root@web01 ~]# vim /etc/logrotate.d/applicationname
/var/www/vhosts/example.com/application/logs/your_app.log {
        missingok
        daily
        rotate 7
        compress
        create 644 apache apache
}

If you would like to rotate your Holland backup logs weekly, keeping one months worth of logs, compress it, and ensure the ownership stays owned by root:

[root@db01 ~]# vim /etc/logrotate.d/holland
/var/log/holland/holland.log {
    rotate 4
    weekly
    compress
    missingok
    create root adm
}

If you would like to rotate out 2 logs, using one defination, simply add it to the first line as shown below:

[root@db01 ~]# vim /etc/logrotate.d/holland
/var/log/holland.log
/var/log/holland/holland.log {
    rotate 4
    weekly
    compress
    missingok
    create root adm
}

Postfix – Flush mail queue

I have seen servers where the Postfix mail queue is jammed up with mail, perhaps from a programming error on the application, or maybe spam if they were web hacked. During these times, you may just want to purge the queue since you don’t want the messages going out. Below are some very simple methods of doing this:

How to remove all mail in the postfix queue:

[root@web01 ~]# postsuper -d ALL

How to remove all email in the deferred queue

[root@web01 ~]# postsuper -d ALL deferred

Remove all email in the queue for a specific domain:

[root@web01 ~]# postqueue -p | tail -n +2 | awk 'BEGIN { RS = "" } /@example\.com/ { print $1 }' | tr -d '*!' | postsuper -d -

Remove all email in the queue from a specific email address:

[root@web01 ~]# postqueue -p | tail -n +2 | awk 'BEGIN { RS = "" } /user@example\.com/ { print $1 }' | tr -d '*!' | postsuper -d -

Force fsck on next reboot

Ever needed to do a fsck on the root filesystem? I was happy to find that an old method of forcing a file system check on reboot still exists on most modern distributions. I tested this successfully on the following operating systems:

CentOS 6
Ubuntu 12.04
Ubuntu 14.04

Before proceeding, it is a really good idea to have a monitor and keyboard hooked up to the server just in case the fsck requires manual intervention due to bad blocks or something, therefore preventing normal boot up!

First, check to see when the last time the server had a file system check ran. This command should be against the device listing for /, such as /dev/sda1, or if using LVM, something like /dev/mapper/ubuntu–vg-root.

[root@web01 ~]# tune2fs -l /dev/mapper/ubuntu--vg-root |grep -iE "last|state"
Last mounted on:          /
Filesystem state:         clean
Last mount time:          Thu Sep 10 23:48:18 2015
Last write time:          Sun Mar  1 16:02:06 2015
Last checked:             Sun Mar  1 16:02:04 2015

As shown in the output above, a fsck has not been ran in quite some time! So to force a fsck at next boot, simply type:

[root@web01 ~]# touch /forcefsck

Now reboot your server:

[root@web01 ~]# shutdown -r now

Once the server comes back online, confirm the fsck ran by:

[root@web01 ~]# tune2fs -l /dev/mapper/ubuntu--vg-root |grep -iE "last|state"
Last mounted on:          /
Filesystem state:         clean
Last mount time:          Thu Feb 18 18:40:34 2016
Last write time:          Thu Feb 18 18:40:32 2016

Finally, check to confirm the system removed the file /forcefsck:

[root@web01 ~]# ls -al /forcefsck
ls: cannot access /forcefsck: No such file or directory

Varnish 3 – Installation and configuration

Varnish is a HTTP reverse proxy that can be installed in front of the web server to provide caching. If the VCL’s are properly configured for your site, Varnish can greatly offset the backend server load many times over.

In this guide, it is assumed you already have a running LAMP stack, and Apache’s vhost configurations are stored in /etc/httpd/vhost.d/. At the end of this guide if all goes well, you will be running Varnish 3, with Varnish listening for inbound connections on port 80, and passing any backend connections that cannot be served via cache to Apache on port 8080.

CentOS 6 – Installation and initial configuration

Install the varnish-release package repository, then install:

[root@web01 ~]# rpm --nosignature -ivh https://repo.varnish-cache.org/redhat/varnish-3.0.el6.rpm
[root@web01 ~]# yum -y install varnish

Now update your Apache ports and vhosts to 8080 since Varnish will be listening on port 80:

[root@web01 ~]# sed -i "s/Listen 80\$/Listen 8080/g" /etc/httpd/ports.conf
[root@web01 ~]# sed -i "s/NameVirtualHost \*:80\$/NameVirtualHost \*:8080/g" /etc/httpd/ports.conf
[root@web01 ~]# sed -i "s/:80>/:8080>/g" /etc/httpd/vhost.d/*

Configure Varnish to pass connections back to Apache on port 8080:

[root@web01 ~]# sed -i 's/port = "80"/port = "8080"/g' /etc/varnish/default.vcl

Then update Varnish so it listens on port 80:

[root@web01 ~]# sed -i 's/VARNISH_LISTEN_PORT=6081$/VARNISH_LISTEN_PORT=80/g' /etc/sysconfig/varnish

Finally, restart Apache and Varnish:

[root@web01 ~]# service httpd restart
[root@web01 ~]# service varnish start
[root@web01 ~]# chkconfig varnish on

Ubuntu 12.04 / 14.04 – Installation and initial configuration

First, setup the Varnish repos:

# Ubuntu 12.04
[root@web01 ~]# curl -sL http://repo.varnish-cache.org/debian/GPG-key.txt | apt-key add -
[root@web01 ~]# echo "deb http://repo.varnish-cache.org/ubuntu/ precise varnish-3.0" > /etc/apt/sources.list.d/varnish.list

# Ubuntu 14.04
[root@web01 ~]# curl -sL http://repo.varnish-cache.org/debian/GPG-key.txt | apt-key add -
[root@web01 ~]# echo "deb http://repo.varnish-cache.org/ubuntu/ trusty varnish-3.0" > /etc/apt/sources.list.d/varnish.list

Now install Varnish:

[root@web01 ~]# apt-get update
[root@web01 ~]# apt-get install varnish

Next update your Apache ports and vhosts to 8080 since Varnish will be listening on port 80:

[root@web01 ~]# sed -i "s/Listen 80\$/Listen 8080/g" /etc/apache2/ports.conf
[root@web01 ~]# sed -i "s/NameVirtualHost \*:80\$/NameVirtualHost \*:8080/g" /etc/apache2/ports.conf
[root@web01 ~]# sed -i "s/:80>/:8080>/g" /etc/apache2/sites-available/*

Configure Varnish to pass connections back to Apache on port 8080:

[root@web01 ~]# sed -i 's/port = "80"/port = "8080"/g' /etc/varnish/default.vcl

Then update Varnish so it listens on port 80:

[root@web01 ~]# sed -i 's/^DAEMON_OPTS="-a :6081/DAEMON_OPTS="-a :80/g' /etc/default/varnish
[root@web01 ~]# sed -i 's/START=no/START=yes/' /etc/default/varnish
[root@web01 ~]# service apache2 restart
[root@web01 ~]# service varnish restart

Varnish VCL configuration examples

All of the tunings take place within the vcl’s. For the purpose of this guide, we are going to just use the default varnish configuration file in /etc/varnish/default.vcl for our examples.

How to enable basic caching of static resources:

sub vcl_recv {
...
if (req.url ~ "\.(html|gif|jpg|jpeg|png|js|css)$") {
         unset req.http.cookie;
         return(lookup);
     }
 return(pass);
 }
...

If the request is coming in from CloudFlare or a load balancer, here is how to set the real IP of the client:

sub vcl_recv {
...
     if (req.restarts == 0) {
        if (req.http.x-forwarded-for) {
            set req.http.X-Forwarded-For =
                req.http.X-Forwarded-For + ", " + client.ip;
        } else {
            set req.http.X-Forwarded-For = client.ip;
        }
     }
...

Here is an example of how to exclude things like phpmyadmin, apc.php, and server-status from being cached:

sub vcl_recv {
...
if (req.url ~ "(?i)/(phpmyadmin|apc.php|server-status)") {
      return(pass);
    }
...

Here is how you can exclude a specific URL from being cached:

sub vcl_recv {
...
     if (req.url ~ "^/example") {
     return (pass);
     }
...

Perhaps you have 30 domains on the server, and you need one of them to be excluded from the cache. Or maybe your actively working on the site. Here is how you can prevent the domain from being served through varnish:

sub vcl_recv {
...
    if (req.http.host ~ "^(www.)?domain.com") {
    return (pass);
    }
...

If you find a script running via your browser, and suspect it is timing out due to varnish, you can adjust the timeout on that specific script by:

sub vcl_recv {
...
if (req.url == "^/bigscript.php") {
    set bereq.first_byte_timeout = 10m;
}
...

Here is an example of how to never cache PUT and DELETE requests for a domain:

sub vcl_recv {
...
if ( req.http.host == "subdomain.domain.com" ) {
    if (req.method == "PUT" || req.method == "POST" || req.method == "DELETE")
    {
        return(pass);
    }
}
...

Varnish Troubleshooting

One of the most common errors I see on sites utilizing Varnish is a error message:

Error 503 Service Unavailable
Guru Meditation:
XID: 1234567

Typically Varnish is not the problem, but instead its something else such as Apache or PHP-FPM (aka the backend) not being available. If you can replicate the error in your browser, then run the following command so you can see if you can catch the issue as its happening in the logs:

[root@web01 ~]# varnishlog -d -c -m TxStatus:503

This will return a bunch of output. You are most interesting in the lines surrounding ‘FetchError’ as shown below:

   11 SessionOpen  c 192.168.1.56 60015 :80
   11 ReqStart     c 192.168.1.56 60015 311889525
   11 RxRequest    c GET
   11 RxURL        c /
   11 RxProtocol   c HTTP/1.1
   11 RxHeader     c Host: example.com
   11 RxHeader     c Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
   11 RxHeader     c Connection: keep-alive
   11 RxHeader     c Cookie: wordpress_test_cookie=WP+Cookie+check; wp-settings-time-1=1455921695
   11 RxHeader     c User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_3) AppleWebKit/601.4.4 (KHTML, like Gecko) Version/9.0.3 Safari/601.4.4
   11 RxHeader     c Accept-Language: en-us
   11 RxHeader     c DNT: 1
   11 RxHeader     c Accept-Encoding: gzip, deflate
   11 VCL_call     c recv pass
   11 VCL_call     c hash
   11 Hash         c /
   11 Hash         c example.com
   11 VCL_return   c hash
   11 VCL_call     c pass pass
   11 FetchError   c no backend connection
   11 VCL_call     c error deliver
   11 VCL_call     c deliver deliver
   11 TxProtocol   c HTTP/1.1
   11 TxStatus     c 503
   11 TxResponse   c Service Unavailable
   11 TxHeader     c Server: Varnish
   11 TxHeader     c Content-Type: text/html; charset=utf-8
   11 TxHeader     c Retry-After: 5
   11 TxHeader     c Content-Length: 418
   11 TxHeader     c Accept-Ranges: bytes
   11 TxHeader     c Date: Fri, 19 Feb 2016 23:04:03 GMT
   11 TxHeader     c X-Varnish: 311889525
   11 TxHeader     c Age: 0
   11 TxHeader     c Via: 1.1 varnish
   11 TxHeader     c Connection: close
   11 Length       c 418
   11 ReqEnd       c 311889525 1455923043.127803802 1455923043.128304243 0.000658751 0.000423908 0.000076532

And in the example above, where is has ‘FetchError’, it gave the 503 as Apache was not running. Which was why is says: “no backend connection’.

MySQL Slow Query Log

Ever wonder why MySQL is spiking the CPU on your database server during the worst possible times? Then when you log into MySQL and run a ‘show processlist;’, you can never seem to catch exactly what query was running? The MySQL slow query log can help!

In short, the slow query log is a recording of all queries that took longer then a specified period of time to run. This information becomes valuable as it will help identify which queries are very intensive, which may lead you to having to create a table index, or perhaps showing you a plugin for you site that is performing queries in a very inefficient manner.

On most systems, the slow query log is disabled by default. However enabling it is very simple and can be applied without restarting MySQL.

Getting started, first create the log file, and set the proper permissions:

[root@db01 ~]# touch /var/lib/mysql/slow-log
[root@db01 ~]# chown mysql:mysql /var/lib/mysql/slow-log
[root@db01 ~]# chmod 640 /var/lib/mysql/slow-log

Now enable the slow query log without restarting the MySQL. The commands below will instruct MySQL to log any query that takes longer then 2 seconds:

[root@db01 ~]# mysql
mysql> SET GLOBAL slow_query_log=1;
mysql> SET GLOBAL slow_query_log_file="/var/lib/mysql/slow-log";
mysql> SET GLOBAL long_query_time=2

Now, update the systems my.cnf so the changes will persist if MySQL is restarted in the future:

[root@db01 ~]# vim /etc/my.cnf
[mysqld]
...
#log-output = FILE
slow-query-log = 1
slow-query-log-file = /var/lib/mysql/slow-log
long-query-time = 2
#log-queries-not-using-indexes = 0
...

Finally, don’t forget to setup log rotation for the slow query log as this can grow very large.

[root@db01 ~]# vim /etc/logrotate.d/mysqllogs
/var/lib/mysql/slow-log {
        missingok
        rotate 2
        size 125M
        create 640 mysql mysql
}

Now with the slow query log enabled, simple check the log file to see what queries took longer than 2 seconds to run. I’ll post an example below:

[root@db01 ~]# cat /var/lib/mysql/slow-log
# Time: 160210 22:45:25
# User@Host: wpadmin[wordpressdb] @ localhost []
# Query_time: 14.609104  Lock_time: 0.000054 Rows_sent: 4  Rows_examined: 83532
SET timestamp=1301957125;
SELECT * FROM wp_table WHERE `key`='5544dDSDFjjghhd2544xGFDE' AND `carrier`='13';

If the above query is shown repeatedly within the log file, perhaps running several times a minute, this could be a possible culprit for our CPU issues with MySQL. The key fields to pay attention to are “Rows_examined:” and the actual query itself.

If the “Rows_examined:” field is reporting over a few hundred, that could mean that you may need to create a table index. The query itself is also important because if a table index is not possible for one reason or another, your developer will be able to review the query to possible try optimizing or rewriting it, so it returns less rows to MySQL, therefore making the query more CPU friendly.