February 25

BackupPC – rebuilding your /etc/BackupPC/hosts file

BackupPC is terrible at removing old hosts where backups are no longer needed. Over time, it becomes necessary to get old servers out of your BackupPC “Host” list drop-down. You might find yourself in a situation where it becomes necessary or even easier to remove the config .pl files from your pc/ directory, and then just recreate your hosts file. Use the following steps to accomplish just that:

[root@backupserver pc]# for i in `ls /etc/BackupPC/pc | sed ‘s/.pl//g’`; do grep $i /etc/BackupPC/hosts; done > /etc/BackupPC/hosts-NEW
[root@backupserver pc]# cd /etc/BackupPC
[root@backupserver BackupPC]# mv hosts hosts.BAK
[root@backupserver BackupPC]# mv hosts-NEW hosts
[root@backupserver BackupPC]# chown apache:apache hosts
[root@backupserver BackupPC]# /etc/rc.d/init.d/backuppc restart
Shutting down BackupPC: [ OK ]
Starting BackupPC: [ OK ]
[root@backupserver BackupPC]#

Category: Linux | LEAVE A COMMENT
February 22

apache_conf_distiller User data set has no ‘main_domain’ key.

After a server was hacked recently, the attackers replaced all files named index / default / main with their typical 0wned-by message and javascript. Of the “main” files that were affected, the /var/cpanel/userdata/USER/main files were also overwritten. These files contain important cpanel domain information which are required to build a new httpd.conf using the apache_conf_distiller. The following steps should regenerate a working apache userdata in order to fix subdomains. Thanks to Josh for finding userdata_update!

/etc/init.d/httpd stop
mv /usr/local/apache/conf/httpd.conf /usr/local/apache/conf/httpd.conf-notworking
cp -a OLDHTTPD.CONF /usr/local/apache/conf/
mv /var/cpanel/userdata /var/cpanel/userdata-BAK
cp -a /var/cpanel/userdata /usr/local/apache/conf
/etc/init.d/httpd start

Run the /usr/local/cpanel/bin/apache_conf_distiller –update to ensure the main_domain key errors are gone.

February 8

Finding problem CRON jobs

Find cronjobs that were modified recently:

[root@SERVER cron]# find /var/spool/cron -type f -mtime -3 | xargs ls -al

After commenting out suspect lines in the listed user’s crontabs, you can dump the process list to a file every 5 seconds or so with:

[root@SERVER ~]# touch /root/ps-list.txt
[root@SERVER ~]# watch -n 5 "ps aux >> /root/ps-list.txt"

If the server crashes, you can then review the last few lines of /root/ps-list.txt to see which processes appear to be overwhelming the server.

Category: Linux | LEAVE A COMMENT
February 1

Rebuilding raid with mdadm

Look at mdstat to see if a partition has been dropped from the array:

root@SERVER [~]# cat /proc/mdstat 
Personalities : [raid1] 
md0 : active raid1 sdb1[1] sda1[0]
      521984 blocks [2/2] [UU]
md1 : active raid1 sda3[0]
      483668864 blocks [2/1] [U_]

The [U_] shows that sdb3 is out of the array md1. To add /dev/sdb3 back into the array, we do the following:

root@reseller10 [~]# mdadm /dev/md1 -a /dev/sdb3 
mdadm: re-added /dev/sdb3
root@SERVER [~]# cat /proc/mdstat 
Personalities : [raid1] 
md0 : active raid1 sdb1[1] sda1[0]
      521984 blocks [2/2] [UU]

md1 : active raid1 sdb3[1] sda3[0] 483668864 blocks [2/1] [U_] [>....................] recovery = 0.0% (2432/483668864) finish=6448.8min speed=1216K/sec

Running: echo 100000 > /proc/sys/dev/raid/speed_limit_min will speed up the software raid rebuild process:

root@SERVER [~]# cat /proc/mdstat 
Personalities : [raid1] 
md0 : active raid1 sdb1[1] sda1[0]
      521984 blocks [2/2] [UU]
md1 : active raid1 sdb3[2] sda3[0]
      483668864 blocks [2/1] [U_]
      [==>..................]  recovery = 11.0% (53583104/483668864) finish=1030.4min speed=6954K/sec
unused devices: 
Category: Linux | LEAVE A COMMENT
January 28

Environment Monitoring

Setup an APC AP9319 Environment monitoring unit today. It has temperature and humidity probes to do basic environmental monitoring of the are surrounding a rack. I’ve got it rigged up to a light beacon to flash an orange strobe and send an email if the alert criteria is met. See AP9319

According to APC, the AP9319 has since been replaced by their uber expensive NetBotz Rack Monitor 200. It seems like it offers more than you would actually need for individual zone monitoring.

January 27

Exim problem

While attempting to send outgoing mail through our exim relays, our Qmail mailbox was saying:

@400000004b60848425285634 delivery 12799: deferral: Connected_to_216.120.xxx.xxx_but_connection_died._Possible_duplicate!_(#4.4.2)/

On the relay : 2010-01-27 13:25:31 1NaCa3-0000zJ-AT demime acl condition: error while creating mbox spool file 2010-01-27 13:25:34 1NaCa3-0000zJ-AT H=smtp-2.******.net [216.120.xxx.xxx] F=<sherry@domain.com> temporarily rejected after DATA

This was resolved by clearing out the 30,000+ directories in /var/spool/exim/scan on the exim relay server left over from clamd scanning.

/etc/rc.d/init.d/exim stop; cd /var/spool/exim/scan;  find ./ -type d -exec rm -rf {} \; /etc/rc.d/init.d/exim 
Category: Email, Linux | LEAVE A COMMENT