Setting up UPS link with an ubuntu server
I've bought a new server (build from parts) to replace the old one that has capacitor leak issues and died †
And with it, I bought a new UPS (Uninterruptible Power Supply) to replace the old one that non longer work either...
I've googled internet to setup this UPS with my linux server (ubuntu server 8.0.4.1) and here is the result of my search and work :
When the UPS detect a power outage, it wait some times
(configured depending on the capacity of the UPS and the
power consumption of the server), and then send a mail
with the UPS state just before the shutdown,
and then shutdown the server.
While waiting some time, if the power is back,
the shutdown is cancelled.
The UPS state is useful as it shows you the health of your UPS when it goes on battery.
- If the charge is really low when the server is shutdown, you might want to reduce the time it wait on battery before shutting down the server or replace the battery (or the UPS).
- If the charge is really close to 100, you might want to increase the time on battery.
Script to send mail are located in my home account /home/thomas.
My UPS is a MGE Ellipse 750VA connected trough USB.
(a serial cable is also available)
NUT install & configure
NUT is the piece of software that will communicate with the UPS, and shut your server down properly.
I've read that MGE is involved in NUT development, which is worth to be mentioned as it's not so often a company do that.
NUT documentation is for nuts ;) (easy joke) Well not that easy to get into.
Instead I've read the following french tutorial written by Olivier Van Hoof and this one also helped : http://wiki.monserveurperso.com/wakka.php?wiki=NutInstall (still in French)
In all configuration files/script excerpt, I'll use some colours to show the link between configuration files. So if you want to change one value to match your config, change all string with the same colour.
Let's install NUT
a piece of cake :
sudo apt-get install nut
Edit configuration files
Note you can download a copy of my configuration files here : http://mansonthomas.com/blogger/download/nut.tar.bz2
Configuration files are in /etc/nut
No configuration files are created by default. If you want example, you can find some in /usr/share/doc/nut/examples/
First we need to tell the system to start nut's daemon when the system starts :
sudo vi /etc/default/nut
change the first two 'no' to 'yes' to do so.
# start upsd
START_UPSD=yes
# start upsmon
START_UPSMON=yes
Now lets declare our UPS :
sudo vi /etc/nut/ups.conf
[MGE-Ellipse750]Within the bracket, you can set your UPS name (no space allowed).
driver = usbhid-ups
port = auto
desc = "MGE UPS Systems UPS"
The most important thing is to find your driver that handle your UPS.
You can find it here : http://eu1.networkupstools.org/compat/stable.html
(beware, this is for the last stable version of NUT which might not be the version installed on your server)
For an USB connection, the port is 'auto'.
If you use a serial cable instead, you should do this :
chmod 0600 /dev/ttyS0
chown nut:nut /dev/ttyS0
and put /dev/ttyS0 instead of auto.
sudo vi /etc/nut/upsd.conf
ACL all 0.0.0.0/0
ACL localhost 127.0.0.1/32
ACCEPT localhost
REJECT all
This file defines who is able to connect. In my case, only localhost.
I guess this files is involved when you have several computer connected to one big UPS.
sudo vi /etc/nut/upsd.users
[thomas]This file defines a user 'thomas' with a password 'test' allowed to connect on localhost, and is allowed to control the UPS (upsmon master)
password = test
allowfrom = localhost
upsmon master
sudo vi /etc/nut/upsmon.conf
# define the ups to monitor and the permissions
MONITOR MGE-Ellipse750@localhost 1 thomas test master
# define the shutdown comand
SHUTDOWNCMD "/sbin/shutdown -h now"
# launch the upssched program to execute script with delay
NOTIFYCMD /sbin/upssched
NOTIFYFLAG ONBATT SYSLOG+WALL+EXEC
NOTIFYFLAG ONLINE SYSLOG+WALL+EXEC
This files basically tells NUT what to do on power outage.
The first instruction defines on which UPS is monitored, on which machine, the user/password used to connect, and the type : this is the master server.
All the line is defined from the previous configuration files we created, so if you change them, change this line too.
The SHUTDOWNCMD is what will be executed to shutdown the server.
NOTIFYCMD is called when the power status change (OB : on battery, OL : on line).
Here we call /sbin/upssched which will allow to say : Ok, if in X seconds power is not back, we do something
Events that trigger the NOTIFYCMD is the two last line (ONBATT and ONLINE)
sudo vi /etc/nut/upssched.conf
# the script to be executed
CMDSCRIPT /home/thomas/scripts/alertAndShutdown.php
# mandatory fields that must be set before AT commands
PIPEFN /var/run/nut/upssched.pipe
LOCKFN /var/run/nut/upssched.lock
# the timers, here 30 sec after the ONBATT (ups on battery) event
AT ONBATT * START-TIMER onbatt 30
# cancel the countdown is power is back
AT ONLINE * CANCEL-TIMER onbatt
In this file, we define, the script called upon the end of timer. (we'll see later this script)
Define 2 technical resources.
The two last line define the behaviour of NUT considering the events, which give in plain english :
On ONBATT event, start a time called onbatt for 30 secs, when those 30 secs are ellapsed the CMDSCRIPT will be called, unless a ONLINE event is recieved.
Now that we've defined our configuration files, we need to secure them :
sudo chown root:nut /etc/nut/*
sudo chmod 640 /etc/nut/*
At first I create a dedicated directory in /var/run named upsshed, but upon server restart, the directory is deleted... so instead I used the existing /var/run/nut directory.
If the directory is not there you get this error :
Oct 8 19:29:27 home upssched[5324]: Failed to connect to parent and failed to create parent: No such file or directory
Restart the NUT daemon
I recommend you to open to console, the first doing this :
sudo tail -f /var/log/daemon.log
that will display what is appended to the daemon.log file.
It's usefull to understand what's going wrong.
Then restart :
sudo service nut stop; sudo service nut start
which gave for me :
the stop :
Oct 8 21:13:48 home upsd[5171]: Signal 15: exiting
Oct 8 21:13:48 home upsmon[5174]: Signal 15: exiting
Oct 8 21:13:48 home upsmon[5173]: upsmon parent: read
Oct 8 21:13:48 home usbhid-ups[5169]: Signal 15: exiting
the start :
Oct 8 21:14:26 home usbhid-ups[5498]: Startup successful
Oct 8 21:14:26 home upsd[5499]: listening on 0.0.0.0 port 3493
Oct 8 21:14:26 home upsd[5499]: Connected to UPS [MGE-Ellipse750]: usbhid-ups-MGE-Ellipse750
Oct 8 21:14:26 home upsd[5500]: Startup successful
Oct 8 21:14:26 home upsmon[5502]: Startup successful
Oct 8 21:14:26 home upsd[5500]: Connection from 127.0.0.1
Oct 8 21:14:26 home upsd[5500]: Client thomas@127.0.0.1 logged into UPS [MGE-Ellipse750]
As I was doing many other thing while I was setting up my UPS, the following may not be exact :
But I had stop/start several time to see thing change upon my tries. (but as I say, maybe I simply just forgot to save the configuration files (many time))
Test the connection with the UPS
Now that NUT is started successfully, we can test that the UPS is here :
thomas@home:/etc/nut$ upsc MGE-Ellipse750@localhost
battery.charge: 100
battery.charge.low: 30
battery.runtime: 2818
battery.type: PbAc
driver.name: usbhid-ups
driver.parameter.pollfreq: 30
driver.parameter.pollinterval: 2
driver.parameter.port: auto
driver.version: 2.2.1-
driver.version.data: MGE HID 1.01
driver.version.internal: 0.32
input.transfer.high: 264
input.transfer.low: 184
outlet.0.desc: Main Outlet
outlet.0.id: 1
outlet.0.switchable: no
outlet.1.desc: PowerShare Outlet 1
outlet.1.id: 2
outlet.1.status: on
outlet.1.switchable: no
outlet.2.desc: PowerShare Outlet 2
output.frequency.nominal: 50
output.voltage: 230.0
output.voltage.nominal: 230
ups.beeper.status: enabled
ups.delay.shutdown: -1
ups.delay.start: -10
ups.load: 0
ups.mfr: MGE OPS SYSTEMS
ups.model: Ellipse 750
ups.power.nominal: 750
ups.productid: ffff
ups.serial: BDCJ2303A
ups.status: OL CHRG
ups.vendorid: 0463
if you want to test the SHUTDOWNCMD instruction, to trigger the event that run shutdown command :
(beware, this will shut down your server if the SHUTDOWNCMD is properly setted)
/sbin/upsmon -c fsd
This should stop the computer.
Now, about the /home/thomas/scripts/alertAndShutdown.php script.
You can find here the set of script used :
http://mansonthomas.com/blogger/download/UPS-scripts.tar.bz2
Download the file, have a look in it to see there nothing bad there (you should always check ;)
cd
tar jxf UPS-scripts.tar.bz2
This would create a scripts dir in your home directory.
First : I use php as my linux script engine because it's simple but yet powerful, and I know this language very well and it's already installed for other need on my linux box.
To get it :
sudo apt-get install php5 php5-cli
Before we proceed we need to add a group.
As I run several scripts with several user (root/thomas/nut etc...), in which there is some private data such as password (linux account/MySQL etc...) we need to do something so that all user that run these scripts can read these files, and only them.
To do so, we'll create a group, and add all those user to this group, and change the file permission so that the file belong to this group.
The script would be in a better place in /usr/local or something but I personally prefer to have them in my home directory on my small personal server. (well maybe I'll move them into /usr/local and create a link to it, it would be far more clean ;)
groupadd scriptExecutor
chmod 770 /home/thomas
sudo chown thomas:scriptExecutor /home/thomas
sudo chown -R thomas:scriptExecutor /home/thomas/scripts
sudo chmod 640 /home/thomas/scripts/config/config.php
sudo chmod 770 /home/thomas/scripts/alertAndShutdown.php
sudo chmod 640 /home/thomas/scripts/mail.php
sudo chmod 640 /home/thomas/scripts/lib/*
#thomas is a sudoers so you need to add admin otherwise you won't be able to sudo again.
sudo usermod -G scriptExecutor,admin thomas
sudo usermod -G scriptExecutor root
sudo usermod -G scriptExecutor nut
Note : if something is wrong in the script permission (from the / to the script, check permission and ownership on each directory) then you'll get this kind of warning :
Oct 8 01:12:48 home upssched[5355]: exec_cmd(/home/thomas/scripts/alertAndShutdown.php onbatt) returned 126
I use phpMailer to send mail :
http://phpmailer.codeworxtech.com/
my mail function is based on the gmail example within the phpMail archive.
What's worth to be notice in these scripts :
- the leading #!/usr/bin/php in alertAndShutdown.php along with chmod 770 allow you to execute the php interpreter and run this script at one time instead of doing #php alertAndShutdown.php
- Ok the core of the script : build an html message with the output of the command upsc
- Write to /var/log/ups.log the fact that a shutdown has been processed by NUT (you can find this information in /var/log/daemon.log, but it contains other logs soo...)
- And the last command which is utterly needed otherwise your server won't stop : "/sbin/upsmon -c fsd" which will tel NUT to run the shutdown cmd.
Test carefully your configuration
To test the script, you have no choice but unplug your UPS (between UPS and your house power outlet. (not between the server and the UPS as I did the first time (lack of sleep is terrible))While you do so, use a console with the "tail -f /var/log/daemon.log" running in it.
Make the following test :
- Unplug the UPS, wait to see in logs that this as been seen by NUT, replug the UPS, and see that the timer is canceled, your server shouldn't stop.
- Unplug the UPS, wait more than 30 seconds (the time we set in upsshed.conf), see that you recieved a mail and that our computer shut down.
- Check that your UPS is fully charged, (battery.charge). Comment the last line of the alertAndShutdown.php script, unplug the UPS, and see how much time your server need to suck all the power of your UPS. Then take 80% of that time, et set it in upsshed.conf in seconds
- Uncomment the last line of alertAndShutdown.php, re run test 3.
For me it gave :
First test :
Broadcast Message from nut@my.server.com
(somewhere) at 22:57 ...
UPS MGE-Ellipse750@localhost on battery
Oct 8 22:57:37 home upsmon[6221]: UPS MGE-Ellipse750@localhost on battery
Oct 8 22:57:37 home upssched[6302]: Timer daemon started
Oct 8 22:57:38 home upssched[6302]: New timer: onbatt (30 seconds)
Broadcast Message from nut@my.server.com
(somewhere) at 22:57 ...
UPS MGE-Ellipse750@localhost on line power
Oct 8 22:57:57 home upsmon[6221]: UPS MGE-Ellipse750@localhost on line power
Oct 8 22:57:57 home upssched[6302]: Cancelling timer: onbatt
Second test :
Broadcast Message from nut@my.server.com
(somewhere) at 23:01 ...
UPS MGE-Ellipse750@localhost on battery
Oct 8 23:01:57 home upsmon[6221]: UPS MGE-Ellipse750@localhost on battery
Oct 8 23:01:57 home upssched[6315]: Timer daemon started
Oct 8 23:01:58 home upssched[6315]: New timer: onbatt (30 seconds)
Oct 8 23:02:28 home upssched[6315]: Event: onbatt
Oct 8 23:02:29 home upsd[6218]: Connection from 127.0.0.1
Oct 8 23:02:29 home upsd[6218]: Client on 127.0.0.1 logged out
Broadcast Message from nut@my.server.com
(somewhere) at 23:02 ...
Executing automatic power-fail shutdown
Broadcast Message from nut@my.server.com
(somewhere) at 23:02 ...
Auto logout and shutdown proceeding
Oct 8 23:02:31 home upsmon[6221]: Signal 10: User requested FSD
Oct 8 23:02:31 home upsd[6218]: Client thomas@127.0.0.1 set FSD on UPS [MGE-Ellipse750]
Oct 8 23:02:31 home upsmon[6221]: Executing automatic power-fail shutdown
Oct 8 23:02:31 home upsmon[6221]: Auto logout and shutdown proceeding
Broadcast message from root@my.server.com
(unknown) at 23:02 ...
The system is going down for halt NOW!
Oct 8 23:02:36 home init: tty4 main process (4928) killed by TERM signal
Oct 8 23:02:36 home init: tty5 main process (4929) killed by TERM signal
Oct 8 23:02:36 home init: tty2 main process (4931) killed by TERM signal
Oct 8 23:02:36 home init: tty3 main process (4934) killed by TERM signal
Oct 8 23:02:36 home init: tty6 main process (4935) killed by TERM signal
Oct 8 23:02:36 home init: tty1 main process (5240) killed by TERM signal
and the mail :
Third test :
My UPS with my server last 14 minutes with an initial battery state at 92% of full charge before reaching the 30% low limit (on which NUT execute the SHUTDOWNCMD anyway).
(so about 15 minutes with full charge)
My server parts are designed to be low power consumer, so the result is not that nice.
Details about server parts can be found here if you're interested : http://spreadsheets.google.com/pub?key=pw1pftHbgb8QakIjz1z_7Ew
I'll do some tweak, like processor speed change (if possible) and stop the harddrive if not used and mount /var/log on an USB key (so that the hard drives can stop)
Final thought
- recheck your UPS from time to time, like every 6 months. UPS lifetime can be bad.
- Of course, all above information is to be used at your own risk and may certainly be improved in many way. If you do so, share ;)
Comments
thanks again, alex
what kind of issue do you have?
All the scripts are there, you just have to modify the config.php file with your gmail account info.
Thomas.
To test it, make a copy of alertAndShudown.php script, comment (with leading //) the last exec line (so that it does not shutdown your machine).
And if your nutdaemon is correctly configured you should recieve a mail.
(or copy line 29 to 33 in a new php file, update the $message values, replace $addresses and $subject by appropriate values and run).
Note: I'm flying to Thailand tomorrow, so don't expect any answer until the beginning of the next month ;)
Lucky you Thailand!!!
alex
P.s. There is one possible typo though:
sudo vi /etc/nut/upsshed.conf
.. should be instead:
sudo vi /etc/nut/upssched.conf
(i.e. I think you may be missing a "c" - at least based on the ".sample" file name in Ubuntu 8.10)
typo corrected.