EMC XtremIO

6a00e552e53bd28833019aff2b85a5970bWell It’s been a while since I posted  but I got some new exciting stuff to talk about!

We’ve purchased our first EMC XtremIO array!   The system is 2x 20T Xbricks which gives us around 30TB raw capacity.  Compression and dedupe brings us to around 210TB usable storage, this is based on a  7:1  compress/dedupe factor which is proving to be the standard for the current software.

 

The system comes in up to an 8 brick configuration:

emc-XtremIO-f2

 

What each X-Brick looks like racked:

Each X-Brick contains:

  • Eaton high quality UPS unit.
  • 2 Storage Controllers.
  • DAE enclosure sas connected to the Storage Controllers.
  •  Once you go to a 2 or more brick system you will have 2 48G infiniban switches connecting the storage controllers together.

 

Update 12/1/14

 

So we’ve had our array in production for about 8 weeks now.  I have nothing but good things to say, performance has been absolutely incredible and stability / reliability has been everything promised.

The storage controller servers are Intel chassis with dual power supply, dual infiniban controllers and dual sas hba.  They appear to be running some sort of E5 cpu and boast 256GB of ram each.

Inside XtremIO Storage Controller

 

The system is pretty busy once cabled up, the architecture is very cluster oriented so there is a lot of redundancy in the cabling.

Xtremio Xbrick Cabling

Some pictures for the front of the array/Eaton UPS.

EMC EATON XtremIO UPS

 

20141202_180648

We are currently running around 550 production virtual machines that service 7,000 customer servers.   We are averaging 350-400MB/s read/writes at 20k io day in day out.   We’ve seen well over 20GB/s transfers and over 200k iop.

Storage vMotion and VAAI actions are extremely fast and completed almost instantly.  At the current time our data reduction/dedupe ratio is around 2.5:1 but I believe this numbers inaccurate as the total amount of data in our datastores is much larger than it shows stored.  🙂

Some UI Screenshots of our environment:

 

XtremIO UI Bandiwdth

XtremIO UI IOPS

xtremioui3

The lights on the disks in the UI blink based on activity, pretty cool eye candy.

Installing OpenVSwitch 2.3.1 LTS on CentOS 6

yum install kernel-headers kernel-devel gcc make python-devel openssl-devel kernel-devel, graphviz kernel-debug-devel automake rpm-build redhat-rpm-config libtool git

cd /root/

wget http://ftp.gnu.org/gnu/autoconf/autoconf-2.64.tar.gz

tar xvf autoconf-2.64.tar.gz

cd autoconf-2.64/

./configure

make

make install

 

cd /root/

wget http://openvswitch.org/releases/openvswitch-2.3.1.tar.gz -O /root/openvswitch-2.3.1.tar.gz

 

mkdir /root/rpmbuild/SOURCES

cp /root/openvswitch-2.3.1.tar.gz /root/rpmbuild/SOURCES/

rpmbuild -bb rhel/openvswitch.spec
rpmbuild -bb rhel/openvswitch-kmod-rhel6.spec

rpm -ivh /root/rpmbuild/RPMS/*.rpm

 

You can also use our public repo here for cloudstack.

http://mirror.beyondhosting.net/Cloudstack/

 

Recommendations I make to save critical data

First off, your data is the most valuable part of any server. There are many many hour of very hard if not impossible to replace work involved in setting up even a fairly basic web site. This doesn’t even include things like client information, orders etc. that directly cost you money if you lose them.

Not all backup methods are for everyone. The reason is that there are widely variable needs for data security as well as a wide variety of budgets. Someone with a page that is doing e-commerce transactions will likely need a lot more in regards to backups than someone with a bi-weekly blog for instance.

First off, there are two different modes of failure one will encounter as a sysadmin. The first is a “hard” failure. This includes drives or RAID arrays (yes it does happen) going bad. I love RAID, I think it’s a great measure to ensuring data protection but it’s not fool proof by any means and is no substitute for backups.

The second type of failure is the “soft” failure. With this failure mode for whatever reason data on the system is gone. This can be anything from a user deleting off their public_html directory to data corruption because the drive is heavily over run. Commonly this is someone running an FS check on a machine and having it dump a few thousand files to lost&found. I have seen my fair share of machines come up after this and run fine, and have seen plenty that didn’t too. This can also be the result of hackers etc. messing around on your system. Something I will warn of is if you use a secondary drive in the same server for backups, it can be something that is deleted by hackers as well. If you leave the drive mounted after backups are done and they do rm -rf /* it will be erased. Be sure to unmount your backup drive if you use this method. In general I do not advise relying on it for this reason, however it makes for a great way to have backups on a system without waiting for them to transfer.

The first rule I have is no matter what you should have minimum three copies of your data, at least one of which is totally off site and not within the same company as your server/colocation/shared host etc. This gives you options if something happens, and you’re not relying on one group of people to ensure your data is in tact.This can be as simple as having your system upload the files to a home or office computer via DynDNS and back mapping the port, then burning the images on to a CD weekly. On a higher level it can be storage by a company offering cloud storage such as Amazon.

How often you should back your data up and retain it is another question that is fairly common. This is largely subjective, and is a compromise between how much data you can afford to lose versus how much space you can afford. If you’re running a streaming video site, this can get quite pricey very quickly. Even to the point it may be best to try and get a low end server and put big drives in it to back up to. Afterall if you pay .50/gb and need a 1TB of backup space $500 buys a good bit of server!

What to back up is another good question. If you’re running a forum or something like that where there aren’t really all that many changes made to the underlying software, doing a single full backup and then backing the user upload directories (eg images) and the database may be enough. If the site is undergoing constant development, full backups would be a great deal more prudent.

The last thing to consider is how these backups are going to be made. I have done backups before with shell scripts, and used both Plesk’s and CPanel’s backup mechanisms. When doing a shell script for backups, you gain a ton of versatility in how and what you back up, at the price of being a lot more tedious to configure. These sort of backups are really nice if you’re wanting to make it so that your system backs up only certain things on varying interval. The panel based backups are so easy to configure, there is little to no reason you shouldn’t set them up. You just specify how often you want backups, where they will be stored and what will be backed up. The caveat I will warn about using a panel based backup system is that even with CPU level tweaks in the config files these can heavily load a system so my advice is to run them off hours.

Extending LVM across multiple disks

Had a situation arise yesterday where a coworker was wanting to extend an LVM Volume Group across two disks. It’s actually really simple to do.

The first thing we do is use vgdisplay to show original info for the Volume Group. Notice how when you look at this, the Free PE Size is 0MB.

[root@nfsen01 ~]# vgdisplay
— Volume group —
VG Name               VolGroup00
System ID
Format                lvm2
Metadata Areas        1
Metadata Sequence No  3
VG Access             read/write
VG Status             resizable
MAX LV                0
Cur LV                2
Open LV               2
Max PV                0
Cur PV                1
Act PV                1
VG Size               2.88 GB
PE Size               32.00 MB
Total PE              92
Alloc PE / Size       92 / 2.88 GB
Free  PE / Size       0 / 0
VG UUID              XXXXXXXXXXXXXXXXXXXXXXXXXX

To create the LVM PV on your new new disk follow these steps.

fdisk /dev/sdb
n
p
1
enter
enter
t
1
8e
w

Now we will probe for the new linux partition without rebooting:

partx -v -a /dev/sdb
pvcreate /dev/sdb1

Assuming you are using sdb1 as your drive, extending the Volume Group is as simple as:

vgextend VolGroup00 /dev/sdb1

And this will extend the volume across the entire disk. You should be able to run vgdisplay again and see your free PE size went up.

What you have to do next is extend the Logical Volume for the disk. This is optional depending on your objectives, if you wanted a common VG and wanted to create new volumes you can do it at your convenience now.

lvextend -L +931.51G /dev/mapper/VolGroup00-LogVol00

Assuming you’re running EXT3 you would use this command. For other file systems on top of LVM your milage may vary; Consult your documentation.

resize2fs /dev/mapper/VolGroup00-LogVol00 -p

After this is done you should be able to use df -h on the drive, and see your partition has been enlarged. This can even be done while the system is active, there’s no need for any boot CDs or the likes.

Some Perl for entering IPs into a database

This code is proof of concept, if you want to use it in a production environment I suggest you go over it heavily. For a person fairly new to perl there is a lot going on here that you may find useful. The overall idea is to convert IPs from dotted quad decimal numbers into binary then store them in a database. Because IPs can’t be duplicated on machines or it will cause a conflict, it is in general going to be a good value to have as a primary key. Feel free to use and adapt this code as you see fit. The end result should be something like:

 

mysql> select * from IPs;
+———————————-+———————————-+————————–+
| ip_address                       | netmask                          | computer_name            |
+———————————-+———————————-+————————–+
| 11000000101010000000001000000101 | 11111111111111111111111100000000 | control.frontandback.net |
+———————————-+———————————-+————————–+
1 row in set (0.00 sec)

#/usr/bin/perl

#IP2DB 0.1.0 (C) Febuary 2011 Howard A Underwood II
#Free for use and modification under the Creative Commons 1.0 License. If you want to give me a shout out try aunderwoodii#at#gmail.com
#The purpose of this code is to convert an IP address and netmask pair into Binary to make it easily stored in the database in a processable manner. This is only for IPV4 atm and is just a proof of concept, I’d love to see your adaptations to real world applications. Feel free to give me your feedback at the above address.

#This requires DBI and DBD::MySQL. Use CPAN or your package manager of choice to get them.
use DBI;
use DBD::mysql;

#info to connect to the DB server. This assumes that your table is pre-created. If you need to create a database do the following:
#create database ips;
#CREATE TABLE IPs (ip_address BINARY(32) PRIMARY KEY, netmask BINARY(32), computer_name char(200));

$hostname=localhost;
$db=”ips”;
$port=”3306″;
$user=”dbuser”;
$password=”wouldn’tyouliketoknow”;

#info to put into the DB. There’s the IP here, netmask and the computer name. These variables and the ones above are going to be what you need to use to adapt the script to your needs.
$ip=”192.168.2.5″;
$netmask=”255.255.255.0″;
$compname=”control.frontandback.net”;

#Getting down to business. This first line takes the netmask and breaks it into 4 ocets.
my @netmask = split (/\./, $netmask);
#Now that we have 4 ocets, we process each one into binary. Future modifications include cleaning this code up so that it’s a loop rather than 4 instances.
$ocetnm0= unpack(“B*”, pack(“C”, $netmask[0]));
$ocetnm1= unpack(“B*”, pack(“C”, $netmask[1]));
$ocetnm2= unpack(“B*”, pack(“C”, $netmask[2]));
$ocetnm3= unpack(“B*”, pack(“C”, $netmask[3]));
#We recombine everything into 1 Binary number after this.
$totalnm= $ocetnm0.$ocetnm1.$ocetnm2.$ocetnm3;
#Just printing the post process # on the TTY for human verification
print “$totalnm\n”;

#Now we repeat the process for the IP its self. This will probably get condensed into one instance along with the above code eventually. Once again, not the most efficient way to do it but rather straight forward.
my @ip = split (/\./, $ip);
$ocet0= unpack(“B*”, pack(“C”, $ip[0]));
$ocet1= unpack(“B*”, pack(“C”, $ip[1]));
$ocet2= unpack(“B*”, pack(“C”, $ip[2]));
$ocet3= unpack(“B*”, pack(“C”, $ip[3]));
$total= $ocet0.$ocet1.$ocet2.$ocet3;
print “$total\n”;

#Basic DBI connection code. We are using the DBI script to connect to the databse
$dsn = “DBI:mysql:database=$db;host=$hostname;port=$port”;
$DBIconnect = DBI->connect($dsn, $user, $password)
#If we don’t like what we see bail out because we can’t connect.
or die “Connection denied to database $db \n;”;
#Add the entry to the table. Please note that if you use the above table it will probably not let you run this more than once for any given IP.
eval { $DBIconnect->do(“INSERT INTO IPs (ip_address,netmask,computer_name) VALUES (‘$total’,’$totalnm’,’$compname’);”) };
print “Data not added to the database: $@\n” if $@;

The Sword of SEO part II

Well, it’s been a long time since I posted the first article on this. My time or lack thereof got the best of me. To counter this attack is actually very very easy. The first thing you do is you find out who is the referrer. This is simply done by tailing the logs. If you have a single domain, this can be fairly easy. Otherwise my preferred method involves using “watch ls -l” and seeing which log grows the fastest. This tends to be the one getting hit, or a likely suspect. I will probably write a perl script later to check this and tell me which log grows the most in say 10 seconds eventually. After this, you can use tail in the manner of:

tail -f /etc/httpd/domlogs/domain.log

When you do this, you will see what IPs are querying the page and the source they are being referred from. Look for any thing that doesn’t look like a search engine. To actually block them after they are identified what you do is you block the attack based on a referrer in the .htaccess. See the convenient rewrite code I jacked off another web site (about the same I did when I really saw the attack.)

RewriteEngine on
# Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} attacker\.com [NC]
RewriteRule .* – [F]

So, why does this work you may ask? In the case of the scenario I saw the person was attacking a “high value” target. This means a page that hits the database and has dynamically generated content with no caching. Server side configuration CAN make these sort of attacks a lot harder to perpetrate as well. Anything that you can do to increase the robustness of a server will help with a DoS. When you add a rule like this where it denies access to the referrer basically what happens is you pull up static content instead. Static content uses virtually no resources compared to something PHP based and backed by a databse. It’s a good idea to know about this sort of attack, as I could see it being bigger in the future. Black hat SEO is very common these days, and if you have the SEO part down the resources to do the rest of this attack are virtually nothing compared to what it does. It could also be plausible we will see this attack combined with “conventional, network level” type DoSing to increase its effectiveness.

Another basic shell script

The great thing about shell scripts is that they are a great way to solve complex problems that can cost you a lot of time to do manually. To this end, I had a client that needed some videos encoded on his server that didn’t encode properly. For an experienced script writer this would take about 5 minutes to write. It also makes it so that if the client wants to use it they can. The configuration was nice because the input and output file name was the same, just the extension was different. This is not very polished, if it were I would

A)run it as the same user

B)Put it in the user’s homedir

C)Make it so that it was password protected and executable via PHP script so the user wouldn’t require any bash experience at all but could upload a list via FTP and just run it.

#!/bin/bash

for video in `cat /root/list.txt` #We will run a loop where each line in list.txt is run as a variable $video.
do
mv /home/user/public_html/media/videos/flv/$video.flv /home/user/public_html/media/videos/flv/$video.flv.old #back up old files
ffmpeg -y -b 1500 -r 25 -i  /home/gogreenc/public_html/media/videos/vid/$video.* -f flv -s 640×480 -deinterlace -ac 1 -ar 41400 /home/user/public_html/media/videos/flv/$video.flv #encode new file, 640X480 out, FLV format deinterlaced.
chown user:user /home/user/public_html/media/videos/flv/$video.flv #chown to the right user. Not required if running as the right user.
done

A quickie MySQL backup script

I’ve seen my fair share of clients that need basic MySQL backups but have no control panel or don’t want to bother with Control panel based backups. This is a really simple setup that lets you do DB backups and put them in a local directory of the server. It would likely be easily modified to rsync to another server as well if you wanted to. There are a ton of options that could be added to this, your imagination (and shell scripting capacity) are the only limitations. Some suggestions I have would be

-Mail on success or failure and on old file deletion

-Connect to a remote DB

-Monitor the overall size

Well enough with the abstract, on to the shell!

#!/bin/bash
date=`date +%Y%m%d`
mysqldump –all-databases > /mysqlbackups/mysql-$date.sql
find /mysqlbackups/ -atime +30 -delete

If you notice, this takes up all of 4 lines. The first one is the she-bang, the second is establishing the date time stamp, the third dumps the databases and the last one purges any old backups. The only real variable you have to change here is the “+30” so that it is the number of days you want to retain the backups for minus one.

Did you know tw_cli has performance monitoring??

Yep title says it all, you can actually monitor individual disk performance with tw_cli.

First we need to enable performance monitoring:

tw_cli /c0 set dpmstat=on

Now we will show the information its providing.

tw_cli /c0 show dpmstat  type=ra
Drive Performance Monitor Configuration for /c0 ...
Performance Monitor: ON
Version: 1
Max commands for averaging: 100
Max latency commands to save: 10
Requested data: Running Average Drive Statistics

 Queue           Xfer         Resp
Port   Status           Unit   Depth   IOPs    Rate(MB/s)   Time(ms)
------------------------------------------------------------------------
p0     OK               u0     22      23      0.479        11
p1     OK               u0     24      93      1.344        12
p2     OK               u0     25      82      0.720        14
p3     OK               u0     24      83      1.108        16

BE SURE TO TURN OFF PERFORMANCE MONITORING WHEN YOU ARE DONE!

tw_cli /c0 set dpmstat=off

Different performance results:

This command only applies to 9000 series SX/SE/SA controllers, except for
type=ext, which applies only to SE/SA models.

This command allows you to request drive statistics of the specified type for
the specified port. These statistics can be helpful when troubleshooting
performance problems.

type= specifies which statistics should be displayed. The options are: inst for
Instantaneous, ra for Running Average, lct for Long Command Times,
histdata for Histogram Data, and ext for Extended Drive Statistics.

inst (Instantaneous). This measurement provides a short duration average.
ra (Running Average). Running average is a measure of long-term averages
that smooth out the data, and results in older results fading from the average
over time.

ext (Extended Drive Statistics). The extended drive statistics refers to
statistics of a drive’s read commands, write commands, write commands with
FUA (Force Unit Access), flush commands, and a drive sectors’s read, write,
and write commands with FUA.

lct (Long Command Times). This a collection of the commands with the
longest read/write response time.

histdata (Histogram Data). The histogram categorizes the read/write
execution times and group them together based on time frames.

Adding lots of IPs to a debian box

At work I had a client with a Debian system that needed a bunch of IPs added to it. Since it doesn’t really support ranges (at least that I can find) I came up with the following script.

#/bin/bash
j=42
for i in  {186..190}
do
j=$(expr $j + 1)
echo auto eth0:$j >> interfaces; echo iface eth0:$j inet static >> interfaces; echo address 192.168.41.$i >> interfaces; echo netmask 255.255.255.248 >> interfaces;
done

How it works is that j is the last IP in the ranges currently set in the interfaces file. The address is defined in the script, and the range is defined in the i= section. Just change the numbers to match what you want, put this into /etc/networking, run it and restart networking. This is only for five IPs but you could do hundreds or thousands this way if it was the desired affect. Or you can use a distro that supports ranges :>