Dell has several methods for updating their BIOS.
Out of all the different approaches, this is what I cam up with that successfully let me install the latest BIOS without taking a trip to the data center.
./PER610_BIOS_LX_2.1.15.BIN --extract bios_files
would extract the files to a directory named bios_files
dellBiosUpdate -u R610-020115C.hdr
A GUI that's a part of the Lifecycle Controller. I think you get to it by pressing F10 for System Services during boot.
Dell's Drivers and Downloads page has BIN files you can download and run from within linux to update the BIOS. These are a combined shell script and archive. In theory, you just run them and the new BIOS is installed.
Dell officially maintains and supports a yum repository with rpms for their firmware and some tools for installing it.
Matt Domsch from Dell maintains an unsupported yum repository just with BIOS updates. You need to install some other tools from Dell's community repository (or maybe smbios-utils and firmware-addon-dell from EPEL would be enough, haven't tested this yet) to use it.
Update: vwaelchli on reddit points out that you can also update the firmware by running wsman commands from the lifecycle controller.
When new software packages are released, you don't want to blindly apply them to all your systems. You should have a test environment, or at least a set of less-critical systems, where you test things first. If you normally stick to a consistent update schedule, such as once a week, you can use the date a package was created to determine whether it should be updated in your production environment. I couldn't find an way to do this directly through yum, so I created a shell script named update_production to do this. As written, it applies all updates, except for packages released to CentOS or EPEL in the last week. It's designed to be easy to tweak the repositories you delay updates from or how long the delay is. Ideally this functionality should be available as a plugin to yum, but that looked much more complicated to develop.
If you accidentally set up a server with the wrong software raid type, as long as you used LVM you can use mirroring to temporarily move the filesystem off the RAID device, recreate the RAID properly, then move the filesystem back onto it. Needless to say if things go wrong during this process you'll be reinstalling. Below is an example where md1 on sda2 and sdb2 is being converted from RAID0 to RAID1, and an iSCSI volume serves as the temporary storage. In it VG is the volume group name. LV is the only logical volume being moved, and DEVICE is the device linux creates for the iSCSI target. If doing this yourself, you'll want to frequently check the sanity of what you're doing with commands like 'pvs' and 'lvs -a -o +devices'. If feasible, you can save a lot of time by turning off swap, removing its logical volume during the conversion, then recreating it afterwards.
pvcreate $DEVICE vgextend $VG $DEVICE lvconvert -m 1 --mirrorlog core ${VG}/$LV $DEVICE # repeat these two steps for every LV on the VG lvconvert -m 0 ${VG}/$LV /dev/md1 vgreduce $VG /dev/md1 mdadm --stop /dev/md1 mdadm --remove /dev/md1 mdadm --create /dev/md1 --level=raid1 --raid-devices=2 /dev/sda2 /dev/sdb2 mdadm --detail --scan >> /etc/mdadm.conf pvcreate /dev/md1 vgextend $VG /dev/md1 lvconvert -m 1 --mirrorlog core ${VG}/$LV /dev/md1 # repeat these two steps for every LV on the VG lvconvert -m 0 ${VG}/$LV $DEVICE vgreduce $VG $DEVICE mkinitrd -v --with=raid1 /boot/newraid $(uname -r) # make the first entry use your new initrd vim /boot/grub/grub.conf
I recently had a problem on CentOS 5 where the best solution I could think of required running a command when a network interface is brought up. I'm used to having this functionality as a part of the networking scripts on Debian-derived distributions, but I discovered that Red Hat-derived distributions lack it. I've written a small daemon called nethook to provide it. You can find the code and documentation for nethook on github.
Update: Moved project from google code to github.
If you are using virt-install to create KVM virtual machines on RHEL or CentOS 5, be sure to specify the '–accelerate' option. If you don't use accelerate, virt-install starts '/usr/bin/qemu-system-x86_64' rather than '/usr/libexec/qemu-kvm'. This isn't what you want, and it will fail with the error message “internal error Domain $YOUR_VM didn't show up”. What happened in the background is that libvirt set the machine type to rhel5.4.0, which lets qemu-kvm know it can use virtio, but this machine type is not understood by qemu-system-x86_64. If you check '/var/log/libvirt/qemu/$YOUR_VM.log', you'll see a detailed error like
LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin HOME=/ /usr/bin/qemu-system-x86_64 -S -M rhel5.4.0 -no-kqemu -m 512 -smp 1 -name $YOUR_VM -uuid a3e38d9d-d958-0da1-4b3a-57179ee04f29 -monitor pty -pidfile /var/run/libvirt/qemu/$YOUR_VM.pid -no-reboot -boot d -drive file=$YOUR_INSTALLER.iso,if=ide,media=cdrom,index=2 -net nic,macaddr=54:52:00:76:5f:f4,vlan=0 -net tap,fd=36,script=,vlan=0,ifname=vnet10 -serial pty -parallel none -usb -vnc 127.0.0.1:10 -k en-us
Supported machines are:
pc Standard PC (default)
isapc ISA-only PC
If you receive the following messages when starting collectd
network plugin: Option `SecurityLevel' is not allowed here.
network plugin: Option `AuthFile' is not allowed here.
It is because you configured collectd to sign or encrypt its communication, but collectd was not compiled with libgcrypt support. This is true of the version in EPEL.
$ yum info collectd | grep Version
Version : 4.10.2
$ ldd /usr/lib64/collectd/network.so
libpthread.so.0 ⇒ /lib64/libpthread.so.0 (0x00002b620d29d000)
libdl.so.2 ⇒ /lib64/libdl.so.2 (0x00002b620d4b8000)
libc.so.6 ⇒ /lib64/libc.so.6 (0x00002b620d6bc000)
/lib64/ld-linux-x86-64.so.2 (0×0000003223400000)
To solve the problem, either disable signing and encryption or rebuild collectd.
When configuring services with Puppet, you sometimes need to know the IP address of a server. For example, I export nagios host definitions from many of my servers and use them to configure my nagios service. Facter ships with two sets of network-related facts that can help.
The first is the ipaddress fact. This fact contains the IP address of the first first network interface reported by ifconfig, which outputs them in alphabetical order. The second is a fact for each network interface that has an IP address; unsurprisingly, it contains the interface's IP address.
For example, here are the facts from a server with three network interfaces, eth0, eth1, and eth2. eth1 and eth2 are bonded as the device bond0.
ipaddress => 172.16.32.48 ipaddress_bond0 => 172.16.32.48 ipaddress_eth0 => 192.0.32.10
In my environment, the device specific facts are too narrow. Although I'm consistent on which interfaces are plugged into which network, some servers use bonding or bridging so I can't assume that the device that has the physical link is what the ip address is assigned to. The private ip might be on eth1 or bond0, for example. Of course, this means the ipaddress fact is too broad. It will be a public IP address on some servers and a private IP address on others, depending on how the interfaces are named.
To improve on this, I've created two custom facts, ipaddress_public and ipaddress_private. Instead of containing the first IP address they find, they contain the first public or private IP address. I also have two facts, on_public and on_private, that report whether a server has any public or private IP address. Here is how my previous example looks with these new facts.
ipaddress => 172.16.32.48 ipaddress_bond0 => 172.16.32.48 ipaddress_eth0 => 192.0.32.10 ipaddress_private => 172.16.32.48 ipaddress_public => 192.0.32.10 on_private => true on_public => true
The code for the facts is below.
require 'facter/util/ip' def has_address(interface) ip = Facter::Util::IP.get_interface_value(interface, 'ipaddress') if ip.nil? false else true end end def is_private(interface) rfc1918 = Regexp.new('^10\.|^172\.(?:1[6-9]|2[0-9]|3[0-1])\.|^192\.168\.') ip = Facter::Util::IP.get_interface_value(interface, 'ipaddress') if rfc1918.match(ip) true else false end end def find_networks found_public = found_private = false Facter::Util::IP.get_interfaces.each do |interface| if has_address(interface) if is_private(interface) found_private = true else found_public = true end end end [found_public, found_private] end # these facts check if any interface is on a public or private network # they return the string true or false # this fact will always be present Facter.add(:on_public) do confine :kernel => Facter::Util::IP.supported_platforms setcode do found_public, found_private = find_networks found_public end end Facter.add(:on_private) do confine :kernel => Facter::Util::IP.supported_platforms setcode do found_public, found_private = find_networks found_private end end # these facts return the first public or private ip address found # when iterating over the interfaces in alphabetical order # if no matching address is found the fact won't be present Facter.add(:ipaddress_public) do confine :kernel => Facter::Util::IP.supported_platforms setcode do ip="" Facter::Util::IP.get_interfaces.each do |interface| if has_address(interface) if not is_private(interface) ip = Facter::Util::IP.get_interface_value(interface, 'ipaddress') break end end end ip end end Facter.add(:ipaddress_private) do confine :kernel => Facter::Util::IP.supported_platforms setcode do ip="" Facter::Util::IP.get_interfaces.each do |interface| if has_address(interface) if is_private(interface) ip = Facter::Util::IP.get_interface_value(interface, 'ipaddress') break end end end ip end end
A student in the lab associated with my employer asked me for advice on how to extract records from a FASTA file. The hitch was that he wanted a large number of records, on the order of thousands, and the FASTA file was even larger, containing tens of millions of records. The first approach that came to my mind was splitting the file into chunks that were small enough to fit into memory on the nodes in our cluster. This would allow multiple CPUs to search for the records of interest while eliminating the greatest potential performance killer, lots of disk seeks. I'd never used Biopython before, so this request seemed like a good excuse to try it. It turned out to be remarkably easy to learn enough to accomplish what I wanted; going from idea to tested code took about an hour.
To split the data
import sys from Bio import SeqIO filename= sys.argv[1] maximum_length = int(sys.argv[2]) input = open(filename, "rU") current_length = 0 current_file = 0 output = open("%s.%d" % (filename, current_file), "w") for record in SeqIO.parse(input, "fasta"): if (current_length == maximum_length): output.close() current_length = 0 current_file = current_file + 1 output = open("%s.%d" % (filename, current_file), "w") # SeqIO.write requires a list, so turn our record into one SeqIO.write([record], output, "fasta") current_length = current_length + 1 input.close() output.close()
And then to find the records you care about
import sys from Bio import SeqIO fasta_filename = sys.argv[1] id_filename = sys.argv[2] output_filename = sys.argv[3] id_handle = open(id_filename, "rU") # read entire file into list, stripping newlines ids = [id.strip() for id in id_handle.readlines()] id_handle.close() fasta_handle = open(fasta_filename, "rU") records = SeqIO.to_dict(SeqIO.parse(fasta_handle, "fasta")) fasta_handle.close() output_handle = open(output_filename, "w") for id in ids: if id in records: SeqIO.write([records[id]], output_handle, "fasta") output_handle.close()
So assuming you have 40 million records in mydata.fasta, that file is 4 times too large to fit into memory, and the identifiers for the records you care about are one per line in findthese.txt, you could run
split_data.py mydata.fasta 10000000
And then submit 4 jobs to the cluster
find_records.py mydata.fasta.0 findthese.txt results.fasta.0
find_records.py mydata.fasta.1 findthese.txt results.fasta.1
find_records.py mydata.fasta.2 findthese.txt results.fasta.2
find_records.py mydata.fasta.3 findthese.txt results.fasta.3
When benchmarking storage configurations, it's important that your benchmark's set of test data be larger than the amount of memory in the computer where the benchmark is running. Otherwise, the results will be more reflective of your operating system's caching behaviour than your storage system. For example, on a server with 8GB of RAM I would set iozone's -g argument to 16GB. Of course, tests that write this much data can take a long time. This can be a problem when you're trying to test many small tweaks to your storage. A way to speed this up is to use the linux kernel's mem parameter. For example, in my GRUB configuration I could set kernel /vmlinuz ro root=/dev/example/root mem=1G. When I boot with this parameter set the system will only have 1GB of memory available, so I can reduce my test data to 2GB. This makes iterating through many storage configurations much quicker.
One on server I adminster, users run workflows that occasionally leave zombie processes behind. A zombie describes a processes that has finished but has not had its exit status read by its parent process. I keep this shell script around to generate a report on zombie children and their negligent parents that I can send to the workflow developers.