Updating Dell Firmware

Dell has several methods for updating their BIOS.

Working Approach

Out of all the different approaches, this is what I cam up with that successfully let me install the latest BIOS without taking a trip to the data center.

  • Install smbios-utils from EPEL
  • Download the BIOS package from the Dell website, here is an example
  • Rather than running what you downloaded, just extract the files, e.g.
    ./PER610_BIOS_LX_2.1.15.BIN --extract bios_files

    would extract the files to a directory named bios_files

  • The actual BIOS image is in the subdirectory named payload. Install it using the dellBiosUpdate program, e.g.
    dellBiosUpdate -u R610-020115C.hdr
  • Reboot

Unified Server Configurator Approach

A GUI that's a part of the Lifecycle Controller. I think you get to it by pressing F10 for System Services during boot.

Pros

  • Able to update all firmware on the system
  • Fetches the latest version
  • No dependencies on installed OS

Cons

  • Must go to data center since console redirection does not work with it
  • Extensive downtime as updates are downloaded

Download Approach

Dell's Drivers and Downloads page has BIN files you can download and run from within linux to update the BIOS. These are a combined shell script and archive. In theory, you just run them and the new BIOS is installed.

Pros

  • Can always get latest version

Cons

  • Must manually browse site and identify proper download
  • Only updates the BIOS, not other firmware
  • In my experience, this does not work. A report of the kind of error I ran into and some troubleshooting steps are here.

Dell "Hardware" Repository Approach

Dell officially maintains and supports a yum repository with rpms for their firmware and some tools for installing it.

Pros

  • Able to update all firmware on the system

Cons

  • Has software that is also packaged in EPEL, meaning you must assign a higher priority to this repository in your yum config
  • The repository is not kept up to date with the latest fimrware. It seems that it's updated in batches every few months.
  • In my experience, this does not work. See these error reports over a period of months where the R610 BIOS update fails to install.

Dell "Firmware" Repository Approach

Matt Domsch from Dell maintains an unsupported yum repository just with BIOS updates. You need to install some other tools from Dell's community repository (or maybe smbios-utils and firmware-addon-dell from EPEL would be enough, haven't tested this yet) to use it.

Pros

  • Should always be latest version

Cons

  • The script that generates the repository is fragile and frequently breaks
  • Doesn't seem to have all the BIOS I need, e.g. it didn't find any updates for my R610s

Update: vwaelchli on reddit points out that you can also update the firmware by running wsman commands from the lifecycle controller.

2011/09/10 12:43 · Brian Pitts · 0 Comments

Applying Tested Updates via YUM

When new software packages are released, you don't want to blindly apply them to all your systems. You should have a test environment, or at least a set of less-critical systems, where you test things first. If you normally stick to a consistent update schedule, such as once a week, you can use the date a package was created to determine whether it should be updated in your production environment. I couldn't find an way to do this directly through yum, so I created a shell script named update_production to do this. As written, it applies all updates, except for packages released to CentOS or EPEL in the last week. It's designed to be easy to tweak the repositories you delay updates from or how long the delay is. Ideally this functionality should be available as a plugin to yum, but that looked much more complicated to develop.

2011/09/07 23:42 · Brian Pitts · 0 Comments

Converting Software RAID Levels

If you accidentally set up a server with the wrong software raid type, as long as you used LVM you can use mirroring to temporarily move the filesystem off the RAID device, recreate the RAID properly, then move the filesystem back onto it. Needless to say if things go wrong during this process you'll be reinstalling. Below is an example where md1 on sda2 and sdb2 is being converted from RAID0 to RAID1, and an iSCSI volume serves as the temporary storage. In it VG is the volume group name. LV is the only logical volume being moved, and DEVICE is the device linux creates for the iSCSI target. If doing this yourself, you'll want to frequently check the sanity of what you're doing with commands like 'pvs' and 'lvs -a -o +devices'. If feasible, you can save a lot of time by turning off swap, removing its logical volume during the conversion, then recreating it afterwards.

pvcreate $DEVICE
vgextend $VG $DEVICE
lvconvert -m 1 --mirrorlog core ${VG}/$LV $DEVICE
 
# repeat these two steps for every LV on the VG
lvconvert -m 0 ${VG}/$LV /dev/md1
vgreduce $VG /dev/md1
 
mdadm --stop /dev/md1
mdadm --remove /dev/md1
mdadm --create /dev/md1 --level=raid1 --raid-devices=2 /dev/sda2 /dev/sdb2
mdadm --detail --scan >> /etc/mdadm.conf
pvcreate /dev/md1
vgextend $VG /dev/md1
lvconvert -m 1 --mirrorlog core ${VG}/$LV /dev/md1
 
# repeat these two steps for every LV on the VG
lvconvert -m 0 ${VG}/$LV $DEVICE
vgreduce $VG $DEVICE
 
mkinitrd -v --with=raid1 /boot/newraid $(uname -r)
# make the first entry use your new initrd
vim /boot/grub/grub.conf
2011/07/29 00:58 · Brian Pitts · 0 Comments

Nethook

I recently had a problem on CentOS 5 where the best solution I could think of required running a command when a network interface is brought up. I'm used to having this functionality as a part of the networking scripts on Debian-derived distributions, but I discovered that Red Hat-derived distributions lack it. I've written a small daemon called nethook to provide it. You can find the code and documentation for nethook on github.

Update: Moved project from google code to github.

2011/04/15 02:30 · Brian Pitts · 0 Comments

virt-install with kvm

If you are using virt-install to create KVM virtual machines on RHEL or CentOS 5, be sure to specify the '–accelerate' option. If you don't use accelerate, virt-install starts '/usr/bin/qemu-system-x86_64' rather than '/usr/libexec/qemu-kvm'. This isn't what you want, and it will fail with the error message “internal error Domain $YOUR_VM didn't show up”. What happened in the background is that libvirt set the machine type to rhel5.4.0, which lets qemu-kvm know it can use virtio, but this machine type is not understood by qemu-system-x86_64. If you check '/var/log/libvirt/qemu/$YOUR_VM.log', you'll see a detailed error like

LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin HOME=/ /usr/bin/qemu-system-x86_64 -S -M rhel5.4.0 -no-kqemu -m 512 -smp 1 -name $YOUR_VM -uuid a3e38d9d-d958-0da1-4b3a-57179ee04f29 -monitor pty -pidfile /var/run/libvirt/qemu/$YOUR_VM.pid -no-reboot -boot d -drive file=$YOUR_INSTALLER.iso,if=ide,media=cdrom,index=2 -net nic,macaddr=54:52:00:76:5f:f4,vlan=0 -net tap,fd=36,script=,vlan=0,ifname=vnet10 -serial pty -parallel none -usb -vnc 127.0.0.1:10 -k en-us

Supported machines are:
pc Standard PC (default)
isapc ISA-only PC

2011/03/24 13:54 · Brian Pitts · 0 Comments

Collectd Encryption Error

If you receive the following messages when starting collectd

network plugin: Option `SecurityLevel' is not allowed here.
network plugin: Option `AuthFile' is not allowed here.

It is because you configured collectd to sign or encrypt its communication, but collectd was not compiled with libgcrypt support. This is true of the version in EPEL.

$ yum info collectd | grep Version
Version : 4.10.2
$ ldd /usr/lib64/collectd/network.so
libpthread.so.0 ⇒ /lib64/libpthread.so.0 (0x00002b620d29d000)
libdl.so.2 ⇒ /lib64/libdl.so.2 (0x00002b620d4b8000)
libc.so.6 ⇒ /lib64/libc.so.6 (0x00002b620d6bc000)
/lib64/ld-linux-x86-64.so.2 (0×0000003223400000)

To solve the problem, either disable signing and encryption or rebuild collectd.

2011/03/24 13:48 · Brian Pitts · 0 Comments

Getting IP Addresses with Facter

When configuring services with Puppet, you sometimes need to know the IP address of a server. For example, I export nagios host definitions from many of my servers and use them to configure my nagios service. Facter ships with two sets of network-related facts that can help.

The first is the ipaddress fact. This fact contains the IP address of the first first network interface reported by ifconfig, which outputs them in alphabetical order. The second is a fact for each network interface that has an IP address; unsurprisingly, it contains the interface's IP address.

For example, here are the facts from a server with three network interfaces, eth0, eth1, and eth2. eth1 and eth2 are bonded as the device bond0.

ipaddress => 172.16.32.48
ipaddress_bond0 => 172.16.32.48
ipaddress_eth0 => 192.0.32.10

In my environment, the device specific facts are too narrow. Although I'm consistent on which interfaces are plugged into which network, some servers use bonding or bridging so I can't assume that the device that has the physical link is what the ip address is assigned to. The private ip might be on eth1 or bond0, for example. Of course, this means the ipaddress fact is too broad. It will be a public IP address on some servers and a private IP address on others, depending on how the interfaces are named.

To improve on this, I've created two custom facts, ipaddress_public and ipaddress_private. Instead of containing the first IP address they find, they contain the first public or private IP address. I also have two facts, on_public and on_private, that report whether a server has any public or private IP address. Here is how my previous example looks with these new facts.

ipaddress => 172.16.32.48
ipaddress_bond0 => 172.16.32.48
ipaddress_eth0 => 192.0.32.10
ipaddress_private => 172.16.32.48
ipaddress_public => 192.0.32.10
on_private => true
on_public => true

The code for the facts is below.

require 'facter/util/ip'
 
def has_address(interface)
  ip = Facter::Util::IP.get_interface_value(interface, 'ipaddress')
  if ip.nil?
    false
  else
    true
  end
end
 
def is_private(interface)
  rfc1918 = Regexp.new('^10\.|^172\.(?:1[6-9]|2[0-9]|3[0-1])\.|^192\.168\.')
  ip = Facter::Util::IP.get_interface_value(interface, 'ipaddress')
  if rfc1918.match(ip)
    true
  else
    false
  end
end
 
def find_networks
  found_public = found_private = false
  Facter::Util::IP.get_interfaces.each do |interface|
    if has_address(interface)
      if is_private(interface)
        found_private = true
      else
        found_public = true
      end
    end
  end
  [found_public, found_private]
end
 
# these facts check if any interface is on a public or private network
# they return the string true or false
# this fact will always be present
 
Facter.add(:on_public) do
  confine :kernel => Facter::Util::IP.supported_platforms
  setcode do
    found_public, found_private = find_networks
    found_public
  end
end
 
Facter.add(:on_private) do
  confine :kernel => Facter::Util::IP.supported_platforms
  setcode do
    found_public, found_private = find_networks
    found_private
  end
end
 
# these facts return the first public or private ip address found
# when iterating over the interfaces in alphabetical order
# if no matching address is found the fact won't be present
 
Facter.add(:ipaddress_public) do
  confine :kernel => Facter::Util::IP.supported_platforms
  setcode do
    ip=""
    Facter::Util::IP.get_interfaces.each do |interface|
      if has_address(interface)
        if not is_private(interface)
          ip = Facter::Util::IP.get_interface_value(interface, 'ipaddress')
          break
        end
      end
    end
    ip
  end
end
 
Facter.add(:ipaddress_private) do
  confine :kernel => Facter::Util::IP.supported_platforms
  setcode do
    ip=""
    Facter::Util::IP.get_interfaces.each do |interface|
      if has_address(interface)
        if is_private(interface)
          ip = Facter::Util::IP.get_interface_value(interface, 'ipaddress')
          break
        end
      end
    end
    ip
  end
end
2011/02/08 15:13 · Brian Pitts · 0 Comments

Splitting FASTA with Biopython

A student in the lab associated with my employer asked me for advice on how to extract records from a FASTA file. The hitch was that he wanted a large number of records, on the order of thousands, and the FASTA file was even larger, containing tens of millions of records. The first approach that came to my mind was splitting the file into chunks that were small enough to fit into memory on the nodes in our cluster. This would allow multiple CPUs to search for the records of interest while eliminating the greatest potential performance killer, lots of disk seeks. I'd never used Biopython before, so this request seemed like a good excuse to try it. It turned out to be remarkably easy to learn enough to accomplish what I wanted; going from idea to tested code took about an hour.

To split the data

import sys
from Bio import SeqIO
filename= sys.argv[1]
maximum_length = int(sys.argv[2])
input = open(filename, "rU")
current_length = 0
current_file = 0
output = open("%s.%d" % (filename, current_file), "w")
for record in SeqIO.parse(input, "fasta"):
  if (current_length == maximum_length):
    output.close()
    current_length = 0
    current_file = current_file + 1
    output = open("%s.%d" % (filename, current_file), "w")
  # SeqIO.write requires a list, so turn our record into one
  SeqIO.write([record], output, "fasta")
  current_length = current_length + 1
input.close()
output.close()

And then to find the records you care about

import sys
from Bio import SeqIO
fasta_filename = sys.argv[1]
id_filename = sys.argv[2]
output_filename = sys.argv[3]
id_handle = open(id_filename, "rU")
# read entire file into list, stripping newlines
ids = [id.strip() for id in id_handle.readlines()]
id_handle.close()
fasta_handle = open(fasta_filename, "rU")
records = SeqIO.to_dict(SeqIO.parse(fasta_handle, "fasta"))
fasta_handle.close()
output_handle = open(output_filename, "w")
for id in ids:
  if id in records:
    SeqIO.write([records[id]], output_handle, "fasta")
output_handle.close()

So assuming you have 40 million records in mydata.fasta, that file is 4 times too large to fit into memory, and the identifiers for the records you care about are one per line in findthese.txt, you could run

split_data.py mydata.fasta 10000000

And then submit 4 jobs to the cluster

find_records.py mydata.fasta.0 findthese.txt results.fasta.0

find_records.py mydata.fasta.1 findthese.txt results.fasta.1

find_records.py mydata.fasta.2 findthese.txt results.fasta.2

find_records.py mydata.fasta.3 findthese.txt results.fasta.3

Shortening Disk Benchmark Time

When benchmarking storage configurations, it's important that your benchmark's set of test data be larger than the amount of memory in the computer where the benchmark is running. Otherwise, the results will be more reflective of your operating system's caching behaviour than your storage system. For example, on a server with 8GB of RAM I would set iozone's -g argument to 16GB. Of course, tests that write this much data can take a long time. This can be a problem when you're trying to test many small tweaks to your storage. A way to speed this up is to use the linux kernel's mem parameter. For example, in my GRUB configuration I could set kernel /vmlinuz ro root=/dev/example/root mem=1G. When I boot with this parameter set the system will only have 1GB of memory available, so I can reduce my test data to 2GB. This makes iterating through many storage configurations much quicker.

2011/02/05 20:42 · Brian Pitts · 0 Comments

Identifying Zombies

One on server I adminster, users run workflows that occasionally leave zombie processes behind. A zombie describes a processes that has finished but has not had its exit status read by its parent process. I keep this shell script around to generate a report on zombie children and their negligent parents that I can send to the workflow developers.

2011/02/03 01:21 · Brian Pitts · 0 Comments

Older entries >>

blog.txt · Last modified: 2007/09/27 01:01 by brian
Recent changes · Show pagesource · Login