Failed iLO management port

Last week, I had got my hands on a set of three Intel S3700 datacenter SSDs, and today was going to be the day I installed them.

The disk installation was straightforward enough. Legos and duct tape were involved. More on that some other time, perhaps.

The issue

So, I get disks in, I close up, and plug everything back in. And somehow I manage to kill the iLO management port along the way.

Well, the server still works. But now the iLO can’t be accessed over the network, which means my fan control scripts won’t work.

This greatly reduces the Everyone-In-The-House Acceptance Factor (EITHAF?), which is a problem, since I’ve actually, finally, started using the server for thesis word counting.

The solution

HPE has an utility for their servers called hponcfg, which allows you to set iLO parameters directly.

I goofed around with rpms and missing drivers in unRAID trying to get a sign of life from the iLO, but to no avail. It was dead.

Well, the network port was. No link light from either switch or iLO port, across multiple cables. A monitor allowed me to see what was going on; which indicated the iLO hardware otherwise was fine.

So: I needed to set the iLO to run on the onboard NIC of my 380e.

grml

grml is the grumpy sound sysadmins make when they can’t automate every task in front of them. It is also a very capable live CD, based on Debian.

Download grml, and get it on a bootable USB somehow. I used Etcher.

Boot off the grml USB. I added the boot parameters grml ssh=secret at the boot prompt to be able to ssh in.

hponcfg

hponcfg is available from HPE as part of the Management Component Pack.

To install it, first add the repository:

cat "deb http://downloads.linux.hpe.com/SDR/repo/mcp buster/current non-free" >> /etc/apt/sources.list.d/hp-mpc.list

and the GPG key:

curl http://downloads.linux.hpe.com/SDR/hpePublicKey2048_key1.pub | apt-key add -

pull a fresh list of packages, and install hponcfg:

apt-get update && apt-get install hponcfg

The XML file

I really don’t like XML as a representation of data. Too verbose. But nevertheless; hponcfg consumes XML.

The following snippet tells the iLO to use the onboard NIC instead of the management port:

<!-- HPONCFG VERSION = "5.5.0" -->
<!-- Generated 5/12/2020 23:15:55 -->
<RIBCL VERSION="2.1">
 <LOGIN USER_LOGIN="Administrator" PASSWORD="password">
  <RIB_INFO MODE="write">
  <MOD_NETWORK_SETTINGS>
    <SHARED_NETWORK_PORT VALUE="Y"/>
  </MOD_NETWORK_SETTINGS>
  </RIB_INFO>
 </LOGIN>
</RIBCL>

Save it on your grml live environment as, say, lom.xml, and apply the configuration with `hponcfg -f lom.xml. After a bit of time, the iLO will reset, and iLO is running off the onboard NIC.

Notes

I really want these tools in base unRAID, or at least the drivers for them.

After a iLO reset, it had forgotten to run off the shared networking interface.

Running the iLO off the LOM means that the server itself cannot ssh in to the iLO.