Add a GPU to the DL380e

Because the CPU is for, uh, my thesis!

After observing PLeX streams churning up some CPU time, I decided to get a GPU to offload transcoding tasks. I went with a GTX 1050 Ti 4GB. This is more or less the same chip as the Quadro P2000, which is the current ~budget~, uh, low-powered darling. With a sneaky workaround, the artificial two-transcode limit is easily circumvented. For 5 times less than a P2000, I’ll take that deal!

GPU installation patch

To pass your GPU through to your PLeX docker, there are preparatory steps needed.

0. Install Community Applications

1. Install unRAID-nVidia

From Community Applications install unRAID-nVidia. Go to Settings → unRAID-nVidia, Select the nVidia build for your version of unRAID, and install.

2. Prepare patch

su
cd /boot
wget https://raw.githubusercontent.com/keylase/nvidia-patch/master/patch.sh
chmod +x patch.sh
mv patch.sh nvidia-patch.sh
cat /boot/nvidia-patch.sh >> config.go

3. Power down, and install GPU

Installing the card is straightforward. Remove the PCIe riser, insert the card in the 16x wide slot, reinsert, reboot.

The GPU fits!

Do note that the 16x slot is 8x electrical; but this does not matter for our purposes. My particular card did not need an extra GPU power cable. If yours does, you need the 10 pin to GPU power adapter from HP, or get this one from moddiy.com

4. Reboot, and configure the PLeX docker

…but first, go to Settings → unRAID-nVidia; and copy your GPU GUID somewhere convenient.

In the unRAID web UI navigate to Docker, and reconfigure the PLeX docker. Switch to advanced view, and under “Extra Parameters” add --runtime=nvidia. Under NVIDIA_VISIBLE_DEVICES add that GUID. Save, restarting the PLeX docker.

5. Enable hardware transcoding in PLeX

In the PLeX webui, go to settings for your server. Under “Transcoding”, select “Use hardware transcoding when available”

And there you go!

Pipe down, you!

Remember the fan control rain dance from the last entry in this series? And HPE’s agressive stance towards fan control? Well, HPE’s not gonna let you forget. After installing the GPU, my fans were running at an… excessive > 60 %. And my previous fan hack - just setting every fan baseline to 1 - didn’t work anymore. This reddit post could point me in the right direction, though.

This process is also a little involved, so buckle up.

1. Reset iLO.

There’s a bug in the fan-control hacked firmware that makes it not display command outputs in SSH sessions beyond the first after a reset. And this output is important for the next step.

2. Figure out which sensor is making iLO freak out

iLO is, as my son so eloquently put it, crying “stranger danger” on account of not recognizing the GPU. This can be illustrated by SSHing into iLO, and running the command fan info g.

A nice table like the following should be presented:

GROUPINGS
0: FASTEST Output:  63  [02*07 ...
1: FASTEST Output:  63  [02*07 ...
2: FASTEST Output:  35  [01 02*...
3: FASTEST Output:  36  [01 02 ...
4: FASTEST Output:  60  [01 03 ...
5: FASTEST Output:  60  [01 05 ...

(Example borrowed from the linked Reddit post, since I forgot to save my actual output)

Note that some numbers are marked with an *. This indicates that that is the sensor iLO is reading as the hottest - in my case, sensor 52.

I said, be QUIET!

To quiet down just that sensor, run fan pid 52 hi 300 or some other low number. And enjoy immediate relief, as your fans settle down somewhere around 10-15 %.

Results

Quick testing yielded 2 4K → 1080p transcodes, at ~1500 megabytes of GPU ram each; alongside one 1080p → 720p transcode. Realistically, I wont have much more than one 4K transcode at any given moment, if at all. Very nearly 0 CPU usage though, which was nice.

Feels good when a plan comes together.

unRAID Tuning

To have predictable performance across the many tasks performed by the NAS, it is possible to do CPU pinning. This effectively blocks off parts of the CPU processing power for certain tasks, meaning Plex can do it’s thing without being bothered by SABnzbd unpacking, for instance.

But to do this most efficiently, we need to know our CPU thread pairings. A hyperthreaded core has two threads per core; not one core with an extra thread. A subtle distinction, maybe.

To find CPU tread pairings, go to Tools → System Devices in the unRAID web interface.

For my dual-octocore 2450Ls, I get the following:

CPU Thread Pairings

Pair  1: cpu  0 / cpu 16
Pair  2: cpu  1 / cpu 17
Pair  3: cpu  2 / cpu 18
Pair  4: cpu  3 / cpu 19
Pair  5: cpu  4 / cpu 20
Pair  6: cpu  5 / cpu 21
Pair  7: cpu  6 / cpu 22
Pair  8: cpu  7 / cpu 23
Pair  9: cpu  8 / cpu 24
Pair 10: cpu  9 / cpu 25
Pair 11: cpu 10 / cpu 26
Pair 12: cpu 11 / cpu 27
Pair 13: cpu 12 / cpu 28
Pair 14: cpu 13 / cpu 29
Pair 15: cpu 14 / cpu 30
Pair 16: cpu 15 / cpu 31

With pairs 1-8 being the first CPU, and pairs 9-15 the second.

This means that to isolate, say, Plex, to cores 3,4,5; id need to pin virtual CPUs 3,19; 4,20; 5,21.

Planning it out

Context matters. This server is:

  • A basic NAS/media server
  • A digital PVR
  • A Plex server
  • A minecraft server
  • my heavy word-crunching rig

A recent aquisition

My new NAS, remote workstation and so much more

Pursuant to my master’s in educational sociology I’ve been coding a fair bit of R recently. I’ve quickly run into resource bottlenecks though — the intersection of a fairly large dataset and a mere X1C6 turns out to… not be great. So… what better excuse to buy retired enterprise hardware? It’s for my degree!

The hardware

| Base     | HP DL380e gen 8         |
| Chassis  | 12 LFF bays, 1x750W PSU |
| CPU      | Dual Xeon E5-2450L      |
| RAM      | 96 GB (4x8 + 4x16)      |
| HBA(ish) | P420, B120i             |
| Data     | 5x 6TB HGST NL-SAS      |
| Cache    | 500GB WD Red NAS SSD    |

Went whole hog on the RAM, as that’s where my programming efforts are stymied. I’ve got room to grow - HPE reports I can increase up to 196 GB; while unRAID reports 384 GB max capacity.

The disks were used, and a steal at ~15 USD per gigabyte. All report A-OK.

Hacks

Extra 12v/5v power

To use the B120i for cache SSDs, I needed to find extra power somewhere. My 380e came without the rear drive cage option; but did come with the rear drive cage cable. Measurements yielded this pinout:

    |-|
+---------+  1: 8v ground    4: 12/8v (yellow)
| 1  2  3 |  2: empty        5: 12v/5v ground
| 4  5  6 |  3: 1v ground    6: 5v/1v (red)
+---------+

Also, the cable fits a female 6-pin PCI Express connector perfectly. So, a massacre of a 6-to-8 pin adapter as well as a molex extender later, we have power!

The yellow and red leads on the molex adapter go to pins 4 and 6, and the two black leads meet and go to pin 5.

The first of four planned cache drives

Fan Control

HP servers are notorious for having an… aggressive approach to fan profiles. This means they can be hard to share a small home with. But never fear! Nerds to the rescue — turns out, there’s a hack for that.

Do note: THIS IS A HACK. IT MAY NOT WORK. IT MAY BRICK YOUR SERVER.

I hate noise more than I have sense, and I was fine in the end. YMMV.

0. Actively cool the P420

The P420 is a hot chip, and largely responsible for the baseline fan levels. iLO does all it can to keep it at or below 85 degrees C; which means running the fans hard. I ziptied a Noctua 40mm fan (the A4-N20FLX) to the heatsink, with great results. Powered from the rear drive bay power cable (another adapter in the chain), this keeps the RAID card at a comfortable 65-67 degrees C, and iLO can stop worrying.

So. Many. Adapters

Except that, in their infinite wisdom, any PCI card detected means fans 3, 4, and 5 will run at a minimum of 35-40 %. More steps must be taken!

1. Install exploitable iLO firmware

wget https://downloads.hpe.com/pub/softlib2/software1/sc-linux-fw-ilo/p192122427/v112485/CP027911.scexe
CP027911.scexe --unpack=ilo

Install the 2.50 firmware however you like. I used the web interface.

2. Install hacked firmware

git clone git@github.com:airbus-seclab/ilo4_toolbox.git
yay -S keystone hexdump
cd ilo4_toolbox/scripts/iLO4/eploits
wget https://uc2e993615a24a6915b40d722b8c.dl.dropboxusercontent.com/cd/0/get/A1CIhVjQEhr9ukukz8Qw_dHKizKB0RGgnFjfrp6z1rUtvBFclCvn4t6LErPcGVl0At3NQKzgezKAb8eV9-W5eg1P_0lRnZ47R-d5u0r4VvTpbmRBuItsv5RL2b2aKbyY7_M/file?_download_id=16760008867236312560412850928972566356913390752513665509633372074&_notify_domain=www.dropbox.com&dl=1
python2 exploit_write_flash.py 250 ilo4_healthcommands.bin

I had the exploit stall the first time I ran it. Tried again, the planets were aligned.

3. Control fan speed:

This will reduce the base speed of the fans to a more bearable level cross the board; while allowing the firmware to respond as designed to high temperatures. It really quiets down around sensor 32; which, incidentally, is the P420. Adding disks can cause the fans to spin up; which requires a re-run of the command. Change user and iLO hostname to suit your environment. If private keys are not set up, add sshpass password after do.

for I in `seq 1 65`; do ssh -o KexAlgorithms=+diffie-hellman-group1-sha1 martin@nas-ilo "fan pid $I lo 125"; done

I decided to run this command periodically, in case the box gets confused and ramps up the fan profiles again.