Ok, it's been a year since I've written down a recommendation for building a gigantic computer or any computer for that matter. I haven't been building many of these now that a basic laptop like the MacBook Pro 15" is so powerful. And I haven't been doing much gaming and most of the machine learning work happens in the cloud.
But there is still room for a custom machine for dedicated researcher doing machine learning work. If you have free grad student labor, it makes more sense to use a $20K machine than pay$5/hour for machine learning if you are going to use it for more than two years day and night (that is you can buy 4,000 hours or about 180 days of processing if you are running continuously.
Tim Dettmers has some great insights and recommendations having built seven inference machines. Slav Ivanov has a good set of specific recommendations.

# Graphics Card: Titan RTX or 2080 RTX

The main thing to ensure is that you have selected a card which works with your workloads. If you use 16-bit models, then you should use the RTX (Turing) models as they are tuned for 16-bit performance. The now older GTX 1080 Ti and the like can be used just for 32-bit since they don't have 16-bit optimizations.
But the most important thing is how big a model you are running. You need:

• 11GB+ State-of-the-art scores, training
• 8GB looking for running research models

So let's look at the performance of the all-important GPU on machine learning modes. The latest thing to happen is the new architectures that nVidia has produced.
AMD Ryzen 7 2700X. This is the budget pick at $400 with 8 Cores and 16 Threads, but running at 3.7 GHz and is the minimum needed to server 4 GPUs. ## Intel CPU Recommendations In the last year, Intel has rejiggered their line. At the end of 2017, they basically had old Broadwell-E Xeon on 2011 v3 sockets and then the new Skylake for Core i7 on 1151. In those days, if you went to 2011, you got ECC as the main benefit. Now, they have Core i9, Xeon W and Xeon Scalable which are roughly prosumer, workstation and server products. The Xeon W are the processor for workstations using the new 2066 pinouts and required the C422 chipset. It uses the Skylake-W processor variant. It Intel is always changing pinouts which means that motherboards don't just work, so you spend alot of time understand new lines. As before, support large memory and ECC is the reason to buy these. They have 48 lanes. Note that there are special Mac version that go into the iMac Pro. Right now it is nearly impossible to build your own workstation because these are such limited part. A comparable part might be the Xeon W-2145 with 8 cores/16 threads for$1600. But you can get Supermicro X11 SRA (Amazon) but this only has three full length slots (16/16/8) or the ASUS WS C422 Pro SE (Amazon) which has three slots with true 16/16/16. So not a bad choice.
So moving to the LGA 1151 consumer line, you can get, without ECC:
Intel Core i7-9700K at $400 is the other choice, it has 8 Cores and 8 Threads. It is a 1151 pin system so it has 16 available lanes, but it is Intel so there is safety in picking Intel, so this is not a bad choice. ## CPU Fan Although water cooling sounds neat, the truth is that in a big case with a well designed quiet fan, there is really no difference and an air-cooled fan is much simpler. The Noctua NH-U14S has long been a favorite of mine. # RAM: 16GBx4 ECC DDR4-2666 The main change here is the use of "pinned" memory, this means your information is in a place that a GPU can find it and transfer to its vRAM without any CPU involvement. In this case, memory speed improvements don't matter. Even with CPU mediated transfers, overclocked RAM results in a 3% speed improvement at most. Also, in terms of memory, then you need enough to have it all fit. so having 64GB makes sense. As an aside, you probably really want ECC RAM at this level if you are running loads for a long time. Unless you need over 384GB, you can unbuffered. A year ago, you could only get DDR4-2133 RAM, but now the new Ryzen allows up to DDR4-2666 RAM which help performance marginally. You can get 16GB per stick from Crucial or 32GB per stick for high density configurations (only for specialized motherboard supports it). The big tradeoff is that if you go to 8 slots worth of memory with dual ranked RAM, then you can only run at DDR4-1866, so don't overbuy if you need lots of RAM. # Motherboard ## Motherboard Requirements: PCI Lanes: 8 per GPU The big thing is not to focus too much on PCIe lanes in your CPU. These lanes are used to take data from the dRAM to the VRAM of your GPU. The conventional wisdom is that you want the full 16 lanes available for the GPU to transfer from the CPU. But in fact, this transfer time isn't as important. The main point here is that the difference between 16 PCIe CPU to GPU is 2ms on a 216ms ImageNet pass, so most of the time this doesn't matter much. This means that with most Intel systems, you can have a dual GPU system where each gets 8 lanes and you should have about the same performance. With multiple GPU training, you really only need 8 PCIe lanes per GPU so 32 lanes total through the motherboard, so make sure this is what is available in a 4 GPU system. This normally requires a PCIe switch in the motherboard which only specialized units have. On the Intel side, each generation requires a different motherboard, but with AMD, the same X399 motherboards will work for Ryzen 1 or 2. For Intel, the latest overclockable line is the Z390 series that pair with the overclockable K processors. But for machine learning, you don't need to overclock. You want reliability. To get four GPUs into a single motherboard, you will need ATX at a minimum, eATX will give you an additional slot if you need it. We've used this in the past for NVMe memory cards if you run out of slots. ## AMD Motherboard Recommendations As an example, we have a new set of boards that are designed for the Ryzen 2 from Anandtech and Tom's Hardware: ASUS X399 ROG Zenith Extreme. This is their flagship gamer board, it has room for 4 full-length GPUs running 16/8/16/8 lanes which should work OK for machine learning, but obviously, all 16 would be better. It also has three m.2 slots. The big issue is that it is hard to fit an air cooler on the CPU as the PCIe slot is very close to it. That is because they wanted a sixth slot for a 4 lane PCI Express card to run 10Gbe Ethernet. Gigabyte Aorus X399 Extreme (Tom's Hardware) is an eATX so you can put four graphics cards in with 16/8/16/8 lanes available plus a single lane card as well. It supports 128GB in 8 slots as well with ECC available. And has room for three M.2 NVME cards in 2280 and a single 22110 form factor. And it has onboard 802.11ac Wifi. It's an expensive$400 board though and Gigabyte has been Ok for me but doesn't have the same reputation as an ASUS.
MSI MEG X399 Creation. Like the ASUS, this is a monster system with 16 PCI Express to each of GPUs and there is a riser that attaches up to three M.2 cards on it. Also if you don't completely populate the graphics cards, it does includea special riser care that lets you attach four additional NVMe cards on an x16 slot. The main issue is lack of 10Gb Ethernet which is becoming important for machines like this.
Then are less premium boards, they don't have the bells and whistles but are better deals
Gigabyte X399 Aorus Gaming 7. This is one step down from the Extreme. It has 4 slots at 16/8/16/8 and there is a 4 lane slot but this prevents one GPU from being installed. And it has two M.2 22110 and one 2280 on the motherboard.
ASRock X399 Taichi. (Anandtech) This is an ATX board that is also 16/8/16/8 and is a good budget choice at $280. This doesn't have any of the fancy features like 10GBe or 802.11ad, but for a server, it isn't clear if you need all this given most of tine it is grinding away ## Intel Motherboard Recommendations In the past, the ASUS Workstation line has been my goto for Intel based systems. The WS. Now for the new 9th generation, you can get the WS Z390 Pro for instance and I've had good luck with the WS X99 for the older LGA 2011 v3 that works with the old Broadwell Xeons that allowed a 16/8/8/8 configuration with the 40 PCIe Lanes from that chip. # Mass Storage SSDs and HDs These days it only makes sense to get an NVMe SSD for the boot drive and for swapping, these aren't expensive these days. Most modern motherboards let you put three M,2 NVMe cards in, so you can have a configuration that is all M2. These take a total of 4 PCI Express lanes each, so it is nice to have Ryzen's 64 lanes. I would allocate them a system, boot and data drives. Then you need a place to put all the data that you are running. Most datasets these days are pretty smaller so a RAID-1 10TB hard drive is not a bad choice. The market has slowed down some for this category, but Tom's Hardware and they recommend that you stick with 1TB SSDs as the most economical choice and Anandtech also has some recommendations but the high-end SSD market has plateaued in performance: 1. AData XPG Gammix s11. These are super fact, but they are no longer available and have been replaced by the AData SX8200 Pro at$180.
2. Mushkin Pilot. AT $190 is another low cost choice. 3. Samsung 970 Evo. this is slightly lower performance but at$250 much cheaper.
4. Samsung 970 Pro. You can't go too wrong with using these, but the Adata is apparently faster. At $350 each, prices have really fallen, but they are still super expensive. 5. HP EX920. This is last years model (the EX950 is the new one), but you can get a great value on closeouts at$180 for 1TB!

## Power Supplies

You want a big power supply, they decrease in efficiency by 20% over their life. So, for instance, a 4GPU system with 250 watts TDP plus the CPU can easily require 1250 watts and you need 20% over that, so a 1.6kW supply isn't unusual.

## Bitcoin Mining Rigs

These Rigs are mainly not super appropriate because they don't need many PCI lanes, but there are no motherboards that are designed to have more than 4 graphics cards in them.