Distributed Compiling - The AnandTech Linux XBOX PC Experiment

Publish date: 2024-05-25

Distributed Compiling

Distributed compiling was one of the original goals of this project from the beginning. We do not need to throw a real lot of computing power at gcc in order to compile something, but compiling some things (like GCC itself!), we can really benefit by using a lot of different jobs if hard drive and network IO do not slow us down. There happens to be a very excellent program called distcc that acts as a front end to run GCC over several machines at the same time. Since we installed distcc on our original cluster master before cloning it, we only need to jump start the daemon using our cluster_command.sh script.

Network IO could really hurt us here. Each slave machine can utilize the entire 100Mbps of network traffic, but our master might be uploading to more than one machine at a time. If we have to upload files to seven other machines on the cluster at once, we would be limited to only 1.7 megabytes (14.3 megabits) per second. It may make a lot more sense for us to run a separate dedicated PC with a gigabit Ethernet card as the master instead - at least for distcc. We will test both cases here to see how network IO affects our build.

Since there are eight machines on the cluster, we want to make sure that there are enough make jobs running to satisfy each processor on the node. Deciding the exact number of jobs under distcc is not an exact science, but fortunately, all of our machines are the same speed, so that will alleviate some headaches. We tried compiling GCC 3.4.2 on our cluster using distcc under 9 and 17 jobs.

# ./cluster_command.sh distccd -daemon # export DISTCC_HOSTS='master slave1 slave2 slave3 slave4 slave5 slave6 slave7' # make - j9 CC=distcc 

Kernel 2.6.4 Compile (GCC 3.3.3)
We didn't really see the performance numbers that we were looking for here. We first anticipated poor performance on the XBOX cluster due to its small amounts of RAM. "In theory", if we can get our cluster to scale to 16 nodes without a huge performance hit, we would see very impressive compile scores. However, running 16 threads on a single user application does not occur that often, even with a large compile like GCC, since many things need to be made in order. A multi-user environment running hundreds of compiles at once would benefit from so many nodes; perhaps a community cross-compiling station. Just in case, the master XBOX cluster is not performing poorly due to the meager 100Mbps network card. We reconfigured the cluster to obey a different host with a gigabit Ethernet card. We ran the same command as above. Kernel 2.6.4 Compile (GCC 3.3.3) (no master)
It looks like our cluster really didn't get too affected by the network IO after all. If we anticipate running more XBOXes, however, running the cluster from a dedicated master with at least one gigabit Ethernet card would be absolutely necessary.

ncG1vNJzZmivp6x7orrAp5utnZOde6S7zGiqoaenZH52f5hoaGk%3D