It seems that the scheduler interference when enough cores are present is enough to slow down the system. Sparing a core just to schedule tasks to other cores (and handle interrupts) makes the system as a whole faster.
These are the times for the experiments of making a kernel, copying a file tree from ram to ram, and making a kernel with its source tree in ram. The first two rows are taken from a previous experiment also reported in this blog. We did re-run that experiment to compare with AMP because it depends on the state of the network, which may differ.
14.0344 1.921 10.458 single sched. 32 cores.
10.789 0.608 5.073 single sched. 4 cores.
12.8 2.20 9.23 rerun of single sched, 32 cores.
10.17 0.995 5.775 AMP sched, 32 cores.