When acquiring a spinlock, yieldThread() every 1000 spins (#3553, #3758)
authorSimon Marlow <marlowsd@gmail.com>
Fri, 22 Jan 2010 16:49:11 +0000 (16:49 +0000)
committerSimon Marlow <marlowsd@gmail.com>
Fri, 22 Jan 2010 16:49:11 +0000 (16:49 +0000)
commit65ac2f4cefcea7ca78a65ca22889b51b5a27d1f0
treea841ea73d574bbda83eb387c8db530c342eb84e9
parentf1d99ae043da2a4825d88756275477d82d92c967
When acquiring a spinlock, yieldThread() every 1000 spins (#3553, #3758)

This helps when the thread holding the lock has been descheduled,
which is the main cause of the "last-core slowdown" problem.  With
this patch, I get much better results with -N8 on an 8-core box,
although some benchmarks are still worse than with 7 cores.

I also added a yieldThread() into the any_work() loop of the parallel
GC when it has no work to do. Oddly, this seems to improve performance
on the parallel GC benchmarks even when all the cores are busy.
Perhaps it is due to reducing contention on the memory bus.
includes/rts/Constants.h
includes/rts/SpinLock.h
includes/rts/storage/SMPClosureOps.h
rts/sm/GC.c