Kicked the tires on SciPy. At first I got memory errors, but I was able to work around it. Iterating over a 10,000 point feature class against a 50,000 point feature class, instead of comparing all 500,000,000 combinations at once, the SciPy method was ~15% faster than my original straight NumPy approach. Assuming the memory errors are manageable, SciPy does offer a performance improvement in this case.
It is good that Esri will be packaging and automatically installing SciPy with future ArcGIS releases.