All right, let me try to take you through this again using your new case. Since Generate Near Table tool gives more options that you seem to need than Point Distance tool. I am going to use Generate Near Table tool in the description below.
Your data:
For a particular state you have 223 sewerage points. Let's call it sewerPts.
And you have millions of urban centroid points across many states. Let's call it urbanPts.
Your goal:
"I want to know which ones are closest to that sewerage point within a 10 mile radius, outside of 10 mi I don't consider it".
Normally you should be able to use the Generate Near Table tool, enter 10 miles as the Search Distance, and uncheck the box "Find only closest matches" (unless you only want to find one closest near point to one input point?). The result should contain distances from each sewer point to all those unban points that are within 10miles to the sewer point. If an urban point is beyond 10 miles from a sewer point you won't get a record for it.
If the above process gives you the 99999 error, the workaround I proposed earlier should give you equivalent result. So, here is the updated workaround:
1. Run Buffer tool on sewerPts with 10 miles distance and the ALL option for Dissolve Type. You should get one polygon that contains multiparts. The overlapping boundaries should be dissolved. Let's call it buffer10miles.
2. Run Clip tool to clip urbanPts by buffer10miles. Let's call the result urbanPts_in10miles. This result is equivalent to what would be found by a Search Distance of 10 miles from points in sewerPts to points in urbanPts. So now you have a subset of the original urbanPts, which I hope gives a better chance to run through the next step.
3. Run Generate Near Table tool using sewerPts as input features and urbanPts_in10miles as near features. Enter 10 miles as the Search Distance and uncheck "Find only closest matches". This should give you the desired result. Let's call it sewerPts_urbanPts_nearTable.
You said "In addition, on overlapping sewerage sources, I would like to distinguish which urban centroid is closer". I am not sure if you meant to reverse the analysis - to generate near table from urbanPts to sewerPts within the distance? If an urbar point finds multiple sewer points, you can use the Summary Statistics tool to get the minimum NEAR_DIST value and the corresponding sewer point FID.
I am not very clear about your future prediction part. But unless you want to change 10 miles to a different value, you should run the same process even you include more urban centroid points.
Regarding the "Maximum number of closest matches" option, do you want to find only a certain max number (say 100 or 500) of urban centroid points from each sewer point? If not, don't enter anything here.
I would like to suggest that you test the process by small sample datasets - select 5 sewer points, use a smaller distance so that a small number of urban centroid points are found, and review the table and try to understand and verify the result is expected. Then apply the process to the real data size.
Hope that helps.