Finding nearest centroids in a descending manner

Zeke · ‎12-01-2020

Hello,

My task is to combine census tracts for analysis. Starting with the tract with less than n number of households that is closest to n, where n is some arbitrary number, tracts should be combined with the tract with the nearest centroid, and keep being combined until n or greater is reached. Then proceed with the next uncombined tract less than n, and so on. Number of households is a field in the data.

What I've done so far is create centroids for all tracts. I can run the Near tool or the Generate Near Table tool to get each tract's nearest centroid, but apart from going through tracts manually (2800 of them) is not something I want to do.

I'm pretty sure this will require a Python script (could be wrong!), which is fine, can do that, but hitting a block on figuring out how the workflow should go. Any suggestions? Thanks.

JoshuaBixby · ‎12-01-2020

How do you determine which census tracts you start with? Do you have a bunch of scattered census tracts that are your seed tracts?

Zeke · ‎12-01-2020

No. The department that's requested this has divided the state into 4 regions. The nearest centroid has to be in the same region. This is fine, definition queries can take care of that.
Then, say n = 2100. For all tracts where the number of households is less than 2100, start with the tract that's closest to 2100, e.g. 1997. That tract should be combined with the nearest centroid tract. If that combination = 2100 or more households, good, all done with them. If not, calculate the nearest centroid for the new combined area, check for n, and repeat.
Basically, they want to make sure each study area has a minimum n# of households, based on # of households in tracts.