I have census block data with total number of children within each blocks. Instead of using census block itself as my analysis unit, I constructed circular neighborhoods (a sample of same sized circles with one mile diameters) and overlay them on top of the census blocks.
I would like to aggregate number of children from census block level to the circular neighborhood level.
Some census blocks that are small will be completely included into the circular neighborhoods, while some will be cut off and only have a part in the circles. If I could find out the share of the block area within the circle relative to that block that is cut off and multiply the share with the total number of children within that block (of course, assume children are evenly distributed in that block), I will be able to get the number of children in that part of the block in circle.
I still cannot think of a way to program it in python. How can I find out the area of blocks that are cut off by the circles?
Hope my description of the problem is clear!
why do you want to use circles? census data are aggregate at a variety of scales, is the next aggregation level up not appropriate? and if so why not?
I want to use circles because the boundaries of decimal census shifts through years, which gives me inconsistent measures for variables that I need to use across years. Also, if the reason of boundary shifts might be correlated with unobservables that have an impact on my outcome variable, which will give me biased estimation results.
You are in a bit of a catch-22 then since the circles are also going to cause issues because of the assumptions of the uniform distribution of people within the census structure. I guess the country of residence dictates whether they have aggregated and/or divided existing boundaries over time. It does make things completely difficult when boundaries are completely thrown out and redrawn at a single and/or several points of time.
I find certain GIS analyses are easier when converted from vector to raster. In this case, since you are already making some pretty large assumptions about distribution of children within the census blocks, I don't think rasterizing the data should be viewed as introducing error or uncertainty.
If you rasterize the census blocks and then make each raster cell a fraction of the children in that census block, you can overlay your circles on top of the raster layer and sum up the fractions to get new totals.
as long as the areas where people can't live can be removed first... ie the nonstationarity note