Select to view content in your preferred language

Can I "see" the "Spatial Index"?

965
5
09-29-2011 01:06 PM
JoanneMcGraw
Frequent Contributor
I'd like to know if it is possible to "display" the "grid" generated by a spatial indexing process. I'd like to be able to evaluate how many features are in each individual "cell"

There's not a lot of documentation about the Spatial Index and what little I have found all pretty much repeats the same things and doesn't really provide additional information. For example, in this page http://www.spatialvision.com.au/index.php/technical-tips/266.html, about half way down, it says:

"In this example, the feature class has a spatial index of 2407 (the default created when the data has been loaded from ArcCatalog). This default index ... In this example, we will increase the spatial index to 10,000 and re-evaluate the statistics."

What are the numbers 2407 and 10,000 referring to? Does a spatial index of 10,000 mean a 100x100 grid cell size? A 10,000x10,000 grid cell size? Nothing like that at all?

And, if there are, say, 100000 spatial index "cells" and the current extent is in the southwest of the layer, does the order of the cell records impact how fast it can find feature records that fall within that extent? Does it have to search the spatial index "cells" to find out which ones are even related to the current extent before then worrying about specifically looking at the feature records within those "cells" to see which individual records actually fall within the extents. If so, the order of the spatial index records is going to be relevant...I would think.

Does anyone know where I might be able to learn this kind of information?

Cheers
jtm
0 Kudos
5 Replies
VinceAngelo
Esri Esteemed Contributor
First off, you need to define "spatial index". There are several different possible spatial types,
and each may have a different indexing algorithm. If you restrict yourself just to Esri spatial
index types, then you just need a tool to create the one, two, or [hopefully not] three level
indexing scheme. Once you work out the how second and third level index indicators are bit-
masked into the gx and gy values, grid population is a trivial exercise using either the SDELOB/
SDEBINARY Sx table or ST_GEOMETRY's Oracle Sn_IDX$ IOT.

While the documentation details the logic used to assemble the theoretical grid (the cell size
is in ground units), no such grid is ever created (just the pointers to the x,y pair values).

I did some research into generating a shapefile from my se_layergrid object in se_toolkit,
but never needed to implement the 'visit' operator which would have been necessary to
populate a data density map. Keep in mind that such a table could be quite large (a true
16-bit raster might be a better output product). In the end, I wasn't ever able to convince
myself that the benefit of adding such a feature would be worth the effort (vice completing
other tasks). As a rule, the 'si_stats' report (by 'sdelayer' or se_toolkit's 'sdestats') provides
enough density information to get a feel for index suitability without plotting it.

Yes, spatial fragmentation usually plays a much larger role in performance than the grid size
(except when the grid size is *really* wrong).

- V
0 Kudos
JoanneMcGraw
Frequent Contributor
Vangelo,

Thank you for your response.

By "spatial index", I mean the one that is automatically generated when I import a shapefile into SDE through ArcCatalog. In ArcCatalog, it tells me (on the Properties' Index tab) that the Feature Class has a spatial index of 10000 for Grid 1.

When I run sdelayer on the feature class, I get the following response:
Layer 8822 Spatial Index Statistics:
Level 1,   Grid Size 10000
|-------------------------------------------------------------------
| Grid Records: 969804
| Feature Records: 920958
| Grids/Feature Ratio:  1.05
| Avg. Features per Grid: 105.32
| Max. Features per Grid: 3318
| % of Features Wholly Inside 1 Grid: 96.11
|-------------------------------------------------------------------
|               Spatial Index Record Count By Group
| Grids:      <=4    >4    >10    >25    >50    >100   >250   >500
|---------- ------ ------ ------ ------ ------ ------ ------ ------
| Features: 920519    439    182     77     34     15      3      1
| % Total:     100%     0%     0%     0%     0%     0%     0%     0%
|-------------------------------------------------------------------


From what I understand in the documentation, all the values reported here fall within the tolerances they suggest.

I'm not exactly having a problem with this, I am merely trying to understand how it works. The feature class has approximately 1 million polygon records in it. When I am querying it (using "Select by Location" in ArcMap), it takes longer to return the results of some queries than others and I'm trying to understand why. It's not just the number of features returned...if I have an extent that returns 100 features in one area and 100 features in another area, they don't necessarily take the same amount of time to respond. I'm assuming this has to do with the manner in which the spatial indexing is taken into account and then the records within each cell are evaluated.

So, for example, I'm hypothesizing that if the extent of the rectangle I am searching with spans four different spatial index "cells", it will take longer than if the rectangle I search falls entirely within a single "cell". Additionally, if a given spatial index cell has 3318 features in it (as in the sdelayer report) and another has only 105, if the rectangles searched fall within each one of those individually, I would expect that the rectangle that falls within the grid cell with only 105 features would respond more quickly than the rectangle that falls within the grid cell with 3318 features.

I'd like to be able to visually check to see if my assumptions are correct by overlaying the spatial index grid cells on the feature class itself as well as the rectangle polygons I am searching with.

I'll look into your comment re: "generating a shapefile from my se_layergrid object in se_toolkit" further. But, if you are already aware this will not provide what I am looking for, I hope you'll let me know and save me from wasting a bunch of time.

If you know that what I am trying to do is not possible without a great deal of ESRI developer expertise, I hope you'll also let me know.

Cheers,
jtm
0 Kudos
VinceAngelo
Esri Esteemed Contributor
I wasn't recommending that you modify my 'C' source (I haven't found a customer who
wanted to pay to add that functionaility), but if you can read it, you might better understand
what's happening, and thereafter code your own polygon population module (I don't
recommend an Esri novice get anywhere near the ArcSDE 'C' API).

You seem to have captured many of the issues surrounding variability in performance;
there is no one solution that will address these divergent constraints.  It you haven't already
done so, you should search on "spatial fragmentation", since that is something that you
may be able to control.  You can also look at the size histogram generated by 'sdestats
-o size -F 0,5000,20 -f -'.

Note that any number of different spatial storage algorithms might be used, depending
on how you have your instance configured, so "whatever the default is" might be different
from instance to instance (or might change suddenly within one instance, if the DBTUNE
environment was altered).

- V
0 Kudos
jagadeeshrao
Deactivated User
Hi,

You can see the spatial index information using GDBT Tool for each layer.

Down load the GDBT tool and install in local Arc GIS Desktop. So that you can see the complete spatial index information in Arc Catalog ( if you are using Arc SDE Database).

Thanks
0 Kudos
JoanneMcGraw
Frequent Contributor
Ah, jagadeesh88, thank you very much! That's exactly the type of thing I was looking for. It would be better if I could display it in ArcMap as well (9.3 doesn't provide it in the Extensions list) so I could also display it on top of other things but beggars can't be choosers.

Vangelo, thank you for the additional info. I will certainly have a look into "spatial fragmentation" and see what I can learn from sdestats. I had already had a look at your se_toolkit and determined modifying it was beyond my current skillset. Interesting to look around in though.

I'll report back if I learn anything of interest that I think someone else may be interested in in future.

Cheers,
jtm
0 Kudos