Which is faster: fewer larger features or more smaller features?

942
11
Jump to solution
12-04-2013 09:31 AM
StormwaterWater_Resources
New Contributor III
I've made a county-wide LIDAR based 1ft contour feature class in a local fGDB.  There are ~2.6 million features for about 550 contour intervals (possibly order of billions of vertices, at least hundreds of millions).  The extent is about 15 miles by 30 miles.  Currently many of the lines are single features over 20 miles long with perhaps thousands of vertices (I'm speculating).  It takes about 6 minutes to draw at full extent which is pretty much unusable.  If I diced this up with, say, a 1mile x 1mile grid to make more, smaller features would the performance be better or worse?

Alternatively, would a local "personal" SDE GDB on SQL Server Express speed this up?


10.1 SP1, Win 7 64bit, Core i7 870 (2.93Ghz), 10Gb RAM.
0 Kudos
1 Solution

Accepted Solutions
VinceAngelo
Esri Esteemed Contributor
Fewer rows are faster, except when there's data that can't be rendered.

Best practice would be to intersect the contour lines against a regular grid
of at least 5x5 over the study area, then dissolve by grid cell and elevation
(unioning nearby shapes with the same attribute).  If you then set scale
dependency so that not more than 4-9 grids are rendered at one time,
the draw request should fly (at which point it won't matter if it's got the
added database overhead of SQL-Server Express [which would generally
slow access vice a local FGDB]).

- V

View solution in original post

11 Replies
michaelcollins1
Occasional Contributor III
Whatever you end up using, dissolving the contour lines by height might help further. I use FGDB for such things.
0 Kudos
MichaelVolz
Esteemed Contributor
Why don't you apply a scale dependency to the data so only a small amount of contours would need to be accessible at any given time?

You could create major contours (e.g.10 or 20 ft intervals) and show these at smaller scales and only show the 1ft contours at the largest scales.
0 Kudos
VinceAngelo
Esri Esteemed Contributor
Fewer rows are faster, except when there's data that can't be rendered.

Best practice would be to intersect the contour lines against a regular grid
of at least 5x5 over the study area, then dissolve by grid cell and elevation
(unioning nearby shapes with the same attribute).  If you then set scale
dependency so that not more than 4-9 grids are rendered at one time,
the draw request should fly (at which point it won't matter if it's got the
added database overhead of SQL-Server Express [which would generally
slow access vice a local FGDB]).

- V
MarcoBoeringa
MVP Regular Contributor
Two remarks to these topics:

I've made a county-wide LIDAR based 1ft contour feature class in a local fGDB.  There are ~2.6 million features for about 550 contour intervals (possibly order of billions of vertices, at least hundreds of millions).


First of, contours generated from extremely high detailed raster or LIDAR data, almost always need generalization to be of any real use. There is really little use for maintaining all detail, since in general, the GIS analyses options for something like contour lines are very limited (e.g. compare with rasters), and mostly contours are simply used for display only.

See the

An overview of the Generalization toolset

Help topic, and especially the Simplify Line option, for more information about generalization.

There are ~2.6 million features for about 550 contour intervals (possibly order of billions of vertices, at least hundreds of millions).
...
It takes about 6 minutes to draw at full extent which is pretty much unusable.


What would you expect attempting to display 2.6 million features, with hundreds of millions of vertices?... Even "Google Earth" type mega spatial databases, are only feasible by very smart indexing schemes, only sending out the minimum required data to the viewer to fill up his display at a particular scale and well chosen level of detail.

I have therefore no clue as to what you attempt to do drawing all features "at full extent". In terms of display, there is absolutely no use for a completely cluttered up screen, swamped by 2.6M features.

Realistically, any display of GIS data should probably be limited to <100.000 features / screen to be of any real use in distinguishing individual features (a 1920x1080 full HD display has +/- 2M pixels at most anyway).

The main solution to your problem is therefore simple: set minimum and maximum display scales to realistic values, preventing ever to display all features at full extent...
0 Kudos
WilliamCraft
MVP Regular Contributor
As an additional suggestion (which is something fairly easy and quick to try), it might be good to review your grid sizes for the contour feature class's spatial index.  In general, I have found that setting grid 1 to 1000 is good start and setting grid 2 to 3x the size of grid 1 is also a good start (in this case, grid 2 would become 3000 since grid 1 would be 1000) for a line feature class like this.  Grid 3 can be left at 0 for the time being.  Give it a try and see what happens when rendering the feature class.
0 Kudos
VinceAngelo
Esri Esteemed Contributor
Multiple levels of spatial index gridding have a performance cost.  They should only be used
when needed (e.g., when there is effectively two sets of data with different intrinsic sizes
within the table).

If you group the features to an overlay fishnet, then there is no reason to have more than
one grid size. The index grid size should be at least twice the size of the fishnet distance
(if it's a large net [e.g., 4x4 - 5x5 over the area], then you can go as small as 133% of the
fishnet, but never down to the size itself, since that would generate 4-9 index rows with
each feature -- far too many false positives in the index).

- V
0 Kudos
StormwaterWater_Resources
New Contributor III
To address some of these posts.  I think the most promising thing is the dissolve.  I'm going to experiment with that some more, but on subsets of the data, dissolving on elevation into multi-part lines made a big difference in drawing speed.  I plan to more fully test vangelo's suggestion in message #4 below.

mboeringa2010's comments were not invalid, but did not really answer the question.  Certainly, when I'm done, I expect to employ many of these data restriction techniques such as scale constraints and DQ's though I'm not interesting in simplifying/generalizing.

I did split my contours up into 1 sq mile pieces.  Overall it does not look like that makes a big difference in performance, but I'm writing a Python button plug-in to load only those chunks necessary to cover the current display extent. 

Just out of curiosity I plan to dissolve the whole massive dataset to see what happens.
0 Kudos
StormwaterWater_Resources
New Contributor III
Fewer rows are faster, except when there's data that can't be rendered.


There's the quick and dirty answer to the original question from Vangelo (#4 below).

Best practice would be to intersect the contour lines against a regular grid
of at least 5x5 over the study area, then dissolve by grid cell and elevation
(unioning nearby shapes with the same attribute).


Before I tried this I reassembled clipped subsets; I'll describe that in a minute.  However I finally tried the exact method from Vangelo.  Again, my original data set (*un*dissolved) had ~2.6 million features, many with thousands of vertices.  I created a 15x7 grid, and then attempted to intersect the whole thing.  After four days the process finally crashed (out of memory).  I tried these exact steps on a far smaller subset, and it worked perfectly.  The result was that my contours were split up by my grid and dissolved, but otherwise identical to the original contours.  Note: it's the dissolve that really makes the difference in performance.

The way that I actually worked with the entire data set was by clipping and then merging.  As far as I can tell this accomplishes the exact same thing.  The entire set of contours was diced up into pieces and dissolved but otherwise identical to the original.  Here are the steps I took:

- Create an index grid of squares (Toolbox -> Cartography -> Data Drive Pages -> Grid Index)
- Create a model (actually I scripted it, but the model iterator works great too)
[INDENT]    - Use Feature Selection iterator over index grid
    - Clip the contours with the output of the iterator
    - output the new feature class clips into a GDB
[/INDENT] - Dissolve the individual clipped feature classes from above
[INDENT]   (using another model with a Feature Classes iterator over the GDB)[/INDENT]
- Merge the dissolved clips again using a Feature Classes iterator.

I suppose this could all be grouped into one model but that's not the way I did it.  I am of course using a scale restriction.  The final product is slower than our original contour layer, but it's certainly acceptable performance.
0 Kudos
VinceAngelo
Esri Esteemed Contributor
I don't understand that last bit:
The final product is slower than our original contour layer, but it's certainly acceptable performance.


How many features in each layer?
How many features are drawn?
How long does it take to draw each full-scale, and how long to draw at an extent 10%
larger than one of the grid cells?

If the sheer size of the contour lines is impacting performance, then Marco's suggestion
to generalize your features is your best fallback option.

- V
0 Kudos