Non-versioned data slowing way down

MattLane · ‎04-21-2014

I have two unversioned featureclasses in a dataset that our organization adds and deletes up to 10s of thousands of records daily. One is over 3million points and the other is over 500k lines. Each have an organizational ID field with an index and a GlobalID field with an index. We will be versioning and replicating these in the near future. They are getting slower and slower over time, to the point where just deleting a couple records could take 8 minutes and getting a table feature count using the toolbox tool takes 3 minutes. The processing that used to take 1 to 2 hours now takes over 12.

Details:

SQL Server 2008 R2

SDE Release 101010

Desktop 10.1 sp1

Each spatial index is built by ESRI (16 cells per object, Medium on all levels)

Data is not in SDE schema, but a separate one

There isn't a lot of other data in this database, although there is a single versioned featureclass in a different dataset that has topology rules applied to it.

What I've tried:

I rebuild and reorganize all indexes in this database as needed according to Microsoft documentation

Checked the Bounding Boxes on the spatial indexes

Rebuilt statistics (esri tools)

Compressed

Shrunk database (our DBA does this periodically)

Restarted server (memory doesn't seem to be the issue)

George_Thompson · ‎04-30-2014

Matt,

I wonder if the geoprocessing history for those two feature classes is getting very large. Every time that you perform a GP process it adds to the metadata. The metadata is stored in a geodatabase system table.

You may want to look into this add-in: Delete Geoprocessing History from Metadata

Got the link from this post: http://forums.arcgis.com/threads/43399-metadata-geoprocessing-history

I would only do this after you have a verified back of the geodatabase.

Try it on the two feature classes in question

-George

--- George T.

View solution in original post

MattLane · ‎04-21-2014

This has happened in the past and the only way I could find to fix it was to export and recreate the two layers completely. That worked well for about 4 months. This option won't be desirable once we start replicating daily to other agencies.

WilliamCraft · ‎04-21-2014

How are those features being added and deleted? Is this being done inside an edit session or outside an edit session? If you're deleting and adding tens of thousands of features on a daily basis, you might want to consider re-building the spatial indexes on those feature classes accordingly each time the number of records changes drastically. The need for doing this is rare, but it can happen depending on how the features are being added/deleted. I also believe this is more likely to happen when the changes are made in an edit session. As an example, a spatial index calculated for 50,000 lines might look much different than a spatial index for 140,000 lines in terms of grid size. Check out these articles to understand a bit more about spatial indexes:

http://resources.arcgis.com/en/help/main/10.1/index.html#//003n0000001r000000
http://resources.arcgis.com/en/help/main/10.1/index.html#//003n0000001s000000

So, your workflow might need to change to something like this at a high level:

Start edit session.

Delete 50,000 lines.

Recalculate the spatial index.

After several hours, add 80,000 lines.

Recalculate the spatial index.

MattLane · ‎04-21-2014

Thanks for the response.

The records are being deleted by the delete tool and added by the append tool outside of an edit session (python scripted process).

Like I mentioned, I'm monitoring the indexes and when a spatial index's fragmentation is > 30% I'm rebuilding. If the fragmentation is > 5% and < 30% I'm reorganizing.

I've tried using ArcCatalog to Recalculate as well as delete and recreate. It seems to do what the SQL Server functions do and neither improve access and editing times.

VinceAngelo · ‎04-21-2014

A couple of things to consider:

GUID columns, by their very nature, are always randomly distributed;

If the feature data has been gathered over time it is likely badly spatially

It's unlikely that the default spatial index tuning parameters are best for

- V

MattLane · ‎04-21-2014

If the feature data has been gathered over time it is likely badly spatially

fragmented. One of the hallmarks of spatial fragmentation is poor spatial
query performance, not matter how artfully the spatial index has been tuned.
There are many possible ways to defragment spatially, but the easiest is
to do an ORDER BY query on some attribute that imposes spatial order
(county code, etc). More exotic queries are available if you can use the
ST functions to group features into bands in X (or Y), then order by Y (or X).
Note that spatial defragmentation will hurt join performance on other
query constraints (like time), so that may need to be mitigated.

Thanks Vince, I'd like to look into how I can reorder the data without reloading it. How do I go about using an ORDER BY like you mentioned to defragment spatially? Are you referring to enforcing order during spatial index creation or just on queries to the db?

VinceAngelo · ‎04-21-2014

You can't defragment without reloading -- it's the load order which makes it fragmented.

One possible methodology is to create a backup table with

CREATE TABLE mytab_backup AS
SELECT * FROM mytab_backup

Then TRUNCATE the original, and execute something like

INSERT INTO mytab
SELECT * FROM mytab_backup 
ORDER BY county_fips

- V

MattLane · ‎04-29-2014

I'm still unable to get the two layers back to where they were a few weeks ago. They continue to get worse by the week 😞

I've followed your suggestion, ordering the geometries by the districts they are in, proved by using the TOP keyword in SQL Server. After rebuilding all the indexes the issue persists.

The display of the data in ArcMap is actually really fast. But adding it to ArcMap takes minutes. Just selecting it in ArcCatalog freezes the application for minutes. Other layers in the database, even fairly large ones, are instantaneous. This all makes me think there is some cruft in the SDE tables, or at least something wrong in them.

I've tried copy/pasting them into a new layer or importing into a new layer (even without the GlobalID field) and the new layers are just as slow.

VinceAngelo · ‎04-29-2014

Are you adding these layers as geodatabase feature classes or as Query Layers?

How many total features exist in the table?

- V

MattLane · ‎04-29-2014

I have two unversioned featureclasses in a dataset that our organization adds and deletes up to 10s of thousands of records daily. One is over 3million points and the other is over 500k lines

These were created in ESRI software and are edited with python tools outside of an edit session. They are not versioned (yet) but do have GlobalIDs. They will be replicated. They exist within a dataset, but exporting them out of the dataset into the main directory does not help. They were not like this 3 or 4 weeks ago, and there haven't been significant total record increases. Thanks for following up with me.