Spatial View so slow

HaniuHokkaido · ‎07-05-2013

Dear experts,

I have a spatial view with 25 columns in the database server that looks like this:

SELECT a.OBJECTID, a.Shape, a.KEYID, a.ROTATION, a.LAND_COVER, a.LAND_USE, a.COMPANY, b.Species, b.SpeciesGroup, c.Tipe
FROM     sde.LANDUSE AS a LEFT OUTER JOIN
               gdb.dbo.StandReg AS b ON a.KEYID = b.KEYID LEFT OUTER JOIN
                   (SELECT Dist, created_by, updated_by
                    FROM     gdb.dbo.Petak_PSDS
                    WHERE (Dist = 'District 1')) AS c ON a.KEYID = c.cKeyId

(the query has been truncated, not all columns are mentioned)

This View comprised of 3 tables. The view has 7600 data in it. I open it in my PC using arcmap / arccatalog and it takes around 4 minutes to load up. I have used analyze and update_dbms_stat and its still the same. Also there are no errors spatially.
How can i make it load faster ?

BTW, is view replicate-able using One-Way replication ?

Thanks

[arcsde 10.0, sql server 2005]

VinceAngelo · ‎07-16-2013

So if these are only simple feature classes, why not just run

sdeexport -o create -f - ... | sdeimport -o create -f - -k GEOMETRY ...

for each table?

The <10 features/sec load performance is still off by an order or magnitude or two,
so you need to address that too.

- V

MarcoBoeringa · ‎07-17-2013

Perhaps what I was doing was exposing my naivete and relative inexperience when dealing with spatial data in "hard core" fashion. Here's how I got my numbers:

I queried SDE_LAYERS to get the layer_id for the feature class I wanted to examine, then I simply queried the corresponding f table to take the average of numofpts column. That is the number I reported as being the average number of vertices for a feature in that class. Did I misunderstand the content of the f table?

No, you didn't. You interpreted the data right. You undoubtedly know this already, but be aware that GEOMETRY and GEOGRAPHY in SQL Server don't use a f table, features are stored straight in the base table.

I have the following stats for the layer in question:

366k features
numofpts: avg 2127, min 13, max 144,969 (!!!), standard deviation about 5086, median 1023.

The feature with 145k vertices is Monroe County, FL.

Still a huge number of vertices per feature by any measure.

MarcoBoeringa · ‎07-17-2013

I have made one last attempt to get closer to Keith's data and working process, by densifying the features one more time, inserting a vertex each 0.1 meter. I used a selection of about 10% of the original dataset, so there is only 1806 polygons left here. This resulted in a dataset with the following characteristics.

Results for densified layer

Polygons: 1806

Vertex count
Total: 4883810
Average: 2704
Min: 23
Max: 34119
Std: 3505

As you can see, the resulting dataset now has on average even more vertices per feature than Keith's dataset (2704 versus 2127).

I than copied this data to a SDE_BINARY dataset using the Copy Features tool by setting the proper configuration keyword, and back from there to GEOMETRY. This last step is very close to what Keith is doing, as my initial tests were GEOMETRY to GEOMETRY mainly, and not specifically from SDE_BINARY to GEOMETRY. Since this step involves a conversion of data types, it might slow down things.

I also copied from GEOMETRY to GEOMETRY again for comparison:

GEOMETRY TO GEOMETRY results:

1806 polygons copied in 59 sec

1806 / 59 = 30.6 features written per second
4883810 / 59 = 82776 vertices written per second.

GEOMETRY TO SDE_BINARY results:

1806 polygons copied in 37 sec

1806 / 37 = 48.8 features written per second
4883810 / 37 = 132K vertices written per second.

SDE_BINARY TO GEOMETRY results:

1806 polygons copied in 43 sec

1806 / 43 = 42 features written per second
4883810 / 43 = 114K vertices written per second.

CONCLUSIONS:
Well, "surprise"... or maybe not so. But the writing to and from GEOMETRY to SDE_BINARY, both ways, is actually faster than from GEOMETRY to GEOMETRY, despite a needed conversion.

Anyway, looking at the figures ranging from 30.6 to 48.8 features / second max, I do have to contest Vince's "two-orders-of-a-magnitude" to low performance figures with the 7 to 9 features Keith reported. It seems likely from these figures, that a maximum gain of about 5-6x is a more realistic figure based on real world data (albeit densified). This is assuming ArcGIS and geoprocessing framework usage, not se_toolkit or ArcSDE Command Line tools, of course. And test hard-/software configuration as in this post in this same thread.

Lastly, Keith's dataset has more attribute fields than mine, that may contribute to lower insertion figures too.

The databases are on virtual servers. The destination server is configured as two Intel E5-2260 @ 2.2 GHz CPUs, Windows Server 2008 Standard Edition, 64 bit, 16 GB RAM.

I guess this needed to be Intel E5-2660 2.2 GHz. According to this benchmark page, they should be about double as fast as my Core i5 2320. That would realistically put them in the "one order of a magnitude" faster than the current loosy 7-9 features / s score, so maybe 80-100 "2000+ vertex" features / s should be achievable.

VinceAngelo · ‎07-17-2013

While virtual servers are great for many purposes, database server (and ArcGIS
server) is not one of them. None of my enterprise clients run databases on
virtual servers, and I've worked on benchmark teams to clearly demonstrate
the benefits of ArcGIS on physical (vice virtual) servers. We also saw a 2x
performance difference between fibre-attached disks and network shares.

Obviously, this is not to assert that all physical systems are better under all
conditions, but when I/O is a a premium (as it is in databases and mapping),
it makes sense to optimize the system for I/O performance (which includes
giving the machines a lot of RAM -- 16Gb is good but 32-64Gb is better).

- V

KeithAdams · ‎07-31-2013

No, you didn't. You interpreted the data right. You undoubtedly know this already, but be aware that GEOMETRY and GEOGRAPHY in SQL Server don't use a f table, features are stored straight in the base table.

Still a huge number of vertices per feature by any measure.

Glad to know I had the details right. We are looking forward to the removal of f, s, and i tables from the architecture, and also to being able to use straight SQL (plus some follow-up spatial index management) to copy features from "master" sets to subsidiary subsets, without having to go through the pain of reprocessing the data for each variation.

Since we are using county, census tract, and county subdivision boundary data obtained from outside sources (Esri and the Census) we're not predisposed to mess with them. Any idea why there might be over 145K vertices to define a single county boundary, even if that's an extreme outlier?

-= Keith =-

LeoDonahue · ‎07-31-2013

... beyond making sure all the indexes are present, to improve that query.

Haniu,

Did you index your join fields? I didn't see your reply to that.

MelitaKennedy · ‎07-31-2013

A couple of reasons for large vertex counts--

1. Part of the boundary follows a water feature
2. Even long, "straight" lines shouldn't be only two point lines if you want to reproject the features. ArcGIS doesn't densify data as it reprojects the features. Straight lines may be curves in the new coordinate system.

Melita