�??Best method�?� for weekly imports into SDE 9.3.1

880
7
06-10-2010 11:56 PM
NicholasO_Connor
New Contributor II
I realise there are many ways to achieve this goal but I hope by posting this question we can begin a discussion on what is a common job for the SDE Administrator.

So my situation.

PROD: Oracle 11.1.0.6.0 / SDE 9.3.1 / ST_GEOMETRY default / Solaris 10
EDIT: Oracle 11.1.0.6.0 / SDE 9.3.1 / ST_GEOMETRY default / Solaris 10

On a weekly basis, we would like to update completely 40 simple (no topology, no feature data sets) feature classes (line, poly and point) containing rows from 1000 up to 1.1 million.

The 40 feature classes are accessed by �??normal�?� ArcGIS Desktop Users (7am �?? 7pm), ArcIMS (24*7) and ArcGIS Server applications (24*7). We can shutdown the ArcIMS+ ArcGIS Server anytime after 7pm. Some of our ArcGIS Server apps uses Direct Connect the rest not.

Currently, I make the following steps:

1. sdeexport on all 40 feature classes from EDIT creating sdx files
2. sde command kill all users
3. shutdown the ArcIMS+ ArcGIS Server
4. sdeimport -o init all sdx files into PROD
5. start the ArcIMS+ ArcGIS Server

This process takes approximately 1 hour to run.

So my questions would be:

1) Is there another approach that would allow me not to shutdown ArcIMS+ ArcGIS Server? I had thought about using SQL statements to delete all rows from PROD and then sdeimport -o append? But I�??m very concerned about the Delete performance from SDE.

2) When I use the sdeimport -o init command, as it TRUNCATES data, this produces a Table lock. So if one of the ArcGIS Desktop Users were to login and access the feature class during load time �?? the truncate statement fails. I had thought about LOCKING users accounts on the �??Oracle side�?� to handle this and UNLOCKING them after import has completed �?? good idea?

3) Finally, is there a better way to achieve this goal? I have read from the forums about Users using �??loading�?� feature classes and then making a rename. Or (see early post) about using SQL over a Database Link to move data between EDIT and PROD.

Any thoughts on this matter greatly appreciated.
0 Kudos
7 Replies
RandyKreuziger
Occasional Contributor III
Check out this weeks thread over in the GeoProcessing forum. http://forums.arcgis.com/threads/3047-Feature-classes-locked-by-ArcGIS-server-services

You can use python delete features and append commands.  With this method you won't lose any of your metadata and you won't have to worry about tables locks.  The downside is that between the delete features and append your feature class is in a state of flux.  Someone query the feature class could get incomplete results.  I do most of our updates early Sunday morning under the assumption no one is on at that time.
0 Kudos
PaulDziemiela
New Contributor
Hi dhuhkosi,

I am not exactly clear about what the problem is with your current procedure.  Is it the time expended to do the update or is it exclusion of the users and shutdown of the middleware servers?  The way you are doing things seems very "safe" to my mind.  I assume you also shut down all the editing users during the sdeexport step.  An hour seems like nothing to me.  I often times measure loads or processing steps in terms of days.

As kreuzrsk mentioned, if you don't mind the possibility that AGS and IMS users may not find their data or just part of the their data during the time of the load, you can load the data while the servers are active.  The python route will work or easier still is to do a TRUNCATE on the table via the database and then sdeimport -o append into the empty table (don't delete the rows with DML - slow).  Nothing in ArcSDE can stop or hinder the truncation no matter who is logged in or how.  But your connected users could end up looking at either no data, partial data or simply crash.  The best and nicest thing to do is to keep everyone out during this time of data instability. 

It makes sense to me to try the switcheroo idea.  Have an exact copy of the table.  You need to make 100% sure the copy is exactly the same coordinate system as the master and that the ST_SRIDs match.  Load the copy and then rename as the master via the database "behind the back" of ArcSDE.  But your objectids will mostly likely change and you will hose anyone currently attached to that table.  If time is the essence you could load all the copies at your leisure and then force everyone out over lunch and swap the table names at that time?  Again not sure what part of things you want to improve.

Cheers,

Paul
0 Kudos
NicholasO_Connor
New Contributor II
Firstly, thank you very much for your answers.

Kreuzrsk: With the delete and append commands I�??m not really concerned with the data being in a state of flux �?? however, after some testing, the delete process is taking way too long.


Pdziemiela:

1) The python route will work or easier still is to do a TRUNCATE on the table via the database and then sdeimport -o append into the empty table (don't delete the rows with DML - slow). 

�??to do a TRUNCATE on the table via the database�?� �?? do you have to TRUNCATE both the �??business�?� table and the �??spatial index table�?� for this to work?

2) The best and nicest thing to do is to keep everyone out during this time of data instability.

Yeah, this is what I believe too. Hence my thought about generating a Oracle sql script that LOCKS all Users expect the one which is loading the data.

3) Again not sure what part of things you want to improve.

Sorry if I was not too clear. I think it is the idea of stopping and starting ArcIMS and ArcGIS Server. However, after reading both your posts and from some Testing here, I don�??t really want to start very slow running DELETE statements on the data.

Finally, one point, if you could also clarify if you received similar results to me:

If I try and delete 50,000 records from a Feature class using the geoprocessing objects, it takes approximately 10+ minutes to run.

If I try and delete 50,000 records from a Feature class using a simple sql statement from Oracle SQL Developer, it takes less than a minutes to run.
0 Kudos
PaulDziemiela
New Contributor
Hi dhuhkosi,

The SDE.ST_SPATIAL_INDEX index type is a database domain index just like MDSYS.SPATIAL_INDEX or other types of domain indexes like Oracle Text.  So Oracle does the backend work of noting you've truncated the table and also truncating the spatial index to match.  So nothing special that you need to do (well, do bear in mind sometimes you get orphans - http://resources.arcgis.com/content/kbase?fa=articleShow&d=34324 - but that's some kind of hiccup in ArcSDE where the index id value increments without getting rid of the indexes for the old value). 

One thing you might well WANT to do is drop the spatial index entirely for the duration of the load and recreate it after your load is complete.  This will speed up your load for your larger tables.  You can either drop the index via SQL or just set the layer to load_only_io via sdelayer or python.  Then when the load is complete set the layer to normal_io.  I don't believe that ArcSDE tracks the status of the domain index apart from just checking it realtime when it needs to.  So you can drop and recreate the spatial index either via SQL or via ESRI without any problems in my experience. 

The locking users out is a good idea.  How does that reflect through ArcMap?  Does the user get a nice informative message as to why they cannot connect?  And how do the middleware servers handle the locked condition?  For example just recently we found AGS 9.3.1 creating dozens and dozens of inactive sessions to the point where eventually the database hit the upper limit of sessions.  Turns out the db account being used had gone into that Oracle "password warning mode" saying that the password was about to expire.  Once the warning was removed, it all went back to normal.  The middleware login was just not liking that warning message and connecting over and over.  And what about users that are already connected at the time of the lockout?  You are okay with the effort to manually track down and evict users at the start of the load process?  You can walk over to these folks or phone them and nicely tell them to get out?  But you just don't want anyone new to come on board, right?  I'd very much like to hear how this goes for you.

Please send us the results of your testing. 

Cheers,

paul


Pdziemiela:

1) The python route will work or easier still is to do a TRUNCATE on the table via the database and then sdeimport -o append into the empty table (don't delete the rows with DML - slow). 

�??to do a TRUNCATE on the table via the database�?� �?? do you have to TRUNCATE both the �??business�?� table and the �??spatial index table�?� for this to work?

2) The best and nicest thing to do is to keep everyone out during this time of data instability.

Yeah, this is what I believe too. Hence my thought about generating a Oracle sql script that LOCKS all Users expect the one which is loading the data.

3) Again not sure what part of things you want to improve.

Sorry if I was not too clear. I think it is the idea of stopping and starting ArcIMS and ArcGIS Server. However, after reading both your posts and from some Testing here, I don�??t really want to start very slow running DELETE statements on the data.

Finally, one point, if you could also clarify if you received similar results to me:

If I try and delete 50,000 records from a Feature class using the geoprocessing objects, it takes approximately 10+ minutes to run.

If I try and delete 50,000 records from a Feature class using a simple sql statement from Oracle SQL Developer, it takes less than a minutes to run.
0 Kudos
NickHarrison
New Contributor III
Since your data is simple feature you should go with the truncate and append method. If you put the layers into load_io mode after the truncate then it will greatly increase the speed of the load. I would not use sdelayer -o init. use sdetable truncate (database triggers will truncate the other retated SDE tables), then put layer into LOAD_ONLY_IO, then use sdelayer -o append. Then put layer back into normal mode. Depending on how much data you have to load this could cut the load time immensely.

I haven't used ArcIMS for a while but I do recall that a MXD based service will lock the table while an AXL service will not.  So depending on which service you are using you may not need to kill all the connections and stop/start ArcIMS.
0 Kudos
NicholasO_Connor
New Contributor II
Hello Nick and Paul

Sorry one point I�??m not too sure about.

Paul: easier still is to do a TRUNCATE on the table via the database and then sdeimport -o append
Nick: use sdetable truncate (database triggers will truncate the other retated SDE tables

My experience using Oracle SQL Developer to run the SQL commands:

Select count(*) from GEO.FOREST;
25,123

Select count(*) from GEO.S1640_IDX$;
7856

Truncate Table GEO.FOREST;

Select count(*) from GEO.FOREST;
0

Select count(*) from GEO.S1640_IDX$;
7856

Reload with new data using sdeimport �??o append into GEO.FOREST;

Select count(*) from GEO.FOREST;
26,123

Select count(*) from GEO.S1640_IDX$;
7856

So...

Paul �?? are you sure the Spatial Index is getting updated correctly when using a TRUNCATE?
Nick �?? when you run sdetable truncate cmd are you sure that a TRUNCATE statement is actually run or just a DELETE statement?

From my experience the Triggers on a Table will only be used when a DELETE statement is run not a TRUNCATE.
0 Kudos
PaulDziemiela
New Contributor
Paul �?? are you sure the Spatial Index is getting updated correctly when using a TRUNCATE?

Hi dhuhkosi,

Your question is most interesting but I admit I have always just trusted Oracle and ESRI to do the truncation correctly.  Basically every domain index has "events" that watch for changes to the host table.
http://download.oracle.com/docs/cd/B14117_01/appdev.101/b10800/dcidmnidx.htm
The one we are talking about here is ODCIIndexTruncate().

Now the ESRI folks store the procedure in the SDE schema in the body of the SDE.ST_DOMAIN_METHODS type.  So we can look right at the PL/SQL code that fires and its just 47 lines of code.  So to summarize it checks first to see if two specific conditions are true in which case it does nothing, else it then determines the name of the domain index table and truncates it.  I just tried your experiment.  I loaded 443,154 points into a layer.  I checked that the domain index had 443,154 records.  Then I truncated the host table and found the domain table now had zero records.  So I am not seeing what you are seeing.  However, as I said there are two conditions in the code whereby the truncation does not take place.  Perhaps your situation is encountering those conditions?  I'd say your next step is ESRI support unless someone else can succinctly explain those conditions.

Please tell us what you find out as now I am curious.

Cheers,

Paul
0 Kudos