Help with moving a large dataset

07-26-2012 03:01 AM
New Contributor III
Hi all,

I was wondering if anyone could help me with a strategy for moving a large sde database.

Our setup consists of two identical ArcSDE 10.0 (no sp) PostgreSQL 8.3 installs, the first machine is a setup to only do Geoprocessing the other on a ArcGIS server. Both machines are 64bit and have a minimum of 8GB or ram each.

My plan is to process monthly updates on the Geoprocessing machine then transfer it over to my ArcGIS Server. The intention is to transfer the whole database every couple of months and replace the old version 100%.

The database occupies about 350 GB of HDD space in PostgreSQL however is it around 40GB when I exported using SDEExport or pg_dump.

I have already tried running SDEExport, transferring the data and running SDEImport however I have found that after an hour or so the import failed with the following message.


One thing I did notice was that the import failed when PostgreSQL got up to using 2GB of RAM, does anyone else have a problem with PostgreSQL being allocated this amount of RAM?

The alternative method I have used was run the pg_dump command which I have never used before but with a little help from the internet I was able to run. Getting the data out was relatively simple however the pg_restore has been running now for 7 days and I am unsure that this method will be successful as I can see a number of error messages.

Does anyone have experience using either of the above methods and could offer me some advice?

The only other option I know of it to setup DB replication between the two servers however I have not done this before. Can anyone tell me if replicating a DB of this size is possible and maybe offer my any advice in setting up the process?

If you have any questions about the above or can offer advice your comments will be very welcome.

Kind Regards

0 Kudos
7 Replies
Esri Esteemed Contributor
Please post error messages in text (which is searchable) rather than as an image (which is not).

I can't help but wonder if one of the five service packs released for ArcSDE might have helped
to prevent this error.  Keeping up with service packs can be a pain, but they're not issued so
frequently that it's impossible.

Exporting the entire table first probably wouldn't be a bad idea (though the 90% compression
ratio makes me wonder if the file-only export is failing, too, but unnoticed).

I suspect that your primary database is corrupted.  The best way to detect this is to build an
index on a column that will get you between 50 & 1500 distinct files (countries, states, counties)
and use WHERE (-w) clauses to export each of them (it's even better if this serves as a spatial
partitioning key, since the imported database will perform better). When you import, use a
"where 1=0" export to create the table, place it in load_only_io mode, then append all the
remaining files, and then place the table back in normal_io mode. Then if one key fails, you're
closer to knowing where the corruption is.  If the smaller partitions run to completion, then
you know it's a PG bug.

- V
0 Kudos
New Contributor III

The error reads:

Importing SDEX from mmarea.000 ...
Importing spatial column "shape"
SDE code (-51) = underlying DBMS error
Extended DBMS error code: -51
Error: ESRI: error making shape from arguments [20602]:
Error in stream loading...

Error in importing SDE feature/attribute data record/
last record processed was rec#47822673
Error importing SDE export file version 10.0.0_1
sdeimport error
47822673 records read.
0 records stored.

I should probably add that I have tried importing both a single export file and a sdeexport -X where the maximum limit per file was set at 1GB.

I need to read up on the service packs just in case updating them will have an negative effects to our other services.


0 Kudos
Esri Esteemed Contributor
You can scan the file for validity with 'sdexinfo -o stats'.  I expect the contents is corrupt.

If you have the patience to wait through 47822600 rows of processing, using 'sdexinfo
-o list -a | tail -200' (assuming you have 'tail') will give more information about the
failing row (or at least the one before it).

- V
0 Kudos
New Contributor III
Hi Vince,

Here is the information from the sdexinfo:

C:\mm>sdexinfo -o stats -f mmarea
SDEX File:             mmarea.000
Version:               SDEX 10.0.0_1
Type:                  Final
Volume Type:           Multi

Layer ID:              8
Database Name:         sde
Table Owner:           sde
Table Name:            mmarea
Spatial Column Name:   shape
Minimum Shape ID:      1
Spatial Index:
 Parameter:      SPIDX_RTREE
 Array Form:     -2,0,0
Entity Type:           nac+
I/O Mode:              NORMAL
Layer Config. Keyword: DEFAULTS
Layer Precision:       High
User Privileges:       SELECT, UPDATE, INSERT, DELETE
Creation Date:         04/07/12 22:08:09
Layer Envelope:
 minx:     5513.00000, miny:     5099.38000
 maxx:   655653.85000, maxy:  1220310.00000

XY False Origin:       -5220400.0, -15524400.0
XY System Units:       10000.0
XY Half SysUnit:       0.00005
XY Round:              0.0001
XY Cluster Tolerance:  0.001
Z  Offset:             0.0
Z  Units:              1.0
Z  Half SysUnit:       0.5
Z  Round:              1.0
Z  Cluster Tolerance:  0.001
Measure Offset:        0.0
Measure Units:         1.0
Measure Half SysUnit:  0.5
Measure Round:         1.0
Measure Cluster Tol:   0.001
Spatial Referencd ID:  2
Coordinate System:     PROJCS["British_National_Grid",GEOGCS["GCS_OSGB_1936",DAT

SDE Attribute Columns: 16
Name                                 Type    Width numDecimal  NULL?   RowID
-------------------------------- ----------- ----- ---------- -------- -----
toid                             SE_STRING      20      0
version                          SE_INT32       10      0
verdate                          SE_DATE         0      0
featcode                         SE_INT32       10      0
theme                            SE_STRING      80      0
calcarea                         SE_FLOAT64     38      8
change                           SE_STRING      80      0
descgroup                        SE_STRING     150      0
descterm                         SE_STRING     150      0
make                             SE_STRING      20      0
physlevel                        SE_INT32       10      0
physpres                         SE_STRING      20      0
broken                           SE_INT16        5      0
loaddate                         SE_DATE         0      0
objectid                         SE_INT32       10      0     NOT NULL  SDE
shape                            SE_SHAPE        0      0

Spatial Column Statistics:
Total Area Features:             106809759
Total Features:                  106809759

Minimum Feature Number:                  0
Maximum Feature Number:                  0
Largest Feature:                     41907 Points
Smallest Feature:                        4 Points
Average Feature:                     27.33 Points
Minimum Polygon Area:                           0.00020 sq.
Maximum Polygon Area:                   174317702.72110 sq.
Average Polygon Area:                        2218.34133 sq.
Minimum Polygon Perimeter:                      0.06472
Maximum Polygon Perimeter:                 342097.43901
Average Polygon Perimeter:                    138.52868
Layer Envelope:
                minx:      5513.00000,  miny:      5099.38000
                maxx:    655653.85000,  maxy:   1220310.00000

Is the above what I should be expecting? I actually didnt know about SDEXinfo so thanks for teaching me something new.

I will try the other command next.


0 Kudos
New Contributor III
using 'sdexinfo -o list -a | tail -200' (assuming you have 'tail') will give more information about the
failing row (or at least the one before it).
- V

I tried the above and got the following message

C:\mm>sdexinfo -o list -f mmarea -a | tail -200
'tail' is not recognized as an internal or external command,
operable program or batch file.

Does this mean i don't have a tail? Our of interest what is a tail?


0 Kudos
Esri Esteemed Contributor
Hmmm.  No, I was expecting the file to choke.  Instead, it seems to be a load issue.
Probably best to get Tech Support on board about now.  And using SP4 or SP5 for
further testing wouldn't hurt, and try to capture the just the features in the 2000+/-
around where it's failing with a new export file, to see if that fails.

Good luck.

- V

BTW: You *really* shouldn't be using the SDE user to own spatial data.  Best practice
is to have an ownership account, and often a different read-only user for any web
access use.

PS: 'tail' is a Unix utility that displays the last n rows.
0 Kudos
New Contributor III
Couple of questions:
Can you give an example of the sdeexport command?
I am not aware of Postgres having issues on a restore if it runs out of RAM, but 2 gigs is the limit of memory that a window process can use, if Postgres was affected by lack of memory it would tell you that in the error logs.
When you are restoring, you are restoring public schema first and than any other schema you might have right? Some amount of errors are expected because we have to brake up the restore.
Postgres 8.3.8 does not have any replication features as part of the core product, that only came out recently with Postgres 9.1, so if you wanted to use db replication you would have to use one of the Postgres modules like Slony, for example.

Most Sincerely,
0 Kudos