How to load millions row from FGDB into Postgre Geodb

1019
3
03-10-2020 07:33 PM
yockee
by
Occasional Contributor II

Hi,

I am having trouble to load 27 millions rows of data from File Geodatabase into Postgre Geodatabase.

I have been loading it for somedays. It fails on the fourth day when it reached 4 millions rows.

I have tried different method : copy - paste (fails after 1 millions rows), export to xml (fails after 4 millions).

Is there any postgre setting that i should turn off ?

Please give me advice.

I am using arcgis server 10.6

0 Kudos
3 Replies
JoshuaBixby
MVP Esteemed Contributor

Trying to copy/import 27 million records in a single transaction, which is effectively what you are trying to do by just copying and pasting, is bound to have issues regardless of ArcGIS.  Have you read About loading data into existing feature classes and tables—ArcGIS Help | Documentation ?  There are ArcGIS tools and workflows designed for bulk loading data into enterprise geodatabases.

George_Thompson
Esri Frequent Contributor

Are they points/lines/polys?

Is the file geodatabase local or on a network drive?

Where is the Postgres DB stored (in the same building, data center, cloud, etc.)?

Are you also using ArcMap 10.6? Have you tried ArcGIS Pro yet?

--- George T.
0 Kudos
VinceAngelo
Esri Esteemed Contributor

I regularly load 2M, 3M, 5M, 8M, and 12M row tables into PostgreSQL, and occasionally go as high as 60M rows. I use both FeatureClassToFeatureClass / TableToTable (as appropriate) or a nested arcpy.da.SearchCursor / arcpy.da.InsertCursor pair.  Performance is usually about the same, taking 20-240 minutes, depending on size of the data and the target database (effectively the same rate for each DB). 

I have seen multi-day loads, but only when trying something sub-optimal, like loading into an AWS RDS across three routers shared by hundreds of users.  My best performance has occurred on a high-IOPS EC2 in the same mission as the high-IOPS RDS, using ArcPy from the Linux EC2 AGS host, where a cascade of FC2FC commands (40 some-odd tables) populated 35-40M rows in ~2 hours. I have cheated on occasion, by parallelizing my load scripts, so one PuTTY session was loading the odd tables while the other PuTTY session was loading the even ones (with load order sequenced by size, so they finished about the same time).

You'd need to provide a great deal more information about the network architecture, the capabilities of the loading and database host, and location of the source FGDB before specific advice could be offered.

- V