Problems with EXPORTING a large shapefile containing a join

6099
14
Jump to solution
10-17-2013 12:56 PM
ElenaLopez
New Contributor III
Hi,

I have joined an excel table to a shapefile with multiple registers (more than 200,000) and I???m trying to export it as a new shapefile but it takes more than 4 hours to do it. Does anybody know another way to do it?

I have to repeat the process more that 16 times (using Model Builder) and I need to find a faster way to do it.

Any help will be much appreciated !!

Thank you.
Tags (2)
0 Kudos
1 Solution

Accepted Solutions
RichardFairhurst
MVP Honored Contributor
Hello,

I run the model using ADD JOIN instead of JOIN FIELD tool but the result is the same (i.e., it is slower than when I worked with shapefiles). The problem I think is the copy features tool (at the end of the process). It takes four hours to process this step. Any recomendation?

I send you a better version of the Model Builder:

[ATTACH=CONFIG]28510[/ATTACH]

Thank you very much.


You are using a Personal Geodatabase (.mdb) in your model, not a File Geodatabase (.gdb).  Personal Geodatabases do not perform well.  File Geodatabases are almost exclusively what I use.  Create a File Geodatabase and replace the Personal Geodatabase in the model with that.  Try the Copy Feature with that geodatabase type as the output.  It should take much less time.  Also create the File Geodatabase on a local drive, not a network drive if you want better performance.

What type is the field that you are joining on?  Some field types like long text field values can cause problems in joins.  Also, how many fields are in the feature class and table when combined?  Part of what is taking so much time is the time the join cursor is searching through the joined data to find a matching record.  You could try the Sort tool on the Feature Class and the Join Table and sort both of them on the join field before indexing and creating the join, which should improve performance.

If you only intend to keep the final shapefile and do not actually intend to use the File Geodatabase feature class, I think you can remove the Copy Feature step and go straight to the Feature Class to Feature Class step that outputs to a shapefile directly.

View solution in original post

14 Replies
RichardFairhurst
MVP Honored Contributor
Hi,

I have joined an excel table to a shapefile with multiple registers (more than 200,000) and I�??m trying to export it as a new shapefile but it takes more than 4 hours to do it. Does anybody know another way to do it?

I have to repeat the process more that 16 times (using Model Builder) and I need to find a faster way to do it.

Any help will be much appreciated !!

Thank you.


I personally have no use for either shapefiles or excel tables when it comes to geoprocessing.  I would convert them both into a file geodatabase, perform the join, export to a geodatabase feature class in the same geodatabase and then export to a shapefile.  If the geoprocessing occurs on a local workspace file geodatabase and not one installed on your network you will get additional performance gains.  I am sure it would cut down the time involved by hours.  There is no way to optimize Excel for a join and shapefiles are also poor performers compared to file geodatabase feature classes and tables.
0 Kudos
ElenaLopez
New Contributor III
Thanks for your answer. It seems very promising but ... I haven't worked with geodatabase files. Could you please explain me more in detail these steps ?? 

Thank you very much!!
0 Kudos
RichardFairhurst
MVP Honored Contributor
Thanks for your answer. It seems very promising but ... I haven't worked with geodatabase files. Could you please explain me more in detail these steps ?? 

Thank you very much!!


You can create a geodatabase for your project in ArcCatalog or Desktop's ArcCatalog window in to a local workspace.

In Model Builder use the Feature Class to Feature Class tool in the Conversion toolbox and the To Geodatabase toolset to convert the shapefile into the geodatabase by choosing it as your output workspace.

For Excel use the Table to Table tool to copy it into your new geodatabase, also in the Conversion toolbox and the Geodatabase toolset. 

Use the Add Attribute Index tool under the Data Management toolbox in the Indexes toolset on the Join fields in both the new feature class and the table to optimize the join performance. 

Use the Make Feature Layer tool on the converted shapefile feature class.  Use the Join tool with that layer as the input layer and the geodatabase table as the join source.

Use the Copy Features tool under the Data Management toolbox in the Features toolset to export the joined features to a new feature class. 

Then use the Feature Class to Feature Class tool in the Conversion toolbox and the To Geodatabase toolset to convert the geodatabase feature class to a shape file by choosing the ulitimate shapefile workspace as your output workspace and naming the file with .shp.
0 Kudos
ElenaLopez
New Contributor III
Hello,

Thanks for the explanation. I was able to run the model, but ... unfortunately is not running faster than before.

I send you the Model I have generated and I would really appreciate your feedback if you find any error.

[ATTACH=CONFIG]28443[/ATTACH].

Some specific questions I was not sured when I generate the model:
1. I use the tool Join Field and not Add field in the model. Is that correct ?
2. When I used the Copy Feature tool I saved the feature within the geodatabase folder. Is that correct ?
3. In the last step (Feature Class to Feature Class tool), I was not sured how I had to do it, so I saved the final shapefile in a folder, NOT in the geogatabase folder. Is that correct ???

Thank you very very much.
0 Kudos
RichardFairhurst
MVP Honored Contributor
Hello,

Thanks for the explanation. I was able to run the model, but ... unfortunately is not running faster than before.

I send you the Model I have generated and I would really appreciate your feedback if you find any error.

[ATTACH=CONFIG]28443[/ATTACH].

Some specific questions I was not sured when I generate the model:
1. I use the tool Join Field and not Add field in the model. Is that correct ?
2. When I used the Copy Feature tool I saved the feature within the geodatabase folder. Is that correct ?
3. In the last step (Feature Class to Feature Class tool), I was not sured how I had to do it, so I saved the final shapefile in a folder, NOT in the geogatabase folder. Is that correct ???

Thank you very very much.


Here are my answers to your questions.

1.  I have never used Join Field after trying it once and getting horrible performance.  I always use the Add Join and Remove Join tools pair in the Data Management Tools Toolbox in the Joins Toolset.  Add Join works in memory and is fast.  Join Field directly alters your actual input feature class itself and writes the changes to disk, which performs horribly.

2.  Correct.  Copy Feature after using Add Join (not Join Field).

3.  Correct.  The last step is to output to shapefile.  That step is only necessary if you intend to keep using shapefiles.  Personally, if I could use a file geodatabase only I would never go back to a shapefile.

I can't read your model.  It is too small and low resolution.  Please confirm that you have converted the shapefile and excel spreadsheet to a file geodatabase.  Also confirm that before you do the Add Join that you have added an Attribute Index to the join attribute in both the source and the join tables.  I could not tell if you did that in your model.
0 Kudos
ElenaLopez
New Contributor III
Hello,

I run the model using ADD JOIN instead of JOIN FIELD tool but the result is the same (i.e., it is slower than when I worked with shapefiles). The problem I think is the copy features tool (at the end of the process). It takes four hours to process this step. Any recomendation?

I send you a better version of the Model Builder:

[ATTACH=CONFIG]28510[/ATTACH]

Thank you very much.
0 Kudos
RichardFairhurst
MVP Honored Contributor
Hello,

I run the model using ADD JOIN instead of JOIN FIELD tool but the result is the same (i.e., it is slower than when I worked with shapefiles). The problem I think is the copy features tool (at the end of the process). It takes four hours to process this step. Any recomendation?

I send you a better version of the Model Builder:

[ATTACH=CONFIG]28510[/ATTACH]

Thank you very much.


You are using a Personal Geodatabase (.mdb) in your model, not a File Geodatabase (.gdb).  Personal Geodatabases do not perform well.  File Geodatabases are almost exclusively what I use.  Create a File Geodatabase and replace the Personal Geodatabase in the model with that.  Try the Copy Feature with that geodatabase type as the output.  It should take much less time.  Also create the File Geodatabase on a local drive, not a network drive if you want better performance.

What type is the field that you are joining on?  Some field types like long text field values can cause problems in joins.  Also, how many fields are in the feature class and table when combined?  Part of what is taking so much time is the time the join cursor is searching through the joined data to find a matching record.  You could try the Sort tool on the Feature Class and the Join Table and sort both of them on the join field before indexing and creating the join, which should improve performance.

If you only intend to keep the final shapefile and do not actually intend to use the File Geodatabase feature class, I think you can remove the Copy Feature step and go straight to the Feature Class to Feature Class step that outputs to a shapefile directly.
ElenaLopez
New Contributor III
Yes, that was it. I created a File Geodatabase and now it only takes 5 minutes to conduct the join and create the shapefile. That is great !!

Thank you very much for all your help !
0 Kudos
RichardFairhurst
MVP Honored Contributor
Yes, that was it. I created a File Geodatabase and now it only takes 5 minutes to conduct the join and create the shapefile. That is great !!

Thank you very much for all your help !


That is a lot more like it.  So now you see why I use File Geodatabases almost exclusively.
0 Kudos