I am going to try not to fill this post with expletives.
Preliminaries: ArcGIS 10.3.1 running on latest 2015 MacBook Pro using Parallels, Windows7, 8G memory, no other user level processes running.
Using: Arc Toolbox => Analysis Tools => Overlay => Spatial Join to join points(715 features) to polygons(469 features)
The two datasets have the same coordinate system, although the point dataset has a join, and the polygon dataset
has a giant number of fields because it was originally Business Analyst dataset, but which was pared down to
the geographic region of interest
The tool fires up. I enter the name of the target feature dataset (polygons). Then name of join features.... 205 seconds later the tool acknowledges the name of the dataset (no work yet mind you, just to respond that I entered the name of a dataset).
Then I wish to join 1:1 and to merge features of the point dataset. So I delete all the features which cannot meaningfully be joined.
Each deletion takes about 10 seconds to register for a net total of 270 seconds.
Then the remaining 4 features Itell it to aggregate as a mean. Another 20 odd seconds.
I enter the name of the output dataset -- incredible, it seems to respond almost immediately; what can be wrong?
I have now spent 495 seconds just to enter the data.
I now hit the run button. It executes, I get the usual progress dialogue at the screen bottom, and within a few seconds the tool finishes
claiming to have run without error. Only, there is no output dataset created.
There is a word for software like this, which decorum dictates that I had best not use.
I have had the misfortune to have been required to use ArcGIS for years;
have lost count of the number of bugs personally reported; grown
old looking at a spinning blue wheel while the product seemingly does nothing.
The tool did not work, already bad, but what kind of software
engineering underlies a user interface that takes 10 minutes to specify
a handful of data items?
The behavior of this tool is replicable.
There, I got to the end without using any 4 letter words.
Now I can go out and relieve by feelings by giving the cat a good kick.
Haha... love it ... some tips that I use
Thanks for venting
If you are willing to share the data, I would like to try and see if I get the same performance that you are experiencing. Although the outcome can be the same, perhaps the "trouble shared is trouble halved" could save the cat...
Chris
The data is not by any means the crown jewels, so I am certainly willing
to share it.
But I am not sure how much you need.
Do you need the BA2015 datasets, or do you have those?
I am not very familiar with packaging up
a subset of data from a file geodatabase.
I would certainly be interested in what you find.
I will try to figure out how to package this stuff up.
Rob
PS. Cat OK : did not even expend one of his 9 lives.
As Dan mentioned before, that is indeed a large set of attributes. Although one may expect that this shouldn't have such a big impact on the spatial operations. Since they do, please try what Dan Patterson suggested and see if that speeds up the process. Do generate the attribute indexes on the fields used for the join, when you join back the attributes.
If you want to verify if the join with all the attributes takes takes as long on my system as it did in yours, then you could simply select (in the catalog window) the featureclasses, and paste them into a new one. If one of the dataset contains a very much larger area as the other, then you could do a select by location (draw a rectangle with the selection tool) and export the featureclass to the new file geodatabase. Zip the file geodatabase (including the .gdb folder name) and see if you can attach it to the thread.
BTW: glad the cat is OK...
Dan and Xander
Thanks for your replies. Many of the things Dan had recommended were already in place:
same projection; local machine; polygons pertinent to geography.
Now, what is true, is that these Business Analyst datasets have many fields (~2000 in BA basic).
But surely when I run one of these python tools and simply enter the name of a dataset
it should not take 3 minutes simply to display a list of fields, should it? I mean, there is
no real computation happening yet. There is some structure containing field data and, paf!,
display it in a dialogue box ought to be instantaneous, shouldn't it??
Ok, speed aside, how is it that the tool appears to run, claims to have completed successfully,
and yet does not output the new feature class I specify? If the tool is overwhelmed by computational
complexity, should it now just report that and quit with an error?
I think possibly the way out of my dilemma is along the lines Dan suggests:
1. Use the BA layers (tracts, block groups, zip codes..) to create just polygons with no data.
Store those polygons in some file geodatabase.
2. Join the data I am interested in with those polygons using a suitable unique identifier
(tract ID, block group ID, zip code, ..)
3. Then create my own custom BDS layer as outlined in this document:
https://www.esri.com/library/whitepapers/pdfs/importing-and-using-your-own-data.pdf
Another advantage of that is that one can use the BA reporting capabilities.
I am guessing that one needs these BDS layers to be able efficiently to use all the data.
Very likely normal feature classes were never intended to have so many fields.
That is definitely a bloat of data. I would suggest you deal with geometry when needed in separate files, then joing the attributes over if they are used for further work. Keep us posted if you workflow improves.