World Geocoding Service Needs Improvement

1406
9
12-27-2019 06:55 AM
RobertStevens
Occasional Contributor III

I use this service as, I hope, the most up-to-date way of geocoding street names and addresses. It could use some improvement.

1. Documentation
    The only documentation I can find that describes all the fields returned by this service is at:

Service output—ArcGIS REST API: World Geocoding Service | ArcGIS for Developers 

     This description does not correspond perfectly to the fields I see returned; i see some that are not documented (eg arc_*; I see some documented that I don't see returned eg resultid. Oh, and BTW, the documentation has a comment to the effect that the user cannot rely on the number and names. Really? You have a service and you are just going to change it at will and allow me, the user, to figure it out when my programs stop working?

2. Why in are there so many fields? Here is the list (names have been truncated to 10 characters as I used the table to table tool to download the data as a .csv so I could extract the field names, and that tool only allows 10 characters for a field name):

1: oid
2: loc_name
3: status
4: score
5: match_type
6: match_addr
7: longlabel
8: shortlabel
9: addr_type
10: type
11: placename
12: place_addr
13: phone
14: url
15: rank
16: addbldg
17: addnum
18: addnumfrom
19: addnumto
20: addrange
21: side
22: stpredir
23: stpretype
24: stname
25: sttype
26: stdir
27: bldgtype
28: bldgname
29: leveltype
30: levelname
31: unittype
32: unitname
33: subaddr
34: staddr
35: block
36: sector
37: nbrhd
38: district
39: city
40: metroarea
41: subregion
42: region
43: regionabbr
44: territory
45: zone
46: postal
47: postalext
48: country
49: langcode
50: distance
51: x
52: y
53: displayx
54: displayy
55: xmin
56: xmax
57: ymin
58: ymax
59: exinfo
60: arc_addres
61: arc_addr_1
62: arc_addr_2
63: arc_neighb
64: arc_city
65: arc_subreg
66: arc_region
67: arc_postal
68: arc_post_1
69: arc_countr

Yes, 69 fields! Most are empty in any data I receive and some seem to be duplicates of others. For instance what is the difference between arc_city and city?

3. Many of the fields have a width which seems almost to be a random number. And they are loooooong. Eg match_addr has a width of 500! The consequence of this is that I have to spend endless amounts of time scrolling to the right to see my own data which are the last handful appended to the list above. 

4. The input I submit has, not surprisingly, a field named city. When I get the results the *#!@! service has renamed my field to be city_1. Why? Don't rename my fields. Rename yours.

5. If you use a feature class created by the WGS in joins you will get errors to the effect that a field name is a reserved word. So ESRI have selected a field name which is one of their own reserved words (I have made a post on this previously)

So: why is the documentation not correct? Why do you rename my fields? Why are there so many? Why can I not select only those fields I wish to see and save just those? Why are the field lengths so long?

0 Kudos
9 Replies
BradNiemand
Esri Regular Contributor

Robert,

Let me answer some of you questions.

1. Documentation for the output fields when using the ArcGIS World Geocoding Service are documented at the service like you mentioned.  The ARC_ fields contain the values that were mapped when you originally geocoded the table and the ArcMap documentation for that can be found here:

About geocoding a table of addresses—Help | ArcGIS Desktop 

ResultID is a field that is returned using the REST API but not when geocoding in Desktop.  It is used to "stitch" the original table onto the results from geocoding.

As for your following statement

   "Oh, and BTW, the documentation has a comment to the effect that the user cannot rely on the number and names."

This is referring to the values in the fields that can change.  Data changes all the time so you can't base application logic on it.  The field names won't change but the values might.

2. We return a consistent set of all fields when geocoding so even empty fields will be returned.  When using the REST API directly, you can choose which fields will be returned by specifying the outFields parameter but we return all of them when geocoding using Desktop.  Information about the arc_ fields can be found in 1 above.

3. The length of the fields that are returned are based on the data we have.  In some cases, in some countries, the values can be quite long so we need to take that into account.  ArcGIS Pro sets the default width to something smaller than the max so I would encourage you to try it out.

4. The fields in the output are a copy from the original table that are appended to the geocoding result which is why their name is changed.  If you geocoding in ArcGIS Pro that will be named USER_<fieldName> with a field alias that has the same field name from the original table.

5.  What field was it that caused this issue?

0 Kudos
RobertStevens
Occasional Contributor III

Brad

Thank you for this information. It is helpful.

To answer your item#5. The two fields that seem to cause a conflict are Type and Zone.

Please see my posting to Geonet at this URL:

https://community.esri.com/thread/225988-field-names-in-world-geocoder-match-reserved-words 

Quite some time has elapsed since I posted that question. I have not checked lately whether or not it continue to be a problem.

I do quite a bit of geocoding. Possibly I should do it using standalone programs and choose the fields I wish to be returned as you suggest possible in your posting. Do you have a code sample snippet for that (using Python?) It becomes quite time consuming and is most inconvenient when using the World Geocoded in Arcmap. The data returned has all these uninteresting (to me, and I daresay to most), the occupy much space. To see one's original data one has to scroll  many pages to the right. One can go into the layers and make the fields invisible (cumbersome, since that has to be done field by field), or use the "Delete Field" tool.This all takes so much time.

0 Kudos
BradNiemand
Esri Regular Contributor

Robert,

You would need to make pure REST calls via Python if you want to go that route.  Victor Bhattacharyya‌ could potentially help you with that because he has done this quite a bit.

Brad

0 Kudos
VictorBhattacharyya
Esri Contributor

Hi Robert,

You'll want to make REST calls against your the GeocodeAddresses REST endpoint for the geocoding service. Here's some documentation about using the GeocodeAddresses REST endpoint: geocodeAddresses—ArcGIS REST API: World Geocoding Service | ArcGIS for Developers 

If you plan to do your geocoding against the world service, you'll need to get a token first, and pass that token in with your request. The user credentials associated with this token will be billed for the results of batch geocoding. If you are using your own geocoding service that you have stood up, you won't be charged for the results, but if the service is secure, you'll still need to pass a token.

By default, the geocodeAddresses operation returns ALL of the output fields that the service has turned on. It will be your job in post-processing to keep the ones you want, and to forget about the rest.

You'll want to copy the format for passing records that's described in the geocodeAddresses REST documentation above. Once you have the correct dictionary format for your input records in a recordset variable, then, you'll want to make the actual request.

values = {

      "addresses": json.dumps(recordset, ensure_ascii=False),

      "f": "json"

      "token": token

}

response = requests.post(geocoding_service_url, data=values)

json_response = response.json()

Once you get the output dictionary/JSON of the GeocodeAddresses call, you can filter and only add certain values to your output resultant feature class in Python. So, after you get your REST response, you'll want to do something like:

arcpy.management.CreateFeatureclass(some_gdb, name_of_feature_class, "POINT", spatial_reference=sr)

arcpy.management.AddFields(path_to_fc, list_of_fields)

cursor = arcpy.da.InsertCursor(path_to_fc, list_of_fields + ['SHAPE@XY'])

< loop over the results of your JSON response >

< skip over the attributes you don't want in your output and only add the attributes you want to save to your list_of_attributes >

cursor.insertRow(list_of_attributes)

0 Kudos
JaredPilbeam2
MVP Regular Contributor

Hi @VictorBhattacharyya ,

 

Are you aware of any tutorials on this?

0 Kudos
VictorBhattacharyya
Esri Contributor

Hi Jared,

There are no tutorials but the code sample I shared above is a good start.

Also, something that might be of use to everyone in this thread @RobertStevens  in the GeocodeAddresses tool we added the ability to only output certain outFields starting at Pro 2.7. You have the ALL, MINIMAL and LOCATION_ONLY options, which provides more control over which outputs are in the output of your geocoding job.

0 Kudos
VictorBhattacharyya
Esri Contributor
0 Kudos
BruceHarold
Esri Regular Contributor

Another option is to implement this approach which separates your business tables and related tables with system managed fields:

https://pm.maps.arcgis.com/home/item.html?id=50e74e318a5e4e17b1cc2a06258daba1 

0 Kudos
BruceHarold
Esri Regular Contributor

Hi, this begs the question if Geocode Addresses should return a feature class and a separate related table with the match details keyed on ResultID.  Then the output features would not be so overloaded.  If you agree then please create an Ideas item and we can monitor votes on it.

0 Kudos