Using NENA Data To Create Geocoding Locators - No ETL Required!

1763
4
10-03-2019 01:44 PM
BruceHarold
Esri Regular Contributor
2 4 1,763

The National Emergency Number Association promulgates GIS standards for datasets that support public safety operations in the USA.  A principal example is Civic Location Data Exchange Format (CLDXF).  Digging in further we can find a well defined data model for address points. The problem we're tackling in this blog is how to directly use data maintained in this schema to create ArcGIS geocoding locators without anyone having to construct complex ETL processes and copy data around repetitively.

The workflow requires your NENA data be maintained in an Enterprise Geodatabase, and there is a disclaimer - the full granularity of subaddress elements in the NENA schema is not supported.  At time of writing (Pro 2.4.1 release) only one pair of subaddress type & identifier values is supported, but the sample demonstrates how three pairs of type & identifier values can be handled, as at the Pro 2.5 release locators will support this many subaddress fields.  My test data (the counties of Kings, Queens, Nassau and Suffolk in New York, thanks to NYS GIS Clearing House) has units (apartments etc.), levels (floors, basements etc.) and building units (rooms, annexes etc.).  Building name is usable too, and seat in the room and additional location data is retained and may be output by a locator but not used for searching.

Before we go further, why doesn't Esri just design the Create Locator tool to accept all the NENA fields?  The short answer is we have to have internationally applicable parameters so it would overload the tool.

I said 'no ETL required'.  Well hopefully that is true for you, and for my test data it would be if I had access to the database, but what I often see in the wild is things like empty strings and blank values in character fields, so I like to enforce proper null values and fix invalid date values with a bit of processing with Data Interoperability extension.  In the screen captures below (click on images to enlarge) I'm making sure empty data is null as I import my test data to my EGDB.

The only other thing I did with my ETL was rename fields to lower case (what PostgreSQL likes, my EGDB platform) and make a couple of fields wider (pretype, posttype) in case my concatenations overflow those fields.  Make sure domains don't bite you too, you'll be adding new values to pretype and posttype fields.  Having said that though, I see in the data view of my layer that the character fields have arbitrary widths of 255 characters, so I'm not sure if the input field definitions are honored, or that views have any concept of domains, this is something that might be platform dependent.  Anyway, that gets me to what should be your starting point.  I have NENA-schema address points in my EGDB and I want to make a locator.

The secret sauce here is creating a view in my DBMS that performs all the manipulations necessary to rename, cast, substring and concatenate data into a schema directly usable in ArcGIS Pro as a feature layer input to the Create Locator geoprocessing tool, using the Point Address data role.

I seldom descend into SQL to this depth so to develop my view I built it up in pgAdmin (you'll need whatever SQL authoring tool comes with your DBMS), going field by field and inspecting the result in Pro as I went.  Tip:  you can recreate your view in pgAdmin and leave it in Pro's table of contents and just reset the layer source each time you want to view it - it will refresh in the map.

The blog download has the pgAdmin SQL source - esri_view.sql - and you can inspect the comments to understand the logic.  Basically the fields specific to NENA that cannot be mapped to Point Address role inputs have their values passed into other fields.  Fields combining type & identifier values are parsed into separate fields for each.  The SQL will need to be ported to your environment, but its pretty standard stuff.

If you are a SQL wizard and can go straight to a SELECT statement then you could use the Create Database View tool and input the view definition.  The edited source (no comments in it) is the file test_view.sql in the download.  No prizes for user interface design but it works:

Having created the view, add it to your map and specify the ObjectID field as the unique identifier:

Let it index and you have your (dynamic) view of NENA data in your map as a feature layer:

You can see why I had to widen the type fields, check out '1375 Sunrise Hwy Westbound Service Road, Islip, NY, 11706'

Anyway, run Create Locator (hard to make an exciting graphic but hopefully useful):

arcpy.geocoding.CreateLocator("USA", "nena.sde.esri_view PointAddress", @"""PointAddress.ADDRESS_JOIN_ID 'nena.sde.esri_view'.address_id"";""PointAddress.HOUSE_NUMBER 'nena.sde.esri_view'.house_number"";""PointAddress.BUILDING_NAME 'nena.sde.esri_view'.building_name"";""PointAddress.STREET_NAME_JOIN_ID 'nena.sde.esri_view'.street_id"";""PointAddress.STREET_PREFIX_DIR 'nena.sde.esri_view'.prefix_direction"";""PointAddress.STREET_PREFIX_TYPE 'nena.sde.esri_view'.prefix_type"";""PointAddress.STREET_NAME 'nena.sde.esri_view'.street_name"";""PointAddress.STREET_SUFFIX_TYPE 'nena.sde.esri_view'.suffix_type"";""PointAddress.STREET_SUFFIX_DIR 'nena.sde.esri_view'.suffix_direction"";""PointAddress.SUB_ADDRESS_UNIT 'nena.sde.esri_view'.unit"";""PointAddress.SUB_ADDRESS_UNIT_TYPE 'nena.sde.esri_view'.unit_type"";""PointAddress.NEIGHBORHOOD 'nena.sde.esri_view'.neighborhood"";""PointAddress.CITY 'nena.sde.esri_view'.city"";""PointAddress.METRO_AREA 'nena.sde.esri_view'.metro_area"";""PointAddress.SUBREGION 'nena.sde.esri_view'.county"";""PointAddress.REGION 'nena.sde.esri_view'.state"";""PointAddress.POSTAL 'nena.sde.esri_view'.zipcode"";""PointAddress.COUNTRY 'nena.sde.esri_view'.country""", r"C:\Work\Product_Management\Address_Management\Nena", "ENG", None, None, None)

Then geocode!

Units work:

285 Asharoken Ave, #1, Huntington, NY, 11768

Fancy house numbers work:

5 1/2 Locust Ave, Brookhaven, NY, 11790

Building names work:

Building 22A, John F Kennedy Airport, New York, NY, 11430

So there you have it, maintain your data in NENA compliance and use it to geocode.

But wait, there's more!  In response to the blog commentary around handling aliasing the download has been updated to add the SQL source esri_views.sql that creates an alternate city name table, used as below in Create Locator - see the Alternate Name Tables section:

Ignore the warning chip in the dialog capture, that just appears after locator creation to indicate you'll overwrite the output if you re-run the tool.

The wisdom of harvesting alternate city names from as many fields as i did can be debated, but hopefully you get the idea, the various NENA fields for zone values can be viewed suitably for use as alternate name roles.  In production, it would be more efficient to create an alternate city name table from centerline data and join to it on street_id.

Here is the view used as the alternate city name table:

The address with address_id = 'KIN0000001' is '463 Maspeth Ave, New York, NY, 11211'  Using the city alias 'Brooklyn' works with score = 100:

Additionally, I took a question off-line about maintaining all parts of addresses defined in the FGDC standard such as prefix and suffix address number parts, street name separator elements, pre-modifiers and post-modifiers.  If you want to output these elements when geocoding then define them as custom output fields for your locators.  This functionality is available in the tool as the last parameter, but you'll also need to supply source fields in the field map for each output.

I output seat and additional_location in my locator, which would let me work on candidates if that's what I needed.

4 Comments
TomNeer
New Contributor III

Thank you for the writeup.

So effectively, concatenate the fields down into something Esri's Locator can consume? Will this concatenation affect the quality of the scoring of the geocoder (e.g. "Main Street North" vs "Main Street North Extension")?

How do we resolve the issue of Postal City vs Incorporated City vs Unincorporated Community best? One of the issues the NENA standard resolves is the Postal Address vs Situs Address confusion. In some cases, their Situs Address falls under a different Postal Code but they use another Post Office. Set up a composite locator duplicate locators for both the Situs Address (using Incorporate City or Unincorporated Community and State) and Postal Address (using Postal Community and Postal Code) for both the RoadCenterlines and Site/StructureAddressPoints?

What is the best scenario for road centerlines along international borders? Do we hard code this value?

BruceHarold
Esri Regular Contributor

Hi Tom, thanks for the feedback!

Scoring is a matter of agreement between the input address and the reference data, in itself moving values around input fields will not affect scoring.

In our new geocoding engine, all input fields can have alternate value tables, so allowing alternate city or postcode values for addresses will work provided you create the relevant lookup table and build it into your locator.  There isn't any need to build composite locators for this sort of logic any more, or to include multiple geometry types, new locators ingest everything at once.

Looking at the data, say for alternate city names from the incorporated/unincorporated municipalities, it would be simple to generate an alternate name table from the distinct pairs of names, but if you did then you would allow alternates for cities that for the given address are actually clear across the other side of the city.  You could build an alternate city table keyed on individual address ID, it would be a bigger table but give better results.

For neighborhood, district, postcode and city values, a proximity-based 'alias' is also automatically applied from a dissolve of the input points or street segments.

I haven't tackled centerline data yet, I just wanted to show a pattern for how to consume NENA data.  I expect a small industry to spring up around this and there will be plenty of collaborators on details.

TomNeer
New Contributor III

Bruce, Thanks for the quick response. Yes, we are just starting to work with migrating clients to NG9-1-1 schema. We are looking at ways to best integrate the standard with other Esri schema such as LGIM and with the Locator. Based on your post, we are going back to refactor our schema again so both ETL and SQL views do not need to be utilized. Lots going on here. Lots of unknowns. Honestly, haven't mucked around under the hood of ArcGIS Locators since 9.x. I remember it not being much fun...

BruceHarold
Esri Regular Contributor

Tom, I updated the download to include the SQL source for an additional view that can be used in an alternate city name role.  I probably went overboard on pulling names from available fields but you'll get the idea.  Also, it would probably be a better design to create a city name alias table from NENA street centerlines and join to it from the points via street_id and not like I did from address_id as it blows out the locator size unnecessarily.