Using NENA Data To Create Geocoding Locators - No ETL Required!

BruceHarold · ‎10-03-2019

The National Emergency Number Association promulgates GIS standards for datasets that support public safety operations in the USA. A principal example is Civic Location Data Exchange Format (CLDXF). Digging in further we can find a well defined data model for address points. The problem we're tackling in this blog is how to directly use data maintained in this schema to create ArcGIS geocoding locators without anyone having to construct complex ETL processes and copy data around repetitively.

The workflow requires your NENA data be maintained in an Enterprise Geodatabase, and there is a disclaimer - the full granularity of subaddress elements in the NENA schema is not supported. At time of writing (Pro 2.4.1 release) only one pair of subaddress type & identifier values is supported, but the sample demonstrates how three pairs of type & identifier values can be handled, as at the Pro 2.5 release locators will support this many subaddress fields. My test data (the counties of Kings, Queens, Nassau and Suffolk in New York, thanks to NYS GIS Clearing House) has units (apartments etc.), levels (floors, basements etc.) and building units (rooms, annexes etc.). Building name is usable too, and seat in the room and additional location data is retained and may be output by a locator but not used for searching.

Before we go further, why doesn't Esri just design the Create Locator tool to accept all the NENA fields? The short answer is we have to have internationally applicable parameters so it would overload the tool.

I said 'no ETL required'. Well hopefully that is true for you, and for my test data it would be if I had access to the database, but what I often see in the wild is things like empty strings and blank values in character fields, so I like to enforce proper null values and fix invalid date values with a bit of processing with Data Interoperability extension. In the screen captures below (click on images to enlarge) I'm making sure empty data is null as I import my test data to my EGDB.

The only other thing I did with my ETL was rename fields to lower case (what PostgreSQL likes, my EGDB platform) and make a couple of fields wider (pretype, posttype) in case my concatenations overflow those fields. Make sure domains don't bite you too, you'll be adding new values to pretype and posttype fields. Having said that though, I see in the data view of my layer that the character fields have arbitrary widths of 255 characters, so I'm not sure if the input field definitions are honored, or that views have any concept of domains, this is something that might be platform dependent. Anyway, that gets me to what should be your starting point. I have NENA-schema address points in my EGDB and I want to make a locator.

The secret sauce here is creating a view in my DBMS that performs all the manipulations necessary to rename, cast, substring and concatenate data into a schema directly usable in ArcGIS Pro as a feature layer input to the Create Locator geoprocessing tool, using the Point Address data role.

I seldom descend into SQL to this depth so to develop my view I built it up in pgAdmin (you'll need whatever SQL authoring tool comes with your DBMS), going field by field and inspecting the result in Pro as I went. Tip: you can recreate your view in pgAdmin and leave it in Pro's table of contents and just reset the layer source each time you want to view it - it will refresh in the map.

The blog download has the pgAdmin SQL source - esri_view.sql - and you can inspect the comments to understand the logic. Basically the fields specific to NENA that cannot be mapped to Point Address role inputs have their values passed into other fields. Fields combining type & identifier values are parsed into separate fields for each. The SQL will need to be ported to your environment, but its pretty standard stuff.

If you are a SQL wizard and can go straight to a SELECT statement then you could use the Create Database View tool and input the view definition. The edited source (no comments in it) is the file test_view.sql in the download. No prizes for user interface design but it works:

Having created the view, add it to your map and specify the ObjectID field as the unique identifier:

Let it index and you have your (dynamic) view of NENA data in your map as a feature layer:

You can see why I had to widen the type fields, check out '1375 Sunrise Hwy Westbound Service Road, Islip, NY, 11706'

Anyway, run Create Locator (hard to make an exciting graphic but hopefully useful):

arcpy.geocoding.CreateLocator("USA", "nena.sde.esri_view PointAddress", @"""PointAddress.ADDRESS_JOIN_ID 'nena.sde.esri_view'.address_id"";""PointAddress.HOUSE_NUMBER 'nena.sde.esri_view'.house_number"";""PointAddress.BUILDING_NAME 'nena.sde.esri_view'.building_name"";""PointAddress.STREET_NAME_JOIN_ID 'nena.sde.esri_view'.street_id"";""PointAddress.STREET_PREFIX_DIR 'nena.sde.esri_view'.prefix_direction"";""PointAddress.STREET_PREFIX_TYPE 'nena.sde.esri_view'.prefix_type"";""PointAddress.STREET_NAME 'nena.sde.esri_view'.street_name"";""PointAddress.STREET_SUFFIX_TYPE 'nena.sde.esri_view'.suffix_type"";""PointAddress.STREET_SUFFIX_DIR 'nena.sde.esri_view'.suffix_direction"";""PointAddress.SUB_ADDRESS_UNIT 'nena.sde.esri_view'.unit"";""PointAddress.SUB_ADDRESS_UNIT_TYPE 'nena.sde.esri_view'.unit_type"";""PointAddress.NEIGHBORHOOD 'nena.sde.esri_view'.neighborhood"";""PointAddress.CITY 'nena.sde.esri_view'.city"";""PointAddress.METRO_AREA 'nena.sde.esri_view'.metro_area"";""PointAddress.SUBREGION 'nena.sde.esri_view'.county"";""PointAddress.REGION 'nena.sde.esri_view'.state"";""PointAddress.POSTAL 'nena.sde.esri_view'.zipcode"";""PointAddress.COUNTRY 'nena.sde.esri_view'.country""", r"C:\Work\Product_Management\Address_Management\Nena", "ENG", None, None, None)

Then geocode!

Units work:

285 Asharoken Ave, #1, Huntington, NY, 11768

Fancy house numbers work:

5 1/2 Locust Ave, Brookhaven, NY, 11790

Building names work:

Building 22A, John F Kennedy Airport, New York, NY, 11430

So there you have it, maintain your data in NENA compliance and use it to geocode.

But wait, there's more! In response to the blog commentary around handling aliasing the download has been updated to add the SQL source esri_views.sql that creates an alternate city name table, used as below in Create Locator - see the Alternate Name Tables section:

Ignore the warning chip in the dialog capture, that just appears after locator creation to indicate you'll overwrite the output if you re-run the tool.

The wisdom of harvesting alternate city names from as many fields as i did can be debated, but hopefully you get the idea, the various NENA fields for zone values can be viewed suitably for use as alternate name roles. In production, it would be more efficient to create an alternate city name table from centerline data and join to it on street_id.

Here is the view used as the alternate city name table:

The address with address_id = 'KIN0000001' is '463 Maspeth Ave, New York, NY, 11211' Using the city alias 'Brooklyn' works with score = 100:

Additionally, I took a question off-line about maintaining all parts of addresses defined in the FGDC standard such as prefix and suffix address number parts, street name separator elements, pre-modifiers and post-modifiers. If you want to output these elements when geocoding then define them as custom output fields for your locators. This functionality is available in the tool as the last parameter, but you'll also need to supply source fields in the field map for each output.

I output seat and additional_location in my locator, which would let me work on candidates if that's what I needed.