Select to view content in your preferred language

An Overture Encore! Addresses Theme Mapping And Geocoding (Alpha)

209
0
4 weeks ago
BruceHarold
Esri Frequent Contributor
2 0 209

Building on earlier work using an ArcGIS Pro notebook to ETL Overture Maps Foundation data into ArcGIS, this post takes a look at Overture Map Foundation's Addresses theme, as both a map-ready layer and subaddress-capable geocoding locator.  At writing the data is in alpha release, but you can take a look with the ArcGIS Pro notebook (in the post download), which is configured to extract the address points in California, USA - a little over 14 million features.  Here they are, with metadata also created by the notebook:

14 million address points in California14 million address points in California

There are many uses for the address points (map layer, near features, join features, geometry source for editing...), but one I want to highlight, and which is mentioned by Overture as a target workflow, is geocoding - turning address details into map locations, like you see here for the address "1000 Pine Ave Unit 109 Redlands CA 92373".

Geocoding with Overture AddressesGeocoding with Overture Addresses

It always seems magic that you can convert a local address dialect into a coordinate, and while there are some country "gaps" in the current Addresses theme, where a country's data is populated it is complete, and maintained monthly.  Let's see how to access the data!

Spoiler
Note:  Esri's geocoding products, such as the ArcGIS  World Geocoding Service, may contain the same reference data as Overture's Addresses theme.

I'm using an ArcGIS Pro 3.5 notebook, using only the default runtime modules, including DuckDB.  Overture offer the data as GeoParquet files in S3 hive storage (meaning the single logical addresses dataset exists as an arbitrary number of individual parquet files you specify as a glob (wildcard) path, like this SQL sent to DuckDB:

select * from read_parquet('s3://overturemaps-us-west-2/release/2025-09-24.0/theme=addresses/type=address/*.parquet',filename=false, hive_partitioning=1)

The notebook figures out the path to the latest data automatically.

As we're interested in only a subset of the available addresses data, we supply a "where" clause in the notebook which filters which records to read, here is mine to get only Californian data:

where country = 'US' and address_levels[1].value = 'CA' and number is not null and street is not null

You'll notice one term that queries a struct column, address_levels, which is a 1-based array of up to three zone values in descending size, like province, city, neighborhood.  This will vary per country and is for you to figure out.  Few countries use all three levels.  Indeed, some columns may not be populated at all, either because the data isn't available from the contributor or is unused in the country, for example postal_city or postcode.

In the notebook, the address_levels array is re-ordered from grandparent-parent-child in the source to child-parent-grandparent in the output data, as that is how addresses are typically given, from small areas to larger ones, for example "380 New York St Redlands CA 92373".

I'll let you surf the notebook for other details, for example a bounding box around each address point is calculated to provide a more usable zoom experience when locating an address.  The notebook will create a locator in your project home folder or rebuild it with new data if it already exists.  To use the rebuild option, first remove the existing locator from your project to drop file locks on it.  It takes about 45 minutes to process the Californian data, including one ~5-minute step where street, unit and zone fields are consistently cased - for appearance' sake:

Consistent casing of text fieldsConsistent casing of text fields

When you get to the cell that creates or rebuilds the locator, unless you're in the US you will want to replace the code with something intended for your country.  The cell code was built by copying the Python command from a manual run of Create Locator into the cell, you can do the same.

Please comment in this post with any observations or questions.

Contributors