|
BLOG
|
The team tell me DuckDB was broken in Pro 3.6 Beta 1, it's fixed in later builds.
... View more
a week ago
|
0
|
0
|
120
|
|
BLOG
|
Thanks, you might have hit a bug - I'll get this to the right people. I'll remove the Spoiler in the post re. the Overture registry issue when it's sorted.
... View more
a week ago
|
1
|
0
|
130
|
|
BLOG
|
Hi Marc, do you mean you get this error when importing duckdb or is this when adding DuckDB to a pre-3.5 Pro Python conda environment? The spatial extension is already available in Pro 3.5+. BTW, Overture is broken at the moment, people are looking into an issue in the registry, which the notebook uses to determine the latest release.
... View more
a week ago
|
0
|
0
|
147
|
|
BLOG
|
Building on earlier work using an ArcGIS Pro notebook to ETL Overture Maps Foundation data into ArcGIS, this post takes a look at Overture's Addresses theme, as both a map-ready layer and subaddress-capable geocoding locator. At writing the data is in alpha release, but you can take a look with the ArcGIS Pro notebook (in the post download), which is configured to extract the address points in California, USA - a little over 14 million features. Here they are, with metadata also created by the notebook: 14 million address points in California There are many uses for the address points (map layer, near features, join features, geometry source for editing...), but one I want to highlight, and which is mentioned by Overture as a target workflow, is geocoding - turning address details into map locations, like you see here for the address "1000 Pine Ave Unit 109 Redlands CA 92373". Geocoding with Overture Addresses It always seems magic that you can convert a local address dialect into a coordinate, and while there are some country "gaps" in the current Addresses theme, where a country's data is populated it is complete, and maintained monthly. Let's see how to access the data! Note: Esri's geocoding products, such as the ArcGIS World Geocoding Service, may contain the same reference data as Overture's Addresses theme. I'm using an ArcGIS Pro 3.5 notebook, using only the default runtime modules, including DuckDB. Overture offer the data as GeoParquet files in S3 hive storage (meaning the single logical addresses dataset exists as an arbitrary number of individual parquet files you specify as a glob (wildcard) path, like this SQL sent to DuckDB: select * from read_parquet('s3://overturemaps-us-west-2/release/2025-09-24.0/theme=addresses/type=address/*.parquet',filename=false, hive_partitioning=1) The notebook figures out the path to the latest data automatically. As we're interested in only a subset of the available addresses data, we supply a "where" clause in the notebook which filters which records to read, here is mine to get only Californian data: where country = 'US' and address_levels[1].value = 'CA' and number is not null and street is not null You'll notice one term that queries a struct column, address_levels, which is a 1-based array of up to three zone values in descending size, like province, city, neighborhood. This will vary per country and is for you to figure out. Few countries use all three levels. Indeed, some columns may not be populated at all, either because the data isn't available from the contributor or is unused in the country, for example postal_city or postcode. In the notebook, the address_levels array is re-ordered from grandparent-parent-child in the source to child-parent-grandparent in the output data, as that is how addresses are typically given, from small areas to larger ones, for example "380 New York St Redlands CA 92373". I'll let you surf the notebook for other details, for example a bounding box around each address point is calculated to provide a more usable zoom experience when locating an address. The notebook will create a locator in your project home folder or rebuild it with new data if it already exists. To use the rebuild option, first remove the existing locator from your project to drop file locks on it. It takes about 45 minutes to process the Californian data, including one ~5-minute step where street, unit and zone fields are consistently cased - for appearance' sake: Consistent casing of text fields When you get to the cell that creates or rebuilds the locator, unless you're in the US you will want to replace the code with something intended for your country. The cell code was built by copying the Python command from a manual run of Create Locator into the cell, you can do the same. Please comment in this post with any observations or questions.
... View more
a week ago
|
2
|
0
|
129
|
|
BLOG
|
In this post I'm revisiting the ingest of Overture Maps Foundation data via ArcGIS Pro notebook, for a few reasons: To show a subscription pattern for the latest data in an area of interest To show the value of unnesting complex columns in parquet files To show geocoding locator creation in addition to geodatabase object creation To show metadata creation simultaneously with data creation To show sharing notebooks via OneDrive provides Overture processing capability to your colleagues You're not here for the cartography, and to prove it here's a map of my subject matter data: Division Areas and Places in greater London, GB Overlaying a Colored Pencil basemap, there are global extent Division Area polygons and Places point features within an area of interest (greater London, UK) defined by selected division area features. What you can't see, but we'll get to, is that separate POI-role locators made from the division area and places features are active ArcGIS Pro project locators in the map. The division area polygon layer (transparent with green outlines) is global, with over 1M features. Division areas go from country down to microhood size, so to help with display I sort them by area (descending), so they look quite busy in the map, but you have flexibility in area of interest definition. For the places points, there are over 417K features (dark blue dots) in the area of interest, which at the map scale shown (1:500,000) makes the symbol density very high. Together, the division area features, places points, relationships to alternate value tables and associated locators for each layer are my information products, which will be maintained by one-click automation on demand in two notebooks shared on OneDrive, one each for division areas and places. These products will all be made in the ArcGIS Pro project default geodatabase or home folder. Let's work through each of the aspects I want to show in this post. Subscribing to the latest Overture data in an area of interest The subscription concept I'm going for here is based on an area of interest for which you want to receive data refreshes on demand. The area is defined by any number of division area features selected by SQL query where clause (you figure out the where clause after first translating the division area features and doing some map-based exploration). Here is mine: "country = 'GB' and id in ('e8e3f6e2-2c45-4708-805c-41d08ab38de1','89c092f8-4287-4401-b72f-4a5a067eee22','2d0e78fb-f7fa-4c8f-be95-69a60527fc97')" Tip: Including the country value 'GB' in the where clause isn't technically necessary but it reduces the workload back in S3, allowing the system to skip reading parquet files that contain no British data. The SQL query reads hive-partitioned GeoParquet files in a public S3 bucket. Overture data is published monthly, the notebooks automatically determine the latest data to read. The value of unnesting complex columns in parquet files Overture data is available in GeoParquet format. You may have seen discussion on GeoParquet being a candidate to be the "new shapefile" - the de facto format for sharing georelational data. While GeoParquet isn't editable like shapefiles are (except by replacement) it has many attractive features, one being support for complex column types. See in the schema for the places theme: Places theme schema as seen by DuckDB Note that several columns are struct type, with the structs containing complex properties. The data isn't flat, and to get the most value from the data (division area translated names, place point alternate categories and address components etc.) you must unnest the structs. You'll see in the notebooks I don't unnest them all (because some were empty) but no useful data was left behind. So when working with parquet, be prepared for a higher information density than shapefiles. You'll see in the blog post notebooks that a couple of structs are unnested and relationship classes created to the resultant tables, like here where searching for ice cream shops in the places data lets you see what other categories the outlets identify as... Ice cream shops also offering... or identifying translated names for a division area feature... Division Area name translations ... and translated names are a good segue into my next topic! Geocoding locators made from the data Many division area names have common names in other languages, see below we can identify the Japanese name for Wellington, New Zealand (ウェリントン) and use it in the division areas locator. Geocoding from the Japanese name for Wellington, New Zealand One of the higher value themes from Overture is places data, and the locator built from that data provides a compelling map navigation experience - using place category as a hint to refine a geocode. For example hinting I want to find a train station: Train Stations Or a restaurant in my map extent: Restaurants Before this I had no idea custom categories from your data could be used this way. There will be many more use cases for this rich places data. This brings me to... Creating metadata simultaneously with data Since we're using a notebook approach it is straightforward to use the metadata class to automate writing metadata to output objects - it is important to record the release and processing timestamps at minimum, and you might like to record feature counts and other observations, so they travel with the data. I will not clutter the post with the cell code that does the job, you can surf the notebooks for that. Which brings me to... Sharing notebooks to OneDrive The notebooks are suitable for external consumption. They automatically detect the changing input path at run time - so are good candidates for sharing to OneDrive, from where your colleagues may run them on demand - the "one click" experience. OK well maybe two clicks, open the notebook then run it 😉.. To share the notebooks, create a folder in OneDrive, copy the notebooks into it, then share the folder and notebook source files to anyone, with edit permission. People invited to the folder can use OneDrive's browser-based controls to add the folder to their local files, then from Windows Explorer drag the folder connection into an ArcGIS Pro project folder connection. Notebooks on OneDrive Ready-to-use notebooks are in the post attachment. You'll need ArcGIS Pro 3.3+ standard or advanced license. ImportCurrentDivisionAreas takes about 30 minutes to run, ImportPlacesByDivisionArea about 15 minutes for the area of interest shown, this will vary with the area you use. Do not run both notebooks simultaneously, they have variable names that will collide. Do comment in the post with any experiences you want to share, or questions you have, for example who would like the Addresses theme supported with a locator output? I'm guessing many people... Have fun with your Overture subscriptions!
... View more
2 weeks ago
|
3
|
6
|
483
|
|
POST
|
Hi, this will be something simple in how you're writing. If you're using the FME product please open a support call with Safe Software, if you're using Data Interoperability open one with Esri and request a screen share when our product engineer can join the call, PDT time zone.
... View more
4 weeks ago
|
0
|
0
|
139
|
|
BLOG
|
Hello again Esri staff can't create Ideas posts so see here instead, where it will get more visibility in the FME world, an idea to make this work easier: https://community.safe.com/ideas/arcgisfeatureserviceviewmanager-transformer-in-the-esri-arcgis-connector-package-39204
... View more
4 weeks ago
|
0
|
0
|
110
|
|
POST
|
Another thing, try reducing the batch size being written from the default 1000 to 500 (and less if necessary), if your data is very point rich it might be a problem.
... View more
a month ago
|
0
|
2
|
243
|
|
POST
|
Hello again, I see you're reading in state plane and writing out web mercator. In this situation the data is supposed to be automatically reprojected but it's possible something is not working at this step, so you could add an EsriReprojector (or plain Reprojector) in your workspace to see if that helps. I also see you're using OneDrive as a shared file location, we have not tested this and it's possible it can be an issue. If problems remain after importing the write definition and reprojecting your data please open a support call.
... View more
a month ago
|
0
|
0
|
266
|
|
POST
|
Hi June When adding a writer you have the option to import its definition from an existing dataset, see outlined in red here: I'll look at your log shortly.
... View more
a month ago
|
0
|
0
|
268
|
|
POST
|
Another possibility is the wrong geometry type is in the data, for example it is OK if the data has null geometry but not if it has line or polygon geometry when you're trying to write to a point feature layer. Also, it is always a good idea when reading and writing between different storage platforms to define your writer by importing the schema definition from the source dataset. Both sides of your equation are Esri technology so this will be reliable.
... View more
a month ago
|
0
|
2
|
286
|
|
BLOG
|
Hello everyone. If you downloaded the blog attachment with workspace source files before 5:45 am PDT Friday 3rd October then please do so again, I simplified the LAViewSourceSwap.fmw workspace to remove unnecessary parameters which were artifacts of development testing.
... View more
a month ago
|
0
|
0
|
200
|
|
BLOG
|
The answer is both for me, but you be the judge for your situation! Read on for the decision criteria... If you want to maintain a large hosted feature service from external data it is best practice to avoid complete overwrites at each refresh, for two reasons: Large write transactions can be fragile Large write transactions can have significant service downtime To avoid both issues it is preferable to implement a CDC (change data capture) approach and write only deltas to the target feature service. This blog will describe two ways to do this: Writing deltas directly to the target feature service Maintain a hosted feature layer view pointing it alternately at two services Write delta transactions to the service not currently the source then swap it to be the view source In the usual situation where a period delta is a small fraction of the data, a direct delta write might take several seconds, while for a view source swap the downtime can be milliseconds, but has twice the storage cost. We'll do a worked example so you can choose between the approaches, but either way you're the winner using CDC! Here is my subject matter data, about a million street address points in Los Angeles, California, maintained daily: Los Angeles Address Points The job is to calculate and apply the daily delta transaction (typically the low hundreds of features) with low downtime, and while our candidate write modes (direct, view source swap) insulate the job's service downtime from the calculation time of the delta, it's always good to build in any optimizations you can. The city's open data site supports CSV download, and CSV is a performant format in spatial ETL tools, so that is half the delta calculation step. The other half is reading the current state of the feature service/view. Here is my optimization for feature service reading, in LAChangeDetection.fmw (in the blog download): Direct Write After Change Detection While the Esri ArcGIS Connector package supplies a feature service reader, in the quest for speed I implemented reading the target service using multiple concurrent Query calls with HTTP. I found that the default maximum record count per call (2000) in 4 concurrent requests gave optimal performance, roughly double the packaged reader's rate. The ChangeDetector transformer calculates the delta in seconds once it has the data, then writing the delta takes 3-4 seconds for a typical daily changeset (if you inspect the workspace you'll see I instrumented it with Emailer transformers to call home with some timestamp information). For people not satisfied with a few seconds service downtime, implementing view source swap is only slightly more challenging, see LAViewSourceSwap.fmw in the blog download: View Source Swapping You'll see logic in the workspace to toggle between "A" and "B" services for reading, writing and source swapping. For this reason changes are detected a little differently; the same public URL accessing the address data as CSV is read, but the delta is calculated versus the hosted feature layer that is not the current source for the hosted feature layer view, and the delta is applied to that feature layer. Then the updated feature layer must be swapped into being the source for the feature layer view. How? The answer requires some detective work, inspecting how ArcGIS natively handles view source swap in item settings: View Source Swap What you're looking at above is me manually doing a source swap but with the browser developer tools active, filtered to record POST transactions in big request row view. As I clicked through the view source swap I could see the system uses two calls, deleteFromDefinition and addToDefinition. Even better, if I inspect any POST call I can see the JSON payload used in it - which is lucky because the REST API documentation is a bit challenging for a no-code person like me 😉. The deleteFromDefinition payload is trivial, but the addToDefinition JSON payload is huge. However, as I made my services with default settings I'm not looking to change, I cut the JSON down to objects I thought worth keeping, and of course the required pointer to the desired source. Here is the JSON: {
"layers": [
{
"currentVersion": 11.5,
"id": 0,
"name": "LosAngelesAddresses",
"type": "Feature Layer",
"cacheMaxAge": 30,
"displayField": "Street_Name",
"description": "",
"copyrightText": "",
"defaultVisibility": true,
"adminLayerInfo": {
"viewLayerDefinition": {
"sourceServiceName": "@Value(_nextSourceName)",
"sourceLayerId": 0,
"sourceLayerFields": "*"
}
},
"geometryType": "esriGeometryPoint",
"objectIdField": "OBJECTID",
"uniqueIdField": {
"name": "OBJECTID",
"isSystemMaintained": true
},
"useStandardizedQueries": true,
"minScale": 0,
"maxScale": 0,
"extent": {
"xmin": -13210040.1828,
"ymin": 3989386.3054,
"xmax": -13153020.1132,
"ymax": 4073637.6182,
"spatialReference": {
"wkid": 102100,
"latestWkid": 3857
}
},
"spatialReference": {
"wkid": 102100,
"latestWkid": 3857
},
"globalIdField": "",
"maxRecordCount": 2000,
"standardMaxRecordCount": 32000,
"standardMaxRecordCountNoGeometry": 32000,
"tileMaxRecordCount": 8000,
"maxRecordCountFactor": 1,
"capabilities": "Query"
}
]
} In production I could edit the JSON to tweak things if desired, like extent or display field, but it's probably a better investment to get your layer design right before the fact. One key thing I learned about the payload is at line 15 where I inject a feature attribute into the JSON at runtime, sourceServiceName is a property that keys the service being swapped in, there is no reference to its item ID or its service URL. In my case the source service name toggles between "LosAngelesAddressesA" and "LosAngelesAddressesB" in consecutive runs. If any delta transaction contains no edits then no service swap occurs. So now we have squeezed as much downtime out of a feature service update as we can, it's your call if the average period's delta transaction is big enough (many thousands of features?) to justify the extra storage cost of view source swap and guaranteed minimum downtime. While I'm focusing here on downtime minimization, not run time for the whole job, if anyone is curious it's taking 3-5 minutes to refresh the million points I'm dealing with. I'm guessing the variability is coming from server load conditions where the data is coming from and going to. Acknowledgements: I was inspired to write this post by my Esri colleague @sashal who first explored this workflow, and to whom I'm grateful, and some prior art in a related workflow where file geodatabases are republished, see the first presentation here.
... View more
a month ago
|
3
|
2
|
459
|
|
IDEA
|
Hi Brad Support is partial, depending on what you want to connect to: https://community.safe.com/ideas/sql-database-in-microsoft-fabric-entra-authentication-37811 We recommend you post to the thread, to which I'm subscribed. We can also help with direct communication with Safe Software.
... View more
10-01-2025
05:34 AM
|
0
|
0
|
145
|
| Title | Kudos | Posted |
|---|---|---|
| 1 | a week ago | |
| 2 | a week ago | |
| 3 | 2 weeks ago | |
| 3 | a month ago | |
| 2 | 09-23-2025 10:27 AM |
| Online Status |
Online
|
| Date Last Visited |
yesterday
|