Latest Contributions by BruceHarold

‎05-14-2025

With the release of ArcGIS Pro 3.5, the stars align a little more when it comes to the use of GeoParquet. You can now work with local GeoParquet files for your mapping and analysis needs, but it is also much easier to ingest big GeoParquet data from an S3-API-compliant object store! This post is about how simple it is to bring remote GeoParquet data into your project. The enabling technology is DuckDB, now included in the default Python environment in ArcGIS Pro 3.5 - no more package management just for this spectacularly useful client technology. Here is an example, the entire Overture Maps Foundation divisions dataset accessed from their AWS S3 object store and written to my project home geodatabase. Overture Divisions Automation is key to GIS happiness, so to access this data I created a simple notebook which you can find in the post download. You'll need ArcGIS Pro 3.5 to run it, or an earlier release with your Python environment extended with DuckDB 1.1+. It takes me about 6 minutes to download the 1m+ features to my project home geodatabase, but a big chunk of that is taken up in a couple of best-practice steps, namely sorting the features on area (descending) and repairing any geometry issues. The sort step is so small features display on top of large features, the geometry repair is commonly needed for point-rich data that "tiles the plain' like these divisions do. The lift and shift itself is fast. I'll let you inspect the notebook for yourselves, but note the option to apply an attribute or spatial filter on the features you download, for example within a bounding box in lat/long or the name of a country. Instead of manually download a set of very large parquet files from S3 you now have a simple tool to go get what you want, any time you like!

‎05-14-2025

Thanks for the clarification Thomas, we'll look into making this behavior easier to use.

‎05-13-2025

Thomas, while we're looking at this, to get going you can explicitly set the interpreter: https://community.safe.com/general-10/how-do-i-direct-fme-to-use-the-active-cloned-python-environment-in-arcgis-pro-instead-of-the-default-and-unmodifiable-environment-24257

‎05-13-2025

Hi Thomas, I also see the default environment path logged, but in my case the cloned environment is available, let us look into this. Back ASAP.

‎05-09-2025

Hi Matt, yes this can be done. Create a layer file then use the Esri ArcGIS Layer reader, set to point at the LYR or LYRX file. Definition queries are honored, it's a really great workflow.

‎04-08-2025

Tom in the Workbench application, open the Tools menu and in the Translation options set the Python compatibility to agree with Pro - the option named "Esri ArcGIS 3.11" or similar, depending on version.

‎04-02-2025

Hello Kathy A simple workaround is to refactor your workspace to use the Esri ArcGIS Portal Feature Service reader/writer with dataset https://arcgis.com and the Generate Token authentication type, which will embed the credentials. Database credentials can be similarly embedded in reader/writers. Let us know how you get on.

‎03-18-2025

Hello Thomas If you clone your ArcGIS Pro 3.4 python environment then extend it with scipy, and set the runtime environment to the clone, and in Workbench translation options set your preferred Python interpreter to Esri ArcGIS Python 3.11, scipy will be available.

‎03-03-2025

Hello Joe The ArcGIS Pro 3.4.3 patch, planned for March 18th, subject to testing, fixes this issue. Pro will update when the patch becomes available. Regards

‎02-26-2025

Sometimes you want to get complex data into an ArcGIS Pro session, but on a trial basis - say into the built-in memory workspace, for inspection and investigation in maps and/or table views. Pro's Analysis ribbon Data Interoperability controls include the Data Inspector and Quick Translator apps, which support viewing and translation, but these run outside the Pro process, and while informative, require extra steps to make use of the data in Pro. ArcMap had a concept of an "interoperability connection", which could cache data locally, but wasn't as performant as you might like. However, it inspired this post, so hats off to ArcMap once again! See below a simple model (in the Pro 3.4 toolbox in the blog download) that leverages the Quick Import system tool, delivered with ArcGIS Data Interoperability, which will read any of hundreds of formats of data supported by the extension, and any you have configured as custom formats, and write the data into the memory workspace for instant access in your Pro session. QuickImportToMemory The processing is simple. A Quick Import tool supports the data source input parameter, which lets you pick a format and source location. What is saved in the tool is an example of GeoJSON at an API URL with filter parameters - building permits issued in Vancouver, BC to date in 2025: Dataset input dialog Quick Import creates an intermediate file geodatabase which submodels inspect for feature class and table outputs: SubModelFeatures SubModelTables The found data object paths are returned to the parent model and exported to memory, then collected to be model output parameters. The Collect Values model tool has the handy property that it suppresses visibility of output parameters so they don't clutter the model dialog when run as a tool. Tool dialog has only an input parameter The intermediate file geodatabase is cleaned up when the data is in memory. Features and table in memory See in the Contents pane a feature class and table (a few GeoJSON features without geometry) were output by Quick Import and exported to memory. So that's it - an easy button to get complex external data into Pro! There are a few limitations, principally that selecting multiple input datasets will result in the processing of only the first. Datasets like the GeoJSON example saved in the tool which result in multiple outputs should work. To support complex logic in this workflow, you can use custom formats. See also here. Custom formats let Quick Import work with raw data using any logic built into the format by an external ETL workspace. Please comment in the blog with your observations!

‎02-20-2025

It is always satisfying to share powerful new ways to solve problems, especially so when the solution has been "hiding in plain sight" for a while. This time I'm showing how the combination of ArcGIS Enterprise branch versioning and cloud native data sharing delivers not only fast data access, to people without portal access, but the ability to ask the accessed data to travel back in time to when it was younger. Like these parcels, see a previously undivided parcel and now its three subdivisions. Parcel subdivisionParcel subdivision Picture a dataset with millions of features and under heavy daily maintenance, like branch versioning is built to handle, your customers can access all or any part of the default version for any moment in time. Forever. Without extra load on your Enterprise portal. So, how did I get there? I simply noticed that the insert-only transaction model of branch versioning is a fit for incrementally creating GeoParquet files in cloud storage that jointly preserve data state over time and can be queried spatially and temporally to make local data on demand for your area and time of interest. It is however a very fancy query! The good news though is you don't have to figure it out, the blog download has a notebook with examples for my parcel subject matter, just plug yours in. I didn't have to invent the query approach, Esri publishes workshop materials on the topic. For example, if you go to around minute 18 in this presentation you'll see what such a query looks like. I did have to make GeoParquet files that I can query, and a maintenance workflow for initial and incremental parquet file creation . It all starts with the source branch versioned Enterprise geodatabase feature class. Normally you can't see the system fields that power branch versioning of a feature class, but if you add the archive class to the map they are available: Archive class added to the mapArchive class added to the map A couple of things to note in the fields map: ObjectID is demoted to an ordinary long integer (values are not unique any more) and various fields named GDB_* are added. They power viewing the data at a moment in time, which is how branch versioning works - the latest state for a feature wins, which may be a deleted state, but the data history isn't lost (unless you drop it), which makes time travel possible. The archive class is also good for discovery of what edit moments are in your data. With the archive class providing visibility to all fields, the sharing and maintenance workflow was possible. It goes like this: Create an initial parquet file with all archive class rows where GDB_BRANCH_ID = 0 On any schedule that makes sense, create delta parquet files for new default branch row states These have a GDB_FROM_DATE later than the maximum in all existing parquet files They also have GDB_BRANCH_ID = 0 Maintain all parquet files in your favorite S3-compliant object store at a glob path Give your data customers a notebook or script tool they can use to extract data The supplied notebook requires DuckDB version 1.0.0 in the Python environment Now, I'm advertising this as cloud native data distribution, but at writing I'm still setting up my AWS account so the attached notebook is using a local filesystem path, I'll update that when I have a public S3 URL path available. In the meantime you can download sample data for testing here, here, here and here. They are the initial bulk version copy and a few incremental delta files, with a few days edits each. Change the notebook pqPath variable to suit your environment until I get the S3 path in place. Spoiler The data I'm using isn't really being maintained in a branch versioned geodatabase, I made sample data, kindly see data permissions in the item details for the links above. The data I'm using isn't really being maintained in a branch versioned geodatabase, I made sample data, kindly see data permissions in the item details for the links above. You'll see in the notebook I supply a template for extent and time travel queries. I find I can extract all 2.7 million parcels in my data in a little over 3 minutes, from local disk. Access from S3 I would expect to be a little slower, we'll see when I have that set up. Try out the notebook for yourself. You might have some questions about the notebook, I'll see if I can anticipate a few: DuckDB 1.0.0 is used as it is in the Esri Conda channel and later versions handle geometry differently The bbox column in the parquet files is JSON type but queried as varchar as DuckDB didn't seem to recognise the data as JSON I tried using the built-in rowid pseudocolumn in DuckDB but got errors, so I overrode it I tried writing the output feature class by bouncing through a spatially enable dataframe but got errors In the blog download the project atbx has a script tool I used to find desired output text field widths Now I'm going to be a little selfish. To make my sample data and parquet files I built a few ETL tools (Pro 3.4), which I could have scripted. These tools are not in the blog download. If you are interested in them please message me and I can share. It will help the team here if we hear how many people are interested in this data sharing paradigm, so please help us to help you.

‎02-14-2025

I think this could be modeled or scripted, but it might make more sense to use the LocateXT tools on the incoming Word documents before attaching them, to make accessory attachments that go for the ride. Data Interoperability doesn't add anything in this situation, as described, but might if other complexities arise.

‎02-07-2025

Ah I see, sounds like a portal bug indeed. You could always shell out to ArcPy in the workspace, assuming the Geocode Addresses geoprocessing tool likes your portal locator, if not then use a file-based locator.

‎02-06-2025

Hi, unless I'm missing something, if you change your Authentication Type to Generate Token then you'll embed your username and password into the Geocoder and a new token will be generated on each run.

‎01-22-2025

Sorry I was not clear, if you start the Workbench application from the Analysis ribbon or ArcGIS program group then you'll see the Tools>FME Options>Default Paths dialog. I think I see a bug though, while database connections persist for the Workbench app they do not for Quick Translate or the system Quick Import tool - I will investigate.

Online Status	Offline
Date Last Visited	Tuesday

My Ideas

Latest Contributions by BruceHarold

Cloud native GeoParquet ingest with ArcGIS Pro 3.5+ just got easier

Re: Installation of additional Python modules into FME Workbench

Re: Installation of additional Python modules into FME Workbench

Re: Installation of additional Python modules into FME Workbench

Re: Use a user selected feature layer in an ETL tool

Re: Interop workbench returns 'No module named psycopg2'

Re: Data Interop - Task Scheduler - Change owners?

Re: Installation of additional Python modules into FME Workbench

Re: Issues with Scheduled Tasks Using Data Interoperability 3.4 in ArcGIS Pro

Quick Import To Memory

Share Versioned Cloud Native Data With Optional Time-Travel

Re: Auto-extract text from attached documents

Re: Creating an FME Web Connection for your Enterprise Portal

Re: Creating an FME Web Connection for your Enterprise Portal

Re: Quick Export : can't save connection

Re: Change Detect or View Source Swap? The Winner...

Re: Embed connection parameters of an enterprise g...

Re: Change Detect or View Source Swap? The Winner...

Re: More From The Duck Pond - Using DuckDB In ArcG...

Data Generation At Scale

LocalMaps NZ

Addressing

ArcGIS Pro: Partner Solutions