With the release of ArcGIS Pro 3.5, the stars align a little more when it comes to the use of GeoParquet. You can now work with local GeoParquet files for your mapping and analysis needs, but it is also much easier to ingest big GeoParquet data from an S3-API-compliant object store!
This post is about how simple it is to bring remote GeoParquet data into your project.
The enabling technology is DuckDB, now included in the default Python environment in ArcGIS Pro 3.5 - no more package management just for this spectacularly useful client technology.
Here is an example, the entire Overture Maps Foundation divisions dataset accessed from their AWS S3 object store and written to my project home geodatabase.
Overture Divisions
Automation is key to GIS happiness, so to access this data I created a simple notebook which you can find in the post download. You'll need ArcGIS Pro 3.5 to run it, or an earlier release with your Python environment extended with DuckDB 1.1+.
It takes me about 6 minutes to download the 1m+ features to my project home geodatabase, but a big chunk of that is taken up in a couple of best-practice steps, namely sorting the features on area (descending) and repairing any geometry issues. The sort step is so small features display on top of large features, the geometry repair is commonly needed for point-rich data that "tiles the plain' like these divisions do.
The lift and shift itself is fast.
I'll let you inspect the notebook for yourselves, but note the option to apply an attribute or spatial filter on the features you download, for example within a bounding box in lat/long or the name of a country. Instead of manually download a set of very large parquet files from S3 you now have a simple tool to go get what you want, any time you like!