Customers frequently ask about various approaches to integrating Snowflake with the ArcGIS platform. At Snowflake, we have seen constant growth in customer adoption of Snowflake for various GIS-based use case scenarios. The diagram below lays out the different access patterns between the two platforms.
You can get to know more details about these integration points here: ArcGIS and Snowflake.
Regardless, customers have constantly asked us to develop integrations/solutions that avoid data movement, specifically large datasets. One concern is data transfer cost in and out; another is data governance. As it is, customers have developed custom external Python solutions, which would involve exporting the data from Snowflake, performing Geospatial analysis (with ArcGIS SDK), ingesting the data back into Snowflake, and/or uploading the datasets as feature layers into ArcGIS.
We have also been asked about getting data directly from ArcGIS into Snowflake so that they can natively implement various non-interactive geospatial-based processing pipelines in Snowflake.
Back in the early part of the year, I attempted to explain how this could be done using the ArcGIS Rest API in this article, Interfacing with ArcGIS Location Services in Snowflake. While this would work, adopting this approach is tedious, especially when developing code for advanced use cases like Routing.
In general, Snowflake customers have asked us to develop a solution whereby they can bring in bespoke Python libraries from PyPi into their various data processing pipelines. As an answer to the call, Snowflake made this possible, as explained in this article, by allowing access to PyPI packages in Snowpark via UDFs and stored procedures. However, there was a limitation that only Wheel (.whl) based Python libraries were possible. Due to this limitation, the ArcGIS Python SDK could not be used within Snowflake, as its dependent libraries were not necessarily wheel-based. Recently, Snowflake worked on this need and now allows various sdist packages to be used in Snowpark user-defined functions (UDF)/stored procedures (sproc). This means we can now use the ArcGIS Python SDK directly within Snowflake.
The diagram below explains how customers can now use the ArcGIS SDK in Snowflake.
Documentation: Using third-party packages
The benefit of this approach would mean:
Ok, less talkie talkie, more walkie walkie!!!
Github Gist: sfc-gh-vsekar/DEMO_BATCH_GEOCODING_SPROC.ipynb
Here are some examples that demonstrate how to do this. Note:
Snowflake Notebook: DEMO_BATCH_GEOCODING_SPROC:- Demonstrates how to perform batch geocoding service, by reading a table from Snowflake, invoking ArcGIS, and populating the table with the geo-coordinates.
Snowflake Notebook: DEMO_FEATURE_LAYER_SPROC:- Demonstrates uploading data as a feature layer, and also downloading a feature layer into a table; based on this doc: Working with Feature Layers and Features.
Snowflake Notebook: DEMO_FIND_CLOSEST_FACILITY_SPROC:- This service demonstrates advanced functionality, Routing / Network: find the closest facilities. Some use cases where this service can be adopted are in a dispatch scenario, for example:
The image below is a map, based on various feature layers published by this demo:
While these demos are simplistic, your needs may vary. We would love to understand your needs and collaborate to solve them. Reach out to your Snowflake or ESRI representative so that we can help.
Till next time, play on!!!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.