Notebooks and SQL Servers

1579
3
05-27-2021 09:36 PM
Labels (1)
ToryChristensen
New Contributor III

Hi all,

I'm currently using FME Desktop to extract data from an SQL server and then amend a AGOL feature service, to ensure the feature service is updated. While this works, its a manual process and that limits how often the feature service can be updated.

I was hoping that there was a way that I can use Notebooks instead of FME Desktop, and also take advantage of the scheduling functionality to update the feature more often.

I'm slowly learning Python, but wanted to check whether Notebooks would be able query an SQL database.

Thanks in advance

0 Kudos
3 Replies
wwnde
by
Occasional Contributor

Hi Tory

 

I have had this question for a long time.

We are an organization purely on AGOL. There hasn't been ways to schedule notebooks to run.

 

We adopted Azure as a cloud computing system and things changed. Within Azure, I can create Functions.

Functions are triggered by occurrence of time or new data. I integrated ArcGIS python API python scripts, scipy spatial and QGIS and I can now be able to directly read and update feature services on AGOL using SQL servers on Premise and Azure SQL dbs in cloud. Importantly, I can use databricks which is scalable and can handle million of rows of data

 

I know you may not have these facilities, however it may not possible to do this within ArcGIS.   I believe other cloud computing platforms other than Azure are capable of doing the same.

I am finding it quite limited to rely on ArcPro to integrate GIS datasets with non GIS data and beginning to ask myself if as GIS specialist I should look beyond the conventional GIS software. 

0 Kudos
jcarlson
MVP Esteemed Contributor

The answer: it depends!

Is your SQL server accessible via the web, or do you need to be in-network? If the former, you can absolutely use a Notebook. If the latter, you'll need a local Python file, but you can still use a Notebook environment to develop the script.

In either case, you'll want to take a look at pandas, specifically pandas.read_sql(), which can read either a whole table or a specific query into a dataframe. For those not familiar, pandas (and the underlying numpy) is a fantastic tool for analyzing and manipulating tabular data.

Connecting to an SQL database via pandas utilizes sqlalchemy, so that would be useful to look at as well, though the pandas examples show you everything you need to know.

Side note: if you're working in the AGOL Notebook environment, you will be missing some of the submodules sqlalchemy needs to connect to various database types. These can be handled by including a cell which says:

pip install pymssql (or other required submodule)

If you're working locally, you can have this installed already, and skip that step.

Once you've queried out the data and made whatever changes you need, you can then use the ArcGIS Python API to append to / update / overwrite hosted feature services.

from arcgis import GIS
import pandas as pd
from sqlalchemy import create_engine

# DB connection
engine = create_engine('mssql+pymssql://*server_name*/*database_name*?trusted_connection=yes')

# Query string
query = """SELECT t.somecolumn, t.othercolumn
    FROM dbo.TableName t
    ORDER BY t.somecolumn DESC"""

# Query DB to dataframe
df = pd.ready_sql(query, engine)

## Do some things to the dataframe here

# Connect to your AGOL / Enterprise
gis = GIS('org-url', 'username', 'password')

# Get the target layer; swap the '0' for whatever layer index you need
# Alternate: use .tables[0] if target is non-spatial
target_layer = gis.content.get('layer itemID').layers[0]

# Clear existing records
target_layer.manager.truncate()

# Append new records from dataframe
target_layer.edit_features(adds=df.spatial.to_featureset())

 

I should say, there are LOTS of ways to manage that end part. Maybe instead of truncate / append, you choose to identify updated rows based on a unique identifier column. Maybe you're just appending new data and you don't need to truncate anything. That part will really depend on your particular situation.

Now, if you're working locally, you can use a Notebook to develop the script, as I mentioned. Once you're sure it works, start to finish, with no manual intervention, copy all the code blocks into a single layer_update.py script, or whatever you want to call it. Then you can schedule that script to run using your computer's task scheduler.

If you're working in an AGOL Notebook, you can schedule it to run from there.

- Josh Carlson
Kendall County GIS
ToryChristensen
New Contributor III

Thank you for such a detailed response. I'm starting to work on it now!!!

I'll let you know how I go 🙂

0 Kudos