Data Pipelines Blog

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Other Boards in This Place


Latest Activity

(8 Posts)
DuncanMackey
Esri Contributor

We’re excited to announce a new experimental Data Pipelines module included in the 2.3.0 release of ArcGIS API for Python. This has been a frequently requested feature, and you can now use the API to integrate Data Pipelines into your own systems or orchestration frameworks. Run data pipelines and get the results in ArcGIS Notebooks, other platforms such as Azure Functions and AWS Lambda, and anywhere else you can install the ArcGIS API for Python. This allows for more advanced orchestration workflows, such as using Azure Functions to run a data pipeline when new files are added to an Azure Blob storage container.

Note that this is an experimental feature, and function calls and behavior may change in future ArcGIS API for Python releases. If you have feedback on the API, please post in our community forum!

To give a quick tour of this new capability I will walk through how to call an existing data pipeline from ArcGIS Notebooks and view the results.

You will need either ArcGIS Notebooks or access to the ArcGIS API for Python version 2.3.0 in your environment. If you are just getting started with Data Pipelines, visit the documentation. If you're interested in learning more about the ArcGIS API for Python, you can do so here.

Let’s get to it!

Step 1: Find or create a data pipeline to run. 

Find or create the data pipeline you would like to run using Python. Note the title of the data pipeline item. If you do not have an existing data pipeline, follow the steps here to create your first data pipeline.

In this example I will be running a data pipeline I titled "My Data Pipeline for ArcGIS API for Python".

Step 2: Create a new ArcGIS Notebook.

Next, let's create the notebook that we will use to run our data pipeline. This step can be skipped if you are using your own Python environment.

  1. Navigate to ArcGIS Notebooks. If this is not visible in the top bar of ArcGIS Online, your account may not have the role required to access ArcGIS Notebooks.
  2. Click New notebook > Standard to create a new standard notebook.

Now we can get started running our data pipeline from our new notebook.

Step 3: Import and call the new datapipelines module.

In a new cell, add the following code and run all cells in the notebook:

 

# import the datapipelines module
from arcgis import datapipelines
# search for the data pipeline item
item = gis.content.search(query="My Data Pipeline for ArcGIS API for Python")[0]
# run the data pipeline
run = datapipelines.run_data_pipeline(item)​

 

 

Once this cell is run, the run variable will contain a PipelineRun object that has a number of properties and methods allowing you to inspect the status and results of the run.

Step 4: Get the results of the run

In a new cell, add the following code and run the cell:

 

run.result()​

 

You will see the status of the run logged in the cell until it completes. Once complete, the result will be printed showing the outcome of the run.

Step 5: Explore the PipelineRun object.

Here are the properties and methods that you can access on the PipelineRun object:

Properties:

 

run.status  # contains the current status of the run
run.properties  # contains general properties of the run including start time

 

Methods:

 

run.result()  # waits for the run to finish and returns the results
run.cancel()  # cancels the run

 

To recap, we searched for our data pipeline item, ran the data pipeline, and showed the results of the run. This could fit into any number of broader data prep workflows, whether on ArcGIS Online or another platform that has access to the ArcGIS API for Python, and allows for more flexibility in how data pipelines are run.

Thanks for following along, and feel free to leave any questions in the comments below!

 

more
7 0 218
Sarah_Hanson
Esri Contributor

This blog provides answers to questions asked during the May 2024 webinar on ArcGIS Data Pipelines.

Read more...

more
5 1 631
BethanyScott
Esri Contributor

This blog outlines the breaking changes introduced in the February 2024 update of ArcGIS Data Pipelines.

 

Read more...

more
3 0 450
BethanyScott
Esri Contributor

ArcGIS Data Pipelines is no longer in beta and is now available for general use.

Read more...

more
1 0 368
BethanyScott
Esri Contributor

This blog outlines the breaking changes introduced in the October 2023 update of Data Pipelines (beta).

Read more...

more
3 0 414
DuncanMackey
Esri Contributor

Learn more about Data Pipelines and what it can do by following along with this workflow. We'll explore transforming, previewing, spatially joining two datasets, and more!

Read more...

more
3 4 1,090
BethanyScott
Esri Contributor

Read the introductory Data Pipelines blog and watch a quick video to get started.

Read more...

more
0 0 593
BruceHarold
Esri Regular Contributor

Security, security, security...

Read more...

more
0 0 1,275
94 Subscribers