Select to view content in your preferred language

Data Loading Workspace - Automatically update the Source and Target Path

3241
9
Jump to solution
10-25-2022 05:18 AM
Luca_Schikowski
Regular Contributor

Hi there,

I am currently working with the Data Loading Workspace. I want to load data into my Utility Network. While manually creating the workspace table and executing the data load works fine, I got some problems with trying to automise this process with python using the openpyxl library.

I got a Data Loading Workspace, (DataReference.xlsx, Folders for DataMapping, Domains and Scripts). The Worksheet "SourceTargetMapping" in my DataReference.xlsx is already prepared for loading Data into my Utility Network. I got a lot of source definition queries, target delete queries and the field mapping inside of the mapping workbooks is also ready for data loading.

What I want to do now is to automatically update the Paths in the Source and Target columns. My first approach was to update the paths in the "DataPath"-sheet as shown here:

workbook_source = reference_workbook+"\DataReference.xlsx" ('reference_workbook' contains the path of the DataLoadingWorkspace folder)
workbook = openpyxl.load_workbook(workbook_source)
worksheet = workbook.get_sheet_by_name('DataPath')

fd_string = str(featuredataset) (featuredataset contains my UtilityNetwork featuredataset)
worksheet['A2'] = source
worksheet['A3'] = fd_string
workbook.save(workbook_source)

When I run this code, my data paths are updated. This is how the DataPath sheet looks like after running my code:

Luca_Schikowski_0-1666699024890.png

It looks good. When I manually create the DataLoadingWorkspace with this source and target paths, it looks excactly the same. But when I try to execute the data load, nothing happens. In fact, i get the following messages:

Luca_Schikowski_1-1666699989413.png

Which required values are missing here? Sadly I could not find some help online.

Thank you in advance for your help, i have been stuck on this problem for a while now..

Greetings from Germany

Luca

0 Kudos
2 Solutions

Accepted Solutions
TedHoward2
Esri Contributor

A couple things may be happening here but I believe it has to do with a limitation with openpyxl and excel workbooks that contain formulas. Excel caches formula results so you need to open your DataReference workbook and hit save. This enables openpyxl to properly read the rows in the DataReference workbook during execution of the Execute Data Load tool.

Just out of curiosity, what's the reasoning behind programmatically updating the data paths? 

View solution in original post

SumitMishra_016
Frequent Contributor

Hi @TedHoward2 ,

I also want to automatically update source path before running "Loading Data Loading Using Workspace". What can I use instead of openpyxl?

View solution in original post

9 Replies
TedHoward2
Esri Contributor

A couple things may be happening here but I believe it has to do with a limitation with openpyxl and excel workbooks that contain formulas. Excel caches formula results so you need to open your DataReference workbook and hit save. This enables openpyxl to properly read the rows in the DataReference workbook during execution of the Execute Data Load tool.

Just out of curiosity, what's the reasoning behind programmatically updating the data paths? 

Luca_Schikowski
Regular Contributor

Hi, thank you for responding! 🙂

Could you explain it a bit more detailled for me? Sadly I dont understand what you were trying to say.. sorry.

 

 

But what we want to do is start using the Utility Network for our telecommunication management. Therefore, we have to load all the needed data into the utility network. 

Whenever we start working on a new project, the employee has to create a new ArcGIS Pro project, create an utility network and migrate all the data into it. This will be done regularly (every 2 weeks approximately). All the data mapping workbooks and sql queries do not change, so the only thing to update are the data paths.

Well, creating the utility network and all its configurations works automatically already. Now we want to make the data import as simple as possible for the employee.. and what could be more simple than doing it automatically aswell with python code.. atleast thats what I thought.

Of course we could manually open the excel file, copy the data paths and paste them into the "DataPath" worksheet. But is this really the best way to work with here?

I hope you got a better understanding now.

0 Kudos
TedHoward2
Esri Contributor

Ok that makes sense. I was just trying to understand your workflow. So specifically in the DataReference workbook, there are formulas that pull the data path values into the Source and Target columns in the SourceTargetMapping sheet. You can see these by opening the DataReference workbook in Excel, selecting the Formulas pane and clicking the Show Formulas button.

TedHoward2_0-1666716291638.png

The issue with openpyxl is that it cannot read cells that contain formulas if the Excel workbook was saved by openpyxl. It can only read these cells if the workbook was actually opened in Excel and saved (this is what I meant by Excel caching). In your case you are editing the DataReference workbook with openpyxl and saving it with openpyxl. When Execute Data Load tool runs, openpyxl is used to read the DataReference workbook and it fails trying to read the cells that contain formulas. The only solution is to open the DataReference workbook in Excel and click save. After this, Execute Data Load tool should work fine. Let me know if this does not resolve it.

 

0 Kudos
Luca_Schikowski
Regular Contributor

Oh you are right! I opened the excel file, just saved it and it worked. I am not sure how I want to solve this issue, but atleast I know why it is not working.

Thank you 🙂

0 Kudos
SumitMishra_016
Frequent Contributor

Hi @Luca_Schikowski ,

Were you able to find any solution?

0 Kudos
LucaSchikowski
Emerging Contributor

Hi, sadly no.

All I got was this:

# Import subprocess
import subprocess
# Open Excel-File with Popen-method
excel_workbook = subprocess.Popen(workbook_source,shell=True)
#workbook_source contains the local data path of the data loading workspace table

 

This automatically opens the excel file while running the code. But I didnt manage to find a way to close it afterwards. So by now, when i want to create an Utility Network, everything works automatically, except closing this Excel File. This has to be done manually. Hope that helps.

SumitMishra_016
Frequent Contributor

Hi @TedHoward2 ,

I also want to automatically update source path before running "Loading Data Loading Using Workspace". What can I use instead of openpyxl?

TedHoward2
Esri Contributor

If you are using Load Data Using Workspace then I assume you are on ArcGIS Pro 3.2+. With this release we did away with the DataPath sheet entirely so formulas are not a problem anymore. You can use openpyxl to update the Source and Target columns on the SourceTargetMapping sheet without problem.

If you are by chance using 3.1 or earlier you can still get around it by updating the Source, Target columns directly on SourceTargetMapping sheet using openpyxl. Don't use the DataPath sheet though, that won't work still.

0 Kudos
TedHoward2
Esri Contributor

something like this...

dr = r"C:/temp/DataReference.xlsx"
workbook = openpyxl.load_workbook(dr)
worksheet = workbook.get_sheet_by_name("SourceTargetMapping")
for source, def_query, target in worksheet.iter_rows(min_row=2, max_col=3):
    source.value = "" # new
    target.value = "" # new

workbook.save(dr)
0 Kudos