<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Issue calling a subprocess from ArcGIS Pro Toolbox in ArcGIS Pro Questions</title>
    <link>https://community.esri.com/t5/arcgis-pro-questions/issue-calling-a-subprocess-from-arcgis-pro-toolbox/m-p/1203736#M58748</link>
    <description>&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;I am developing a toolbox that - at some point - has to process a large amount of data and apply a function to each row. For efficiency reasons I already use a pandas dataframe.&lt;/P&gt;&lt;P&gt;I am familiar with pandas and big dataframes, so thats why I tried using &lt;STRONG&gt;pandas.series.apply&lt;/STRONG&gt; to efficiently apply the function to each row.&lt;/P&gt;&lt;P&gt;But processing this big dataframe with &lt;STRONG&gt;100k&lt;/STRONG&gt; rows takes like 100s benchmarking the speed and I don't really want to be stuck with a calculation of &lt;STRONG&gt;5kk&lt;/STRONG&gt; rows taking 50 times longer, when it can be sped up by almost the factor of my cpu cores. I have also experienced the time it takes being even longer when using this from &lt;EM&gt;inside&lt;/EM&gt; ArcGIS Pro, where I had like 25% completion after 1 hour of computing time.&lt;/P&gt;&lt;P&gt;So I looked into the option of using pooling and splitting up my dataframe to utilize my full cpu power, and then apply the function to every chunk in their own pool. I ran into some issues with ArcGIS Pro opening new instances of itself when naively trying to use &lt;STRONG&gt;concurrent.futures&lt;/STRONG&gt;, but fixed this by using a subprocess like the&lt;A href="https://github.com/Esri/large-network-analysis-tools" target="_self"&gt; large-network-analysis-tool&lt;/A&gt; does.&lt;/P&gt;&lt;P&gt;But that lead to my current problem:&lt;/P&gt;&lt;PRE&gt;&lt;SPAN class=""&gt;File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\subprocess.py", line 1420, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
PermissionError: [WinError 5] &lt;/SPAN&gt;Access Denied&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have already tried granting myself permissions as described here: &lt;A href="https://community.esri.com/t5/arcgis-pro-questions/arcpy-permission-denied-errors-in-arcgispro-2-3/td-p/332962/page/2" target="_self"&gt;Link&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Running the args I put into the subprocess call from the command line works perfectly fine, but the subprocess call seems to have some permission issues, even when starting ArcGIS Pro as admin.&lt;/P&gt;&lt;P&gt;I'd appreciate any kind of hints or help with this issue, I am fine with sharing some codesnippets and rewriting lots of code if there is a better way of handling this amount of data.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;EDIT: Here is a snippet from the subprocess call (I think those double quotes need to be there, because of "Program Files" in the directory path)&lt;/P&gt;&lt;PRE&gt;create_no_window = &lt;SPAN class=""&gt;0x08000000&lt;/SPAN&gt;
cwd = os.path.dirname(os.path.abspath(__file__))
python_path = os.path.join(sys.exec_prefix, &lt;SPAN class=""&gt;"python.exe"&lt;/SPAN&gt;)
script_path = os.path.join(cwd, &lt;SPAN class=""&gt;"solveTable.py"&lt;/SPAN&gt;)
inputs = [
  &lt;SPAN class=""&gt;'"{}"'&lt;/SPAN&gt;.&lt;SPAN class=""&gt;format&lt;/SPAN&gt;(python_path),
  &lt;SPAN class=""&gt;'"{}"'&lt;/SPAN&gt;.&lt;SPAN class=""&gt;format&lt;/SPAN&gt;(script_path),
  &lt;SPAN class=""&gt;"--in_df"&lt;/SPAN&gt;, &lt;SPAN class=""&gt;'"{}"'&lt;/SPAN&gt;.&lt;SPAN class=""&gt;format&lt;/SPAN&gt;(in_df_csv),
  &lt;SPAN class=""&gt;# some more kwargs&lt;/SPAN&gt;
  ]
&lt;SPAN class=""&gt;with&lt;/SPAN&gt; subprocess.Popen(
  inputs,
  stdout=subprocess.PIPE, stderr=subprocess.PIPE,
  creationflags=create_no_window) &lt;SPAN class=""&gt;as&lt;/SPAN&gt; process:
  &lt;SPAN class=""&gt;# some code for logging and error handling&lt;/SPAN&gt;
&lt;/PRE&gt;&lt;P&gt;The solveTable.py is parsing the kwargs from the input, splits up the in_df into n chunks, then calculates the values for those chunks, saves each result into a .csv file for me to read when everything is done. This - as previously stated - works perfectly when executing from commandline.&lt;/P&gt;</description>
    <pubDate>Thu, 18 Aug 2022 09:09:19 GMT</pubDate>
    <dc:creator>JonasNeubürger</dc:creator>
    <dc:date>2022-08-18T09:09:19Z</dc:date>
    <item>
      <title>Issue calling a subprocess from ArcGIS Pro Toolbox</title>
      <link>https://community.esri.com/t5/arcgis-pro-questions/issue-calling-a-subprocess-from-arcgis-pro-toolbox/m-p/1203736#M58748</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;I am developing a toolbox that - at some point - has to process a large amount of data and apply a function to each row. For efficiency reasons I already use a pandas dataframe.&lt;/P&gt;&lt;P&gt;I am familiar with pandas and big dataframes, so thats why I tried using &lt;STRONG&gt;pandas.series.apply&lt;/STRONG&gt; to efficiently apply the function to each row.&lt;/P&gt;&lt;P&gt;But processing this big dataframe with &lt;STRONG&gt;100k&lt;/STRONG&gt; rows takes like 100s benchmarking the speed and I don't really want to be stuck with a calculation of &lt;STRONG&gt;5kk&lt;/STRONG&gt; rows taking 50 times longer, when it can be sped up by almost the factor of my cpu cores. I have also experienced the time it takes being even longer when using this from &lt;EM&gt;inside&lt;/EM&gt; ArcGIS Pro, where I had like 25% completion after 1 hour of computing time.&lt;/P&gt;&lt;P&gt;So I looked into the option of using pooling and splitting up my dataframe to utilize my full cpu power, and then apply the function to every chunk in their own pool. I ran into some issues with ArcGIS Pro opening new instances of itself when naively trying to use &lt;STRONG&gt;concurrent.futures&lt;/STRONG&gt;, but fixed this by using a subprocess like the&lt;A href="https://github.com/Esri/large-network-analysis-tools" target="_self"&gt; large-network-analysis-tool&lt;/A&gt; does.&lt;/P&gt;&lt;P&gt;But that lead to my current problem:&lt;/P&gt;&lt;PRE&gt;&lt;SPAN class=""&gt;File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\subprocess.py", line 1420, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
PermissionError: [WinError 5] &lt;/SPAN&gt;Access Denied&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have already tried granting myself permissions as described here: &lt;A href="https://community.esri.com/t5/arcgis-pro-questions/arcpy-permission-denied-errors-in-arcgispro-2-3/td-p/332962/page/2" target="_self"&gt;Link&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Running the args I put into the subprocess call from the command line works perfectly fine, but the subprocess call seems to have some permission issues, even when starting ArcGIS Pro as admin.&lt;/P&gt;&lt;P&gt;I'd appreciate any kind of hints or help with this issue, I am fine with sharing some codesnippets and rewriting lots of code if there is a better way of handling this amount of data.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;EDIT: Here is a snippet from the subprocess call (I think those double quotes need to be there, because of "Program Files" in the directory path)&lt;/P&gt;&lt;PRE&gt;create_no_window = &lt;SPAN class=""&gt;0x08000000&lt;/SPAN&gt;
cwd = os.path.dirname(os.path.abspath(__file__))
python_path = os.path.join(sys.exec_prefix, &lt;SPAN class=""&gt;"python.exe"&lt;/SPAN&gt;)
script_path = os.path.join(cwd, &lt;SPAN class=""&gt;"solveTable.py"&lt;/SPAN&gt;)
inputs = [
  &lt;SPAN class=""&gt;'"{}"'&lt;/SPAN&gt;.&lt;SPAN class=""&gt;format&lt;/SPAN&gt;(python_path),
  &lt;SPAN class=""&gt;'"{}"'&lt;/SPAN&gt;.&lt;SPAN class=""&gt;format&lt;/SPAN&gt;(script_path),
  &lt;SPAN class=""&gt;"--in_df"&lt;/SPAN&gt;, &lt;SPAN class=""&gt;'"{}"'&lt;/SPAN&gt;.&lt;SPAN class=""&gt;format&lt;/SPAN&gt;(in_df_csv),
  &lt;SPAN class=""&gt;# some more kwargs&lt;/SPAN&gt;
  ]
&lt;SPAN class=""&gt;with&lt;/SPAN&gt; subprocess.Popen(
  inputs,
  stdout=subprocess.PIPE, stderr=subprocess.PIPE,
  creationflags=create_no_window) &lt;SPAN class=""&gt;as&lt;/SPAN&gt; process:
  &lt;SPAN class=""&gt;# some code for logging and error handling&lt;/SPAN&gt;
&lt;/PRE&gt;&lt;P&gt;The solveTable.py is parsing the kwargs from the input, splits up the in_df into n chunks, then calculates the values for those chunks, saves each result into a .csv file for me to read when everything is done. This - as previously stated - works perfectly when executing from commandline.&lt;/P&gt;</description>
      <pubDate>Thu, 18 Aug 2022 09:09:19 GMT</pubDate>
      <guid>https://community.esri.com/t5/arcgis-pro-questions/issue-calling-a-subprocess-from-arcgis-pro-toolbox/m-p/1203736#M58748</guid>
      <dc:creator>JonasNeubürger</dc:creator>
      <dc:date>2022-08-18T09:09:19Z</dc:date>
    </item>
    <item>
      <title>Re: Issue calling a subprocess from ArcGIS Pro Toolbox</title>
      <link>https://community.esri.com/t5/arcgis-pro-questions/issue-calling-a-subprocess-from-arcgis-pro-toolbox/m-p/1211266#M59704</link>
      <description>&lt;P&gt;My first guess is some issue with the pathing or quoting in the `inputs` list, subprocess can be persnickety about this particularly on Windows. Perhaps try a simpler `inputs` to start to isolate what's causing the issue, or disable the no_window and see if there is are any details the process call shows. Just try launching the python_path and a script that returns a value. I would also check that the path being sent as `in_df_csv` is properly normalized.&lt;/P&gt;</description>
      <pubDate>Fri, 09 Sep 2022 19:54:24 GMT</pubDate>
      <guid>https://community.esri.com/t5/arcgis-pro-questions/issue-calling-a-subprocess-from-arcgis-pro-toolbox/m-p/1211266#M59704</guid>
      <dc:creator>ShaunWalbridge</dc:creator>
      <dc:date>2022-09-09T19:54:24Z</dc:date>
    </item>
  </channel>
</rss>

