I'm using the Run Python Script Tool in GeoAnalytics Server, using the ArcGIS API for Python, where I load a CSV file in a BigData Store, with about 80k records.
The goal is to use PySpark to run a few joins between some spark dataframes, and store the result in a pre-existing hosted table in Portal.
As currently there is no way to update an existing hosted table using PySpark, I use ArcGIS API for Python to do the update. For that I do dataframe.rdd.collect() and I loop the rows to insert them in the hosted table. The dataframe.rdd.collect() results in the following error: "Job aborted due to stage failure: Task 56 in stage 15.0 failed 4 times, most recent failure: Lost task 56.3 in stage 15.0 (TID 1386, 10.221.254.134, executor 0): TaskResultLost (result lost from block manager)".
Thank you for your help,