How can I improve performance of this Arcade data expression that joins attributes?

Vinzafy · ‎11-24-2023

Hey all,

Posting this here to see if anyone can help improve the performance of an Arcade script I'm using. The script is essentially a join between a spatial layer and a related table and is used as a data expression in a dashboard.

Though this is currently achieved via a joined view layer, that restricts schema edits to the hosted feature layer and I'm hoping to get away from that. Additionally, another use case I have joins three related tables which you can't do with a joined view layer.

In the simpler case of joining one related table, the base of the script I got was from this github page (and pasted below). The only minor change I made was to specify fields for the feature sets as opposed to returning all fields.

If I use that data expression as a source for a dashboard item (e.g,. an indicator), load times are around 75 seconds vs. 3-5 seconds though a joined view layer.

Is there anyway to improve performance of this script? It would help immensely. My other use case where I join three related tables currently takes around 2.5 minutes to load. Though it works as expected, the load times are brutal.

Any help is greatly appreciated! Thanks all.

var portal = Portal("https://www.arcgis.com/");
var polyfs = FeatureSetByPortalItem(
    portal,
    "4dbbad3d6f694e0ebc7c3b4132ea34df",
    0,
    ["*"],
    false
);

var tablefs = FeatureSetByPortalItem(
    portal,
    "4dbbad3d6f694e0ebc7c3b4132ea34df",
    6,
    ["*"],
    false
);

// Create empty features array and feat object
var features = [];
var feat;

// Populate Feature Array
for (var t in tablefs) {
    var tableID = t["FeatureID"]
    for (var p in Filter(polyfs, "HydroID = "+tableID)){
        feat = {
            attributes: {
                FeatureID: tableID,
                Name: p["DPS_Region"],
				ModelID: t["ModelID"],
                AddressCount: t["AddressCount"],
                MAX_TSTime: t["MAX_TSTime"],
            }
        }

    Push(features, feat)
    }
}

var joinedDict = {
    fields: [
        { name: "FeatureID", type: "esriFieldTypeString" },
        { name: "Name", type: "esriFieldTypeString" },	
        { name: "ModelID", type: "esriFieldTypeInteger" },
        { name: "AddressCount", type: "esriFieldTypeInteger" },
        { name: "MAX_TSTime", type: "esriFieldTypeString" },
    ],
    'geometryType': '',
    'features':features
};

// Return dictionary cast as a feature set 
return FeatureSet(joinedDict);

JohannesLindner · ‎11-26-2023

Arcade is built to optimize queries to the server where the actual data is stored. So when you call FeaturesetByPortalItem(), the fs isn't actually loaded into your local RAM, it just creates a connection to the fs on the server. This is done to then optimize Filter(), Intersects(), First() and so on to send as few as possible query requests to the server.

This approach doesn't work with calling these functions in a loop. You're basically bombarding the server with hundreds to thousands filter requests (one for each feature in the fs), which slows the script down to a crawl. The developers are aware of this, but I wouldn't hold my breath waiting for a solution, it's basically a known limitation at this point. The intended approach is to use a joined layer.

But using a joined layer isn't always possible or sensible. For these cases, @jcarlson wrote a nifty function that actually loads the fs into your local RAM, where repeated Filter() calls are much faster: Improving Expression Performance: A Custom Function

In your case, we don't actually need that function, but the principle still applies: Copy the data from the server and do your filtering locally. In this case, we can use a simple dictionary (lines 21, 22, 34):

// only load the fields you need
var portal = Portal("https://www.arcgis.com/");
var polyfs = FeatureSetByPortalItem(
    portal,
    "4dbbad3d6f694e0ebc7c3b4132ea34df",
    0,
    ["HydroID", "DPS_Region"],
    false
);
var tablefs = FeatureSetByPortalItem(
    portal,
    "4dbbad3d6f694e0ebc7c3b4132ea34df",
    6,
    ["FeatureID", "ModelID", "AddressCount", "MAX_TSTime"],
    false
);

// repeatedly sending Filter requests to the server slows the script down to a crawl for larger Featuresets.
// It's best to create a copy in local RAM. In this case, we can just use a dictionary {HydroID: DPS_Region}.
// If you need Filter(), use the Memorize() function defined in the blog post above.
var dps_regions = Dictionary()
for(var p in polyfs) { dps_regions[Text(p.HydroID)] = p.DPS_Region }

// Create empty features array and feat object
var features = [];
var feat;

// Populate Feature Array
for (var t in tablefs) {
  var tableID = Text(t["FeatureID"])
      feat = {
        attributes: {
            FeatureID: tableID,
            Name: dps_regions[tableID],
            ModelID: t["ModelID"],
            AddressCount: t["AddressCount"],
            MAX_TSTime: t["MAX_TSTime"],
          }
      }
  Push(features, feat)
}

var joinedDict = {
    fields: [
        { name: "FeatureID", type: "esriFieldTypeString" },
        { name: "Name", type: "esriFieldTypeString" },	
        { name: "ModelID", type: "esriFieldTypeInteger" },
        { name: "AddressCount", type: "esriFieldTypeInteger" },
        { name: "MAX_TSTime", type: "esriFieldTypeString" },
    ],
    'geometryType': '',
    'features':features
};
// Return dictionary cast as a feature set 
return FeatureSet(joinedDict);

This script executes more than 2 times faster than your original script. I expect it will be even more noticeable for large Featuresets.

Take a look at these related Ideas and maybe lend your support. Hopefully, they will be implemented someday...

Arcade FeatureSet Functions should have the option to output to RAM

Arcade: Add a function that quickly converts a Featureset into an Array

Arcade: Add Function to Join Featuresets

Arcade: Add a function to merge multiple Featuresets

Have a great day!
Johannes

View solution in original post

JohannesLindner · ‎11-26-2023

Arcade is built to optimize queries to the server where the actual data is stored. So when you call FeaturesetByPortalItem(), the fs isn't actually loaded into your local RAM, it just creates a connection to the fs on the server. This is done to then optimize Filter(), Intersects(), First() and so on to send as few as possible query requests to the server.

This approach doesn't work with calling these functions in a loop. You're basically bombarding the server with hundreds to thousands filter requests (one for each feature in the fs), which slows the script down to a crawl. The developers are aware of this, but I wouldn't hold my breath waiting for a solution, it's basically a known limitation at this point. The intended approach is to use a joined layer.

But using a joined layer isn't always possible or sensible. For these cases, @jcarlson wrote a nifty function that actually loads the fs into your local RAM, where repeated Filter() calls are much faster: Improving Expression Performance: A Custom Function

In your case, we don't actually need that function, but the principle still applies: Copy the data from the server and do your filtering locally. In this case, we can use a simple dictionary (lines 21, 22, 34):

// only load the fields you need
var portal = Portal("https://www.arcgis.com/");
var polyfs = FeatureSetByPortalItem(
    portal,
    "4dbbad3d6f694e0ebc7c3b4132ea34df",
    0,
    ["HydroID", "DPS_Region"],
    false
);
var tablefs = FeatureSetByPortalItem(
    portal,
    "4dbbad3d6f694e0ebc7c3b4132ea34df",
    6,
    ["FeatureID", "ModelID", "AddressCount", "MAX_TSTime"],
    false
);

// repeatedly sending Filter requests to the server slows the script down to a crawl for larger Featuresets.
// It's best to create a copy in local RAM. In this case, we can just use a dictionary {HydroID: DPS_Region}.
// If you need Filter(), use the Memorize() function defined in the blog post above.
var dps_regions = Dictionary()
for(var p in polyfs) { dps_regions[Text(p.HydroID)] = p.DPS_Region }

// Create empty features array and feat object
var features = [];
var feat;

// Populate Feature Array
for (var t in tablefs) {
  var tableID = Text(t["FeatureID"])
      feat = {
        attributes: {
            FeatureID: tableID,
            Name: dps_regions[tableID],
            ModelID: t["ModelID"],
            AddressCount: t["AddressCount"],
            MAX_TSTime: t["MAX_TSTime"],
          }
      }
  Push(features, feat)
}

var joinedDict = {
    fields: [
        { name: "FeatureID", type: "esriFieldTypeString" },
        { name: "Name", type: "esriFieldTypeString" },	
        { name: "ModelID", type: "esriFieldTypeInteger" },
        { name: "AddressCount", type: "esriFieldTypeInteger" },
        { name: "MAX_TSTime", type: "esriFieldTypeString" },
    ],
    'geometryType': '',
    'features':features
};
// Return dictionary cast as a feature set 
return FeatureSet(joinedDict);

This script executes more than 2 times faster than your original script. I expect it will be even more noticeable for large Featuresets.

Take a look at these related Ideas and maybe lend your support. Hopefully, they will be implemented someday...

Arcade FeatureSet Functions should have the option to output to RAM

Arcade: Add a function that quickly converts a Featureset into an Array

Arcade: Add Function to Join Featuresets

Arcade: Add a function to merge multiple Featuresets

Have a great day!
Johannes

Vinzafy · ‎11-28-2023

THANK YOU for such a detailed answer and providing two different workarounds! This is fantastic. I figured the performance issues had something to do with overloading the server with requests, and the workaround to store the feature set in local RAM, then carry out necessary functions is brilliant!

I was able to test this yesterday in the more simple use case where I tested both the method you outlined above, and utilizing the memorize function by @jcarlson. The difference between the previous method and this new method of storing in local RAM was massive. This method loaded the data expression ~47 times faster than the previous...awesome.

Really excited to implement this in the more complex use case with three table joins. The inefficiencies of that script has been haunting me, and this is the exact cure to that!

I will definitely be lending my support to those related ideas. Thank you again 🙌