Arcade: Optimizing layer load time

672
4
04-14-2023 05:13 AM
Labels (1)
JohannesLindner
MVP Frequent Contributor

While trying to optimize a data expression for Dashboard with @ChrisSpadi , I realized that the first time you access a feature of a featureset, it takes a very long time. A quick example:

 

var times_load = []
var times_read_first = []
var times_read_all = []
var start

for(var i = 0; i < 10; i++) {

  // load the layer
  start = Now()
  var fs = FeaturesetByPortalItem(Portal("https://arcgis.com"), "78ea2a77985743abb7e83b021da2f728", 0, ["OBJECTID"], false)
  Push(times_load, DateDiff(Now(), start))
  
  // access the first feature
  start = Now()
  var oid = First(fs).OBJECTID
  Push(times_read_first, DateDiff(Now(), start))

  // acccess all features
  start = Now()
  for(var f in fs) {
    var oid = f.OBJECTID
  }
  Push(times_read_all, DateDiff(Now(), start))
}

Console(`Loading the layer: ${Round(Average(times_load), 0)} milliseconds. ${times_load}`)
Console(`Reading the first element: ${Round(Average(times_read_first), 0)} milliseconds. ${times_read_first}`)
Console(`Reading all elements: ${Round(Average(times_read_all), 0)} milliseconds. ${times_read_all}`)

 

 

Loading the layer: 274 milliseconds. [485,255,254,250,246,254,247,247,247,253]

Reading the first element: 588 milliseconds. [1202,479,678,446,546,350,1126,349,340,361]

Reading all elements: 0 milliseconds. [1,0,0,0,0,0,0,0,0,0]

 

This layer has only 8 features. As you can see, getting the layer reference with FeaturesetByPortalItem takes ~0.25 seconds. Accessing the first feature takes ~0.6 seconds (with widely varying times). After that first time, accessing features is instantaneous.

 

Until now, I assumed that FeaturesetBy* loads the layer into the RAM. But this makes it seems as if FeaturesetBy* only establishes a connection to the layer and the first data access reads the whole layer into RAM.

Is this interpretation correct? Is there anything we can do to optimize these times (besides specifying fields and not loading geometries, both of which I did here)?

@jcarlson , @XanderBakker 


Have a great day!
Johannes
0 Kudos
4 Replies
jcarlson
MVP Esteemed Contributor

I made a Data Expression once to test this sort of thing! I decided to open it up and make some adjustments.

I'm pinging all features of the World Countries (Generalized) layer from the Living Atlas. I had it do mostly the same steps as you, timing how long to initially load the FeatureSet, then how long to load the first feature and to load all the features. I should note, I'm not timing individual feature loads in the last step, just how long it takes to do a for loop on all features.

I set it to repeat the process 200 times and push the results to a new FeatureSet so I could look at it.

jcarlson_2-1681478553332.png

There are some spikes, but the pattern is pretty clear. Here are the median values:

StepTime (ms)
Load FeatureSet412
Load First Feature173
Load All Features

46

 

So it does seem like once that first feature is loaded, doing anything else with additional features is nearly instantaneous.

Watching the Network tab of the browser, there's only traffic on the initial ping to the FeatureSet. Iterations 2-200 still take time, but as the query is the same, the dashboard uses the cached response from the first request. The individual features come in all at once, and I would guess that you are right, the features are just being held in the browser's memory and being accessed individually there.

Now, as far as how to optimize an expression, that really depends on what is required in the output. I've seen that when multiple functions are chained together, the Arcade compiler actually combines them into a single request. These functions, for example, would all be lumped into a single query:

var fs = FeatureSetByPortalItem(...)

var filt_fs = Filter(fs, 'some expression')

var grouped = GroupBy(filt_fs, 'some_field', {name:'count', expression:1, statistic:'COUNT'})

var ordered = OrderBy(grouped, 'some_field desc')

All that is to say, the more you can offload onto the server, the better. There are some pretty creative ways you can use things like GroupBy and Distinct to reshape a layer on the server side. It's not always an option, of course.

 If you have to extensively use a single FeatureSet, you can load the whole thing into memory by looping through and pushing each feature into a new FeatureSet. Sort "breaks the link" between the original FeatureSet and its service. It takes a bit to load your features in initially, but then accessing features from the resulting in-memory FeatureSet is very fast.

jcarlson_3-1681480113807.png

Why would you want to do this? Suppose you had a layer with 100 features, and for each feature, you needed to query 3 other layers to pull in some attributes. This would translate into 300 requests to the server. Even if these are cached in the browser, it still takes a long time to read in the features from those queries.

Now suppose you queried all 4 of your layers and loaded them into in-memory FeatureSets, then did your 100 feature for-loop. No additional queries are being sent after the first 4, and the features are already loaded and readily available, and the attributes from the other 3 layers can be pulled in really fast.

What Arcade really needs is a function to quickly convert a FeatureSet into an Array...

 

- Josh Carlson
Kendall County GIS
jcarlson
MVP Esteemed Contributor

Just an update: I did some more testing on the in-memory vs "regular" iterations, and over 200 iterations, the total time taken by the in-memory approach is about 10x faster. But I do think that really only helps for very specific situations.

- Josh Carlson
Kendall County GIS
JohannesLindner
MVP Frequent Contributor

Hey Josh,

thanks that's good stuff!

 

I should note, I'm not timing individual feature loads in the last step, just how long it takes to do a for loop onallfeatures.

Yeah, I did that too, which makes the time discrepancy even starker. After you spent a lot of time getting the first feature, it's not nearly instantaneous to access a feature, but all features.

 

 

when multiple functions are chained together, the Arcade compiler actually combines them into a single request.

That's good to know!

 

 If you have to extensively use a single FeatureSet, you can load the whole thing into memory by looping through and pushing each feature into a new FeatureSet. [...] What Arcade really needs is a function to quickly convert a FeatureSet into an Array...

Yeah, that's the way I went in this case. I had to merge three featuresets and then compare each feature to each other feature. Pushing the features into the array is where this question cropped up, because it took a long time.

I'm gonna open ideas for an Array conversion and Merge.

 

 

Pinging the World Atlas is a good idea and raises two other interesting points:

 

If you load a layer form the Living Atlas in the Arcade Playground, the FeaturesetBy* function returns instantly.

 

Loading the First() feature takes much less time for a layer from the Living Atlas compared to a service from my AGOL:

JohannesLindner_0-1681558091847.png

 

 

The time to load the layer is approximately the same. But the expression loads the First() feature from the World Atlas in 10 milliseconds, and from my AGOL in 460 milliseconds. That's a somewhat extreme difference.


Have a great day!
Johannes