Select to view content in your preferred language

How to make data expression that finds intersecting features faster?

2076
11
01-25-2023 04:40 AM
Labels (1)
NygrenOhto
Occasional Contributor

How can I make the following data expression faster? I have a feature layer with 5500 rows and another one with 3175 and it takes this script 5:30 minutes to run through. That's way too long.

 

I am trying to create an indicator that compares data from two different years in certain areas in a city by checking intersecting population data squares with regions of a city.

 

 

var portal = Portal("https://arcgisportal.xxxx.fi/arcgis/");
var areas = FeatureSetByPortalItem(portal, 'xxxx719ebb25xxxx968694axxxxxxxx', 3, ['*'], true);
var sql = "area_name = 'xxxxxxx'";
var filtAreas = Filter(areas, sql);

var stat2021 = FeatureSetByPortalItem(portal, 'xxx8abda0xxxx483xxxxxxxx', 0, ['he_vakiy'], true);
var sql2 = "he_vakiy > 0"
var stat = Filter(stat2021, sql2)

//var stat = Intersection(stat, filtAreas)

var features = [];
var feat;

for (var poly in stat) {
//Console(poly)
var pts = Count(Intersects(poly, filtAreas));
Console(pts)
feat = {
'attributes': {
'he_vakiy': poly['he_vakiy']

}
};

// Push feature into array
If(pts > 0){
Push(features, feat);
}
};
// Create dict for output FeatureSet
var out_dict = {
'fields': [
{'name': 'he_vakiy', 'alias': 'population', 'type': 'esriFieldTypeInteger'}
],
'geometryType': '',
'features': features
};

// Convert dictionary to feature set.
return FeatureSet(Text(out_dict));

 

 

0 Kudos
11 Replies
jcarlson
MVP Esteemed Contributor

One way to speed things up is to trim how much data is coming and going from the feature services. Unless you're using every field in a service, don't use ["*"] in your function. Only request the fields you absolutely need for your output.

And I do mean output. You can still filter on the field area_name without bringing that field into your expression. The Filter function merely supplies a where parameter to the FeatureService query endpoint, so you can include any field in the SQL statement, so long as the server has access to a field by that name.

Secondly, even though you're doing a spatial intersection between your layers, you don't need to include the geometry of the layer being intersected. When I write the function

Intersects(some_feature, other_featureset)

what happens on the server side is a request to

portalurl.../other_featureset/FeatureServer/0/query?geometry=<geometry from some_feature>&geometryType=<some_feature's geometry type>&spatialRel=esriSpatialRelIntersects

To put it plainly: you only need the geometry of the single feature, not the other layer. As long as you supply the feature to populate the intersecting geometry, the server will handle the spatial intersection for you, since it has access to the shape column of the other layer.

All that said, try re-writing line 2 as follows.

 

var areas = FeatureSetByPortalItem(
    portal,
    'xxxx719ebb25xxxx968694axxxxxxxx',
    3,
    [],
    false
);

 

 Does that still work? And if so, does it perform better?

- Josh Carlson
Kendall County GIS
NygrenOhto
Occasional Contributor

Hi and thank you for answering.

The script still works but it's not any faster. 

-Ohto Nygren

0 Kudos
jcarlson
MVP Esteemed Contributor

In that case, the expression's execution time has less to do with the amount of data, and more with the number of features being evaluated for. Depending on what you're doing, sometimes there's just no easy way to cut that down.

Try arbitrarily limiting the number of input feautures and timing it, or else, add some Console functions to write out how long each iteration is taking. That won't help the full expression, but could at least narrow down the issue to "you have a lot of features".

- Josh Carlson
Kendall County GIS
0 Kudos
NygrenOhto
Occasional Contributor

There are 5500 rows that the script needs to check intersections for against one area polygon that is filtered. I guess that is a lot for data expressions, I was just hoping it would be fairly fast as checking for intersections in a pop up is much faster with the same data. I using the following in the pop ups:

var m1 = Intersects(FeatureSetById($map, /* statistics_2021 */ "xxxxxxx3542-layer-8"), Buffer($feature, -1, 'meters'));
var m2 = sum(m1, "he_vakiy")

 

I'll have to keep trying different approaches to see if it's possible to make it fairly fast. Thank you for all your help so far!

-Ohto

0 Kudos
jcarlson
MVP Esteemed Contributor

Popups evaluate per-feature. This expression needs to send separate intersection queries to the server for each feature, so making that many requests, even if they are quite small, will take some time.

Taking another look at your code. If I'm understanding this right, your output is just a single field from the one layer, filtered for features which intersect with at least one feature from the other?

Is the data dynamic, or will it stay the same? Do you have control over the schema at all? Would it be permissible to create a new layer?

Data Expressions are nice for cases where you either don't control the data sources, or the data might change in the future. But from the sound of things, these are population counts, so those would probably not be changing. And if that's the case, it would certainly be easier to create a new field.

I understand not wanting to add too many fields to a layer, but suppose you used the Count(Intersects(poly, filtAreas)) number to populate an integer field. In your Dashboard, you could just reference the layer directly and filter that field for > 0. For static data, I'd prefer an extra integer field over 5-minute expressions every time the dashboard loads.

- Josh Carlson
Kendall County GIS
0 Kudos
NygrenOhto
Occasional Contributor

You are correct, I'm trying to calculate the population of different city zones from a single field in one layer. I would like to display that in a few separate indicators so the user can quickly see the population in each of the 7 zones in the city and how much the population has changed in one year. I have another identical expression calculating the reference number from the previous year.

The data changes once a year so not that often, but I don't control the data sources and will not control the dashboard after it has been built. It would be much easier to just add an extra field, you're right, but I was hoping this could be possible fairly fast with a data expression so the end user doesn't have to make changes to it in the future. And you're also right, a 5-minute wait to start a dashboard is not user friendly at all. I'll have to try and think of an alternative solution.

-Ohto

0 Kudos
JohannesLindner
MVP Frequent Contributor

Count() uses a simple for loop internally, so depending on the featureset, it can be really slow. You don't actually need the count here, you just need to check if the featrueset is empty or not. Another way to do that is to call First() on the fs. If the fs is empty, First() will return null.

So, in your code above, delete lines 17 & 18, and replace line 27 with this:

 

if(First(Intersects(poly, filtAreas)) != null) {

 

 

Keep Josh's changes, too. They might not bring a huge speed difference (my change probably won't, either), but they are considered best practice.


Have a great day!
Johannes
jcarlson
MVP Esteemed Contributor

Using Count with a featureset is not a for loop, it sends a single query with returnCountOnly set to true.

- Josh Carlson
Kendall County GIS
0 Kudos
JohannesLindner
MVP Frequent Contributor

Oops, you're right. I think I misremembered my tests with Count(). Thanks for the correction!


Have a great day!
Johannes