Select to view content in your preferred language

How deal with showing massive data, built client side

363
9
Jump to solution
12-02-2024 10:40 AM
AndrewMurdoch1
Frequent Contributor

Good Day

I know this has been brought up before, but I'm in a new bind.  We have clients who want to show hundreds of thousands of features on our maps.

We use multiple endpoints to build the map features, mainly two endpoints, one which has all the attribute data, and one that has the geometry data.  Those are stored separately because the attribute data is fairly dynamic and can quickly change. 

As far as I'm aware, we can't use a hosted feature layer because there is no method to quickly and dynamically upload / update an attribute on the fly, and have it tie back into the render setting.   This leaves us with having to build our maps on the client side, and render them in place. 

The issue is that building and rendering features on the client side takes memory, and when we get into the range of 100 000+, a lot of memory.  I've had requests to show upwards of 500 000, and not matter how much I push back and say it's a dumb idea, people don't care.

Is there any methodology to render massive feature counts and build the feature client side?  Is there any way to quickly destroy and build features, if such a thing would work?

Thanks

0 Kudos
1 Solution

Accepted Solutions
UndralBatsukh
Esri Regular Contributor

Hi there, 

I recommend using StreamLayer, especially since you mentioned that your data is updated frequently. StreamLayer is specifically designed for handling high-frequency data updates. You can integrate it with a custom websocket, or use the client-side StreamLayer for seamless integration depending on your needs. You can find an example of the custom websocket implementation here, and a sample for the client-side StreamLayer here.

This sample shows how to add 100,000 features to a cllient-side FeatureLayer without blocking the UI. However, please note that client-side FeatureLayers are not optimized for handling data that updates at a high frequency.

Additionally, I recommend reviewing the Visualization Guide Docs, which offer specific guidance on how to visualize high-density data effectively.

 

 

View solution in original post

0 Kudos
9 Replies
Sage_Wall
Esri Contributor

I'm personally not sure how to improve performance, but the API's client-side feature layers usually handle massive amounts of data pretty well.

As another option consider publishing a map service with the geometry related or joined to the dynamic data tables. I used to use this workflow when working at a local county with over 500,000 parcels and hundreds of attributes per feature that could update daily.  We published a map service with the geometry (the parcel boundaries) related to our dynamic SQL Server table views based on a unique id.  That way the attributes would update automatically (live) in the GIS as the database tables (modified in other non-spatial applications) were modified.  This required ArcGIS Enterprise and some relational databases, but it worked and performed really well.

0 Kudos
UndralBatsukh
Esri Regular Contributor

Hi there, 

I recommend using StreamLayer, especially since you mentioned that your data is updated frequently. StreamLayer is specifically designed for handling high-frequency data updates. You can integrate it with a custom websocket, or use the client-side StreamLayer for seamless integration depending on your needs. You can find an example of the custom websocket implementation here, and a sample for the client-side StreamLayer here.

This sample shows how to add 100,000 features to a cllient-side FeatureLayer without blocking the UI. However, please note that client-side FeatureLayers are not optimized for handling data that updates at a high frequency.

Additionally, I recommend reviewing the Visualization Guide Docs, which offer specific guidance on how to visualize high-density data effectively.

 

 

0 Kudos
AndrewMurdoch1
Frequent Contributor

Awesome, I'll take a look, thanks for the suggestion!

0 Kudos
AndrewMurdoch1
Frequent Contributor

Hey

I've managed to get a Stream Layer stood up with a web socket that is handling 400 000 features at ~1 GB of memory.  What I don't understand, when I made 200 000 features using a “normal” Feature Layer it uses ~4 GB of memory, but fundamentally, what's really different?   I wrote the web socket server using GO, it's very basic, but once the features are streamed, shouldn't the memory even out?  I can't grasp how the Stream Layer is using 1/4 of the memory for twice the amount.

Since I'm going to be asked why the memory is so drastically different, is it possible to get a clear explaination as to why?

0 Kudos
YannCabon
Esri Contributor

Hi Andrew,

I can give you some answer on that. In general, we have a map and layers on one side, and a view and its layerviews on the other. A layerview for a feature layer or stream layer queries or receives data from the layer it is rendering.

For a feature layer backed by a feature service, the data is store remotely on the service. The layerview queries and stores the data in memory. For a stream layer, the layerview receives data from the websocket and stores the data in memory.

Now for the in memory feature layer case, there is no service to back the data, the FeatureLayer itself stores the data in memory. So the data is multiple times in memory (in webworkers). The whole dataset held by the feature layer, and the data visible on screen held by the layerview.

It is not the same data. The layerview keeps only the data intersecting the view, optimized for rendering (geometries can be generalized or/and quantized) and projected to the spatial reference, with the required fields for display. Plus when the layerview queries the feature layer, the layer may have to reproject the geometries if their spatial reference doesn't match the view's.

The stream layer works differently since the layer doesn't hold any features in memory. Only what is received via the websocket is kept by the layerview, which can also purge it.

We keep optimizing the in memory feature layer so we should see some improvements, but their will always be a difference with the stream layer since they fundamentally work differently.

AndrewMurdoch1
Frequent Contributor

Brilliant, that's what I figured was going on, that makes sense 🙂 — Thanks for the explanation.

0 Kudos
YannCabon
Esri Contributor

You may see some change in 4.31. We installed an optimization regarding this:


Plus when the layerview queries the feature layer, the layer may have to reproject the geometries if their spatial reference doesn't match the view's.

It's always better to have the Feature Layer spatial reference matching the view's, to avoid reprojection. But if the data is reprojected, the layer keeps the reprojected geometries in memory in a cache for later reuse if mulitple queries need the same geometries in the same out spatial reference. In 4.31, that cache is purged every 5 seconds, so some memory would be reclaimed more frequently.

AndrewMurdoch1
Frequent Contributor

Good Day

I'm having a new issue, when I change the extent of the view, such as zoom in / out.  I first want to clear all the existing features from the layers, then grab the new features.

Assuming I have a polygon of the extent, if I call:

layers.forEach((layerObject) => {
    const layer = layerObject.layer;
    layer.sendMessageToClient({                                
        type: "clear"
    })
})

 

Before doing:

layers.forEach((layer) => {
    if (layer.layer) {
        layer.layer.sendMessageToSocket({
            asset: layer.assetTypeId,
            boundingPolygon
        })
    }
})

 

The question is: How do I wait until the clear is complete, before trying to grab the features? If I can do that, then the load on the map will be greatly reduced.

Thanks

0 Kudos
AndrewMurdoch1
Frequent Contributor

Figured out the issue, apparently the "TRACKID" has to be unique between feature additions, which I was unaware of.   I thought it worked like an Object ID, but it doesn't. After randomizing the Track ID, now it's fine.

That should be added into the documentation 🙂

0 Kudos