Immediately dropping records in a real-time analytic once they have been dropped from the source

ChrisSchrader · ‎01-03-2024

I have a real-time analytic that is being used in emergency response apps. I have an HTTP Poller making a call every 5 minutes. We are having an issue where old records aren't getting dropped quickly enough. Is there a way to configure the analytic to immediately drop any records that are no longer coming in through the API call?

There is an end date field in the table that I was using to drop records via a Python script every 5 minutes to ensure that there weren't any records in the apps past their end date. The problem with that is if they drop a record on their end before the end date, the records remains in our apps.

I tried using a Steam layer instead of a Feature layer, but encountered symbology limitations.

I tried switching to a big data analytic but the problem with that is that when it runs, the points sometime disappear from the Dashboard for at least 20 seconds and it sometimes requires the user to move the map around or zoom in and out before they see the points redraw at all. This seems to work better in WAB than in Dashboard.

brudo · ‎01-03-2024

An interesting question. Presumably you don't have control over the third-party API, and there is no way to query for recently dropped records.

What you could do in this case is have a scheduled Big Data Analytic with two sources, one being the output layer that your Python script is looking at, and the other being an HTTP Poller for your third-party API. Then you can use Join Features within a real time analytic to join the results from your third-party API feed to the output layer, using "Retain all features regardless of join results".

Then, you can detect which features from the output layer have no matching records in the third-party API (e.g. COUNT 0), and flag those for deletion, e.g. by setting the end date to the past, or setting another field that your Python script can use to detect which features need to be purged from the output layer, or writing them to another layer that maintains a list of pending deletions.

It should also be possible to do something similar within a Real Time Analytic, using an HTTP Poller feed that looks at your existing output layer, rather than having a scheduled big data analytic. That might be a bit more efficient with resources, as a Feed plus a Real Time Analytic will generally consume less resources than a Big Data Analytic, and it also avoids duplicate calls to the third-party API as the Feed for that can be reused, and it may not even require a second Real Time Analytic as a single analytic can contain multiple data flows.

However, if using a Real Time Analytic for this, then there is the question of whether the two feeds are in step with one another, and how quickly records should expire. You would have to set the Join Time Window appropriately, to be more than the polling interval of the third-party API feed, but probably less than two polling intervals.

I hope that makes sense. I think the general idea I am suggesting will work but may need some polishing to achieve just what you want, and I may not have explained it very well. Please feel free to reach out if it needs further clarification.

JeffEismanGIS · ‎01-03-2024

@ChrisSchrader That is a good question, I have a few questions about your configuration and how you would like to be able to use the data if you don't mind.

In the feed do you have a TrackID set? When you are talking about trying to drop records is that for visualization purposes or data storage purposes? Is the desired outcome to only see the active records on a map/dashboard? Feel free to contact me directly and we can review your configuration.