Using data expression to return duplicate values in a dashboard list

1004
20
07-17-2024 06:20 AM
sophered
Regular Contributor

Hello! I am trying to isolate and display duplicate values in a field in a dashboard list element. I was thinking of using the distinct function to do this but have not had any luck. Here is what I have so far! 

var portal = Portal('portal')
var fs = FeatureSetByPortalItem(portal, "itemid", 0, ['*'],true)
var uniquevalues = Distinct(fs, "fieldname")
return Filter(fs, "fieldname NOT IN @uniquevalues")

 

0 Kudos
20 Replies
sophered
Regular Contributor

sophered_1-1721750466407.png

Unfortunately nothing shows up in the Results when I try to return sql

0 Kudos
jcarlson
MVP Esteemed Contributor

Interesting… At this point, we just need some Console statements. Run this, what does the "Messages" tab say?

var portal = Portal('portal')
var fs = FeatureSetByPortalItem(portal, "itemid", 0, ['*'],true)
var uniquevalues = Distinct(fs, "fieldname")
Console('Unique Values', uniquevalues)

var val_arr = []
for(var v in uniquevalues) {
    Console(v, v['fieldname'])
    Push(val_arr, `'${v['fieldname']}'`)
    Console(val_arr)
}

var sql = `fieldname NOT IN (${Concatenate([val_arr, ', ')})`
return Filter(fs, sql)
- Josh Carlson
Kendall County GIS
0 Kudos
sophered
Regular Contributor

sophered_0-1721752166471.png

A few minutes after pressing test, I am met with this lovely screen : (

0 Kudos
jcarlson
MVP Esteemed Contributor

Yikes! How many distinct values are there, anyway?

- Josh Carlson
Kendall County GIS
0 Kudos
sophered
Regular Contributor

Great question! Here's another yikes! When I try to do a count on the unique values, nothing shows up in the results. I can do a count on the fs (62000 records), but not unique values. I actually did open my layer up in pro and there is around 200 duplicate records when considering nulls as duplicates. If not considering nulls, there are about 3 duplicates. Unfortunately I have to display it in a list on dashboard, because the end user will not be using pro.

0 Kudos
KenBuja
MVP Esteemed Contributor

The code I posted initially used a layer with 68K records, so that size of data shouldn't be a problem. Another way to cut down on the request is not to return the geometry of the features.

I'm curious on what result you got when doing a test run vs. getting the error in the Dashboard

0 Kudos
sophered
Regular Contributor

so this is what i tried 

var portal = Portal('portal')
var fs = FeatureSetByPortalItem(portal, "itemid", 0, ['*'],false)
var gb = Groupby(fs, 'EXTERNALSTREETKEY', {name: 'Count', expression: '1', statistic: 'COUNT'});
var filterfs = Filter(gb, "Count > 1")
return filterfs

sophered_0-1721760798603.png

and this is what i got! more than nothing in the results, although, my pro tells me otherwise that there should be duplicates. and i am unable to press done in the bottom left corner. i think thats the "unable to execute arcade script" kind of error thing. i did make some test data and it works perfectly (although it is a MUCH smaller dataset) but when i change the itemid to the actual dataset, it does not work 

0 Kudos
jcarlson
MVP Esteemed Contributor

Oh, gosh. Well, that would explain it, we're creating a SQL string that's going to be many many thousands of characters long. It's probably the wrong approach to look for the "good" values and build a filter based on the "bad" ones.

Let's take a totally different approach. I like @KenBuja 's use of GroupBy, but the output of that won't include any of the feature attributes that we need. But it could give us one of the objectIDs with some tweaking, which we could pass into a Filter on the original featureset.

Do you know if there are any cases of more than two duplicates? This expression could probably get rid of duplicates when there's 1 extra copy.

var fs = FeatureSetByPortalItem(…)

var dups = GroupBy(
  fs,
  'some_field',
  [
    {name: 'dup_id', expression: 'objectid', statistic: 'MAX'},
    {name; 'feat_count', expression: '1', statistic: 'SUM'}
  ]
)

var dups = Filter(dups, 'feat_count > 1')

var dup_ids = []

for (var d in dups) {
  Push(dup_ids, d['dup_id'])
}

return Filter(
  fs,
  `objectid not in (${Concatenate(dup_ids, ',')})`
)

 

- Josh Carlson
Kendall County GIS
0 Kudos
sophered
Regular Contributor

sophered_0-1721758593140.png

so i tried that expression and i once again encountered the issue where it does not give any results and does not let me press "done" to exit the expression. if including the nulls, then there are plenty of cases with more than 1 duplicate (although, id rather not include the nulls). there probably will be cases in the future where there is more than 1 duplicate and id like for it to display those as well. 

0 Kudos
sophered
Regular Contributor

after some tinkering, I found out that other fields on the same layer work perfectly fine with using the Groupby Method. The particular field that i am trying to use for this was a mixed case kinda field with special characters (@, .) I am unsure if this has a factor in it or not but it is interesting that it works perfectly for other fields as well, but not this one! I was wondering if i could try to use a Left function to pull only the first 10 characters, those are the only ones that need to be checked for unique-ness. and groupby those?

0 Kudos