Select to view content in your preferred language

How to sort on two fields to calculate percentile in Arcade?

1691
8
Jump to solution
08-21-2023 10:51 AM
RobynSnookCCB
Regular Contributor

I have a traffic counter dataset (with over 30,000 records) in a related table which has the fields speed and date. I need to calculate the 85th percentile speed for every day of the studies. I've been struggling to this and my code as become a mishmash. 

In order to do this I need to sort the data in ascending order by Date AND by Speed and then give the rows unique numbers as sorted. Every day needs to have the unique numbers reset so I can calculate the percentile afterwards. All intensities that I have tried do not number the unique field correctly and the order by function isn't working with 'and' for me. 

 

Somethings I've tried is following these cases: https://community.esri.com/t5/arcgis-pro-questions/arcade-expression-to-calculate-field-based-on-a/t...

 

Open to doing this in python as well 

0 Kudos
1 Solution

Accepted Solutions
KenBuja
MVP Esteemed Contributor

This is an example of what I was thinking of. I'm using the recent earthquakes data to summarize how many earthquakes happened each day and the largest quake of the day. I get the first and last days of the dataset and set the beginning date without the time. Using a While loop, I cycle through the days, adding one day to the existing day, filtering the data by that time span, ordering that filtered data, and reporting the number of quakes and the largest quake for the day. Then I add a day and go through the loop again. The loop breaks when I've gone past the last day of the data.

The one thing I haven't quite worked out is the time zone issue, where it's starting the day at 8 pm local time.

 

// Fetches features from a public portal item
var fs = FeatureSetByPortalItem(Portal("https://www.arcgis.com"),"9e2f2b544c954fda9cd13b7f3e6eebce", 0, ["*"], false);
var sortedFS = OrderBy(fs, 'eventTime DESC');
var endDate = First(sortedFS)['eventTime'];
var reverseSortedFS = OrderBy(fs, 'eventTime ASC');
var beginDate = First(reverseSortedFS)['eventTime'];
var date1 = Date(Year(beginDate), Month(beginDate), Day(beginDate));
var output;
while (DateDiff(endDate, date1, 'day') > 0) {
  var date2 = Text(DateAdd(date1, 1, 'day'), "MM/DD/YYYY")
  var filteredFS = Filter(fs, `eventTime BETWEEN '${Text(date1, "MM/DD/YYYY")}' and '${date2}'`);
  var sortedFilteredFS = OrderBy(filteredFS, 'mag DESC');
  if (Count(filteredFS) > 0) output += (`${Text(date1, "MM/DD/YYYY")}: ${Count(filteredFS)} quakes. The biggest quake was ${Round(First(sortedFilteredFS)['mag'],1)}
  `);
  date1 = DateAdd(date1, 1, 'day')
}
return output;

 

 

quakes.png

View solution in original post

0 Kudos
8 Replies
KenBuja
MVP Esteemed Contributor

Instead of attempting to sort the records by date and speed, can you loop through each date of the study, filtering the records to get the records for that day, then sorting those filtered records by speed to assign the unique numbers?

0 Kudos
RobynSnookCCB
Regular Contributor

That is an option as well but I'm not sure how to proceed as the date field has date and time in it. I need the dates separated and ordered by speed in order to apply unique numbers and then repeat until the we are out of dates. Unique numbers needs to reset for every day.

This is going to be an every growing database and I need it to function dynamically. I believe my main issue is ordering the for loops correctly an applying the filters correctly to avoid continuous numbering for all dates.   

0 Kudos
RobynSnookCCB
Regular Contributor

Here is my code and the results I'm getting. The numbering (rowID) still isn't numbering correctly nor is it sorted by speed.

//related table
var TrafficTable = FeatureSetByName($map, "Traffic Cameras - TrafficCounts", ["DATE", 'GUID', 'LENGTH', 'OBJECTID', 'OUTGOING', 'SPEED'])

//Fields in feature to filter the related table by
var GID = $feature.GlobalID
var ST = DateAdd($feature.StartDateTime, +2.5, 'hours')
var ET = DateADD($feature.EndDateTime, +2.5, 'hours')

//Filter the related table on the Global IDs & start and end times
var fil = Filter(TrafficTable, 'GUID = @GID')
var FilTime = Filter(fil, 'DATE >= @ST and DATE<= @ET')

// Convert datetime to date for table and creates the main feature set to display in pop-up
var new_fs = {
    geometryType: "",
    fields: [
        {name: "DateShort", type: "esriFieldTypeDate"},
        {name: "Speed", type: "esriFieldTypeInteger"}, 
        {name: "LENGTH", type: "esriFieldTypeInteger"},
        {name: "OBJECTID", type: "esriFieldTypeObjectID"},
        {name: "OUTGOING", type: "esriFieldTypeString"},
        {name: "RowID", type: "esriFieldTypeInteger"},
        ],
    features: []
}

//used for unique numbers in RowID
var rowCount = 0
//for loop to add to the new feature set (new_fs)
for(var f in FilTime) {
    var d = f.DATE
    var date_short = Number(Date(Year(d), Month(d), Day(d)))
    var U = Filter(FilTime, "DATE < @d")
    var O = OrderBy(U, "SPEED ASC")
    for (var T in O){
        rowCount += 1
   }
    Push(new_fs.features, {attributes: {DateShort: date_short, Length: f.LENGTH, OBJECTID: f.OBJECTID, OUTGOING:f.OUTGOING, Speed: f.SPEED, RowID: rowCount}})
}

//Data as a feature set
var fs_all = FeatureSet(Text(new_fs))

return OrderBy(fs_all, "RowID ASC")

RobynSnookCCB_0-1692729843073.png  

0 Kudos
KenBuja
MVP Esteemed Contributor

This is an example of what I was thinking of. I'm using the recent earthquakes data to summarize how many earthquakes happened each day and the largest quake of the day. I get the first and last days of the dataset and set the beginning date without the time. Using a While loop, I cycle through the days, adding one day to the existing day, filtering the data by that time span, ordering that filtered data, and reporting the number of quakes and the largest quake for the day. Then I add a day and go through the loop again. The loop breaks when I've gone past the last day of the data.

The one thing I haven't quite worked out is the time zone issue, where it's starting the day at 8 pm local time.

 

// Fetches features from a public portal item
var fs = FeatureSetByPortalItem(Portal("https://www.arcgis.com"),"9e2f2b544c954fda9cd13b7f3e6eebce", 0, ["*"], false);
var sortedFS = OrderBy(fs, 'eventTime DESC');
var endDate = First(sortedFS)['eventTime'];
var reverseSortedFS = OrderBy(fs, 'eventTime ASC');
var beginDate = First(reverseSortedFS)['eventTime'];
var date1 = Date(Year(beginDate), Month(beginDate), Day(beginDate));
var output;
while (DateDiff(endDate, date1, 'day') > 0) {
  var date2 = Text(DateAdd(date1, 1, 'day'), "MM/DD/YYYY")
  var filteredFS = Filter(fs, `eventTime BETWEEN '${Text(date1, "MM/DD/YYYY")}' and '${date2}'`);
  var sortedFilteredFS = OrderBy(filteredFS, 'mag DESC');
  if (Count(filteredFS) > 0) output += (`${Text(date1, "MM/DD/YYYY")}: ${Count(filteredFS)} quakes. The biggest quake was ${Round(First(sortedFilteredFS)['mag'],1)}
  `);
  date1 = DateAdd(date1, 1, 'day')
}
return output;

 

 

quakes.png

0 Kudos
RobynSnookCCB
Regular Contributor

Thank you this was very helpful. I'm after getting more done on my code. However, when I attempt to calculate the 85th percentile I need to isolate a specific value which is why I needed to give the data unique numbers by date. I'm stumped on a filter solution and the if statement doesn't seem to be working correctly as it keeps giving back a random row from the data instead of the row related to the ID I assigned. 

// Fetches related table and fields
var fs = FeatureSetByName($map, "Traffic Cameras - TrafficCounts", ["*"], false);
var Class = $feature.RdClass

//Fields in feature to filter the related table by
var ST = DateAdd($feature.StartDateTime, +2.5, 'hours')
var ET = DateADD($feature.EndDateTime, +2.5, 'hours')

//Filter the related table on start and end times
var FilTime = OrderBy(Filter(fs, 'DATE >= @ST and DATE<= @ET'), 'DATE DESC')

//var sortedFS = OrderBy(FilTime, 'DATE DESC');
var endDate = First(FilTime)['DATE']
var reverseSortedFS = OrderBy(FilTime, 'DATE ASC');
var beginDate = First(reverseSortedFS)['DATE'];
var date1 = Date(Year(beginDate), Month(beginDate), Day(beginDate));

// variables to return in the popup

var rowCount = 0
var output;

//while loop to run through and filter out the data by date
while (DateDiff(endDate, date1, 'day') > 0) {
  var date2 = Text(DateAdd(date1, 1, 'day'), "MM/DD/YYYY")
  var sortedFilteredFS = OrderBy(Filter(FilTime, `DATE BETWEEN '${Text(date1, "MM/DD/YYYY")}' and '${date2}'`), 'SPEED ASC');

  //loops through the data that is separated (filtered) by  date and sorted by speed
    for (var f in sortedFilteredFS){
      var S = f.SPEED
      var G = f.GlobalID
      var L = f.LENGTH
      //applies unique numbering by date and sorted by speed
      if(S > 0){
        rowCount += 1
        }
      //Total count of vehciles by day
      var TCount = Count(sortedfilteredFS)
      //the 85th ID number per day
      var F85_ID = Round((TCount*0.85)+0.5,0)   
      if (rowCount == F85_ID){
        var v85 = f.SPEED
      }
    }
  
   //text info
  if (TCount > 0) 
  output += Text(date1, 'ddd. MMMM DD, Y') + TextFormatting.NewLine +
  'Total Vehicals: ' + TCount + TextFormatting.NewLine +
  'F85: ' + F85_ID + TextFormatting.NewLine + 
  'V85: ' + v85 + ' km/h' + TextFormatting.NewLine +
  'GID: ' + G + TextFormatting.NewLine +
  'length ' + L + TextFormatting.NewLine +
  'Exceed 1000 VPD (Local): ' + IIf(Class == 0 && TCount >= 1000, 'Yes', IIF(Class == 1, 'N/A','No')) + TextFormatting.NewLine +
  'Exceed 8000 VPD (Collector): ' + IIf(TCount >= 8000 && Class == 1, 'Yes', IIF(Class == 0, 'N/A', 'No')) + TextFormatting.NewLine + TextFormatting.NewLine
   
  date1 = DateAdd(date1, 1, 'day')
  
}

return output
0 Kudos
KenBuja
MVP Esteemed Contributor

A couple of notes.

When you find the record with the 85th percentile, you should break out of the for each loop. What's happening is that you keep on calculating G and L for each record after that, so the reported values will be for the final record in that sorted FeatureSet. You should also move those calculation to within the if at line 41 so they're only calculated for the 85th percentile record.

You don't need to calculate TCount and F85_ID  for every record when looping through the sorted filtered FeatureSet. Those calculations should be moved to line 27.

You can simply your output using template literals

  output += `${Text(date1, 'ddd. MMMM DD, Y')}
Total Vehicles: ${TCount}
F85: ${F85_ID}
V85: ${v85} km/h
GID: ${G}
length ${L}
Exceed 1000 VPD (Local): ${IIf(Class == 0 && TCount >= 1000, 'Yes', IIF(Class == 1, 'N/A','No'))}
Exceed 8000 VPD (Collector): ${IIf(TCount >= 8000 && Class == 1, 'Yes', IIF(Class == 0, 'N/A', 'No'))}

`

 

0 Kudos
RobynSnookCCB
Regular Contributor

Thank you, I've made your corrections but they code isn't updating the variables placed in the if statement in line 20. It keeps giving the same ones from the first date. Any suggestions? I assume its broken somewhere in the for or while loop, not positive how to correct it. 

Thank you.

//while loop to run through and filter out the data by date & order by speed
while (DateDiff(endDate, date1, 'days') > 0) {
  var date2 = DateAdd(date1, 1, 'days')
  var sortedFilteredFS = OrderBy(Filter(FilTime, `DATE BETWEEN '${date1}' AND '${date2}'`), 'SPEED ASC')
  
  //Total count of vehciles by day
  var TCount = Count(sortedfilteredFS)
  //the 85th ID number per day
  var F85_ID = Round((TCount*0.85)+0.5,0) 
  
  //loops through the data that is separated (filtered) by  date and sorted by speed
  for (var f in sortedFilteredFS){
    var S = f.SPEED
    //applies unique numbering by date and sorted by speed
    if(S > 0){
      rowCount += 1
    }
    
       //KEEPS GIVING THE SAME VALUE TO ALL DATES
    if(rowCount == F85_ID){
      var v85 = f.SPEED;
      var L = f.LENGTH;
      var OID = f.objectID;
      var RID = rowCount; 
      break
      }
  }
  
   //text info
  if (TCount > 0) 
  output += `${Text(date1, 'ddd. MMMM DD, Y')}
  Total Vehicles: ${TCount}
  OID: ${OID}
  F85: ${F85_ID}
  RID: ${RID}
  V85: ${v85} km/h
  length ${L}
  Exceed 1000 VPD (Local): ${IIf(Class == 0 && TCount >= 1000, 'Yes', IIF(Class == 1, 'N/A','No'))}
  Exceed 8000 VPD (Collector): ${IIf(TCount >= 8000 && Class == 1, 'Yes', IIF(Class == 0, 'N/A', 'No'))}` + TextFormatting.NewLine + TextFormatting.NewLine
    
  date1 = DateAdd(date1, 1, 'day')
  
}

return output

 

RobynSnookCCB_0-1692972556592.png

 

0 Kudos
KenBuja
MVP Esteemed Contributor

You're calculating F85_ID for the filtered records for each day, but you never reset rowCount to zero for each day.