Expression to categorize values, to make a histogram

178
1
2 weeks ago
Labels (1)
NinaBergström
New Contributor

Hello from Sweden
I want to make a histogram of values in a table between null to 500.

I have made an arcade expression earlier to be able to sort by category and I have tried to change that one to this:

var fs = FeatureSetByPortalItem(Portal('https://xxxx/portal'),'xxxx',0,["*"], false);  
var features = [];
var feat;

for (var f in fs){
    var d = Decode(
        f['VAR27'],
        9, '1-10',
        29, '20-30',
        54, '50-60',
        79, '70-80',
        ''
    )

    feat = {
        attributes: {
            formelltNamn2023: f['formelltNamn2023'],
            histogram: d,
            AR: f['AR'],
            VAR27: f['VAR27']
        }
    }

    Push(features, feat)
}

var out_dict = {
    fields: [
        {name: 'formelltNamn2023', type: 'esriFieldTypeString'},
        {name: 'histogram', type: 'esriFieldTypeString'},
        {name: 'VAR27', type: 'esriFieldTypeInteger'},
        {name: 'AR', type: 'esriFieldTypeInteger'}

    ],
    geometryType: '',
    features: features
}

return FeatureSet(Text(out_dict))

 

It works but I want to se a value range instead then specific. Please help me improve this expression, tank you.

formelltNamn2023 histogram VAR27 AR FID

"xxkommun, x""50-60"5420190
"xxkommun, x""70-80"7920201
"xkommun, x""20-30"2920212
"xxkommun, x""1-10"920223
"xxkommun, x"""1220234
"xxstad, x"""320195
"xxstad, x"""720206
"xxstad, x"""320217

 

0 Kudos
1 Reply
jcarlson
MVP Esteemed Contributor

I know I've made a histogram in Arcade in the past, but I can't find my original code. So, we want to group our features into bins, and have these bins show up in a bar chart.

Your expression is pretty close already, but we run into trouble: without a numeric bin number, the bins have to sort by their name, which isn't always right.

jcarlson_0-1714228787851.png

I figured out how to include a "bin number" attribute, but when the chart groups by that value, we lose the ability to show the "bin name", the range of values, as the label.

jcarlson_1-1714228987050.png

So, the best way to do this is to group the values in the expression. I tried it with the function GroupBy. That way each bin is its own feature, and we can set our chart to be based on features, rather than grouped values. But that brought its own trouble. Suppose one of your bins were empty? The chart omits it!

Here's the same histogram, but with values removed from one of the bins. Rather than showing 55 - 66 as an empty bin, it's removed from the chart entirely, which could lead to very misleading histograms.

jcarlson_2-1714229292611.png

So, the solution. We will do the grouping in our expression, but with a loop. We initialize the FeatureSet with a feature per bin, set to a value of 0, then add to that value as we go.

I thought that this would actually be a useful tool to bring into other dashboards, so I wrote a custom function, Histogram.

function Histogram(fset, field, bins){
    /*
    Generates a histogram from a FeatureSet
    
    Parameters:
        - fset: a FeatureSet
        - field: the field name with the numeric values to be binned
        - bins: the desired number of bins
    
    Returns:
        FeatureSet with one row per bin. Rows include a bin label, bin number, and count.
    */

    // get the upper and lower bounds of the data    
    var hmin = Min(fset, field)
    var hmax = Max(fset, field)
    
    // determine the size of each bin
    var step = Floor((hmax - hmin) / (bins - 1))
    
    // histogram featureset dictionary to hold values
    var hist_fs = {
        fields: [
            {name: 'bin_num', type: 'esriFieldTypeInteger'},
            {name: 'bin_name', type: 'esriFieldTypeString'},
            {name: 'the_count', type: 'esriFieldTypeInteger'}
        ],
        geometryType: '',
        features: [{attributes:{
            bin_num: 0,
            bin_name: `< ${hmin + step}`,
            the_count: 0
        }}]
    }
    
    // populate dictionary with empty bins
    var b_idx = 1
    
    for (var i = hmin + step; i < (hmax - step); i += step) {
        Push(
            hist_fs['features'],
            { attributes: {
                bin_num: b_idx,
                bin_name: `${i} — ${i + (step - 1)}`,
                the_count: 0
            }}
        )
        
        b_idx += 1
    }
    
    Push(
        hist_fs['features'],
        { attributes: {
            bin_num: b_idx,
            bin_name: `> ${i}`,
            the_count: 0
        }})
    
    // loop through input featureset and grab values
    for (var feat in fset) {
        // figure out which bin it falls into
        var bnum = Iif(
            feat['val'] == hmin,
            0,
            Ceil((feat['val'] - hmin)/step) - 1
        )
        
        // increment count on corresponding bin
        hist_fs['features'][bnum]['attributes']['the_count'] += 1
    }
    
    return FeatureSet(Text(hist_fs))

It isn't perfect, sometimes depending on the bin size and min/max values the bin count might be off. But the resulting featureset is good:

jcarlson_3-1714232239539.png

To use this, just drop that function into your script and after you define your featureset, call Histogram(fs, 'VAR27', 10).

You can mess with the bin size as you like. I suppose you could hard-code min/max values into the expression, too, but I wanted something that could adjust to the input data.

Once you have it, set your chart to "Features", wtih "bin_name" as the category, "the_count" as the series, and sort by "bin_num" just to ensure that the bars show in the proper order.

jcarlson_4-1714232730406.png

jcarlson_5-1714232795204.png

You may wish to set the chart's min value to 0, otherwise the histogram could be misleading. Here's the function with 15 bins and a different set of numeric data.

jcarlson_6-1714233283319.png

I hope this helps!

 

- Josh Carlson
Kendall County GIS
0 Kudos