Serial Chart Guides based on Percentage

566
5
05-26-2021 06:20 AM
JasonCyphers
Occasional Contributor III

I created a serial chart, with categories from fields.  While the chart shows record count, I'd like to include percentages of overall records, either as a label or using guides.  Is there any way to do this?

If using guides, I'd like something like this:

JasonCyphers_0-1622035090464.png

 

 

5 Replies
jcarlson
MVP Notable Contributor

A "dynamic" guide like that would be nice, but I don't know that it's currently possible in a serial chart using the off-the-shelf options. A serial chart is focused on how features/categories compare to one another based on some metric. Of course, if that metric is "count", that's essentially what you're getting.

An 'advanced' way of doing this would be to use a data expression. You could include an expression which returns the total number of rows, then use GroupBy to return a featureset, and use the total number of rows to add a calculated field which indicates "percent of total".

All that aside, you might just consider including a separate pie chart, as "amount relative to total" is precisely what a pie chart is made to display.

- Josh Carlson
Kendall County GIS
JasonCyphers
Occasional Contributor III

Thanks for the reply.  A little more detail...

I have a Survey123 form, with a select_multiple field.  I also created individual numeric fields for each of the selections that is auto-populated with a "1" if that value is selected in the select_multiple field.  The serial chart is comprised of these individual fields, and sums the value of these fields to return the total # of times an option was selected (this was the only way I could get a count of each individual selection).

This is all being done in an enterprise portal, and it looks like that option to do a data expression is only in AGOL.  Looked at a pie chart; but, with over 80 items I'm trying to report, it's a little busy.

In additional to the above, I'd really like a way to sort by record count, as I've been asked several times to view the "top 10 most selected items", but creating a serial chart from individual fields won't allow that.

jcarlson
MVP Notable Contributor

I see. Yes, those of us in the Enterprise world have to wait a bit longer for Data Expressions, sadly. They're incredibly useful for out-of-the box situations like this.

So you've successfully gotten around the select_multiple field with the numeric fields you added, and that means you're already part of the way there. A bit of python could do the rest of it, if you're comfortable doing a bit of coding.

You could do this with the existing fields, but since they're auto-calculating to 1, we'll use additional fields that follow the naming convention "X_percent", where "X" is the count column.

In short: replace the "1" with a decimal value representing "percent of total". The chart can show the sum of those fields, which ought to display the same bars in your chart, but their labels will represent the percent of the total.

Here's a sample script. We'll just pretend that the contents of my layer are the following:

|   | choices | A | B | C |
|--:|--------:|--:|--:|--:|
| 0 | AB      | 1 | 1 | 0 |
| 1 | A       | 1 | 0 | 0 |
| 2 | C       | 0 | 0 | 1 |
| 3 | BC      | 0 | 1 | 1 |
| 4 | ABC     | 1 | 1 | 1 |
| 5 | C       | 0 | 0 | 1 |
| 6 | AB      | 1 | 1 | 0 |

 

from arcgis import GIS

# Connect to portal
gis = GIS('your portal url', 'username', 'password')

# Get feature service, query layer to dataframe
fl = gis.content.get('your layer itemID')
sdf = fl.layers[0].query(as_df=True, out_fields = ['list', 'of', 'count', 'column', 'names')

# Grab the length of the dataframe
rowcount = len(sdf)

# Identify 'count' columns
count_cols = df.loc[:, 'A':'C']

# Assign 1 / rowcount to the "count of choice X" columns
count_cols = count_cols / rowcount * 100

# Rename count columns to match percent column naming scheme
sdf.rename(columns={'A':'A_pct', 'B':'B_pct', 'C':'C_pct'}, inplace=True)

# Apply calculated values to layer
fl.layers[0].edit_features(updates=sdf.spatial.to_featureset())

 

The resulting dataframe looks like this:

|   | choices |     A_pct |     B_pct |     C_pct |
|--:|--------:|----------:|----------:|----------:|
| 0 | AB      | 14.285714 | 14.285714 | 0.000000  |
| 1 | A       | 14.285714 | 0.000000  | 0.000000  |
| 2 | C       | 0.000000  | 0.000000  | 14.285714 |
| 3 | BC      | 0.000000  | 14.285714 | 14.285714 |
| 4 | ABC     | 14.285714 | 14.285714 | 14.285714 |
| 5 | C       | 0.000000  | 0.000000  | 14.285714 |
| 6 | AB      | 14.285714 | 14.285714 | 0.000000  |

 

So for instance, a graph of the "C_pct" column could show a "count" metric in the chart itself, but have the data callout be the sum of its values, that being ~57%.

Once you've got this working, schedule the script to run at regular intervals, and you're all set! If you don't have much experience scheduling python scripts, you can also do this via an AGOL Notebook, which can still connect to and work with Enterprise layers.

- Josh Carlson
Kendall County GIS
JasonCyphers
Occasional Contributor III

Thanks for the info!  I'll play around with it!

0 Kudos
jcarlson
MVP Notable Contributor

And come to think of it, you could probably just use calculate to update those fields and not even bother with dataframes at all, if you use separate fields.

fs.layers[0].calculate(
    where = '1=1',
    calc_expression = [
        {"field" : "A_pct", "value" : f"A / {rowcount} * 100"},
        {"field" : "B_pct", "value" : f"B / {rowcount} * 100"},
        {"field" : "C_pct", "value" : f"C / {rowcount} * 100"}
    ]
- Josh Carlson
Kendall County GIS