Select to view content in your preferred language

Python and/or Arcade built-in to handle reserved label characters

235
2
02-24-2025 04:04 PM
Status: Open
Labels (1)
MErikReedAugusta
MVP Regular Contributor

NOTE: I'm deliberately posting this in ArcGIS Pro Ideas because it's fundamentally an issue with the label & text engines, but there's a strong overlap with both the Python and Arcade boards, as well.  I'd be happy to move it if the moderators think it would be better in one/both of those.

 

Background:

As many likely know, the ampersand (&) and left angle bracket (<) are both reserved characters for labels and similar text outputs in ArcGIS Pro.  Let's say I have a feature class that has a comments field that I'd like displayed in either a label or a text box on a map series.

  • Feature 1 Comment: "Reference both DEM and Contours"
  • Feature 2 Comment: "Contains < 1000 sq ft"

Let's also say for argument that I want to print the Asset ID in bold, while the comments are in regular text.  Below is an abbreviated portion of some Python and Arcade code that will handle adding the text formatting strings to do the bold.

 

def FindLabel ([ID], [Comments]):
    id = [ID]
    commo = [Comments]
    return f'<BOL>[{id}]</BOL> "{commo}"'
var id = $feature.ID
var commo = $feature.Comments
return `<BOL>[${id}]</BOL> "${commo}"`

 

Only, we have a problem, here.  One of my comment strings has a less-than sign, which is a reserved character.  The first feature would display correctly, but the second one breaks:

  • [Asset 1] "Reference both DEM and Contours"
  • <BOL>[Asset 2]</BOL> "Contains < 1000 sq ft"

Okay, so you add some code to handle the character replacement, and all is well:

 

def FindLabel ([ID], [Comments]):
    id = [ID]
    commo = [Comments]
    commo = commo.replace('&','&amp;').replace('<','&lt;')
    return f'<BOL>[{id}]</BOL> "{commo}"'
var id = $feature.ID
var commo = Replace(Replace($feature.Comments,'&','&amp;'),'<','&lt;')
return `<BOL>[${id}]</BOL> "${commo}"`​

 

Results:

  • [Asset 1] "Reference both DEM and Contours"
  • [Asset 2] "Contains < 1000 sq ft"

 

Problem:

But what if I wanted to emphasize the word "both" in Asset 1?  Maybe it's not normal for the person looking at this text to reference both DEM and contours at the same time, but for this case I need them to do so.  In normal cases, I could just embed the necessary tags in the underlying comment:

  • Feature 1 Raw Comment: "Reference <BOL><UND>both</UND></BOL> DEM and Contours"
  • Feature 1 Intended Result:
    • [Asset 1] "Reference both DEM and Contours"
  • Feature 1 Actual Result:
    • [Asset 1] "Reference &lt;BOL>&lt;UND>both&lt;/UND>&lt;/BOL> DEM and Contours"

As you can see, because I wrote code to bypass the escape characters, these tags aren't going to work properly.  It's an either/or decision, unless you want to manually go through every Comment and make sure you don't have any unescaped reserved characters.  No bueno.

 

Idea:

I'd love to see a built-in command for arcpy and/or Arcade that would handle the replacement of reserved characters on the fly without replacing those characters when they're a part of a valid tag or escape string.

 

def FindLabel ([ID], [Comments]):
    id = [ID]
    commo = arcpy.EscapeChar([Comments])
    return f'<BOL>[{id}]</BOL> "{commo}"'
var id = $feature.ID
var commo = EscapeChar($feature.Comments)
return `<BOL>[${id}]</BOL> "${commo}"`​

 

Added bonus:

Currently, if you're going to escape both ampersand and left-angle-bracket, you have to do it in that order, or you'll break your own character escapes (because you'll end up changing that left-angle-bracket to "&amp;lt;").  And if someone happened to put the correct escape text in the underlying text, you'd break that, too, for the same reasons.

If instead, you could just call something like "EscapeChar($feature.comments)" in Arcade or "arcpy.EscapeChar(commo)" in arcpy, all these edge cases could be handled in the back end—probably much more efficiently than any ad-hoc function I bake into my label & text scripts.

 

2 Comments
JohannesLindner

Whey you use Formatting Tags in your label expression, the expression breaks for features that have special characters like "<", ">", or "&" in their label fields. See this question for examples.

This is because these characters are reserved in HTML: they make up the syntax. To solve this, you have to replace those characters with their HTML entity, eg

 

var txt = $feature.TextField
txt = Replace("&", "&amp;")
txt = Replace(txt, "<", "&lt;")
txt = Replace(txt, ">", "&gt;")
return "<COL red='255'>" + txt + "</COL>"

 

 

This is a lot of work, you can really easily forget to do this, you have to remember or look up the HTML entities, and users inexperienced with HTML (and probably many experienced ones, too) don't have a clue why their expression breaks for some features.

Solution: Just do those replacements behind the scenes.

MErikReedAugusta

Weirdly, right-angle-bracket (>doesn't seem to break the text, in my experience.  The only two I've ever encountered are the two mentioned in my original post.

Beyond that, absolutely agreed: This is poor UX to begin with, even in simple cases.  In the more complex cases at the top, it's even worse.

Solving it "automagically" in the back end without user input might be difficult to do without being overzealous, though.