Select to view content in your preferred language

Add Python Regular Expression (re) Helpers to Calculate Field Dialog

1356
3
01-12-2023 08:12 AM
Status: Open
Labels (1)
JoshuaBixby
MVP Esteemed Contributor

I have been working with QGIS more, and one aspect I find really handy with that Field Calculator is the support for a few common regular expression string functions (regexp_matches, regexp_replace, regex_substr) natively within the dialog.  I think having something similar in ArcGIS Pro's Calculate Field would be useful, but I also realize the implementation in Pro could not take the same form as in QGIS.  After thinking about how various functions are implemented in Calculate Field currently, I think the simplest way to make regular expression use easier would be to add a few "re" Helpers in the Calculate Field dialog similar to how "math" Helpers are added.

I realize someone can import re into a code block and define a function there and use regular expression, but I also think having re.findall, re.split, and re.sub usable in the Expression block via a Helper would make using regular expression more accessible to more users.

If this were to be considered, I know the likely outcome would be using "re" since it is a native/core library, but it does look like Pro already comes bundled with "regex", and I have to say I find "regex" much better at portable Unicode syntax than the "re" library.

3 Comments
JohannesLindner

Oof... Some incoherent points off the top of my head:

The field calculator is clunky enough in its present state. We have questions here regarding syntax pretty often, especially for users not using the GUI, but calling the calculator in scripts (having to supply the expression as string). If you introduce regex into there, you will probably get a lot of questions about that.

Regex is... let's say "not the most intuitive". I'm by no means a power user, but I have used it in some instances. I still don't have a clue what I'm doing. Every regex expression I've ever used was heavily inspired by online searches and long sessions on regex101. If I look at some of those expressions now, I only know what they're supposed to find because I documented it. I feel like the complicated syntax would stop many users cold in their tracks, leading to frustration and to them just using string.replace() anyway.

Who's your audience here? I think most users who both know of regex and consider using it know enough about Python to either import re in the calculator or skip the calculator altogether and go straight for the UpdateCursor. On the other side, there's a big chance that users who are not as experienced in Python (or programming in general) have never even heard of regex. Swamping those users in regex syntax doesn't seem like a good idea to me.

 

So, personally, I think it would lead to much confusion for little benefit. Having said that, I don't know how QGIS does this. Maybe I'm making up problems and there are good ways of avoiding that confusion...

jcarlson

I want regex everywhere. Quick access to those functions would be incredibly helpful. And the tools currently available are already full of things that some users may not find "intuitive", but I don't think that's a good reason not to offer better and more efficient methods for advanced users to do their work.

JoshuaBixby

@JohannesLindner, I don't see people stumbling into the Helper functions, a user has to actively select it or type it in.  If people are using Helper functions they know nothing about, that isn't a regular expression problem, that is a user problem. If potential audience size is a strong determining factor, how did ".as_integer_ratio" get put in the list?  Also, Esri has introduced some hand-rolled Helpers like Sequential Number, which can have unexpected outcomes, so I can't see what issues there would be with adding a tried-and-true Python module.