Select to view content in your preferred language

Remove ordinals in an address

465
2
07-28-2023 09:13 AM
Michael_Kleinman
New Contributor II

Hi, I'm using Pro 3.1.2. I have a database with street names that I need to remove all the ordinals from. So "SE 1ST ST" would just be "SE 1 ST"; "SE 44TH ST" would become SE 44 ST" etc. I need to remove the "ST" when it signifies first (1ST) but not when it signifies Street (also ST). So only after a number. The same with "TH" in "44TH", but not in "NORTH." I'm having trouble writing something that will catch and remove the ordinals only after a number 0-9.

I've tried to use this without any luck

(?<=[0-9])(?:st|nd|rd|th)

Does anyone have any suggestions? Thanks in advance.

0 Kudos
2 Replies
by Anonymous User
Not applicable

https://regex101.com/  make sure you select python on the left, copy some strings into the test string and build away at the top.

 

(\d+ST)|(\d+ND)|(\d+RD)|(\d+TH)

 

captures the ##ST ##ND ##RD ##TH

 

(?<=\d)ST|TH|ND|RD

 

captures just the ST, ND, TH, RD

1st Alternative
(?<=\d)ST
 
Positive Lookbehind (?<=\d)
Assert that the Regex below matches \d matches a digit (equivalent to [0-9])
ST matches the characters ST literally (case sensitive)
2nd Alternative
TH matches the characters TH literally (case sensitive)
3rd Alternative
ND matches the characters ND literally (case sensitive)
4th Alternative
RD matches the characters RD literally (case sensitive)
0 Kudos
by Anonymous User
Not applicable

Here it is in action-

addrs = ['SE 1ST ST', 'SE 2ND ST', 'SE 03RD ST', 'SE 15TH ST']

comp = re.compile(r"(?<=\d)ST|TH|ND|RD")

for a in addrs:
    match = comp.search(a)
    f = comp.sub("", a)
    print(f'Changed {match.group(0)} : {a} : {f}')
0 Kudos