<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Soundalike, rhyming and similar words in Python Questions</title>
    <link>https://community.esri.com/t5/python-questions/soundalike-rhyming-and-similar-words/m-p/1382390#M69880</link>
    <description>&lt;P&gt;This falls squarely in the realm of Natural Language Processing (NLP), do you really want to be rolling your own vs using existing libraries?&lt;/P&gt;</description>
    <pubDate>Wed, 14 Feb 2024 18:40:39 GMT</pubDate>
    <dc:creator>JoshuaBixby</dc:creator>
    <dc:date>2024-02-14T18:40:39Z</dc:date>
    <item>
      <title>Soundalike, rhyming and similar words</title>
      <link>https://community.esri.com/t5/python-questions/soundalike-rhyming-and-similar-words/m-p/1381961#M69865</link>
      <description>&lt;P&gt;I'm working on developing a script tool that allows users to input text and search against a table and layer. Currently, I've achieved functionality to return very similar words, but I need to enhance it to include soundalike, rhyming, and similar words for a given input. For instance, if the input is "Saddle," the tool should return words like "Battle" and "Cattle" as they are soundalike, rhyming, or similar matches.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The message return in this case is the following,&lt;/P&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'a' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'adele' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'crusader' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'd' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'e' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'leo' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'saddle' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'saddle horn' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'saddle mountain' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'saddle peak' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'saddle up' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'saddlehorn' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'saddleman ranch' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'sand' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'sawtelle' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'southdale' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'southwell' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'steele' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'stella' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'stoll' in stname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'n saddlebrook way' in fullstname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'saddle ave' in fullstname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'saddle horn ln' in fullstname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'saddle mountain ave' in fullstname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'saddle mountain way' in fullstname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'saddle peak ave' in fullstname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'saddle up ln' in fullstname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'saddleback ln' in fullstname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'saddlehorn way' in fullstname field&lt;/PRE&gt;&lt;PRE&gt;'saddle' is similar, sound-alike, or rhymes with 'saddleman ranch ct' in fullstname field&lt;/PRE&gt;&lt;PRE&gt;Similar words count: 21&lt;/PRE&gt;&lt;PRE&gt;Rhyming words count: 7&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Code,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;import arcpy
from fuzzywuzzy import fuzz
from SoundsLike.SoundsLike import Search

class Soundex:
    def __init__(self):
        self.soundex_dict = self._build_soundex_dict()

    def _build_soundex_dict(self):
        soundex_dict = {}
        for char in 'bfpv':
            soundex_dict[char] = '1'
        for char in 'cgjkqsxz':
            soundex_dict[char] = '2'
        for char in 'dt':
            soundex_dict[char] = '3'
        for char in 'l':
            soundex_dict[char] = '4'
        for char in 'mn':
            soundex_dict[char] = '5'
        for char in 'r':
            soundex_dict[char] = '6'
        return soundex_dict

    def get_soundex(self, word):
        if not word:
            return None
        word = word.lower()
        soundex_code = word[0]
        for char in word[1:]:
            soundex_char = self.soundex_dict.get(char)
            if soundex_char and soundex_char != soundex_code[-1]:
                soundex_code += soundex_char
        soundex_code = soundex_code.ljust(4, '0')[:4]
        return soundex_code

    def are_rhyming(self, word1, word2):
        soundex1 = self.get_soundex(word1)
        soundex2 = self.get_soundex(word2)
        return soundex1 == soundex2

def soundex(name, length=4):
    """ soundex module conforming to Odell-Russell algorithm """

    # digits holds the soundex values for the alphabet
    soundex_digits = '01230120022455012623010202'
    sndx = ''
    fc = ''

    # Translate letters in name to soundex digits
    for c in name.upper():
        if c.isalpha():
            if not fc: fc = c   # Remember first letter
            d = soundex_digits[ord(c)-ord('A')]
            # Duplicate consecutive soundex digits are skipped
            if not sndx or (d != sndx[-1]):
                sndx += d

    # Replace first digit with first letter
    sndx = fc + sndx[1:]

    # Remove all 0s from the soundex code
    sndx = sndx.replace('0', '')

    # Return soundex code truncated or 0-padded to length characters
    return (sndx + (length * '0'))[:length]

# Initialize Soundex
soundex_instance = Soundex()

# Get the search text as input parameter
search_text = arcpy.GetParameterAsText(0).lower()

# Set the workspace
arcpy.env.workspace = "C:/GIS Folder/Addressing.gdb"

# Input feature class and table
feature_class = "C:/GIS Folder/Addressing.gdb/Roads"
table = "C:/GIS Folder/Addressing.gdb/Road_Names_Table"

# Create a list to store the comparison results
comparison_results = []

# Create a dictionary to store the counts of similar, sound-alike, and rhyming words
word_counts = {'similar': 0, 'soundalike': 0, 'rhyming': 0}

# Get a list of unique street names from the feature class and sort them
unique_stnames = sorted(set(row[0].lower() for row in arcpy.da.SearchCursor(feature_class, "FENAME")))

# Get a list of unique full street names from the table and sort them, filtering out None values
unique_fullstnames = sorted(set(row[0].lower() for row in arcpy.da.SearchCursor(table, "FULLSTNAME") if row[0]))

# Find perfect homophones for the search text
homophones = Search.perfectHomophones(search_text)

# Compare the search text with stname values
for stname in unique_stnames:
    if soundex_instance.are_rhyming(search_text, stname) or fuzz.partial_ratio(search_text, stname) &amp;gt;= 70 or stname in homophones:
        comparison_results.append(f"'{search_text}' is similar, sound-alike, or rhymes with '{stname}' in stname field")
        if soundex_instance.are_rhyming(search_text, stname):
            word_counts['rhyming'] += 1
        elif fuzz.partial_ratio(search_text, stname) &amp;gt;= 80:
            word_counts['similar'] += 1
        elif stname in homophones:
            word_counts['soundalike'] += 1

# Compare the search text with fullstname values
for fullstname in unique_fullstnames:
    if soundex_instance.are_rhyming(search_text, fullstname) or fuzz.partial_ratio(search_text, fullstname) &amp;gt;= 70 or fullstname in homophones:
        comparison_results.append(f"'{search_text}' is similar, sound-alike, or rhymes with '{fullstname}' in fullstname field")
        if soundex_instance.are_rhyming(search_text, fullstname):
            word_counts['rhyming'] += 1
        elif fuzz.partial_ratio(search_text, fullstname) &amp;gt;= 80:
            word_counts['similar'] += 1
        elif fullstname in homophones:
            word_counts['soundalike'] += 1

# Print the comparison results using arcpy.AddMessage()
for result in comparison_results:
    arcpy.AddMessage(result)

# Print if similar, soundalike, or rhyming words were found
if word_counts['similar'] &amp;gt; 0:
    arcpy.AddMessage(f"Similar words count: {word_counts['similar']}")
if word_counts['soundalike'] &amp;gt; 0:
    arcpy.AddMessage(f"Sound-alike words count: {word_counts['soundalike']}")
if word_counts['rhyming'] &amp;gt; 0:
    arcpy.AddMessage(f"Rhyming words count: {word_counts['rhyming']}")&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 13 Feb 2024 23:34:58 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/soundalike-rhyming-and-similar-words/m-p/1381961#M69865</guid>
      <dc:creator>TonyAlmeida</dc:creator>
      <dc:date>2024-02-13T23:34:58Z</dc:date>
    </item>
    <item>
      <title>Re: Soundalike, rhyming and similar words</title>
      <link>https://community.esri.com/t5/python-questions/soundalike-rhyming-and-similar-words/m-p/1382237#M69879</link>
      <description>&lt;P&gt;I just realized that my question was truncated. I'm having trouble getting words that rhyme or sound similar returned by the tool. For instance, if I input "Saddle," the tool should return words like "Battle" and "Cattle" since they are soundalike, rhyming, or similar matches. However, it's not doing that. I know that in the table there are words like "Battle" and "Cattle." How can I get results returned in this manner?&lt;/P&gt;</description>
      <pubDate>Wed, 14 Feb 2024 15:25:41 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/soundalike-rhyming-and-similar-words/m-p/1382237#M69879</guid>
      <dc:creator>TonyAlmeida</dc:creator>
      <dc:date>2024-02-14T15:25:41Z</dc:date>
    </item>
    <item>
      <title>Re: Soundalike, rhyming and similar words</title>
      <link>https://community.esri.com/t5/python-questions/soundalike-rhyming-and-similar-words/m-p/1382390#M69880</link>
      <description>&lt;P&gt;This falls squarely in the realm of Natural Language Processing (NLP), do you really want to be rolling your own vs using existing libraries?&lt;/P&gt;</description>
      <pubDate>Wed, 14 Feb 2024 18:40:39 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/soundalike-rhyming-and-similar-words/m-p/1382390#M69880</guid>
      <dc:creator>JoshuaBixby</dc:creator>
      <dc:date>2024-02-14T18:40:39Z</dc:date>
    </item>
    <item>
      <title>Re: Soundalike, rhyming and similar words</title>
      <link>https://community.esri.com/t5/python-questions/soundalike-rhyming-and-similar-words/m-p/1382442#M69881</link>
      <description>&lt;P&gt;I would prefer using existing libraries.&lt;/P&gt;</description>
      <pubDate>Wed, 14 Feb 2024 19:35:03 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/soundalike-rhyming-and-similar-words/m-p/1382442#M69881</guid>
      <dc:creator>TonyAlmeida</dc:creator>
      <dc:date>2024-02-14T19:35:03Z</dc:date>
    </item>
  </channel>
</rss>

