Field calculator Python function using Fuzzy string

2101
6
03-23-2017 09:10 AM
AhmedAbdelnasser1
New Contributor III

I want to -simply- compare the values from three columns using Fuzzywuzzy python library and return the percentage value.

I want to do something like this, but of course, this function is not working!

How I can make this one work.

from fuzzywuzzy import fuzz
from fuzzywuzzy import process
def function(query,v1,v2,v3)
choices = [v1,v2,v3 ]
result= process.extractOne(query, choices)
return result

I can make Fuzzy work for comparing only two columns like this.

Pre-logic script code:
from fuzzywuzzy import fuzz
from fuzzywuzzy import process

--------------------------------------------

fuzz.ratio(!column1!, !column2!)

this link can give you an idea about  Fuzzy string: Fuzzy String Matching in Python – Marco Bonzanini 

I will really appreciate your help.

Thanks,

0 Kudos
6 Replies
JoshuaBixby
MVP Esteemed Contributor

I don't have time to look at fuzzy string part, but there are some other issues I see.  First, you don't have a colon after your function definition:  def functioname: .  Second, your not doing any indentation with your function definition.  Third, you are passing 4 columns to the function, but the function only has 3 parameters.

JoshuaBixby
MVP Esteemed Contributor

In your code, you are using extractOne, which returns a tuple with the highest-ranking string and its score.  What do you want to store in the FUZZY_STRING field?  A string, a number, both?

AhmedAbdelnasser1
New Contributor III

Thanks, Joshua 

I update the code to this and it worked for me

from fuzzywuzzy import fuzz
from fuzzywuzzy import process
def SetMatchPercenage(query,v1,v2,v3,v4):
       choices = [v1,v2,v3,v4]
       result=str(process.extractOne(query, choices))
       return result

DanPatterson_Retired
MVP Emeritus

I notice that some of your columns vary in case... ie !Column1! then !column2!  did you type those in or select them from the list of available columns

0 Kudos
curtvprice
MVP Esteemed Contributor

Dan -- field names are absolutely case-insensitive in Calculate Field expressions.

DanPatterson_Retired
MVP Emeritus

ahhh.. too much python... and that must be jUsT a sTUpid vb legacy thing