Hello, I'm currently working with one feature class table containing 400 000 objects and one database table containing 700 objects. I'm currently working in ArcMap 10.4 and using a Geodatabase with Python 2.7. What I'm trying to do is match the database table containing 700 objects against the FC. What I want is to match them, treating all 700 objects as wildcars %700 ojects%, and match those towards the FC that contains text from the table. I have 2 tables with 1 column in each that im trying to match. I have not yet found a good way to treat all rows in a column as wildcard to search against another column.
This is my code so far. This only will print out what the database table values. Next i wanted to get the FC values and compare them. I've read about maybe using a dictionary to make it easier to extract values but I've yet to figure out how.
import arcpy, unicodedata, datetime, openpyxl
arcpy.env.workspace =r"C:\Users\kaaper\Documents\DataDrivenPages\DataDrivenPages.gdb"
def rows_as_dicts(cursor):
colnames = cursor.fields
for row in cursor:
yield dict(zip(colnames, row)
with arcpy.da.SearchCursor(r"C:\Users\kaaper\Documents\DataDrivenPages\DataDrivenPages.gdb\Ordbok", '*') as sc:
for row in rows_as_dicts(sc):
print row['field']
Any input will be appreciated.
Solved! Go to Solution.
Hi Per Olav Kaasa ,
You can try the code below. It will take some time to run. Be sure to run this on the uncompressed version of the data (it will not be able to run on compressed data).Change the following:
# -*- coding: utf-8 -*-
def main():
import arcpy
# settings (first uncompress geodatabase data!)
fc_anno = r'C:\GeoNet\SearchWildcard\gdb\TestBase.gdb\Navn_org'
fld_anno = 'SNAVN' # field to match with dictionary
fld_out = 'OrdbokMatch' # output field name
tbl = r'C:\GeoNet\SearchWildcard\gdb\TestBase.gdb\Ordbok'
fld_tbl = 'field' # dictionary data
# create dictionary list
lst_dct = [stripText2list(r[0]) for r in arcpy.da.SearchCursor(tbl, (fld_tbl))]
# Add output field to annotation
AddField(fc_anno, fld_out, "TEXT", 100)
# loop through annotation and update output field
cnt = 0
# dct_hits = {}
with arcpy.da.UpdateCursor(fc_anno, (fld_anno, fld_out)) as curs:
for row in curs:
cnt += 1
if cnt % 5000 == 0:
print "Processing row:", cnt
snavn = row[0]
lst_items = stripText2list(snavn)
result = FindMatches(lst_items, lst_dct)
row[1] = result
curs.updateRow(row)
def FindMatches(lst_items, lst_dct):
result = None
data = []
for lst_validate in lst_dct:
for validate in lst_validate:
for item in lst_items:
if validate.upper() in item.upper():
data.append(validate)
if len(data) > 0:
data = list(set(data))
result = data[0]
for data_item in data[1:]:
result += ", " + data_item
return result
def AddField(fc, fld_name, fld_type, fld_length=None):
if len(arcpy.ListFields(fc, fld_name)) == 0:
arcpy.AddField_management(fc, fld_name, fld_type, None, None, fld_length)
def stripText2list(text):
text_ori = text
lst_ok = []
if text is None:
return lst_ok
else:
s = text.find('(')
if s != -1:
e = text.find(')')
if e == -1:
e = len(text)-1
text = text[0:s] + text[e:len(text)-1]
text = text.replace('"','')
text = text.replace(';','')
text = text.replace('-','')
text = text.replace(',',' ')
text = text.replace(' ','')
lst = text.split(' ')
for a in lst:
a = a.strip()
if a != '':
if a[0] != '(':
if a[-1:] != ')':
lst_ok.append(a)
return lst_ok
if __name__ == '__main__':
main()
Per, I had to edit your code formatting so it was readable, check that I got the indentation right.
For the future Code Formatting... the basics++
I must admit, I don't quite understand the desired outcome. If you could provide an example with some screenshots that would be most helpful.
Sorry if I explained it poorly. What i want is to match column1 from table1(700 objects) towards column2 (400 000 objects) from table2. Each row in column1 might have several matches against column2. Column1 only contain parts of the text in column2, like keywords, meaning for example if column1 has "Mountain", column2 might have "Mountainrange", or "Westmountainrange". So each side of the text in column2 can have random values, meaning wildcards I want to match against column1. I cannot perform a join or relate, since they only partly match against each other.
Reading you request, it sounds like you are trying to match one set1 (700) things against another set2 (400,00) based upon one field.... Have you tried using a SQL query?
Select a.matchingsetfield, a.fieldstodisplay, b.fieldstodisplay
From set1 as a
Join set2 as b
On a.matchinsetfield = b.matchingsetfield;
So in further reading, what you are trying to do is compare to lists against each other. The python solution discussed is a good one and very flexible. However, an alternative, there are SQL solutions to this as well. Depending on the Database you are using .... I use the following SQL model for comparison..
DECLARE @v1 VARCHAR(MAX) = 'Lime, Tequila, Salt, Sugar',
@v2 VARCHAR(MAX) = 'Party, Fun, TGIF';
SELECT CASE WHEN EXISTS
(
SELECT 1
FROM dbo.Split(@v1) AS a
INNER JOIN dbo.Split(@v2) AS b
ON a.Item = b.Item
)
THEN 1 ELSE 0 END;
If your database does not support this type of operation then create two temporary tables fill in the temp tables built from the field lists from both layers then do a simple join.
Again the Python solution is excellent, but I wanted to give an alternate path to the same problem.
Can you use a relate?
Relate will only give me partial matches, like i explained in my answer above, column2 will have random keywords on both sides. A relate will only match those that have a 100% match towards each other.
Once you find the matched, what do you intent to do with it? Will you write something in table1, table2, both or create some new output with it? You can create two dictionaries or lists from the two fields, but since there is only a partial match you will probably need to loop through it. Depending on what you want to do, this will require additional information and define if it is better to compare table1 with table2 or the other way around.
I will create a new output of the matches in table1. Those that match will be used as a dictionary for the names in table2 in a map layout. How will i create a dictionary from the two fields? I have tried looking for that solution for a while.