Attribute Table Spell Check

1109
3
02-28-2018 11:26 AM
MikeEdwards
Occasional Contributor

I know this might be a stretch, but has anyone had any success writing a spell check function that would check an attribute table? I know there has been a lot of discussion on the need for spelling checking map element text, but it would also be helpful to have something that could check spelling within the attributes as well.

Right now the only real options I can see are to export the table to a CSV or comparable format then importing it into Excel and using it's spell check. Which is not ideal.

I'm currently trying to use PyEnchant, but thought I'd check to see if anyone had any success with this previously.

Thanks.

0 Kudos
3 Replies
JamesMacKay3
Occasional Contributor

Although I didn't build it for attribute tables, I've had some success with spellchecking using Microsoft Word through Python.  The gist of my process was that I looped through a bunch of metadata files, grabbed a specific subset of metadata elements, wrote each element's text value to a new line in a text file, spellchecked the file in Word, then exported the errors out to a list.

Upsides:

  • Easy to setup

Downsides:

  • Not very fast when batch-processing a bunch of different metadata documents
  • Occasionally Word would visibly pop-up and stall the process
  • Obvious baked-in dependency

I did this as a one-off though, so I didn't really worry about the downsides - if I were to use it frequently I'd really want to nail down the second downside in particular.

Code looked something like this:

import win32com.client

app = win32com.client.gencache.EnsureDispatch("Word.Application")
for metadataFile in metadataFiles:
    # Extracted specific metadata elements here, put into list.
    # ...

    # Dump to text file
    with io.open(tempDocPath, "w") as tempFile:
        for elementText in elementTextBlocks:
            tempFile.write(unicode(elementText))
            tempFile.write(u"\n\n")

    # Open the doc
    doc = app.Documents.Open(tempDocPath)
    for error in doc.SpellingErrors:
        # Do stuff
        # ...
    doc.Close()
    del doc
app.Quit()
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Of course the Close()/del/Quit() stuff was handled in finally blocks and all that, and because I was getting a lot of recurring proper nouns flagged as errors I ended up building in an exclusion process where Line 18 would be, but basically I ended up dumping everything to a CSV with the filename and misspelled word.

MikeEdwards
Occasional Contributor

Thanks. That's similar  to the process I was looking into, except swap out Excel for Word.

0 Kudos
RobertBorchert
Frequent Contributor III

What I did is spell check in MS Access.  Of course only works on Personal Geodatabases.

I then created a custom dictionary.

0 Kudos