I want to enable our users to be able to search for text inside attached documents.
Example scenario: User needs to find all features that have attached documents that contain the word "bicycle"
Does anybody have any ideas how to do this?
Solved! Go to Solution.
Thanks Mic, much appreciated. Having read your response I think I'll adjust my approach to immediately loading the text contents when the document is attached and then search through the attachments table:
As a general approach, I'd use Python, with a read cursor to iterate through each attachment, download it to a temp file, and search using a relevant Python library that reads your attachment file types. If you get a match, return the attachment's GlobalID and use that to find and return parent record data.
Downloading could be slow, but I don't know any options for reading attachments in situ.
FME can access attachments and should be able to read/search PDFs, but AFAIK, it can't read Word Documents, so you're back to using Python in FME if that's what you have.
Thanks Mic, much appreciated. Having read your response I think I'll adjust my approach to immediately loading the text contents when the document is attached and then search through the attachments table: