Scenario: I have a wide variety of PDF and I am using a Notebook script I created in python to extract specific information. The specific information I need is extracted to console, and PDFs that do not have the specific values I am looking for are labeled as "Unprocessed".
I essentially want all the "Processed" files and their data to be uploaded to a feature layer under the data tab.
Is this possible? Or, are there any other workarounds such as an XLSX file being created from notebooks with the extracted information.
Thanks.
You would need a python module to process PDF files e.g. PyPDF2
It isn't installed by default in Notebook so excel files might be the better method.
You can read excel files using the following code:
# Import libraries
from arcgis.gis import GIS
gis = GIS("home")
from openpyxl import load_workbook# File location
directory = r'/arcgis/home/'
myexcelfile = directory + r'MyFile.xlsx'## Read excel file
wb = load_workbook(myexcelfile)
ws = wb['Sheet_Name']
# Iterate the loop to read the cell values
for col in range(1, ws.max_column+1):
for row in range(1, ws.max_row+1):
print(ws.cell(column=col,row=row).value)If you are using arcgis online notebook you need to add your excel files into the notebook directory. It's kind of weird but this post explains how to add files into your direcotry. https://www.esri.com/arcgis-blog/products/arcgis-online/announcements/new-user-workspace-and-file-ma...