Hi all,
We frequently use the tax parcel layer published by the State of Hawaii to help us plan projects.
In the last few years, the State stopped publishing the layer with the ownership information, opting instead to include a link to a webpage featuring ownership, taxes, etc. as an attribute
Example here, with Hawai'i Volcano National Park. qPublic.net - Hawai'i County, HI - Report: 980010010000 (schneidercorp.com)
I'd like to be able to populate a copy of the layer (filtered to be relevant to us) with attributes from the webpage, mostly (especially) the ownership information.
Does anyone have any tips as to this might be done? Dynamic is not needed.
Thanks!
Since the URL contains the TMK # of the parcel you could use that with the requests library. Retrieve info from the page, parse the return, then repeat for each record of interest.
Taking a look at the sites robots.txt file, it disallows all user agents (web crawlers/ automatic scraping) for /Application.aprx/ so be respectful/careful how you go about your data extraction.
You can use the python package BeautifulSoup to extract items/text from webpages/urls- there are a ton of tutorials on the net for how it can be done.