Date extraction from OpenStreetMap (OSM) can be performed in various ways. The OSM project features a website, where data can be extracted in the OSM-invented .pbf-format. External platforms provide already preprocessed and verified data from the websites. The Overpass API allows users to download small extents of data (approximately on town and suburb level) via a web request. This API acts like a database server and can be queried with a special database request language, Overpass QL.
The free OSM2ArcGIS Python script, available on GitHub, is based on data requested via the Overpass QL using an adapted version of Jeffrey Scarmazzi’s osm_runner python module. The requested data is processed and converted to the ArcGIS SpatialDataframe format. After data processing in ArcGIS Online / ArcGIS Enterprise, a new feature service is created inside the user’s portal.
Within the feature service, the data is organized in at least three different layers for the three available geometry types point, polyline and polygon (more layers for more configurations, will be explained later). The additional features of OSM2ArcGIS are explained in detail within the following sections of this blog entry.
Following Python version and packages must be installed. The installation of Anaconda helps to keep existing python installations working by creating indepedent python environments. Users benefit from an Anaconda integrated package installation and administration interface.
How to set up the Python Environment in Anaconda:
Download and install the 3.x Version of Anaconda. Then open the Anaconda Navigator and create a new Python Environment. The Python version must be 3.5 or later. Then install these packages in the environment:
conda install -c esri arcgis=1.5.0
ArcGIS API for Python must be equal or less than version 1.5.0 to work correctly with OSM2ArcGIS. For further information see this issue on GitHub: API for Python 1.5.1 & 1.5.2: arcgis.geometry.Polygon.is_valid() method for polygons does not work ·...
pip install requests
pip install progressbar
An ArcGIS Online Account with user type “Creator” (New user types) or an ArcGIS Enterprise account with user type “Level 2” (Named user, for version 10.6. and below) and at least a “Publisher” role. (An ArcGIS Online Developer Account also meets these requirements)
set path=C:\your_anaconda_install_path\Anaconda3\Library\bin\;%PATH%
call conda activate OSM2ArcGIS
call "C:\Users\your_username\AppData\Local\conda\conda\envs\your_python_environment"\python.exe "C:\your_project_extraction_path"\MainModule.py
The geodata contributed to the OSM project is not as standardized as data provided by governmental organizations or companies. There are guidelines inside the OSM wiki, but users operate various clients on various devices and are not forced to generate geometries in a specific way or provide detailed attribute values in exactly defined attribute fields.
This may lead to problems when parsing the data and trusting on specific data types. Also using OpenStreetMap for a roadmap with exact speeds on road sections is not always a good idea when there is no exact unit specified in all road sections. That’s the reason we decided not to use integer or float data types in attribute data, all attribute data except the timestamp for every OSM object are stored as strings.
Another challenge using OSM is related to the geometric data structure with way, node and relation objects. Nodes can be easily transferred to ArcGIS point objects, way objects not having the same start and end can also be easily transferred to ArcGIS polyline objects. More complicated output geometries are polygons, as they can be formed by way objects with the same start and end or relations, which connect multiple way objects to form a “multipolygon”.
Relations can also connect multiple point objects or point and way objects. To take only relation objects into account, that form “multipolygons” by connecting way elements, the polygon-connecting relations feature the attribute “multipolygon” with the value “yes” (see figure below). Relations not forming “multipolygon” features are not regarded in OSM2ArcGIS.
The configuration of OSM2ArcGIS can be defined by the user in two JSON-based configuration files, agolconfig.json and osmconfig.json. The first one enables the tool to establish connections to the ArcGIS Enterprise / Online portal. The latter is used to setup configurations for requests to the OpenStreetMap server. Explanation of all JSON attributes and examples for configurations are listed in the table below.
Parameter | Usage | Example |
---|---|---|
"categories" | Controls the export of elements from OpenStreetMap, for every new configuration with another geometry or OSM key a new category must be created within the following 5 properties: - The desired OSM key for "categoryName" property. Multiple values not allowed here. - The desired OSM tags for "categoryValue" property. Multiple values in square brackets. - The excluded fields from service on ArcGIS Online for the "attributeFieldsToExclude" property. Multiple values in square brackets. - The geometry type, valid types are "line", "point" or "polygon". Multiple values not allowed here. - Set the "isEnabled" property to "yes" to activate or to "no" to deactivate a configuration. Currently unneeded configurations retainable in configuration file. | "categories" : [ { "categoryName" : "public_transport", "categoryValues" :["station", "platform"], "attributeFieldsToExclude" : ["bus", "tram"], "geometryType" : "polygon", "isEnabled" : "yes" }, { "categoryName" : "public_transport", "categoryValues" : ["station", "platform"], "attributeFieldsToExclude" : ["bus", "tram"], "geometryType" : "point", "isEnabled" : "yes" }, { "categoryName" : "public_transport", "categoryValues" : ["platform", "network"], "attributeFieldsToExclude" : ["bus", "tram"], "geometryType" : "line", "isEnabled" : "yes" } ] |
"boundingBox" | Bounding box for the data to be loaded. Multiple bounding boxes not allowed here. | "boundingBox" : { "minLatInit" : "48.0503", "minLonInit" : "11.2723", "maxLatInit" : "48.2597", "maxLonInit" : "11.8113" } |
Valid OSM keys are listed here: https://taginfo.openstreetmap.org/api/4/projects/keys
Valid OSM tags are listed here: https://taginfo.openstreetmap.org/api/4/projects/tagshttps://taginfo.openstreetmap.org/api/4/projects/tags
OSM2ArcGIS was originally developed by Simon Geigenberger. The tool consists of seven modules, which are loaded, when they are needed at the current position in the program. The MainModule is loaded as the core part of OSM2ArcGIS and invokes the modules AGOLConfigHelper, OSMConfigHelper, OSMHelper and AGOLHelper.
AGOLConfigHelper validates the configfile agolconfig.json, the module checks the availability of ArcGIS API for Python and then continues to check the configfile. A quick connection check with the given credentials in the configfile is also performed within this module. The module terminates the whole program if invalid values were specified within the configuration file.
OSMConfigHelper validates the configfile osmconfig.json, the module checks the availability of ArcGIS API for Python and then continues to check the configfile. Categories and Tags available in OSM are downloaded from an OSM database and the values set in the configfile are compared with the values in the database. The module terminates the whole program if the set values or tags in the configfile are not available in the OSM database of valid tags/values, an invalid boundingbox or other invalid values are set within the configfile.
OSMHelper module is designed to setup and control the requests sent to the OpenStreetMap server using the extended osm_runner module. The module is configured to optimize processing speed by creating threads, to simultaneously download data from the OSM server for the configurations in the configfile.
Osm_runner module was originally developed by Jeffrey Scarmazzi and has been extended to process the relation entities with the attribute “multipolygon” and the value “yes”.
Osm_runner_utils provides configuration settings used with the osm_runner module. Details on the OSM server request format can be set, as well as the output format for the server’s answer and additional filters can be changed by modifying this module.
AGOLHelper expects an Esri SpatialDataFrame as input object and performs the creation of a feature service inside the user’s portal in ArcGIS Enterprise / Online and the upload of data for the defined configurations to at least one layer for every geometry type set in osmconfig.json. Uploads to the portal are performed using multiple chunks of data to prevent server errors on large datasets.
OSM2ArcGIS makes it easy to transfer data from OSM to an ArcGIS Enterprise / Online portal. All you need to do is to setup your python installation, configure the tool via the JSON-based configfiles and to make sure, that you were granted at least a publisher role by your portal administrator.
If you have any questions or comments please feel free to contact me or leave a comment in the comment section below.
The script is available for download from GitHub
Happy data processing, Lukas
(E-Mail: lukas.bug@aol.de)
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.