AnsweredAssumed Answered

Missing JSON data, Lack of overlap

Question asked by ga47qef on Jun 21, 2018

Hello everybody,

 

I recently wrote a small script tasked with extracting all buildings from a given shapefile or feature class, fetching the building level data from Overpass API, and  merging the resulting query with the attribute table of the shapefile of the exended buildings via the JoinField_Management() method.

 

During the API call the bounding box is adapted to the feature classes extent, so we're dealing with the same coordinates here!

 

I use the file's OSMiDs and the API query's ids to join the building level data with the rest of the given information in the former shapefile.

 

Unfortunately only a mere 10% of the data overlap!

 

Here is my GET request sent to the API as well as a bit of work with the JSON data:

    # Parameters for GET Request to overpass Turbo for:
    # all buildings in both open and closed ways containing information about their level height
    # within the coordinates of the specified bounding box
    osmrequest = {'data': '[out:json][timeout:25];'
                          '('
                          'way["building:levels"]'
                          '(%s, %s, %s, %s);'
                          '(._;>;);'
                          ');'
                          'out;'
                          % (bbox_west, bbox_south, bbox_east, bbox_north)}

    # URL for GET Request
    osmurl = 'http://overpass-api.de/api/interpreter'

    # GET Request to overpass API
    osm = requests.get(osmurl, params=osmrequest)
    #osm_all = requests.get(osmurl, params=osmrequest_all)

    # Extract data from JSON dictionary
    osmdata = osm.json()
    osmdata = osmdata['elements']

    # rearrange osmdata so only relevant attributes can be extracted as dataframe
    for i in osmdata:
        if 'tags' in i:
            for k, v in i['tags'].iteritems():
                i[k] = v
            del i['tags']

What follows now is a conversion to a dataframe, removal of NaNs, conversion  to an ordered numpy array and via the NumpyArrayToFeatureClass() method the creation of a Table containing the ids connected to the building levels.

 

The table is then joined via:

    arcpy.JoinField_management("buildings_feature_class.shp", "OSMID", r'C:\Users\...\out_table', "id", ["building_levels"])

with my previously extracted feature class.

 

Either table contains several thousand rows, yet a mere few hundred are actually being connected.

A few of the more memorable buildings aren't being joined, and upon manually searching them via overpass-turbo.eu

I can actually find them by using the same query but their ID is nowhere to be found in my table of extracted data.

 

Upon closer inspection of the received JSON data it seems as if a chunk i the beginning has been cut off.

9271L, 1330619804, 356935956, 2440470857L, 356935958, 3709199266L, 3709199268L], u'type': u'way', u'id': 366943300, u'tags': {u'building': u'yes', u'building:levels': u'4', u'roof:colour': u'red', u'roof:shape': u'gabled'}}, {u'nodes': [3709199267L, 441880307, 356935959, 356935960, 3709199270L, 3709199267L], u'type': u'way', u'id': 366943303, u'tags': {u'building': u'yes', u'building:levels': u'4', u'roof:colour': u'red', u'roof:shape': u'gabled'}}, {u'nodes': [1646304052, 1646304062, 2870959340L, 2870959338L, 2870959333L, 3711233225L, 1646304052], u'type': u'way', u'id': 367172928, u'tags': {u'building': u'yes', u'building:levels': u'2', u'roof:shape': u'gabled'}}, 

 

Could anybody help me with the tables' lack of overlap and the possible cause(s)?

Outcomes