Convert JSON into FeatureCollection

JamesCrandall · ‎03-30-2017

I've been provided a JSON output in the sample below and tasked with getting this into a point feature class somehow.

[
 {
  "Name": "w1",
  "Long": "-60.44",
  "Lat": "21.60",
  "table": [
   [
    "G310-P:",
    "84 cfs; Mar 2, 17"
   ],
   [
    "G251-P:",
    "0 cfs; Mar 2, 17"
   ]
  ],
  "Heading": "315"
 },
 {
  "Name": "w2",
  "Long": "-60.32",
  "Lat": "21.63",
  "table": [
   [
    "S362-P:",
    "136 cfs; Mar 2, 17"
   ]
  ],
  "Heading": "0"
 },
 {
  "Name": "w2",
  "Long": "-60.35",
  "Lat": "21.37",
  "table": [
   [
    "S10A-C:",
    "0 cfs; Mar 2, 17"
   ],
   [
    "S10C-C:",
    "0 cfs; Mar 2, 17"
   ],
   [
    "S10D-C:",
    "0 cfs; Mar 2, 17"
   ]
  ],
  "Heading": "20"
 }
]‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

I have some existing processes that get close but hoped to run it by the community to see if someone spots the obvious solution before I dillydally down the wrong path for too long. In the JSON sample above, I have Lat/Lon attributes that I would need to get into an array in order to use this def() to convert it to something closer to what I need. My first thought is to create a new attribute called 'rings' so that it would just work with my geo_convert() def:

def geo_convert(ring_string):  
    from json import loads, dumps  
     
    rings = loads(ring_string)  
    feat_coll = {'type': 'FeatureCollection',  
                 'features':[]}  
     
    for ring in rings:
        feat_coll['features'].append(  
            {'type': 'Feature',  
             'geometry': {  
                'type': 'Point',  
                'coordinates': ring['rings'][0]
            }})  
    
    return dumps(feat_coll)  ‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Any ideas on what I could do?

RandyBurton · ‎03-31-2017

My thoughts on your code is to replace lines 14 and 15

jData = json.dumps(output)    
json.dump(output, jsonfile, indent=2, sort_keys=True)‍‍

with this:

# create a dictionary of features
jData = { "features" : None }

# output is a list of dictionaries, in this case features
# insert the list into the jData dictionary
jData["features"] = output

# save the jData dictionary to a json file
json.dump(jData, jsonfile, indent=2, sort_keys=True)‍‍‍‍‍‍‍‍‍‍‍

You could combine the scripts, basically starting with the first 12 lines of your script (with the includes, etc.), then the six lines above, and finally including my script starting with line 12. You would end up with something like this:

import csv
import json

csvfile = open('MyFile.csv', 'r')
reader = csv.DictReader(csvfile)
fieldnames = ('Name', 'Flow', 'Arrow Size', 'Date', 'Lat', 'Long', 'Heading', 'Color', 'Desciption', 'table')

output = []

for each in reader:
  row = {}
  for field in fieldnames:
    if not field == 'Coords':
        row[field] = each[field]    
  output.append(row)

jData = { "features" : None }
jData["features"] = output

# start the geojson file
geo = {
    "type" : "FeatureCollection",
    "crs" : { "type" : "name",
              "properties" : {
                  "name" : "EPSG:4326"
                  }
              },
    "features" : None
    }

feature_list = []

# loop through json
fCount = 0
for feature in jData['features']:
    fCount +=1
    features = {
        "type" : "Feature",
        "id" : fCount,
        "geometry" : { "type" : "Point",
                       "coordinates" : [
                           float(feature["Lat"]), float(feature["Long"]) ]
                       },
        "properties" : { "OBJECTID" : fCount,
                         "Name" : feature["Name"],
                         "Lat" : float(feature["Lat"]),
                         "Long" : float(feature["Long"]),
                         "Heading" : int(feature["Heading"])
                         }
        }
    
    feature_list.append(features)

    # this part would be exported to tab/csv file
    # for k, v in feature['table']:
    #   print str(fCount)+ '\t' + k + '\t' + v
        

geo["features"] = feature_list

# to write to a file
with open('result2.json', 'w') as fp:
     json.dump(geo, fp)
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

View solution in original post

JoshuaBixby · ‎03-30-2017

How do you plan on handling the "tables?" Using Esri speak, the provided JSON is embedding a record set within each element of a feature set. Unless you plan on storing the record set portion as a string, I see this is two separate tables, one representing the basic properties (including location) of whatever, and the second representing events that are happening at whatever.

JamesCrandall · ‎03-30-2017

That "tables" originally comes in as a string (really it's an html table stored in that column of the csv file) and I had converted it to a list like that. If it's a problem then I can revert it back to its original string value and deal with it later (as you suggested, make it it's own table).

JamesCrandall · ‎03-31-2017

What do you think would be the best plan of attack on this one?

1. Attempt to churn the supplied JSON into a well-formatted esri-JSON object, minus that table column.

2. Just parse out the table column into new columns, effectively just making them new attribute fields.

I'm thinking for now I should just go with the #2 option to manipulate the csv and just convert that to a feature class. Open to suggestions.

JoshuaBixby · ‎03-31-2017

I agree with rvburton‌. I think trying to parse out the "table" and make them new attributes will get messy. Logically, the data set appears to be two tables, one with locations and one with events.

RandyBurton · ‎03-31-2017

I think you can loop through the json and create a well-formed geojson file. While doing the loop, you could pull out the table information and send it to a tab/csv file so it could be imported into your project as a related table.

RandyBurton · ‎03-31-2017

Here is some very rough code to give you an idea.

import json

filename = r'input.json'

with open(filename, 'r') as fp:
    # read json and add "features"
    jData = json.loads('{ "features" : ' + fp.read().decode("utf-8-sig").encode("utf-8") + '}') 

# print len(jData['features'])

# start the geojson file
print '{'
print '  "type" : "FeatureCollection",'
print '  "crs" :'
print '  {'
print '    "type" : "name",'
print '    "properties" :'
print '    {'
print '      "name" : "EPSG:4326"'
print '    }'
print '   },'
print '   "features" : ['

# loop through json
fCount = 0
for feature in jData['features']:
    fCount +=1
    print '   {'
    print '     "type" : "Feature",'
    print '     "id" : '+str(fCount)+','
    print '     "geometry" :'
    print '     {'
    print '       "type" : "Point",'
    print '       "coodinates" : ['
    print '         '+feature["Lat"]+', '+feature["Long"]
    print '          ]'
    print '     },'
    print '     "properties" : {'
    print '       "OBJECTID" : '+str(fCount)+','
    print '       "Name" : "'+feature["Name"]+'",'
    print '       "Lat" : "'+feature["Lat"]+'",'
    print '       "Long" : "'+feature["Long"]+'",'
    print '       "Heading" : "'+feature["Heading"]+'"'
    print '     }'
    if fCount != len(jData['features']):
        print '   },'
    else:
        print '   }'

    # this part would be exported to tab/csv file
    for k, v in feature['table']:
        print str(fCount)+ '\t' + k + '\t' + v
        

# close geojson
print '  ]'
print '}'
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

JamesCrandall · ‎03-31-2017

I appreciate it. However, I don't see where you are starting your input.json file from. In my OP above, that's what I'm starting with and I need to get it into the geoJSON format that you show!

Interestingly, I cannot seem to access individual attributes like you are showing either. I get this error when I try to access the "Name" attribute: TypeError: string indices must be integers, not str

But it validates in a JSON formatter tool. This is what I have so far that will convert a csv file into the JSON output I posted in my OP:

csvfile = open('H:\MyFile.csv', 'r')
reader = csv.DictReader(csvfile)
fieldnames = ('Name', 'Flow', 'Arrow Size', 'Date', 'Lat', 'Long', 'Heading', 'Color', 'Desciption', 'table')

output = []

for each in reader:
  row = {}
  for field in fieldnames:
    if not field == 'Coords':
        row[field] = each[field]    
  output.append(row)

jData = json.dumps(output)    
json.dump(output, jsonfile, indent=2, sort_keys=True)‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

RandyBurton · ‎03-31-2017

I saved the json in your first message as a text file. In my code on line 7, I use json.loads to read the file into the jData variable. But as I am reading it, I appended '{ "features" : ' to the beginning and a closing '}' (with your json in the middle). This makes it a listing of features that one can easily loop through using "for feature in features".

I'll explore the code in your last message when I have some additional time.

I did make a modification on my code that uses dictionaries. It was a chance to do some exploring.

import json

filename = r'input.json'

with open(filename, 'r') as fp:
    # read json and add "features"
    jData = json.loads('{ "features" : ' + fp.read().decode("utf-8-sig").encode("utf-8") + '}') 

# dump the json data
# print json.dumps(jData, indent=4, sort_keys=False)

# start the geojson file
geo = {
    "type" : "FeatureCollection",
    "crs" : { "type" : "name",
              "properties" : {
                  "name" : "EPSG:4326"
                  }
              },
    "features" : None
    }

feature_list = []

# loop through json
fCount = 0
for feature in jData['features']:
    fCount +=1
    features = {
        "type" : "Feature",
        "id" : fCount,
        "geometry" : { "type" : "Point",
                       "coordinates" : [
                           float(feature["Lat"]), float(feature["Long"]) ]
                       },
        "properties" : { "OBJECTID" : fCount,
                         "Name" : feature["Name"],
                         "Lat" : float(feature["Lat"]),
                         "Long" : float(feature["Long"]),
                         "Heading" : int(feature["Heading"])
                         }
        }
    
    feature_list.append(features)

    # this part would be exported to tab/csv file
    # for k, v in feature['table']:
    #   print str(fCount)+ '\t' + k + '\t' + v
        

geo["features"] = feature_list

print json.dumps(geo)

# to write to a file
# with open('result.json', 'w') as fp:
#     json.dump(geo, fp)‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

JamesCrandall · ‎04-03-2017

I get an error running your .json sample ValueError: Expecting : delimiter: line 1 column 14 (char 13)

filename = r'H:\input.json'
with open(filename, 'r') as fp:
    jData = json.loads('{ "features" " ' + fp.read().decode("utf-8-sig").encode("utf-8") + '}')
print json.dumps(jData, indent=4, sort_keys=False)‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

This happens with the input.json sample you provided as well as the contents of the .json file I am working with:

[
  {
    "Arrow Size": "0.60", 
    "Color": "icon20", 
    "Date": "March 3, 2017", 
    "Desciption": "ff00ffff", 
    "Flow": "80", 
    "Heading": "315", 
    "Lat": "46.60", 
    "Long": "-50.44", 
    "Name": "Some Name1", 
    "table": "<TABLE>\n<tr><td>G310-P:</td><td style=\"text-align: right\">84 cfs; Mar 2, 17</td></tr>\n<tr><td>G251-P:</td><td style=\"text-align: right\">0 cfs; Mar 2, 17</td></tr>\n</TABLE><a href=\"http://my.sfwmd.gov/dbhydroGraph/servlet/DbhydroGraphServlet.do?v_report_type=format6&v_period=1week...\">Hydrograph</a>"
  },
  {
      "Arrow Size": "0.60", 
      "Color": "icon20", 
      "Date": "March 3, 2017", 
      "Desciption": "ff00ffff", 
      "Flow": "80", 
      "Heading": "315", 
      "Lat": "42.30", 
      "Long": "-53.42", 
      "Name": "Some Name2", 
      "table": "<TABLE>\n<tr><td>G310-P:</td><td style=\"text-align: right\">84 cfs; Mar 2, 17</td></tr>\n<tr><td>G251-P:</td><td style=\"text-align: right\">0 cfs; Mar 2, 17</td></tr>\n</TABLE><a href=\"http://my.sfwmd.gov/dbhydroGraph/servlet/DbhydroGraphServlet.do?v_report_type=format6&v_period=1week...\">Hydrograph</a>"
  },
  {
      "Arrow Size": "0.60", 
      "Color": "icon20", 
      "Date": "March 3, 2017", 
      "Desciption": "ff00ffff", 
      "Flow": "80", 
      "Heading": "315", 
      "Lat": "45.50", 
      "Long": "-55.44", 
      "Name": "Some Name3", 
      "table": "<TABLE>\n<tr><td>G310-P:</td><td style=\"text-align: right\">84 cfs; Mar 2, 17</td></tr>\n<tr><td>G251-P:</td><td style=\"text-align: right\">0 cfs; Mar 2, 17</td></tr>\n</TABLE><a href=\"http://my.sfwmd.gov/dbhydroGraph/servlet/DbhydroGraphServlet.do?v_report_type=format6&v_period=1week...\">Hydrograph</a>"
  }
]‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍