POST
|
Is it possible that the numpy.nonzero step creates an array with 2 columns/rows? Such as an identifier column or row and then the actual value column/row? If so, that would make sense of my percentile breaks as my overall data set has 11226 total entries in it which may explain why my percentiles are shown to be as it is percentile-ing the identifier column/row. If that is the case, is there a way to specify which column/row of the array to perform the percentile possibly? I am going to put a larger snippet of my code here to get a better sense of what it is I am trying to do (warning, lots to look at lol) # Adds and calculates area for our new feature layer Reclass_NLCD_HUC12 for percent area calculations. Next after that we want to then sum up the area with the case fields of HUC12, NLCD_Land, and gridcode.
# That output is a table of attributes that each unique HUC 12 watershed has a area for all 4 reclassified values, gridcode values 1 thru 4. We then sum that table again this time with the only case field being HUC12
# and that is so we can have an overall total landuse area per HUC12. Next is to join the total landuse area back into the NLCD_HUC12_Gridcode_Area table and rename some fields to better understand our data and clean it up.
# arcpy.management.AddGeometryAttributes("Reclass_NLCD_HUC12", "AREA_GEODESIC", None, "SQUARE_KILOMETERS", None)
arcpy.management.AddGeometryAttributes(Reclass_NLCD_HUC12, "AREA_GEODESIC", None, "SQUARE_KILOMETERS",
"GEOGCS['GCS_North_American_1983',DATUM['D_North_American_1983',SPHEROID['GRS_1980',6378137.0,298.257222101]],PRIMEM['Greenwich',0.0],UNIT['Degree',0.0174532925199433]]")
arcpy.analysis.Statistics(Reclass_NLCD_HUC12,
r"E:\2019 Grant Files\DatabaseProject_DataFolder\DatabaseFinal.gdb\NLCD_HUC12_Gridcode_Area", # Path Change #
"AREA_GEO SUM", "HUC_12;NLCD_Land_Cover_Class;gridcode")
NLCD_HUC12_Gridcode_Area = r"E:\2019 Grant Files\DatabaseProject_DataFolder\DatabaseFinal.gdb\NLCD_HUC12_Area" # Path Change #
arcpy.analysis.Statistics("NLCD_HUC12_Gridcode_Area",
r"E:\2019 Grant Files\DatabaseProject_DataFolder\DatabaseFinal.gdb\NLCD_HUC12_Area_Total", # Path Change #
"SUM_AREA_GEO SUM", "HUC_12")
NLCD_HUC12_Area_Total = r"E:\2019 Grant Files\DatabaseProject_DataFolder\DatabaseFinal.gdb\NLCD_HUC12_Area_Total" # Path Change #
arcpy.management.JoinField("NLCD_HUC12_Gridcode_Area", "HUC_12", "NLCD_HUC12_Area_Total", "HUC_12", "SUM_SUM_AREA_GEO")
arcpy.management.AlterField("NLCD_HUC12_Gridcode_Area", "SUM_AREA_GEO", "Landuse_Area", "Landuse_Area", "DOUBLE", 8,
"NULLABLE", False)
arcpy.management.AlterField("NLCD_HUC12_Gridcode_Area", "SUM_SUM_AREA_GEO", "Landuse_Area_Total", "Landuse_Area_Total",
"DOUBLE", 8, "NULLABLE", False)
# Gets a percent area of each NLCD reclassified value
arcpy.AddField_management("NLCD_HUC12_Gridcode_Area", "Percent_Area", "Double", 9,
field_alias="Percent_Area", field_is_nullable="NULLABLE")
arcpy.management.CalculateField("NLCD_HUC12_Gridcode_Area", "Percent_Area", "(!Landuse_Area!/!Landuse_Area_Total!)*100",
"PYTHON3", None)
# Creates an easy to edit and use HUC12 file copy so we do not damage the original file
arcpy.management.CopyFeatures('Huc12_layer',
r"E:\2019 Grant Files\DatabaseProject_DataFolder\DatabaseFinal.gdb\HUC12Rank", None, None, # Path Change #
None, None)
HUC12Rank = r"E:\2019 Grant Files\DatabaseProject_DataFolder\DatabaseFinal.gdb\HUC12Rank" # Path Change #
arcpy.MakeFeatureLayer_management(HUC12Rank, 'HUC12edit')
# Makes new tables based on a selection of the NLCD_HUC12_Gridcode_Area data table. These selections are going to be for Urban land use and then repeated for Agriculture landuse.
# These new tables will then be joined to our original HUC12 shapefile as new attributes for each HUC12 subwatershed.
arcpy.management.JoinField(HUC12Rank, "HUC_12", "NLCD_HUC12_Area_Total", "HUC_12", "Landuse_Area_Total")
arcpy.analysis.TableSelect("NLCD_HUC12_Gridcode_Area",
r"E:\2019 Grant Files\DatabaseProject_DataFolder\DatabaseFinal.gdb\NLCD_HUC12_Gridcode_Area_Urban", # Path Change #
"gridcode = 2 And Percent_Area IS NOT NULL")
arcpy.analysis.TableSelect("NLCD_HUC12_Gridcode_Area",
r"E:\2019 Grant Files\DatabaseProject_DataFolder\DatabaseFinal.gdb\NLCD_HUC12_Gridcode_Area_Agriculture", # Path Change #
"gridcode = 4 And Percent_Area IS NOT NULL")
arcpy.management.AlterField("NLCD_HUC12_Gridcode_Area_Urban", "Percent_Area", "Landuse_Area_Percent_Urban",
"Landuse_Area_Percent_Urban", "DOUBLE", 8, "NULLABLE", False)
arcpy.management.AlterField("NLCD_HUC12_Gridcode_Area_Agriculture", "Percent_Area", "Landuse_Area_Percent_Ag",
"Landuse_Area_Percent_Ag", "DOUBLE", 8, "NULLABLE", False)
arcpy.management.JoinField(HUC12Rank, "HUC_12", "NLCD_HUC12_Gridcode_Area_Urban", "HUC_12",
"Landuse_Area_Percent_Urban")
arcpy.management.JoinField(HUC12Rank, "HUC_12", "NLCD_HUC12_Gridcode_Area_Agriculture", "HUC_12",
"Landuse_Area_Percent_Ag")
# The now joined data from the Agriculture and Urban land use can now be given a rank with a similar process as the Flowlines earlier in the script
# Urban Ranking
# If a HUC 12 did not have any urban land use, e.g. middle of the rockies, the join field process in the last section would result in a NULL
# in the attribute field, this gives us trouble when trying to use the SearchCursor and UpdateCursor, so first we need to change the NULLs into
# zeros.
arcpy.management.CalculateField(HUC12Rank, "Landuse_Area_Percent_Urban", "updatevalue(!Landuse_Area_Percent_Urban!)",
"PYTHON3", r"def updatevalue(value):\n"
r" if (value == None):\n"
r" return '0'\n"
r" else:\n"
r" return value\n ")
# Creates a new field for the Rank to be calculated into
arcpy.AddField_management(HUC12Rank, "Percent_Urban_Rank", "Double", 9,
field_alias="Percent_Urban_Rank", field_is_nullable="NULLABLE")
maximum_urbanlanduse = max(row[0] for row in arcpy.da.SearchCursor(HUC12Rank, ['Landuse_Area_Percent_Urban']))
print(maximum_urbanlanduse)
minimum_urbanlanduse = min(row[0] for row in arcpy.da.SearchCursor(HUC12Rank, ['Landuse_Area_Percent_Urban']))
print(minimum_urbanlanduse)
# This portion creates an array of the Landuse_Area_Percent_Urban attribute. The array is then searched for zeros and removed to get proper percentile breaks
# the numpy percentile function is ran for 25th 50th and 75th percentiles and those values are saved as variables to be called on once start ranking. We use UpdateCursor
# again here to populate the Percent_Urban_Rank field. We give all 0 fields of percent land use to be rank 0, then compare the Landuse_Area_Percent_Urban to the
# percentile values we stored as p1 p2 p3.
array_urban = arcpy.da.FeatureClassToNumPyArray(HUC12Rank, 'Landuse_Area_Percent_Urban')
array_urbanview = array_urban.view(dtype=numpy.double)
array_urban_nonzero = numpy.nonzero(array_urbanview)
print(array_urban_nonzero)
p1urban = numpy.percentile(array_urban_nonzero, 25)
print(p1urban)
p2urban = numpy.percentile(array_urban_nonzero, 50)
print(p2urban)
p3urban = numpy.percentile(array_urban_nonzero, 75)
print(p3urban)
with arcpy.da.UpdateCursor(HUC12Rank, ['Percent_Urban_Rank', 'Landuse_Area_Percent_Urban']) as cursor:
for row in cursor:
if row[1] == 0:
row[0] = 0
elif minimum_urbanlanduse < row[1] <= p1urban:
row[0] = 1
elif p1urban < row[1] <= p2urban:
row[0] = 2
elif p2urban < row[1] <= p3urban:
row[0] = 3
elif row[1] > p3urban:
row[0] = 4
else:
row[0] = 0
cursor.updateRow(row)
print('Done w Urban Percentile Rank') The result of the array print at line line 80 is as follows (array([ 0, 1, 2, ..., 11223, 11224, 11225], dtype=int64),) The array seems to not be holding the percent area for each subwatershed but rather just a identifier value And as to your previous suggestion, numpy.nan what line would I insert that function into? Full print out for this section of code: 97.859185942202
0.0
(array([ 0, 1, 2, ..., 11223, 11224, 11225], dtype=int64),)
3435.5
6312.0
8775.5
Done w Urban Percentile Rank Sorry for the trouble lol
... View more
10-15-2019
12:20 PM
|
0
|
0
|
1629
|
POST
|
Here is my updated code and the new issue I am getting. All of my variables are properly assigned in the beginning of the script not its not any issue with that I don't think. # Creates a new field for the Rank to be calculated into
arcpy.AddField_management(HUC12Rank, "Percent_Urban_Rank", "Double", 9,
field_alias="Percent_Urban_Rank", field_is_nullable="NULLABLE")
maximum_urbanlanduse = max(row[0] for row in arcpy.da.SearchCursor(HUC12Rank, ['Landuse_Area_Percent_Urban']))
print(maximum_urbanlanduse)
minimum_urbanlanduse = min(row[0] for row in arcpy.da.SearchCursor(HUC12Rank, ['Landuse_Area_Percent_Urban']))
print(minimum_urbanlanduse)
# This portion creates an array of the Landuse_Area_Percent_Urban attribute. The array is then searched for zeros and removed to get proper percentile breaks
# the numpy percentile function is ran for 25th 50th and 75th percentiles and those values are saved as variables to be called on once start ranking. We use UpdateCursor
# again here to populate the Percent_Urban_Rank field. We give all 0 fields of percent land use to be rank 0, then compare the Landuse_Area_Percent_Urban to the
# percentile values we stored as p1 p2 p3.
array_urban = arcpy.da.FeatureClassToNumPyArray(HUC12Rank, 'Landuse_Area_Percent_Urban')
array_urbanview = array_urban.view(dtype=numpy.double)
array_urban_nonzero = numpy.nonzero(array_urbanview)
p1urban = numpy.percentile(array_urban_nonzero, 25)
print(p1urban)
p2urban = numpy.percentile(array_urban_nonzero, 50)
print(p2urban)
p3urban = numpy.percentile(array_urban_nonzero, 75)
print(p3urban)
with arcpy.da.UpdateCursor(HUC12Rank, ['Percent_Urban_Rank', 'Landuse_Area_Percent_Urban']) as cursor:
for row in cursor:
if row[1] == 0:
row[0] = 0
elif minimum_urbanlanduse < row[1] <= p1urban:
row[0] = 1
elif p1urban < row[1] <= p2urban:
row[0] = 2
elif p2urban < row[1] <= p3urban:
row[0] = 3
elif row[1] > p3urban:
row[0] = 4
else:
row[0] = 0
cursor.updateRow(row)
print('Done w Urban Percentile Rank') My results of the print outs are: 97.859185942202
0.0
3435.5
6312.0
8775.5
Done w Urban Percentile Rank There shouldn't be any value over 100 since these are percents, Max urban landuse being 97.859185942202, minimum of 0, but then the percentiles are all off. However, before adding the non-zero portion of the script, I was getting much nicer percentile breaks.
... View more
10-11-2019
08:43 AM
|
0
|
2
|
1629
|
POST
|
Also, one more thing, is there away while doing the percentile to ignore 0 values? Currently one of the attributes that I am calculating a rank for using percentile is skipping rank 1 all together. I guess the the p1 percentile cut off is a value extremely close to 0 but still less than the next highest value of the attribute I am trying to rank. So all my 0 values get a rank of 0 then the first non 0 value is 1.32783428022391E-08 and that is falling within the 2nd percentile group p2 getting a rank of 2.
... View more
10-10-2019
01:15 PM
|
0
|
0
|
1629
|
POST
|
Awesome, this worked! Just as an aside, I fully understand why its not as exciting haha but why is this "unsafe"? Thank you again for your help Dan and Joe.
... View more
10-10-2019
07:21 AM
|
0
|
0
|
1629
|
POST
|
Sorry about the formatting, see the above reply for the proper use of syntax highlighter. 1. I do not have any NoData, every entry has a value or a 0. 2.When I try to from numpy.lib.recfunctions import structured_to_unstructured as stu Pycharm says it cannot find reference 'structured_to_unstructured' in 'recfunctions.py'
... View more
10-09-2019
02:30 PM
|
0
|
3
|
2667
|
POST
|
Sorry, I had not used the syntax highlighter before, here is what that shows, the indents are fine in pycharm. The error i'm receiving is on line 6 maximum = max(row[0] for row in arcpy.da.SearchCursor(NHDFlowline_HUC12, ['Normalized_Linear']))
print(maximum)
minimum = min(row[0] for row in arcpy.da.SearchCursor(NHDFlowline_HUC12, ['Normalized_Linear']))
print(minimum)
arr = arcpy.da.FeatureClassToNumPyArray(NHDFlowline_HUC12, ('Normalized_Linear'))
p1 = np.percentile(arr, 25)
p2 = np.percentile(arr, 50)
p3 = np.percentile(arr, 75)
p4 = np.percentile(arr, 100)
with arcpy.da.UpdateCursor(NHDFlowline_HUC12, ['Linear_Rank', 'Normalized_Linear']) as cursor:
for row in cursor:
if minimum <= row[1] <= p1:
row[0] = 1
elif p1 < row[1] <= p2:
row[0] = 2
elif p2 < row[1] <= p3:
row[0] = 3
elif row[1] > p3:
row[0] = 4
cursor.updateRow(row)
... View more
10-09-2019
01:51 PM
|
0
|
2
|
2667
|
POST
|
Hi all, running into an error and i'm not sure why when I am trying to rank the attribute field of a shapefile. I have a poly line shapefile of some streams that have an attribute field of some normalized data that is the data type of 'Double' and I am trying to rank these values by quartile. and store their rank in another attribute field. I know that you can use symbology > graduated colors > method: quantile with 4 classes and it is displaying my data correctly. However, I need to be able to have an attribute field with a rank to be able to use the data down the line. I have been using python for my processes so far but I am currently running into an error and i'm not sure why/cant really find an answer online anywhere else. Here is a sample of my code maximum = max(row[0] for row in arcpy.da.SearchCursor(NHDFlowline_HUC12, ['Normalized_Linear']))
print(maximum)
minimum = min(row[0] for row in arcpy.da.SearchCursor(NHDFlowline_HUC12, ['Normalized_Linear']))
print(minimum) arr = arcpy.da.FeatureClassToNumPyArray(NHDFlowline_HUC12, ('Normalized_Linear')) p1 = np.percentile(arr, 25) p2 = np.percentile(arr, 50) p3 = np.percentile(arr, 75) p4 = np.percentile(arr, 100) with arcpy.da.UpdateCursor(NHDFlowline_HUC12, ['Linear_Rank', 'Normalized_Linear']) as cursor: for row in cursor: if minimum <= row[1] <= p1: row[0] = 1 elif p1 < row[1] <= p2: row[0] = 2 elif p2 < row[1] <= p3: row[0] = 3 elif row[1] > p3: row[0] = 4 cursor.updateRow(row) first step should store a max and min value for the normalized data attribute and then create an array containing the values of my shapefile's attribute field 'Normalized_Linear' then the next steps are to assing values to p1 thru p4 as the breaks for the quartile and then use updateCursor to store in the rank. The resulting error is: Traceback (most recent call last): File "script path", line 143, in <module> p1 = np.percentile(arr, 25, axis=None, out=None, overwrite_input=False, interpolation='linear', keepdims=False) File "C:\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\numpy\lib\function_base.py", line 4269, in percentile interpolation=interpolation) File "C:\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\numpy\lib\function_base.py", line 4011, in _ureduce r = func(a, **kwargs) File "C:\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\numpy\lib\function_base.py", line 4386, in _percentile x1 = take(ap, indices_below, axis=axis) * weights_below TypeError: invalid type promotion I am unsure of how to go about fixing this TypeError: invalid type promotion. I feel like it may have something to do with the data type being double but if so, I would like to know how to work around this. Any help would be much appreciated
... View more
10-09-2019
12:47 PM
|
0
|
19
|
4818
|
POST
|
Hi all, Background information on my project: I calculated a slope raster from a DEM mosaic. I also have a cross section polyline that has segments with different attributes (i.e. Channel, Marsh, Tidal Channel, etc) and I want to be able to see what the slope is at these different geomorphic features to then see how these slopes change as we go upstream. Anyways, I have converted the polyline to a raster to get it to the same dataset type as my slope raster. My next step logically is to want to 'lift' the values from my slope raster and get them on my cross section line raster cells so the cross section line then has a feature type value and a slope value. I have also thought about maybe turning the slope raster into a polygon and maybe running an intersect with the poly line to then get a value associated with the features that way, but when I use the raster to polygon tool, my slope raster isnt available in the drop down list. Any help is appreciated. Cheers, gmpalmer_geo92
... View more
10-03-2019
09:10 AM
|
0
|
3
|
1054
|
POST
|
So I am trying to complete a set of geoprocessing tasks on some large scale data for several states. Right now I am stuck on the last little stretch I need to get to my final rankings of this data. I have a column in a attribute table of a shapefile of normalized data called "Normalized_Linear", I want to take the min and max values for the entire attribute field "Normalized_Linear" and apply those numbers to two entire temporary attribute fields (Normalized_MIN and Normalized_MAX) for later calculations but I am not sure of how to go about this. Part of me wants to be able to just use field calculator and should be a simple quick little code block but I am not aware of how to pull this off or the min function in Calculate Field takes the minimum value of multiple fields (columns) for one row and not the overall min or max of the attribute column. My next thought step is tho use the Statistics and find the min and max values for my Normalized_Linear attribute field but this then creates a standalone table of one row with columns of max and min, is there a way I can populate the temporary Normalized_MIN and Normalized_MAX automatically based off the value of another tables cell?
... View more
09-24-2019
11:31 AM
|
0
|
1
|
603
|
POST
|
Hi all, In ArcGIS Pro, I am working with a large high resolution raster dataset, specifically the NLCD 2016 dataset for about 5 states. I am also working with a polygon shapefile of subwatersheds (~11,200 unique) for the same states as the raster dataset. My goal is to find the area of my reclassified NLCD data as a percent of the total are per subwatershed. My subwatershed polygon shapefile already got an area attribute calculated, I got that part down. So my next step is to calculate the area of Agriculture 4, Forest 3, Urban 2, and Other 1 cells in the raster. I decided to find the count of each raster value for each unique subwatershed by using the Zonal Histogram. However, this process made the table a bit wonky in that the ~11,200 subwatershed ID values are the columns and the raster value counts are the rows so I have a table that is 4 rows by ~11,200 columns instead of 4 columns and ~11,200 rows which is how the polygon shapefile originally has the subwatershed ID's. When I try to use Transpose Fields to swap the rows and columns, the program just stalls out. It does not crash or say failed or anything and just sits there trying to call the Zonal Histogram table. I have also tried to export as a excel table but the program limits the amount of columns you can export which I have far exceeded because it will only allow me to export in .xls. I have also tried Table to dBase and the same thing happens it just stalls out. During these stalls I cannot do anything except have to close the program via task manager. I would really rather not break up the data sets at all if I absolutely do not have to Any help or direction would be excellent and much appreciated.
... View more
09-04-2019
12:39 PM
|
0
|
0
|
510
|
POST
|
Excellent, I was able to do what I needed with your help so far with this linear dataset and the polygon subwatershed. Now I must perform a similar analysis using the same HUC 12 polygon file but this time also using a raster dataset, specifically the NLCD (National Land Cover Database) 2016. I have already masked it down to the extent of the HUC12 polygon shapefile and reclassified the values to ones that are required of me. Not sure this will fall under the same Python stuff I was thinking of doing before but maybe you can still help me out. So, there are 3 classes being displayed in the raster data set, I must find the % area of 'Forest', 'Agriculture', 'Urban', and 'Other' within each unique subwatershed (polygon dataset). Each cell has a 30m by 30m size and if I could just find the could of each different value in the polygon then it could be done.
... View more
09-03-2019
12:56 PM
|
0
|
0
|
291
|