Latest Contributions by ColineByczek

‎03-10-2016

Hello, Thanks you both for your help. I did not delete intermediary results within the model but their names are not indexed through the loop. The geoprocessing environment is set to overwrite, and I can see the files been overwritten within Windows explorer and the .lock files pop in and out. I do not use layer views or table views within the model, however I did not delete the feature class layer which 'Iterate Feature Selection' directly outputs. Could that be the origin of the problem? I'll try deleting that layer after each run. There was no memory release in the task manager between interations, however between crashes there was. I am under arcgis 10.2 so 64-bit background processing is not available. I'll consider upgrading to 10.3. But I do not run the models in background processing because I want to be able to follow the progress. My computer is quite powerful: 14 CPUs and 48Go of memory. Makes up for my lack of efficiency... I know that the task is huge but it was unavoidable. I am calculating scenic beauty within viewshed from each open point within a 5,000km² zone, at a 200m resolution. I could have decreased the resolution but I'd rather learn how to solve computation problems. Each run is 7s long when starting the loop, which seemed fine. In the end, I could perform all calculations within 24 hours of run through 10 parallel sessions, and only one crash. But I'd rather not make the same mistakes twice... If you have other recommendations, thanks again. Coline

‎03-09-2016

Hi, I am currently trying to perform a model from model builder over 600,000 feature classes in arcgis 10.2 (perhaps 10.2.1 or 10.2.2) In this model I use these tools (several times): viewshed analysis, raster calculator, extract by attributes, extract by mask, buffer, and erase. To speed up the calculation, I'd like to use parallel processing. But because of urgent deadlines I cannot afford to learn about the correct way to perform it. I've seen that not all tools can be used in the parallel processing environment and honestly, I don't have the time to check all the tools I use or find alternatives and of course learn to debug the final model with parallelisation in it. Instead, I was told to split the work and run it in several arcgis windows opened at a same time. This works actually pretty well. But after a time, if I check the Windows Task Manager, the memory use will increase a lot and at the same time the calculation speed will decrease. Eventually the model crash and I have to start over where it ended. My question is: does anyone no where precisely this comes from? Or more importantly, how to solve the problem. I run the models from arctoolbox, not from model builder, so intermediate data should be erased at each run of the loop. Thanks in advance for the help! Coline

‎02-24-2015

Hello everyone, Time for a little feedback I think. First of all, thanks to both of you for the incredible help. And more, thank you for the invaluable good practice tips, I had no idea of where to find such advice, still not but I'm keeping all you said in mind. As for answering your questions quickly: 1) I am using arcgis 10.2.2 and had no idea there was a new cursor: I'll look at the correct version web help now. 2) The variable 'contour' is the shapefile "perimeter.shp" descibing the study area form, which I forgot to rename in the process of trying to make the code a bit clearer for posting. 5) "VALUE > 180002": this was the value at which the precedent run had crashed, so I extracted the points that remained to be processed before launching a new run. Dirty, but in 5-10 times I had them all. Still, I hope I'll learn enough to avoid that problem another time. 3), 4) and 6): thanks for the advice As I was really running out of time and courage, I have implemented the last part of the code in ModelBuilder, and I don't think I'll be able of doing it in Arcpy right now. So I'm posting the result in order for others to look at if they need ideas. It aims at extracting the pixel distribution histogram for a great number of zones. In the picture below you can see how it works: In the background the land cover map, resolution 15m x 15m: I wanted to extract the land cover in a 150m radius around each pixel. For this I computed discs covering that area, and to go faster I made fishnets made of disks instead of single discs As each disks covers an area of 21 x 21 = 441 pixels, and Zonal Histogram won't handle overlapping features, I had to make as many disk-fishnets to be able to cover the whole number of pixels. (two fishnets are shown in the picture) All this runs in the arcpy code provided some patience and re-starting the code after crashes. Then I could extract the land cover distribution for each disc-fishnet, in model builder using a double loop (i.e. two nested models). The 441 fishnet files are stored in the file zones.gdb. They have 3 relevant fields: 1) "orig_id" gives a unique value for each disk in order to delimit the zones for the "Zonal histogram" tool. 2) "Id2" = Int(FID/1000) +1 allows to group disks in 46 lots of 1,000. 3) A third field of 0 and 1 allowed to filter out disks which have their centers outside the land cover raster The model has basically 3 steps: 4) Main model: for each feature class (fishnet), I convert the unique-value-per-disk field, "orig_id", to raster. 5) Nested model: another loop extracts a fragment of this raster by groups of 1,000 using the ID2 field. For each value of Id2, I make a feature layer of only the features with Id2 = for-loop- increment*, and extract the fragment from the whole fishnet fishnet with "extract by attributes". 6) Nested model: each fragment of orig_id.img is used as a zone field for zonal histogram. Main loop: Nested loop: Here are the main problems I encountered and the work arounds I found: 2) Zonal histogram won't run over more than 1,000 features at a time, so I needed to split the job. the "+1" is because in the nested model, which is a for loop, iteration over the value 0 always fails, (Error message: "the precondition is false" and then in the downloop steps "the inputs are not current") so I looped over 0:47 instead of 0:46 and the loop bug in 0 does not count anymore. 3) if you leave them the Zonal Histogram tool crashes completely. 4) Back when I was trying arcpy and using feature classes as zone input for Zonal Histogram, it would automatically convert them to raster and automatically name them, with no workaround. They piled up in the workspace and made the loop eventually crash. By converting the fragment to raster this is avoided. 5) "Zonal histogram" will ignore an output of "select layer by attributes", so "extract" is compulsory. Also, the SQL query within "make feature layer" does not work so I had to use both this and "select layer by attributes". All in all, the loop takes 1 hour to loop over 1 fishnet. Limiting steps are 'extract by mask' (15s) and 'zonal histogram' (1mn10s). The expected calculation time for the 441 fishnets will be 18 days, but with parallelisation in 10 simultaneous runs this can be brought down to 2 days. And this is better than the 2 years with which I started I just hope the loop will be stable over 40 feature classes.

‎02-17-2015

Hello and thank you for the suggestion! Yes I saw that tool. But weirdly this isn't really a transpose tool, as it does not switch rows and columns but only converts everything in a 3-column matrix. Using that tool would allow me to convert the Zonal Histogram output table (10rows x 1000columns) into a ~5x1,000,000 table, but not to perform Zonal Histogram on more than 1,000 features at the time, so I did not retain that option. Newsfeed: I launched the Zonal Histogram script last night and it ran over 5 feature classes only before collapsing (45 iterations per feature class). But ArcGIS did not collapse itslef so I got a clear error message: Error 999998 Not enough memory. I think I'm going nowhere without solving that memory problem. So these are the options I am going to explore right now... any help will be greatly appreciated! Try to handle the scratch.workspace thing, or the current.workspace. I never used it or understood how it works. In ModelBuilder the intermediate data is automatically deleted but I couldn't find a tutorial of how to obtain that behaviour in arcpy. Also, I read that one can run arcpy scripts directly in python without opening arcgis. This looks thrilling but I'll have to find out how to. Thanks for reading and helping! Coline

‎02-16-2015

Hello again! So this is what I managed to write after 2 weeks of trial-and-error. Basically, I learned Python by writing this script, so it must be really dirty but I tried my best, tried to clarify the process and pointed out the unsolved problems in the comments. Now, I am currently running the last loop in order to retrieve 441 x 45 histogram tables... Unfortunately Zonal Histogram will output a table with the 12 classes as rows and the 1,000 zones as columns, and I haven't the slightest idea of how to transpose and process these tables in python. I might do it in R. Anyway, thanks for the help and counseling! Coline # set environment import arcpy, os from arcpy import env from arcpy.sa import * arcpy.CheckOutExtension("Spatial") arcpy.env.snapRaster = "C:\\Users\\admin\\Desktop\\ArcGIS\\Inputs\\grid.img" ws = "C:\\Users\\admin\\Desktop\\ArcGIS\\biodiversity\\window" env.outputCoordinateSystem = arcpy.SpatialReference(2154) ####################################################################################################################################### # Build an Index raster # Here we build a raster of 15m-wide pixel covering the entire area and each pixel with unique value # I found the trick either on this site or GISstackoverflow.. can't find it again. # Something bugs at the Raster Calculator step so the script has to be run in two times, see line 28 perimeter = ws + "\\perimeter.shp" base = arcpy.MinimumBoundingGeometry_management(contour, ws + "\\extent.shp", "ENVELOPE", "ALL", "Id") base_raster = arcpy.PolygonToRaster_conversion(base, "FID", ws + "\\base.img", "CELL_CENTER", "NONE", 15) base1 = Raster(base_raster)*0 + 1 flow1 = Int(FlowAccumulation(base1)) base64 = Raster(base_raster) * 0 + 64 flow64 = Int(FlowAccumulation(base64)) arcpy.BuildRasterAttributeTable_management(flow1, "Overwrite") arcpy.BuildRasterAttributeTable_management(flow64, "Overwrite") # The following lines cannot be copy-pasted in arcpy window directly after the others or a bug happens. Just do it in two steps rastersum = Raster("flow1") * 10000 + Raster("flow64") rasterid = arcpy.CopyRaster_management(rastersum, ws + "\\rasterID.img") ####################################################################################################################################### # Build Fishnets # Instead of coding a single moving window which would run 20 million times (> 2 years calculation time) # I created fishnet-shapefiles of non-overlapping disks of radius 150m (~45,000 disks in one disk-fishnet) # In order to get disks masks for all 15m pixels, and as a single disks covers a square of 21x21 pixel, # 21x21 = 441 "disk-fishnets" area created, using pixels in the "origin.img" raster, # In this raster you can find the 441 X origins and 441 corresponding pixels giving the Y-axis direction #(used in the arcpy.CreateFishnet_management function) # The loops correctly over ~3,000 iterations and then ArcGIS just shuts down. # Is this a memory problem? I couldn't find a way to set proper names to temp files in the ExtractByMask functions, # I think they pile up and ArcGIS runs out of memory. # This is why I had to start the loop again at the point (line 64) where it had died. A pain. # Also, I tried to parallelise the calculation by running ArcGIS in separate windows (works fine usually in ModelBuilder) # But here it does not work, I think it comes from the ExtractByMask temp files which cannot be named # and then conflict at some point, killing the run. # I tried to set different arcpy.env.workspaces for each running ArcGIS window but failed at it. origin = ws + "\\origin.csv" fishnet = arcpy.TableToTable_conversion(origin, ws, "fishnet.dbf") desc = arcpy.Describe(contour) fishnet1 = arcpy.CreateFishnet_management("fishnet1.shp", str(desc.extent.lowerLeft), str(desc.extent.upperLeft), 315, 315, 2, 1, "# ", "NO_LABELS", "# ", "POLYGON") origin_pg = arcpy.Select_analysis(fishnet1, ws + "\\origin_pg.shp","\"FID\" = 0") yaxis_pg = arcpy.Select_analysis(fishnet1, ws + "\\yaxis_pg.shp","\"FID\" = 1") extract = ExtractByMask(rasterID,origin_pg) extract.save(ws + "\\origins.img") extract2 = ExtractByMask(rasterID,yaxis_pg) extract2.save(ws + "\\yaxis.img") # Here is where the loop has to be re-started after ArcgiS goes down. wst = "C:\\Users\\admin\\Desktop\\ArcGIS\\biodiversity\\table" origins = ws + "\\origins.img" extract4 = ExtractByAttributes(origins, "VALUE > 180002") extract4.save (ws + "\\origins4.img") origins4 = ws + "\\origins4.img" contour = ws + "\\contour.shp" arcpy.env.addOutputsToMap = False cursor = arcpy.SearchCursor(origins4) for row in cursor: print(str(row.getValue("Value"))) X = ExtractByAttributes(ws + "\\origins.img", "Value = " + str(row.getValue("Value"))) ptX = arcpy.RasterToPoint_conversion(X,"ptX.shp","Value") rX = arcpy.PointToRaster_conversion(ptX, "POINTID", ws + "\\temp","COUNT","POINTID",15) descX = arcpy.Describe(rX) Y = ExtractByAttributes(ws + "\\yaxis.img", "Value = 21 + " + str(row.getValue("Value"))) ptY = arcpy.RasterToPoint_conversion(Y,"ptY.shp","Value") rY = arcpy.PointToRaster_conversion(ptY, "POINTID", ws + "\\temp","COUNT","POINTID",15) descY = arcpy.Describe(rY) fishnet = arcpy.CreateFishnet_management(ws + "\\fishnet1.shp", str(descX.extent.lowerLeft), str(descY.extent.lowerLeft), 315, 315, 300, 330, "# ", "NO_LABELS", "# ", "POLYGON") fishpoint = arcpy.FeatureToPoint_management(fishnet, ws + "\\fishpoint.shp") fishbuff = arcpy.Buffer_analysis(fishpoint, "fishbuff.shp", 150, "FULL", "ROUND") contour_lyr = arcpy.MakeFeatureLayer_management(contour, "contour_lyr") fishbuff_lyr = arcpy.MakeFeatureLayer_management(fishbuff, "fishbuff_lyr") arcpy.SelectLayerByLocation_management(fishbuff_lyr, "# ",contour_lyr) arcpy.FeatureClassToFeatureClass_conversion(fishbuff_lyr, wst, "zones" + str(row.getValue("Value")) + ".dbf") ####################################################################################################################################### # Zonal histograms # Here I try to extract the land cover distribution within each disk-mask # Zonal histogram does not work for disks which have their centroids outside the value map # (not to mention that they are useless) so I had to unselect them before. # The tool won't run for more than 1,000 features at a time, to the disk-fishnets have to be split again # It appears that it was a bug in arcgis 10.0, as stated in the following links, but normally they should have been solved in 10.3 - which I'm using. # While reading the forum discussions about bugs in this tool, I learned that the tool used to write nonsense in the output tables. # http://gis.stackexchange.com/questions/33502/zonal-histogram-fails-to-create-output-table-when-looping # https://community.esri.com/thread/50764 # Does anyone know about this and could give some counsel about it? # Some people bypassed the issue by doing Extract By mask instead of Zonal Histogram, # But this would lead me back to a single moving window/disk, which is undoable. # Or perhaps I could use a shapefile landcover and "Intersect" + "Calculate field area", but I doubt it would work well. wsf = "C:\\Users\\admin\\Desktop\\ArcGIS\\biodiversity\\table2" wsh = "C:\\Users\\admin\\Desktop\\ArcGIS\\biodiversity\\hist" landcover = ws + "\\landcover.img" contour = ws + "\\contour.shp" env.workspace = wsf list = arcpy.ListFeatureClasses("zone*") for i in list: print(i) s = str(i) namei = s.split(".")[0] arcpy.CalculateField_management(i, "Id", '!FID!/1000', "PYTHON") arcpy.MakeFeatureLayer_management(i, "zone") arcpy.MakeFeatureLayer_management(contour, "contour") arcpy.SelectLayerByLocation_management("zone", "HAVE_THEIR_CENTER_IN", "contour") arcpy.CalculateField_management("zone", "BUFF_DIST", '0', "PYTHON") arcpy.SelectLayerByAttribute_management("zone", "CLEAR_SELECTION") for j in (0,1,2): print(j) mask = arcpy.Select_analysis(i, wsf + "\\mask.shp", "\"Id\" = " + str(j) + "AND \"BUFF_DIST\" = 0") # arcpy.MakeFeatureLayer_management(i, "masklyr", "\"Id\" = " + str(j) + "AND \"BUFF_DIST\" = 0") ZonalHistogram(mask, "FID", landcover, "hist_" + str(namei) + "_" + str(j) + ".dbf") Rlist = arcpy.ListRasters("t*") for k in Rlist: arcpy.Delete_management(k)

‎02-03-2015

Hello, Thank you for your quick answer and advice. Iterating by land use class is a brilliant idea, thank you so much for this! And I did not know about Focal Statistics, so this is a double win. I don't know how long it will take to implement all this so I wanted to say thank you for now; if my final code is clean enough in the end I'll post it for further help to other users.

‎02-03-2015

Hello to everyone, I am trying to create a habitat map based on a land cover map, and as it seems a big fish for me I prefer to ask for advice beforehand, and I thank very mcuh anyone who'd be willing to help ! I have a land use map at 15x15m with ~ 10 land uses, and a habitat classification based on land cover distribution around a point. ex: Habitat "3" = Mosaic cropland (50-70%) / vegetation (grassland/shrubland/forest) (20-50%) I would like to perform the following steps, for each pixel in the land use map: 1) get a pixel count of each of the land uses within a 150m radius 2) confront the land use distribution with my habitat classification directly in arcpy 3) attribute the habitat code value to the point, and later convert to raster. And here are the points about which I'm wondering: 1) should I work in raster format and count pixels? or in shapefile format and calculate areas? I think raster. 2) should I implement a moving window, or convert each point in my shapefile to a 150m-wide circle? I don't even know whether zonal histogram would give me the correct result with that many overlapping features. (I've also thought of building several shapefiles of touching but not overlapping circles, but that's for later I think) 3) should I try and use multi-processing? I'm fairly new to Arcpy, so this might be tricky. 4) is it possible to confront tables in arcpy like in R or Matlab or should I export the distribution table to another software to run these calculations? 5) in a more general context, how to determine what is a reasonable amount of calculation? I have a sense 14 million iteration sounds stupidly high, but I couldn't get more precise... Thanks in advance for any help! Coline

Online Status	Offline
Date Last Visited	‎11-11-2020 02:24 AM

My Ideas

Latest Contributions by ColineByczek

Re: how to prevent memory use increase in task manager during 'bad' parallel processing?

how to prevent memory use increase in task manager during 'bad' parallel processing?

Re: calculate zonal histogram for each pixel in raster

Re: calculate zonal histogram for each pixel in raster

Re: calculate zonal histogram for each pixel in raster

Re: calculate zonal histogram for each pixel in raster

calculate zonal histogram for each pixel in raster

Re: how to prevent memory use increase in task man...