Select to view content in your preferred language

How to fill NoData holes in rasters

474
0
11-12-2024 10:50 PM
JuanLaguna
Esri Contributor
3 0 474

Rasters can have holes (also called voids, gaps, or NoData) in them. These areas can be large and very visible, or they can be individual or small groups of cells scattered throughout and not easily seen. Suppose you want to eliminate the holes? How can you replace them with meaningful values, while preserving the existing values?

There is a variety of ways to do that. Here we will give some background, a few things to consider, and then show four common solutions.

What is NoData?

 

A raster is a data structure that records information about a phenomenon, such as category, magnitude, height, or image reflectance, organized into a regular matrix of equally sized cells arranged into rows and columns. However, sometimes there is not enough information available to give a value to a cell, either from the data source or the output from an analytical operation. The concept of NoData is used to represent these cells.

When displaying rasters, the renderers allow you to set the NoData cells to appear either with a color of your choosing, or to not display at all (transparent).

For analysis, how input NoData cells are handled can vary based on the tool being used. For some tools, those locations will remain NoData in the output. Other tools calculate an output value based on other available values. The reference documentation commonly addresses the NoData behavior for a particular tool. However, the behavior may not be suitable for your analysis. You may want to identify these NoData cells so you can replace them with appropriate cell values. How can you do that?

 

Identify NoData cells

 

There is a tool that explicitly identifies NoData cells, the Is Null tool. It checks each cell and returns a value of 1 if a cell is NoData, and a value of 0 (zero) if a cell is any other value. If the resulting raster has values of both 0 and 1, then the input raster has NoData cells present. In the attribute table of the raster, the Count field tells you how many cells there are of each value.

In the following example, the input raster is on the left with the NoData cells rendered in white. This raster has 9 rows of 10 columns, and thus has a total of 90 cells. There are four unique values (3, 9, 15, and 22). Summing up the cell counts of each value (11, 22, 25, and 10, respectively), gives a total of 68 cells. Taking 90 and subtracting 68 from it tells you there are 22 NoData cells in the input. The output from Is Null and its attribute table is on the right. The cell value of 1 represents NoData, and it matches the expected count of 22 cells.

The Is Null tool identifies NoData cells in an input raster.The Is Null tool identifies NoData cells in an input raster.

 

 

Factors to consider in the analysis

 

Before starting any analysis, first look closely at the problem you are trying to solve. To make sure you follow the right analytical path to get the appropriate answer, carefully consider the types of analysis you will perform.

When it comes to picking the right solution for filling NoData areas, some factors to consider are:

  • Where are the replacement values coming from?
  • Is the raster discrete or continuous? Examples of discrete raster data include those recording land use classes or ranks. Examples of continuous rasters include elevation and concentration.
  • What is the nature of the NoData area to fill? Is it only a few cells scattered throughout, or large blobs that are many cells across? Some solutions will work better on small areas a few cells in size, whereas others can handle larger areas.

 

What do you want to replace NoData cells with?

 

To choose the proper solution, first determine where the replacement values are coming from. Common sources of replacement values include:

  1. A specific numerical value
  2. Cell values from another raster
  3. The nearest cell
  4. A statistic of the surrounding values, such as the average or the largest

Once you decide the source of the replacement values, then you just need to follow a specific workflow to achieve the result.

There are of course many types of raster analysis that can be done, and different ways to go about doing it. This article does not cover all the scenarios, but will focus on some of the typical ones.

 

Workflows for replacing NoData

 

The following graphic illustrates these four workflows at a general level. The column on the left lists the scenarios according to where the replacement values will come from. The center column shows the basic workflow to use for each scenario. The rightmost column provides some information on the general applicability of the scenarios, considering the size of the NoData area and the type of data.

 

An outline of some common workflows for replacing NoData in a raster.An outline of some common workflows for replacing NoData in a raster.

 

A: Replace NoData cells with a specific value

 

The easiest way to fill NoData is to replace those cells with a specific value.

By using the same value for all instances, you can apply it to all the NoData cells without having to consider their size or distribution. This method is most suited for discrete data.

As shown earlier, you can use the Is Null tool to create a raster that uses a value of 1 to indicate where a cell in an input raster is NoData, and a value of 0 for locations of any other value in the input raster.

The Con tool evaluates each cell of an input raster based on a logical condition. You could take the output from the Is Null tool and use it in the Con tool as the input that identifies which of the input cells are NoData and thus will be replaced with the value specified in the true parameter. However, the Expression in the Con tool also has the capability to do an is null operation, as shown here:

 

An example of using the is null option on the Con tool dialog.An example of using the is null option on the Con tool dialog.

 

In the Con tool, do the following:

  1. Set the raster with NoData as the Input conditional raster.
  2. In the Expression, set the Where clause to Value and select the is null option from the list.
    This will use the Is Null tool internally to identify which input cells are NoData and which are not.
  3. Set the Input true raster or constant value to the replacement value you want to replace NoData cells with.
  4. Set the Input false raster or constant value to the original input raster, to preserve those values in the output.
  5. Set the Output raster location and name.
  6. Run the tool.


The following illustration shows the NoData locations replaced with the new value of 2, while the other locations retained their original value.

 

An example of replacing NoData cells with a constant value of 2An example of replacing NoData cells with a constant value of 2

 

 

B: Replace NoData cells with values from a different raster

 

Instead of using a constant value, another raster can provide the values to replace NoData cells with.

If it makes logical sense to replace them all with cell values from another raster, there is no need to consider the size and distribution NoData areas. You can apply this method to both discrete and continuous data, but it is best to match the type of the replacement raster to the type of the raster you are updating.

In the Con tool, do the following:

  1. Set the raster with NoData as the Input conditional raster.
  2. In the Expression, set the Where clause to Value and select the is null option from the list.
  3. Set the Input true raster or constant value to the raster that the replacement values will come from.
  4. Set the Input false raster or constant value to the original input raster, to preserve those values in the output.
  5. Set the Output raster location and name.
  6. Run the tool.

 

Note:
If the raster providing the replacement values has different properties than the raster containing the NoData cells, such as extent, cell size, cell alignment, or coordinate system, remember to account for these differences. By default, the Con tool will use the union of the extents of the two input rasters, and the maximum cell size. To preserve the existing cell values that are not NoData and avoid them being resampled, be sure to set the Extent, Cell Size, and Snap raster environments to your original input raster.

 

The following illustration shows how values from the other raster replaced the NoData locations, while the other locations retained their original value.

 

An example of replacing NoData cells with values from another raster.An example of replacing NoData cells with values from another raster.

 

C: Replace NoData with the value of the nearest spatial neighbor

 

You may want to replace NoData cells with the value of the nearest (closest) cell. There is a tool that can do this: Nibble. It will replace the input values for a defined area with the nearest value that is outside that area.

While the method can fill in large areas, it may be less logical for the replacement values to come from further away. Since the replacement values come from the same set of values as the input, this method is most suited to discrete data.

The Nibble tool has two required inputs. The first input is the raster for which the values at selected locations will be replaced with the nearest value. The second input is a mask that identifies what those locations are. For this input, NoData cells represent locations that are within the mask, and cells with any other value are outside the mask. There are two additional parameters that give specific control over how NoData cells are handled. By setting them a particular way, NoData cells in the input raster can also define the mask area. This means that you can use the same raster for both required inputs.

In the Nibble tool, do the following:

  1. Set the raster containing NoData as the Input raster.
  2. Set the same raster as the Input raster mask.
  3. Set the Output raster to the location and name.
  4. Uncheck the Use NoData values if they are the nearest neighbor parameter.
    The objective here is to only consider cells with valid values to replace NoData cells.
  5.  Check the Nibble NoData cells parameter.
    This will make the tool replace the NoData cells inside the masked area with the value of the nearest neighbor outside the masked area, instead of remaining as NoData.
  6. Leave the Input zone raster parameter blank.
  7. Run the tool.

The following illustration shows how the value of the nearest input cell replaces the NoData locations. To make comparison easier, a dark red outline on the output raster identifies the NoData locations of the input raster.

In the case of ties, where there are two or more input cells that are nearest to a NoData cell, the output will be the lowest of the tied values. In the part of the figure below the dashed line, the numbers show the distance (in cell units) from each NoData cell to the nearest cell outside the mask. For a portion of the NoData cells, small arrows identify the specific input cell that provides the replacement value.

 

An example of replacing NoData cells with values from the nearest neighbor.An example of replacing NoData cells with values from the nearest neighbor.

 

D. Replace NoData with a statistic calculated from the surrounding cells

 

A statistic calculated from the surrounding cells can replace a NoData cell.

This can be done by incorporating the neighborhood tool Focal Statistics into the analysis. For each input cell location, this tool calculates a statistic of the values within a specified neighborhood around it. You can specify a variety of neighborhood shapes, such as a rectangle, a circle, or a pie-shaped wedge, in whatever size you need. There are a variety of statistics to calculate, such as the average or minimum value.

 

The size of the areas of NoData is a consideration for this tool. For individual or small groups of NoData cells, the small default 3 by 3 cells neighborhood size can calculate the replacement value. To fill larger areas of NoData, either expand the size of the neighborhood, or run the process several times. For discrete data, the statistics that are most appropriate to use with this method are the maximum, minimum, most common, and least common. For continuous data, the mean statistic is typically the best one to use.

 

If run by itself, the Focal Statistics tool will calculate a statistic value for every cell in the input raster. To perform the calculations only on the NoData cells, we will apply the technique of using the Is Null tool within the Con tool to identify those locations. Then we will use the Focal Statistics tool to calculate a new value for those locations only.

 

To embed this tool in Con, it is necessary to create a complex expression in map algebra. In ArcGIS Pro, this can be done in the Raster Calculator tool or in the Python Window.

 

Apply the following workflow to run the Focal Statistics tool in the Python window:

  1. Open the Python window and import the necessary modules.
  2. Set the workspace to where your data is located.
  3. Begin to enter the map algebra expression to construct the statement for the Con tool.
  4. For the Input conditional raster parameter, enter "IsNull()" and specify the name of your input raster.
  5. For the Input true or constant value parameter, specify the necessary syntax for the Focal Statistics tool to calculate the output for the neighborhood and statistic of choice.
    1. Enter the name FocalStatistics, without a space.
    2. Set the Input raster to the raster you are processing.
    3. Set the Neighborhood parameter the shape and size of the neighborhood around the NoData cells you want to calculate the statistic for.
    4. Set the Statistics type parameter to the one you want to calculate.

  6. For the Input false or constant value, specify the original input raster again.
  7. Run the expression.
  8. As needed, set up and run the expression again to fill in large NoData areas.
  9. Since the result of the map algebra expression is a temporary raster object, use the Raster save method to persist the final output raster.

 

The following graphic shows an example of the syntax used to create a rectangular 3 by 3 cell focal neighborhood:

 

An example of a map algebra expression in Python that incorporates the Focal Statistics tool.An example of a map algebra expression in Python that incorporates the Focal Statistics tool.

 

The following illustration shows how values from input raster replaced the NoData locations with the maximum value in a 3 by 3 cell neighborhood around them. In this case, the size of the NoData area is larger than the size of the focal processing window. This would cause some of the NoData locations to remain as NoData in the output from the focal operation, since there were no input values to do a calculation for. To resolve this, you can either run the operation again to replace the remaining NoData value, or use a larger neighborhood. This example ran the focal operation two times, with the output from the first pass used as input to the second.

 

An example of replacing NoData cells with the maximum value of the nearest cells. For a 3 by 3 cell focal neighborhood, the size of the NoData area required two passes of the tool.An example of replacing NoData cells with the maximum value of the nearest cells. For a 3 by 3 cell focal neighborhood, the size of the NoData area required two passes of the tool.

 

The following is the syntax used in the Python command window to create the final output raster for this example:

import arcpy
from arcpy import env
env.workspace = "C:\project1"

OutFS1 = Con(IsNull('inRas.tif'),FocalStatistics('inRas.tif',NbrRectangle(3,3, 'CELL'), 'MAXIMUM'), 'inRas.tif')
OutFS2 = Con(IsNull('OutFS1'),FocalStatistics('OutFS1',NbrRectangle(3,3, 'CELL'), 'MAXIMUM'), 'OutFS1')
OutFS2.save("C:\Project1\outFmax.tif")

 

Summary

 

In this article, we touched on what NoData is, how to process it away, and some things to consider about the nature of your input. We then went over several solutions for replacing NoData cells in a raster using the functionality available in ArcGIS Spatial Analyst.

There are other scenarios that can have a similar objective. One example is to use an interpolation tool replace NoData cells in an elevation raster, with the goal of maintaining the local trends in the surround surface cells. Another is to first classify the nature of the NoData areas and then apply different workflows to each category sequentially to get a final product.

You can take the logic behind the workflows shown here and apply them to many applications in your own work.

 

Additional reading

 To learn more about this aspect of raster analysis and specific tools used, start out by looking at the following help topics:

  


The original blog was first published in the ArcGIS Blog, and can be found here:
https://www.esri.com/arcgis-blog/products/analytics/analytics/fill-nodata-holes-in-raster-data/

About the Author
Product Engineer on the Spatial Analyst team.