Mosaics and Image Server services….source vs derived vs reference…Advice requested

3653
7
12-12-2017 06:36 PM
RebeccaStrauch__GISP
MVP Emeritus

Mosaics and Image Server services….source vs derived vs reference…Advice requested

Note: Sorry, this is long…but greatly appreciate anyone who reads and can give advice. I know there are many approaches, but looking for suggestions so I can get my head wrapped around it, so I can see what can work for us. A logical approach just hasn’t quite clicked for me yet. 

I'm putting this as a question to get more views and maybe more responses....but realize there may not be any one answer.

I am looking for some advice on setting up our mosaics and Image Server services (10.5.1).  I know the basics of how to create the source mosaics and create image services from them (I have done this in 10.2.2 and initial test in 10.5.1).   I have read the various help and documents on what can be done, and the various workflows:

And many others, but my head spins whenever I try to figure out a logical way to set this all up.

Source Data: I have 5 folders with about 4500 tiffs (plus .tfw, xml, etc.), one folder four each of 5 “categories”  (RGB, PAN, CIR, DEMEllipsoid, DEMOrtho).   After several crashing, corrupt, and blank or checkerboard issues on zooming in (see notes at bottom for how we fixed), I have successfully created the fgdb with the 5 “source”  mosaics with the default reference system, and used the “-1” for overview creation and have done the default “analyze”, etc. I also created an intial service for the RGB (which was how we found and debugged the zoom issues).   All our source data is referenced with the UNC path to the folder (so it should be portable to our production machine, when ready).

Source (initial) mosaics:  

Based on some of the suggestions, I created a file GDB with the  “s_”  designation to indicate source mosaics.  For lack of better name to start, I attached the wkid, i.e.    s_wkdi102247.gdb

      

Goals (in no particular order, but numbered for discussion purposes):

  1. Have mosaics and/or services in difference projections (3338 and maybe WMA)
  2. Create a cache in 3338 for distribution to area offices/offline and/or map services
  3. Create hillshade, slope, aspect, shaded relief, and maybe some other function stil to be determined…..in 102247 and/or 3338 (i.e. re-projected)

Questions (finally…again, in no particular order):

  1. Am I better off creating “reference” mosaics with functions OR Image Server service with functions for:
    1. The re-projection to 3338?
    2. The hillshades, etc. in 3338?
      1. And should these be function strings or a new reference mosaic off of a. above?
  2. If my data doesn’t overlap (i.e. time), isn’t from different mosaics, etc., that is, if I’m not combining other mosaics, is there a reason for “derived” mosaics? Or am I better off with reference mosaics?
  3. I have all of the source mosaics in one fgdb. I know that one corrupt mosaic can corrupt the entire fgdb (from yesterday’s experience).  Any suggestions for a better strategy or organization of the source and all the reference fgdb/mosaics that will make it efficient and logical?  Anyone have a good sample of how they did or would do it?

 

Thanks for any and all suggestions.

 

Note: for those wondering how I fixed the crashing and zooming in errors (in case it helps others):  

  • accessing the source on a more “local” read-only server helped with the crashing on creating the mosaic.  Also sped up the creation process easily 4 fold or more.
  • The zoom issues were mainly because the OS service for “ArcGIS Server” log-in-as had a different account running it from previous installs. This local account did not have permissions required to access the source images. We did 2 things:
  • used a different service account for the log-in-as, an d made sure it was in the group that had read-access to the source folder
  • created a new “data store” connection in the ArcGIS Image Server (although, the old one may have worked after the above)
  • btw – we first thought the issue was the “default” overview levels create was 4. Using the “-1” option for optimal levels created 8, which improved the zoom issues, but didn’t fix it.  That is when tech support suggested the permission issue.  That didn’t dawn on me since it worked on other machines (10.2.2) and Desktop could access for the creation of the mosaic.  However, it was the Image Server that didn’t have the correct access when it “ran out” of overviews. (i.e more dynamic).

Tagging, since not sure best location:

https://community.esri.com/community/gis/managing-data?sr=search&searchId=3052d0cf-f030-4222-8d73-3a...https://community.esri.com/community/gis/imagery-and-remote-sensing?sr=search&searchId=2f717d35-ae8d...https://community.esri.com/groups/elevation-data?sr=search&searchId=868f085b-27b2-42ed-8a73-78d9e95d...https://community.esri.com/community/gis/enterprise-gis/arcgis-image-server?sr=search&searchId=48361...https://community.esri.com/community/gis/enterprise-gis?sr=search&searchId=e02178e7-6ad8-48b3-a379-8...https://community.esri.com/community/gis/enterprise-gis/arcgis-enterprise?sr=search&searchId=e02178e...

7 Replies
ScottMoore__Olympia_
Esri Contributor

Hi Becky!

I am going to share with you a number of things I have learned about this when working with other people trying to do similar things.  Much of this information has come from other people within Esri.

Regarding your question B above, I think having one service/mosaic that has all of the data in it can be very useful, instead of having many.  This would be why I might want to use the derived mosaic.  We often recommend making source mosaic datasets for homogeneous datasets, and then a derived one (using the table raster method) to combine them into a single source.

You may find this useful if you haven't seen it:

Derived mosaic datasets—Standard Workflow_Creating Mosaic Datasets | ArcGIS 

1. Pyramids & statistics for Source Rasters

 

It is important to differentiate between the use of the mosaic datasets for dynamic & cached services.  In the case of cached services, there is relatively little benefit to pre-building the pyramids (except perhaps for a potentially tiny improvement in the actual caching process) since the cache itself acts as the equivalent of the pyramids. 

 

However for dynamic services, building pyramids can improve response when zoomed in to the display scales that are larger than the display scales for displaying the mosaic dataset overviews.  This display range often extends from the source data resolution to about 10-20x the source resolution.  If performance and access to underlying radiometry is the goal then it may be prudent to build a few levels of pyramids.  Do recall however, that resampling of the digital numbers results in averaged values of the pixels when viewing and interrogating the pyramid layers.

 

For the purpose of ensuring availability of the pyramids (and to avoid accidental loss of access to pyramids that are stored external to the source rasters, I often recommend building the pyramids as internal to the source file.  In that way, the pyramids have no risk of being separated (pathwise) from the source rasters.  The penalty is slightly larger source raster files.

 

In terms of the statistics, it is always recommend that statistics be computed on the source rasters (again during ingest) and stored in the TIFF tags.  It is important to use a reasonable skip factor for rows and columns to make the statistics computation fast.  That is, it is not necessary to use each pixel in a file to achieve a representative mean and standard deviation for each file.  A sampling of 5-10% of the set of pixels is usually sufficient.  Thus, the skip factor can be computed by dividing the number of rows and columns in the file by 20 and using that as the skip factors.  For pre-stretched, pre-rectified, color-balanced 3x8=bit RGB content (like NAIP), having statistics will permit smoother rendering (via the selection of one of the stretch options) of the imagery especially where multiple years are in the collection.    For elevation data (32-bit float), it important to build the statistics to ensure that the rendering of the 32-bit data into an 8-bit stream properly reflects the range of values in the elevation data.  If not built, wide (or narrow) ranging data can appear all black.  It is important to compute the statistics for elevation data since this also establishes the max/min range for the data.

 

All of these steps can be completed during the ingest workflow.

2. There are two approaches to handling multiple spatial references for mosaic datasets. If the source data is in multiple spatial reference systems but a single spatial reference is the output of the service (i.e. Web Mercator), the it preferable to allow ArcGIS to define the required geodata transform that transforms the source SRS into the mosoac dataset SRS on ingest.  The derived mosaic dataset then does not need to impose another geodata transform on rendering.  IT can still however support definition queries or other functions.

 

If there are to be multiple services from a mosaic dataset in multiple SRS (i.e. WM, State Plane, UTM, …) then, as defined above, allow ArcGIS to define the appropriate geodata transform to each raster during the ingest into a source mosaic dataset (perhaps defined with a SRS of web Mercator).  Then define multiple derived mosaic datasets (one for each required output SRS and use these to publish the services.  ArcGIS will define the output geodata transform from the source to the derived mosaics.  If caching these services, make sure to define a custom cache tiling scheme file which represents to desired spatial reference.

RebeccaStrauch__GISP
MVP Emeritus

Hi Scott Moore ..thanks so much for your response.  Things are starting to make more sense. But …

These are a few follow-up comments/questions, based on what I think I am reading here, plus some additional testing…     btw.. We DID have some success testing changing the spatial reference with derived datasets, so I think I understand that now (i.e. using derived vs. function)

First, the steps used to create the source gdb and 5 mosaics (snippets from my script):

  • CreateFileGDB_management(targetFolder, newFGDBname)
  • set the variables for the AddRastersToMosaicDataset command. I either took most of these suggestions from the command default, reading other posts, or talking with tech support (this process of testing has been going for a while…just now putting into practice on 10.5.1)   ...from your response, assuming I should change some of these?
    • rastype = "Raster Dataset"
    • updatecs = "UPDATE_CELL_SIZES"
    • updatebnd = "UPDATE_BOUNDARY"
    • updateovr = "NO_OVERVIEWS"
    • maxlevel , maxcs, maxdim, and spatialref  all set to "#"  (i.e. default)
    • inputdatafilter = "*.tif"
    • subfolder = "NO_SUBFOLDERS"
    • duplicate = "OVERWRITE_DUPLICATES"
    • buildpy = "NO_PYRAMIDS"
    • calcstats = "NO_STATISTICS"
    • buildthumb = "BUILD_THUMBNAILS"
    • forcesr = "#
  • then looped thru each “type”, with the same fgdb as the target
  • CreateMosaicDataset_management(newFGDB, mdName, sr)
  • AddRastersToMosaicDataset_management( theMD, rastype, srcFullpath, updatecs, updatebnd, updateovr,  maxlevel, maxcs, maxdim, spatialref, inputdatafilter,  subfolder, duplicate, buildpy, calcstats, buildthumb, comments, forcesr)
  • BuildBoundary_management(theMD, "#", "OVERWRITE", "NONE")
  • DefineOverviews_management(theMD, "#", theMD, "#", "#", "-1")
  • BuildOverviews_management(theMD, "#", "NO_DEFINE_MISSING_TILES", "GENERATE_OVERVIEWS", "GENERATE_MISSING_IMAGES", 'IGNORE_STALE_IMAGES')
  • ImportMetadata_conversion(Source_Metadata, "FROM_ARCGIS", theMD, "ENABLED")
  • AnalyzeMosaicDataset_management(theMD)
  • the last two took care of the analyze warnings when creating the service from the source mosaic. It did not mention I needed to run statistics or build pyramids

Follow up questions on “source mosaic” design/creation

Regarding your question B above, I think having one service/mosaic that has all of the data in it can be very useful, instead of having many.  This would be why I might want to use the derived mosaic.  We often recommend making source mosaic datasets for homogeneous datasets, and then a derived one (using the table raster method) to combine them into a single source.”  (emphasis added)

I have our 5-different source mosaics in one FGDB (RGB, CIR, PAN, DEMEllipsoid, DEMOrtho), each with about 4500 tiffs. The only relation to the other mosaics in the fgdb is the extent and spatial reference.   I’m creating the source mosaics with the source spatial reference because I figured that would be the cleanest and fastest for the source mosaic, (and I didn’t have much luck changing the SR in 10.2.2 when adding raster files in the past)

Q1: Is that too much for one fgdb with multiple derived/reference/functions/services to handle? Am I better off breaking them into 5 source fgdb?  (thinking of corrupt mosaics, and/or locked files here too)

Q2:  “(using the table raster method)”     do you mean “Dataset” ??

Q3: Do you consider extent and spatial reference enough to consider them homogeneous?

Q4: if I want to create services of the source mosaic’s (i.e. without changing anything), should I still use a derived mosaic ( …again, thinking of locked files.)    

Q5: How about if I decide to include optional function “list”?

Follow up questions “Pyramids” …and overviews:

However for dynamic services, building pyramids can improve response when zoomed in to the display scales that are larger than the display scales for displaying the mosaic dataset overviews..”

Pyramids vs. overviews… I’m a bit fuzzy on this.  I see our Image services as being dynamic.  If a cache is created, it will be used as a Map service with scale factors matching what we need for web maps/apps.  I will use the Image Server service just to create the cache (since it is so much faster).

So for argument sake, think dynamic Image services:

Q6:  Where is the line between creating pyramids vs building overlays? 

Q7: Are both needed, and if so, what options do I want to use for them to work well together?

As mentioned in above showing the variables I used when adding the rasters, I did not create pyramids or statistics in input.

Q8: Is this a problem, and should I recreate the mosaics with these turned on, or can I “run/create” these after the fact?
Q9:  I ran the “define overlays” with -1 and it created 8 levels.  Should I delete these and recreate aver building the pyramids??

“For the purpose of ensuring availability of the pyramids (and to avoid accidental loss of access to pyramids that are stored external to the source rasters, I often recommend building the pyramids as internal to the source file.  In that way, the pyramids have no risk of being separated (pathwise) from the source rasters.

Q10:  Can you explain what you mean by “I often recommend building the pyramids as internal to the source file.”     Or point me to a doc that explains what you mean?  

Q11:  Is this done “outside” of the add raster process, and do you mean in the folder/server with the tiffs??

Follow up questions on Statistics:

“In terms of the statistics, it is always recommend that statistics be computed on the source rasters (again during ingest) and stored in the TIFF tags”

Q12: by “ingest”, do you mean during the add raster process?

Q13: by “the source rasters”, do mean the .tiff in the source folders, or in the source mosaic?  Seems if “ingest” means the add rasters than I’m really not sure.

I really need to check out of the info you mentioned about the statistics variables, but that may not happen until after the holidays.

New subject and questions on Functions…. Mosaic vs Image Server services:

From what I think I understand, we can have multiple function-strings that the user can chose, built-into one service.  (??) 

Q14: Do these same function need to also be in the derived mosaic database?

Q15: Where do we find the function-string templates for the Image service? …or is it just what we create and export from the mosaic functions?

Just  not sure how those all work together…..whether in the mosaic vs in the service is better, if both are needed or not.

----

Seems like I’ve been trying to grasp all these concepts forever. With the 10.2.2 version of the Image extension, I was able to do the very basic things and that was enough for that time, but with 10.5.x, it was time to branch out and maybe do it right (or at least better). So again, all help it much appreciated.

Obviously, many of the questions are related and may have just one answer…I just separated and numbered for convenience if you need to refer to one.

BTW - I am hoping to talk to some staff in another agency later today (that are working with their own copies of the same source data), and that will also help.  I’ll be away from this process for a few weeks but reading up when I can, so having answers here on the forum is great (and hopefully will help others).

0 Kudos
CodyBenkelman
Esri Regular Contributor

Rebecca

We are working on a reply, but before this thread gets any longer, can you clarify if there is any relationship between the 3 imagery folders, and again between the 2 DEM folders?  

Also, is everything in one single folder truly one single collection, or (for example) does the RGB folder include images captured at two different times, and/or two different resolutions?

I thought you had 4500 files total, and all 5 folders were distinct and unrelated, but your last reply says "about 4500 each" so unless that number is just a coincidence, I'm wondering if these files are all one big project.  If the RGB, CIR, and PAN folders are three versions of imagery from the same project, and your Ellipsoidal and Orthometric DEMs are two versions of the same data, I would likely recommend you take a step back.  We may not be at the proper starting point.

Cody B.

RebeccaStrauch__GISP
MVP Emeritus

Hi Cody Benkelman 

i will be semi-offline today but I'll try to clarify.  First I should state that we are not the primary agency to delver this data (that is being done thru our statewide gis data council) but more for our internal analysis, basemap, web use.

there are about 4500 tiffs for each type.  Since we are not the primary provider, I am only using the "latest" version of each tiff extent.  The data is being collected thru an initiative..over several years (due to suze and various logistic and cloud cover issues for Alaska).  I am working with pre-processed, not the raw data.

each type RGB, PAN, etc., has its own source folder with all its related tiffs, tfw, xml, etc. files.  Again, about 4500 tiffs for each type that fivers most of the state.

i have made progress on this, and actually starting to understand the lowPS, highPS, etc. of the footprints, and the templates a little more after talking with another employee in a sister department. (Although I would love to see a write up on thus, if you have and links) He has been working with LiDAR and much if the same data, but if a little bit more of the raw format, and multiple timeframes, etc.  His agency is more involved with providing the services to the public.  I'm hoping he will add a response to this thread with some of the things we discussed on the phone, both for my reference and to clarify some of the items.  We covered a lot and I knw I dId not retain it.

based on that discussion and some additional testing, I'm still leaning towards not creating pyramids, and instead sticking with the overviews, mainly because the source files are on a different server so trying to avoid network traffic.  Ths is what We were thinking would be best.  But I would like to hear you thoughts on this and some of the other questions.  I do feel like I am making progress, however at this point I am in learning phase and will not be able to test much more until after the holidays.   

Thanks for your review and help on this.

0 Kudos
RebeccaStrauch__GISP
MVP Emeritus

Hi CBenkelman-esristaff ,

I'm working on our upgrade to 10.5.1 now, and I'm looking to just copy and past the mosaic databases I created on our test/dev server to our production.  I'm not sure if you ever finished a final reply to the above or not, but assuming I have most things set up ok, I want to make sure that I am grabbing the source data from the new remote server (storing the source) and not the local server.  

So these may be easy questions, how can I find out the source path of each of the images I loaded?  Or once overviews are created, doesn't it matter?

I've tried to Identify, look at the attributes and look at the mosaic metadata.  The copied mosaic geodatabases seem to view just fine, but I can't find the path to make sure they are pulling from the correct source server.

0 Kudos
CodyBenkelman
Esri Regular Contributor

Rebecca

see Export Mosaic Dataset Paths—Data Management toolbox | ArcGIS Desktop  and note you can run this from Catalog by right clicking on the MD

I haven't re-read the whole thread above but if there are still outstanding issues from the original list of questions, can you summarize and let us know?
Thx

Cody

RebeccaStrauch__GISP
MVP Emeritus

Excellent.  That was what  was looking for and didn't think about looking for it in the Toolbox.  I still have a lot to learn on what all mosaics can do.

re: the rest of the thread, I don't remember exactly where I left it either, so I'll read thru it again and follow up with a summarized list of questions, if still needed.  I remember there was a lot discussed above, and then I was out for a while...so my brain flushed it all.

0 Kudos