MRF - S3, Mosaics, Caches and Optimization

2964
1
04-10-2018 01:55 PM
by Anonymous User
Not applicable

Hi - we regularly use the Optimize Rasters tools to push a lot of raster; orthos and DEM from UAV and LIDAR collection, as MRF files to AWS S3 buckets and mosaic datasets hosted on SDE on an EC2 server, that are then published out as image services on to a number of our web maps.

Everything is scripted with Python and runs (mostly smoothly), we have a watcher script sitting on the EC2 server monitoring new uploads to the S3 bucket, that then downloads a local version of the mrf proxy files. I have read recently about other options that may allow us to get away from this and load the mrf directly to the mosaic as a different raster type

MRF load to mosaic as table type

@Peter Becker, in one of your user conference presentations :-

http://proceedings.esri.com/library/userconf/proc17/tech-workshops/tw_630-625.pdf 

you mention in the slice "Embedding Raster Proxies into Mosaic Datasets" that there is a script that will convert the raster proxies (I assume this is the .mrf file) into a table to allow adding to the mosaic as a raster data type. Just wondering if this is a publicly available script and i f so where I might find it?

You also mention the ability to embed the proxy into the mosaic, if not found any other references to this, just wondering if this is a supported procedure now? and if you have any further details?

I've also been checking out the git hub on raster types in python to see if I can use that python raster type to load the mrf. I've checked this page but can't find much more info as to whether this is possible.

GitHub - Esri/raster-types: A set of Raster Types for ArcGIS developed using the Python language. 

Cacheless MRF with S3?

Also - is it possible to get away from having to download a cache of the mrf local to the EC2 with image server and have the app/map load directly from the s3 bucket? (still using a mosaic, not a tiled cache)

Cache locks

Fairly regularly we are getting some of the cache tiles locked by the OS and then the image service either blank or black tiles. We then need to remove all the locks on the cache files (stop the services and a python.exe that we think is our synching script) then remove all the cache tiles and restart the services. 

I know this can happen if trying to access the mrf mosaic via desktop and proxy but are is there anything else that could cause this?

MRF optimization

Finally any with any general tips for image service optimisation using MRFs and dynamic moasics (NOT tile caching) as we need to render, change mosaic orders, filter, apply functions, etc., to all our services, that would be greatly appreciated. Anything right from mosaic creation, MRF tile and index sizes, DB tuning, etc would be appreciated.

Thanks,

Dave

0 Kudos
1 Reply
PeterBecker
Esri Regular Contributor

Good to hear you are using MRF and OptimizeRasters.

I hope you are using MDCS (see http://www.arcgis.com/home/item.html?id=0ab1ce3117d24eca9dd268171456591f) for automation of creating mosaic datasets.

Concerning Embedding MRFs, the simplest way is to use the table raster type and define as input a CSV file that has “Raster” as heading and contains the escaped XML of the Raster Proxies. There is an update coming to OptimizeRasters very soon that will provide an option to create such a table (vs writing raster proxies as many files). This works well for raster datasets. If you use different raster products (eg satellite imagery) then one currently need to write the raster proxy so that all the associated metadata is accessible. The program to embed the raster proxies into the mosaic dataset is not yet publicly available. (We are working to get this out).

Re:  ‘cacheless MRF with S3’. Note you can instead of using a raster proxy just provide in a table the \\VSICurl\Http:\\... or \\VSIS3\Http:\\ at the start of the file name to the raster in a table and it will work. The raster can be an MRF, but can also be TIF etc. But take care some flavors of TIF are not suitable. Directly accessing the files is not a performant as using a raster proxy as GDAL will make unnecessary request to the S3 buckets and the caching does speed up subsequent access.  Note if you want to run caching off there is an environment variable that can be set. (Check Optimize Rasters documentation)

Re: Cache Locks. There should not be any locks on the raster proxies or the caches that are created. Note it is not advisable to share a cache between multiple machines. Best to keep the cache to each machine. The locking you are referring to could be related to a conflict between Desktop (you as a user) and Server (ArcGIS System as a user) accessing the same files. Check the Optimize Raster documentation on how to set rights appropriately.

Re: MRF Optimization.  Some general points. Optimum performance is gained by using MRF as a file format and using Raster proxies that reference them and cache the files (also as MRF) on a local ephemeral drive. We used M3 EC2 instances as they had very fast ephemeral drives. The newer M5 instances do not have ephemeral drives so to use those you need to set up EBS, but use IOPs optimized. We should be documenting optimum setting soon. For compression do use LERC if you want lossless or controlled lossy. If you are OK with some data loss and using <12bits use JPEG. For storing the mosaic datasets, you are using SDE which is good, but consider using RDS. If using file geodatabase, its better to copy the FGDB to a local (non shared) directory before publishing. File geodatabase is chatty and running it on a shared drive can cause bottlenecks.