Reality mapping projects are often large and require extensive processing. ArcGIS Reality Studio is designed to scale with your needs.
By understanding how processing jobs (such as reconstructions) are structured and distributed, and when to scale your production, you can optimize both time and hardware, delivering your results with confidence.
Splitting Jobs into Stages and Tasks
Reality mapping projects typically involve substantial workloads and require significant resources. Processing a large project on a single machine may take days or weeks. To manage these demands, ArcGIS Reality Studio divides processing jobs into sequential stages – each consisting of multiple tasks that follow a common pattern:
- An initial task analyses the data and generates processing tasks based on project area size and input data resolution.
- One or more tasks then carry out the necessary processing.
- A finalization task consolidates the results before proceeding to the next stage.
This method enables the efficient processing of even the largest projects.
Distribution of Tasks
Whenever a job is created, it is submitted to a centralized workspace. This workspace is used by Reality Studio to store tasks associated with the job, including necessary processing information, task status, and the machine handling each task (often referred to as processing node).
Processing nodes may access this workspace either by contributing to the processing work, or by monitoring the information available within the workspace.
When actively contributing to a workspace, processing nodes automatically pick up a task to work on. Using a single workstation, the machine will sequentially progress through the tasks of a reconstruction until final results have been generated.
The true power comes when connecting additional nodes to the same workspace. As long as there are tasks available in the workspace, all these processing nodes will get their own task to work on, drastically reducing the overall time required to finish a job.
In addition, Reality Studio monitors the progress of all tasks in process. If a task fails on a node due to issues such as a crash, reboot, or network outage, Reality Studio assigns a new task to that node. The failed task is then retried on another node to improve the likelihood of completion. This approach allows processing within a workspace to continue and supports a reliable processing environment with minimal interruption.
The result? A robust, fault-tolerant system that maximizes hardware utilization and scales with your needs.
Further reading/resources:
When to Scale Up Processing: Use Cases and Benefits
Already by processing using a single machine, you benefit from this centralized workspace:
- Hold your processing at any time: When interrupting processing, the workspace will keep all completed tasks, and you can resume from this state at any time.
- Automatic recovery: Tasks within your project are automatically retried for processing.
- Continuous operation: Submit multiple jobs to the workspace, and your node will automatically check for available tasks in the order of the job queue. As long as there are tasks available in any of the projects, your node will keep processing.
The true power of Reality Studio processing in a workspace comes when multiple nodes are contributing.
Process large amounts of data in less time
By distributing the processing load across multiple processing nodes, tasks can be completed in parallel, reducing the time required to complete reconstruction jobs of varying sizes.
Examples where distributed processing may be beneficial include:
- Large-scale projects: Processing extensive data sets for country-wide or city-wide meshes can be accomplished more quickly by increasing the number of processing nodes in the workspace.
- Time-sensitive mapping: Utilizing several nodes can minimize processing duration for rapid response scenarios, such as disaster mapping.
- Acquisition delays: If data acquisition is delayed due to factors like weather, permitting issues, or equipment malfunctions, adding more processing nodes can help reduce processing time to meet project deadlines.
Manage multiple projects with ease
Although each project can be processed on a dedicated machine, utilizing one or more shared workspaces offers enhanced flexibility for managing your processing pipeline.
Key advantages of a shared workspace include:
- Maximizing Hardware Utilization: Reality Studio seamlessly retrieves tasks from queued projects whenever active projects have no pending tasks remaining. This enables consistent utilization of your processing hardware and optimizes overall workspace throughput.
- Bottleneck Reduction: Certain parts of the process can (due to their nature) not be distributed. Adding additional projects to the queue provides new tasks for the workspace, helping to mitigate these potential bottlenecks.
- Adjustable Throughput: The workspace efficiently tracks outstanding tasks, while processing speed is determined by the number of participating nodes. You can reallocate processing resources between workspaces by stopping contribution for the current workspace and starting contributing to another.
Whether you are managing projects for a survey company or a national mapping agency, adopting a shared workspace streamlines operations as your project volume grows.
Scale Your Workspace to Your Needs
Adding more nodes to your workspace can increase the number of projects you can process within a given time. However, it is important to note that performance improvements aren't always linear.
A reliable metric to estimate processing performance and plan how many nodes may be needed to complete a project within a given timeframe is the workspace throughput.
Workspace throughput is a metric that measures how many gigapixels (GPix) are processed per day in a given workspace. It’s particularly useful for estimating timelines and allocating resources. Your workspace throughput is influenced by several factors including hardware specifications (like disk speed and memory), network bandwidth, and project characteristics such as image resolution and overlap.
This guide will help you measure it based on project volume and computation time:
Conclusion
Ultimately, every organisation seeks to determine the most effective workspace configuration to meet its project processing deadlines. Since jobs are divided into manageable tasks, the time required to process projects depends largely on the number of nodes assigned to the workspace. The flexibility to scale your workspace dynamically allows you to adjust resources appropriately based on the project workload.
For further details on creating a workspace, contributing resources, or managing jobs, please consult the ArcGIS Reality Studio documentation.