I am not an expert in image processing using ArcGIS, but I suspect that if all one wants to do is view the imagery as a backdrop, the hardware you describe is likely OK. However, if any kind of bulk processing is needed, then one has to look at the amount of information that needs to be handled in RAM at one time.
You mention that the space taken by the data is a bit over 200Gb. That could be misleading. For instance, if each image is a 512 by 512 tile, 3 band image, 8 bit pixel depth, uncompressed space needed would be somewhere higher than 10 Tb. Again, I don't really know in detail how ArcGIS handles imagery, but this would be my main concern. Hence, Michael Volz's recommendation to try doing what you want to do on a small sample is the wiser path to follow.
I do have experience in simply displaying thousands of rasters that I imported into a mosaic dataset. (I'm talking about 8 Tb). But if I wanted to do any geoprocessing on the whole full resolution (say, raster to vector), my system froze, even though I have 32 Gb of RAM & 8 cores. I have had to split the data into manageable subsets. Of course, I could have been doing something wrong, but my work got done.