I am going to attempt to create an ongoing discussion of how to handle large datasets. I am using a number of tools in ArcGIS (clip, intersect, selections, etc) and I have found some of the information out there very useful, however some of the information targets specific issues where some of it is a bit old. I believe that there may be new processes out that adds more flexibility. So far, this is what I have found:
From this thread at GIS Stacked, the most comprehensive discussion I have found, provides some very good solutions to processing large datasets whcih also includes a link to ArcGIS help for Permance Tips for Geoprocessing Services
There is the ESRI Help tip for Tiled Processing of Large Datasets which provides information on performance and scalability of feature overlay tools (Intersect, Union, etc)
There is another thread that's about 3 years old from GIS Stacked, discussing ArcScripting and a good explanation from a former ESRI employee about his throughts on processing large datasets.
Another ESRI Powerpoint re-enforcing some of the techniques to process large datasets.
Another ESRI List of ways to successfully overlaying large, complex datasets in Geoprocessing
For what it's worth, I wrote a script a month or so ago that attempted to avoid a crash that was produced when running a dataset locally from a laptop hard drive. The idea was to intersect 2 very large datasets (both were the size of the state of Oregon). The memory allocation couldn't handle the size of the datasets so I basically took one dataset, converted it to a layer and incrementally changed the definition query to pull in groups of a thousand or so records. This allowed me to intersect the 2 datasets without it crashing. Mind you, the script took about 8-9 hours to run, but it was successful!
Please post any addition throughs, process flows and increase in performance solutions you have come across. I am particularly interested in speed performance.