<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Performance Degradation &amp;amp; Timeout during Large-Scale Data Extraction (8.8M records) via Hosted Feature Service REST API in ArcGIS REST APIs and Services Questions</title>
    <link>https://community.esri.com/t5/arcgis-rest-apis-and-services-questions/performance-degradation-amp-timeout-during-large/m-p/1674897#M5073</link>
    <description>&lt;P&gt;Yes—what you’re seeing is common at this scale. There isn’t an official “2M hard limit,” but very large sequential resultOffset paging gets progressively slower and can start failing (timeouts/“invalid query”) because the service still has to scan/skip more rows as the offset grows, and AGOL also enforces fair-use throttling under sustained high-volume pulls.&lt;/P&gt;&lt;P&gt;What works better (more stable patterns):&lt;/P&gt;&lt;P&gt;Stop using deep offsets. Page by ObjectID instead.&lt;/P&gt;&lt;P&gt;First get OID stats: returnIdsOnly=true (or returnCountOnly=true + orderByFields=OBJECTID).&lt;/P&gt;&lt;P&gt;Then query in chunks like: where=OBJECTID &amp;gt; x AND OBJECTID &amp;lt;= y with orderByFields=OBJECTID and resultRecordCount=2000.&lt;/P&gt;&lt;P&gt;This avoids the “skip N rows” penalty that kills performance after a couple million.&lt;/P&gt;&lt;P&gt;Use asynchronous extract instead of raw query when you can.&lt;/P&gt;&lt;P&gt;Prefer Extract Data / Create Replica (sync/replica workflow) for big exports—those are designed for bulk movement and tend to be more resilient than millions of /query calls.&lt;/P&gt;&lt;P&gt;Parallelize carefully (only if needed).&lt;/P&gt;&lt;P&gt;If you parallelize, split by OID ranges (or time slices) and keep concurrency modest (e.g., 3–8 workers). Too much parallelism will trip throttling faster.&lt;/P&gt;&lt;P&gt;Tune for reliability.&lt;/P&gt;&lt;P&gt;Use POST (not GET) for long params, set timeouts/retries with backoff, and request only needed fields (outFields) + returnGeometry=false unless required.&lt;/P&gt;</description>
    <pubDate>Thu, 25 Dec 2025 19:15:49 GMT</pubDate>
    <dc:creator>VenkataKondepati</dc:creator>
    <dc:date>2025-12-25T19:15:49Z</dc:date>
    <item>
      <title>Performance Degradation &amp; Timeout during Large-Scale Data Extraction (8.8M records) via Hosted Feature Service REST API</title>
      <link>https://community.esri.com/t5/arcgis-rest-apis-and-services-questions/performance-degradation-amp-timeout-during-large/m-p/1674834#M5072</link>
      <description>&lt;P&gt;Hello Community,&lt;BR /&gt;​We are encountering a significant performance bottleneck when performing large-scale data extraction from a Hosted Feature Service (approx. 8.8 million records) via the ArcGIS Online REST API.&lt;BR /&gt;​The Issue:&lt;BR /&gt;We are using standard pagination (resultOffset and resultRecordCount) with a batch size of 2,000.&lt;BR /&gt;​0–2M Records: Requests are stable and performant.&lt;BR /&gt;​~2M+ Records: Response times increase significantly.&lt;BR /&gt;​3M–3.5M Records: The API begins returning timeouts and "invalid query" errors, despite the query parameters being consistent with previous batches.&lt;BR /&gt;​Question:&lt;BR /&gt;​Is there a known platform-level "cumulative query constraint" or throttling mechanism for very high-volume extractions in a single session?&lt;BR /&gt;​Are there recommended patterns for this scale? (e.g., Would parallelizing queries across different ObjectID ranges or using the Extract Data tool via Geoprocessing be more stable than raw REST pagination?)&lt;BR /&gt;​Technical Details:&lt;BR /&gt;​Service Type: Hosted Feature Service (ArcGIS Online)&lt;BR /&gt;​Auth: OAuth 2.0 / User Token&lt;BR /&gt;​Method: GET / POST via /query endpoint&lt;/P&gt;</description>
      <pubDate>Wed, 24 Dec 2025 12:16:44 GMT</pubDate>
      <guid>https://community.esri.com/t5/arcgis-rest-apis-and-services-questions/performance-degradation-amp-timeout-during-large/m-p/1674834#M5072</guid>
      <dc:creator>ÁdhavanBalakrishnan</dc:creator>
      <dc:date>2025-12-24T12:16:44Z</dc:date>
    </item>
    <item>
      <title>Re: Performance Degradation &amp; Timeout during Large-Scale Data Extraction (8.8M records) via Hosted Feature Service REST API</title>
      <link>https://community.esri.com/t5/arcgis-rest-apis-and-services-questions/performance-degradation-amp-timeout-during-large/m-p/1674897#M5073</link>
      <description>&lt;P&gt;Yes—what you’re seeing is common at this scale. There isn’t an official “2M hard limit,” but very large sequential resultOffset paging gets progressively slower and can start failing (timeouts/“invalid query”) because the service still has to scan/skip more rows as the offset grows, and AGOL also enforces fair-use throttling under sustained high-volume pulls.&lt;/P&gt;&lt;P&gt;What works better (more stable patterns):&lt;/P&gt;&lt;P&gt;Stop using deep offsets. Page by ObjectID instead.&lt;/P&gt;&lt;P&gt;First get OID stats: returnIdsOnly=true (or returnCountOnly=true + orderByFields=OBJECTID).&lt;/P&gt;&lt;P&gt;Then query in chunks like: where=OBJECTID &amp;gt; x AND OBJECTID &amp;lt;= y with orderByFields=OBJECTID and resultRecordCount=2000.&lt;/P&gt;&lt;P&gt;This avoids the “skip N rows” penalty that kills performance after a couple million.&lt;/P&gt;&lt;P&gt;Use asynchronous extract instead of raw query when you can.&lt;/P&gt;&lt;P&gt;Prefer Extract Data / Create Replica (sync/replica workflow) for big exports—those are designed for bulk movement and tend to be more resilient than millions of /query calls.&lt;/P&gt;&lt;P&gt;Parallelize carefully (only if needed).&lt;/P&gt;&lt;P&gt;If you parallelize, split by OID ranges (or time slices) and keep concurrency modest (e.g., 3–8 workers). Too much parallelism will trip throttling faster.&lt;/P&gt;&lt;P&gt;Tune for reliability.&lt;/P&gt;&lt;P&gt;Use POST (not GET) for long params, set timeouts/retries with backoff, and request only needed fields (outFields) + returnGeometry=false unless required.&lt;/P&gt;</description>
      <pubDate>Thu, 25 Dec 2025 19:15:49 GMT</pubDate>
      <guid>https://community.esri.com/t5/arcgis-rest-apis-and-services-questions/performance-degradation-amp-timeout-during-large/m-p/1674897#M5073</guid>
      <dc:creator>VenkataKondepati</dc:creator>
      <dc:date>2025-12-25T19:15:49Z</dc:date>
    </item>
  </channel>
</rss>

