Select to view content in your preferred language

Query Feature Layer Feature and Attachment Storage

1477
4
Jump to solution
06-15-2023 02:14 PM
Labels (1)
mpboyle
Frequent Contributor

I'm wondering how I can use the Python API to query a feature layer and retrieve the feature and attachment storage separately.

The feature layer property 'size' returns the combined feature and attachment size.  I'd like to be able to break the feature and attachment storage separately to better align with the reports generated by AGO.

mpboyle_1-1686863371899.png

1 Solution

Accepted Solutions
mpboyle
Frequent Contributor

The hosted feature layers I've been testing have not been edited since April 2022, so I don't believe that's an issue here.

I did some digging with a few of our hosted feature layers that have a considerable amount of attachments.  It seems like the "search" method on the Attachment Manager only returns whatever is the max record count for the service.  It doesn't seem to automatically paginate if the max record count is hit.

Using the feature layer above as an example, I should expect a return of around 3GB in size for the attachments. 

If I use the code below, only the first 2,000 attachments are returned...the max record count for the hosted feature layer service.  Even if I pass in a value of 9,999 for the "max_records" parameter used by the "search" method, only the first 2,000 attachments are returned.

 

import arcgis

itemId = '...'

# connect to portal
p = arcgis.GIS('...')

# set item object
item = p.content.get(itemId)
# get item layers
layers = item.layers
# set place holders
attachmentSize = 0
attachmentCount = 0
# iterate over item layers
for l in layers:
    # check if layer supports attachments
    attachmentSupport = l.properties.hasAttachments
    if attachmentSupport:
        attachments = l.attachments.search(where='1=1')
        # iterate over attachments
        for a in attachments:
            attachmentCount += 1
            s = a.get('SIZE')
            attachmentSize += s

# convert attachment size to MB
attachmentSizeMB = ((float(attachmentSize) / 1024.0) / 1024.0)

print(f'Attachment Count: {attachmentCount:,}')
print(f'Attachment Size: {attachmentSize:,}')
print(f'Attachment Size (MB): {attachmentSizeMB:,.2f}')

 

mpboyle_0-1687465077363.png

 

It seems that in order to retrieve ALL attachments you need to account for pagination/offset when using the Attachment Manager.  With the script below, which uses the "max_records" and "offset" parameters, I was able to finally return all attachments and get a similar number that the item page displays.

 

import arcgis

itemId = '...'

# connect to portal
p = arcgis.GIS('...')

# set item object
item = p.content.get(itemId)
# get item layers
layers = item.layers
# set place holders
attachmentLoops = 0
attachmentSize = 0
attachmentCount = 0
# iterate over item layers
for l in layers:
    # check if layer supports attachments
    attachmentSupport = l.properties.hasAttachments
    if attachmentSupport:
        # set query values
        attachmentMax = 1000
        attachmentOffset = 0
        continueQuery = True
        # get attachments
        while continueQuery:
            attachmentLoops += 1
            attachments = l.attachments.search(where='1=1', max_records=attachmentMax, offset=attachmentOffset)
            # check for attachments
            if attachments:
                # increment offset
                attachmentOffset += attachmentMax
                # iterate over attachments
                for a in attachments:
                    attachmentCount += 1
                    s = a.get('SIZE')
                    attachmentSize += s
            else:
                continueQuery = False
        
# convert attachment size to MB
attachmentSizeMB = ((float(attachmentSize) / 1024.0) / 1024.0)
# check attachment loops
if attachmentLoops == 0:
    attachmentLoops = 0
else:
    attachmentLoops = attachmentLoops - 1

print(f'Number of Loops: {attachmentLoops:,}')
print(f'Attachment Count: {attachmentCount:,}')
print(f'Attachment Size: {attachmentSize:,}')
print(f'Attachment Size (MB): {attachmentSizeMB:,.2f}')

 

mpboyle_1-1687465570032.png

 

View solution in original post

4 Replies
David_McRitchie
Esri Contributor

Hey I saw your other comment on this post  which discusses using the attachment manager to find attachment sizes. I would recommend using this, but from your comment it seems the API does not return a very reliable figure.

 

I gave this a quick test with a layer uploaded that is reported as 1.316 MB on the attachment size while my notebook returns a value of 1381120 which looks about right. Can I just confirm if you did your testing a few hours after uploading any data? I have seen a few cases where the file size reporting in ArcGIS Online takes a few hours to display its true value after edits are made.

 

David

Esri UK -Technical Support Analyst
0 Kudos
mpboyle
Frequent Contributor

The hosted feature layers I've been testing have not been edited since April 2022, so I don't believe that's an issue here.

I did some digging with a few of our hosted feature layers that have a considerable amount of attachments.  It seems like the "search" method on the Attachment Manager only returns whatever is the max record count for the service.  It doesn't seem to automatically paginate if the max record count is hit.

Using the feature layer above as an example, I should expect a return of around 3GB in size for the attachments. 

If I use the code below, only the first 2,000 attachments are returned...the max record count for the hosted feature layer service.  Even if I pass in a value of 9,999 for the "max_records" parameter used by the "search" method, only the first 2,000 attachments are returned.

 

import arcgis

itemId = '...'

# connect to portal
p = arcgis.GIS('...')

# set item object
item = p.content.get(itemId)
# get item layers
layers = item.layers
# set place holders
attachmentSize = 0
attachmentCount = 0
# iterate over item layers
for l in layers:
    # check if layer supports attachments
    attachmentSupport = l.properties.hasAttachments
    if attachmentSupport:
        attachments = l.attachments.search(where='1=1')
        # iterate over attachments
        for a in attachments:
            attachmentCount += 1
            s = a.get('SIZE')
            attachmentSize += s

# convert attachment size to MB
attachmentSizeMB = ((float(attachmentSize) / 1024.0) / 1024.0)

print(f'Attachment Count: {attachmentCount:,}')
print(f'Attachment Size: {attachmentSize:,}')
print(f'Attachment Size (MB): {attachmentSizeMB:,.2f}')

 

mpboyle_0-1687465077363.png

 

It seems that in order to retrieve ALL attachments you need to account for pagination/offset when using the Attachment Manager.  With the script below, which uses the "max_records" and "offset" parameters, I was able to finally return all attachments and get a similar number that the item page displays.

 

import arcgis

itemId = '...'

# connect to portal
p = arcgis.GIS('...')

# set item object
item = p.content.get(itemId)
# get item layers
layers = item.layers
# set place holders
attachmentLoops = 0
attachmentSize = 0
attachmentCount = 0
# iterate over item layers
for l in layers:
    # check if layer supports attachments
    attachmentSupport = l.properties.hasAttachments
    if attachmentSupport:
        # set query values
        attachmentMax = 1000
        attachmentOffset = 0
        continueQuery = True
        # get attachments
        while continueQuery:
            attachmentLoops += 1
            attachments = l.attachments.search(where='1=1', max_records=attachmentMax, offset=attachmentOffset)
            # check for attachments
            if attachments:
                # increment offset
                attachmentOffset += attachmentMax
                # iterate over attachments
                for a in attachments:
                    attachmentCount += 1
                    s = a.get('SIZE')
                    attachmentSize += s
            else:
                continueQuery = False
        
# convert attachment size to MB
attachmentSizeMB = ((float(attachmentSize) / 1024.0) / 1024.0)
# check attachment loops
if attachmentLoops == 0:
    attachmentLoops = 0
else:
    attachmentLoops = attachmentLoops - 1

print(f'Number of Loops: {attachmentLoops:,}')
print(f'Attachment Count: {attachmentCount:,}')
print(f'Attachment Size: {attachmentSize:,}')
print(f'Attachment Size (MB): {attachmentSizeMB:,.2f}')

 

mpboyle_1-1687465570032.png

 

David_McRitchie
Esri Contributor

Ah that explains! In my test layers the record count was below 2000 records.

 

Thank you for pasting your code. That should be really useful for anyone else encountering this.

 

David

Esri UK -Technical Support Analyst
0 Kudos
MasaakiKurokawa
Occasional Contributor

I created a script to calculate the file size of hosted feature layers excluding attachments for credit management, based on the code from this post.

https://community.esri.com/t5/arcgis-online-ideas/to-manage-agol-credits-we-will-investigate-which/i...

0 Kudos