<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: describe the size of a dataset in Python Questions</title>
    <link>https://community.esri.com/t5/python-questions/describe-the-size-of-a-dataset/m-p/419046#M32898</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Joshua,&lt;/P&gt;&lt;P&gt;thanks for your comprehensive and penetrative answer!&lt;/P&gt;&lt;P&gt;I have to digest your code &lt;IMG src="https://community.esri.com/legacyfs/online/emoticons/wink.png" /&gt;, but I think it contains all answers i'm looking for.&lt;/P&gt;&lt;P&gt;Lothar&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Tue, 07 Jul 2015 06:53:32 GMT</pubDate>
    <dc:creator>LotharUlferts</dc:creator>
    <dc:date>2015-07-07T06:53:32Z</dc:date>
    <item>
      <title>describe the size of a dataset</title>
      <link>https://community.esri.com/t5/python-questions/describe-the-size-of-a-dataset/m-p/419042#M32894</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hallo,&lt;/P&gt;&lt;P&gt;ArcGIS shows the size of each Dataset and I want to use this sizes in Python.&lt;/P&gt;&lt;P&gt;so the first point is&amp;nbsp; I'm looking for a function that decribes the size of a featureclass inside my filegeodabase. ArcCatalog shows it in the Content.&lt;IMG alt="sizeInside_GDB_50proz.gif" class="image-1 jive-image" src="https://community.esri.com/legacyfs/online/113316_sizeInside_GDB_50proz.gif" style="height: auto; float: right;" /&gt;&lt;/P&gt;&lt;P&gt;I've serched inside the Decribe-Objects, but didn't found. Did I overlooked something?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Second point is&amp;nbsp; I'm looking for a function that describe wich files belongs for ArcGIS to a geodataset, so that i can summerize the sizes of each part.&lt;/P&gt;&lt;P&gt;For example the&lt;EM&gt; land.shp&lt;/EM&gt; -Dataset consist of e.g&amp;nbsp; SHP, SHX, DBF-File. So,&amp;nbsp; initially&lt;EM&gt;&lt;STRONG&gt; glob.glob('land.*')&lt;/STRONG&gt;&lt;/EM&gt; seems to be the solution. But I've got also an archive&lt;EM&gt; land.zip&lt;/EM&gt; or an Table land.csv inside the directory...&lt;/P&gt;&lt;P&gt;How can I consider geodata like ArcGIS do?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks, Lothar&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 25 Jun 2015 09:21:50 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/describe-the-size-of-a-dataset/m-p/419042#M32894</guid>
      <dc:creator>LotharUlferts</dc:creator>
      <dc:date>2015-06-25T09:21:50Z</dc:date>
    </item>
    <item>
      <title>Re: describe the size of a dataset</title>
      <link>https://community.esri.com/t5/python-questions/describe-the-size-of-a-dataset/m-p/419043#M32895</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;To your first point, file size most likely comes from the OS you're running (Windows, Linux, etc.). You can get this from the os module in Python (e.g. see &lt;A href="http://stackoverflow.com/questions/2104080/how-to-check-file-size-in-python"&gt;here&lt;/A&gt;).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;For your second point, there are probably several ways, but for your example an easy way would be to limit your search to just the possible extensions for files making up a shapefile. This would probably not work for a feature class in a file geodatabase, service, etc., as those are more complicated structures.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 26 Jun 2015 16:33:48 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/describe-the-size-of-a-dataset/m-p/419043#M32895</guid>
      <dc:creator>Zeke</dc:creator>
      <dc:date>2015-06-26T16:33:48Z</dc:date>
    </item>
    <item>
      <title>Re: describe the size of a dataset</title>
      <link>https://community.esri.com/t5/python-questions/describe-the-size-of-a-dataset/m-p/419044#M32896</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I haven't come across any way to extract feature class size from within a FGDB using Python, directly. The best &lt;A _jive_internal="true" href="https://community.esri.com/thread/74409"&gt;workaround I found&lt;/A&gt; was to create a new FGDB, copy over each feature class one by one, and monitor the size of the entire FGDB.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 26 Jun 2015 16:40:55 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/describe-the-size-of-a-dataset/m-p/419044#M32896</guid>
      <dc:creator>DarrenWiens2</dc:creator>
      <dc:date>2015-06-26T16:40:55Z</dc:date>
    </item>
    <item>
      <title>Re: describe the size of a dataset</title>
      <link>https://community.esri.com/t5/python-questions/describe-the-size-of-a-dataset/m-p/419045#M32897</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;If you are willing to install/import comtypes, the following code can be used to give dataset sizes and timestamps for shapefiles in a folder or datasets in a file geodatabase:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE class="lia-code-sample line-numbers language-none"&gt;def GetDatasetFileStatsFromWorkspace(in_workspace):
&amp;nbsp;&amp;nbsp;&amp;nbsp; from comtypes.client import CreateObject, GetModule
&amp;nbsp;&amp;nbsp;&amp;nbsp; import os
&amp;nbsp;&amp;nbsp;&amp;nbsp; 
&amp;nbsp;&amp;nbsp;&amp;nbsp; def _GetDatasetFileStats(pDataset):
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; from datetime import datetime, timedelta
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; from dateutil import tz
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; DFS = {}
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; d = datetime(1970, 01, 01, tzinfo=tz.tzutc())
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pDFS = pDataset.QueryInterface(esriGeoDatabase.IDatasetFileStat2)
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; DFS['StatSize'] = pDFS.StatSize
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; DFS['StatTime'] = {
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 'LastAccess': d + timedelta(0, pDFS.StatTime(0)),
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 'Creation': d + timedelta(0, pDFS.StatTime(1)),
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 'LastModification': d + timedelta(0, pDFS.StatTime(2))
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; }
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; return DFS
&amp;nbsp;&amp;nbsp;&amp;nbsp; 
&amp;nbsp;&amp;nbsp;&amp;nbsp; assert os.path.isdir(in_workspace), "Workspace is not folder or file geodatabase"
&amp;nbsp;&amp;nbsp;&amp;nbsp; 
&amp;nbsp;&amp;nbsp;&amp;nbsp; comDirectory = os.path.join(&amp;nbsp; 
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; os.path.join(arcpy.GetInstallInfo()['InstallDir']), 'com'&amp;nbsp; 
&amp;nbsp;&amp;nbsp;&amp;nbsp; )
&amp;nbsp;&amp;nbsp;&amp;nbsp; esriDataSourcesGDB = GetModule(os.path.join(comDirectory, 'esriDataSourcesGDB.olb'))
&amp;nbsp;&amp;nbsp;&amp;nbsp; esriDataSourcesFile = GetModule(os.path.join(comDirectory, 'esriDataSourcesFile.olb'))
&amp;nbsp;&amp;nbsp;&amp;nbsp; esriGeoDatabase = GetModule(os.path.join(comDirectory, 'esriGeodatabase.olb'))
&amp;nbsp;&amp;nbsp;&amp;nbsp; 
&amp;nbsp;&amp;nbsp;&amp;nbsp; if in_workspace.endswith('.gdb'):
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pWSF = CreateObject(esriDataSourcesGDB.FileGDBWorkspaceFactory,
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; interface=esriGeoDatabase.IWorkspaceFactory)
&amp;nbsp;&amp;nbsp;&amp;nbsp; else:
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pWSF = CreateObject(esriDataSourcesFile.ShapefileWorkspaceFactory,
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; interface=esriGeoDatabase.IWorkspaceFactory)
&amp;nbsp;&amp;nbsp;&amp;nbsp; pWS = pWSF.OpenFromFile(in_workspace, 0)
&amp;nbsp;&amp;nbsp;&amp;nbsp; 
&amp;nbsp;&amp;nbsp;&amp;nbsp; pEnumDS = pWS.Datasets(1)
&amp;nbsp;&amp;nbsp;&amp;nbsp; pDS = pEnumDS.Next()
&amp;nbsp;&amp;nbsp;&amp;nbsp; DS = {}
&amp;nbsp;&amp;nbsp;&amp;nbsp; while pDS:
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if pDS.Type == esriGeoDatabase.esriDTFeatureDataset:
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pEnumSS = pDS.Subsets
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pSS = pEnumSS.Next()
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; while pSS:
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Name = os.path.join(pDS.Name, pSS.Name)
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; DS[Name] = _GetDatasetFileStats(pSS)
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pSS = pEnumSS.Next()
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; else:
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; DS[pDS.Name] = _GetDatasetFileStats(pDS)
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pDS = pEnumDS.Next()
&amp;nbsp;&amp;nbsp;&amp;nbsp; return DS if DS else None&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;A couple or few comments:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;The dataset size and timestamps are coming from the &lt;A href="http://resources.arcgis.com/en/help/arcobjects-cpp/componenthelp/index.html#//000s0000015z000000" rel="nofollow noopener noreferrer" target="_blank"&gt;IDatasetFileStat2&lt;/A&gt; interface of the &lt;A href="http://resources.arcgis.com/en/help/arcobjects-cpp/componenthelp/index.html#/ESRI_ArcGIS_GeoDatabase/000s000000nr000000/" rel="nofollow noopener noreferrer" target="_blank"&gt;Geodatabase&lt;/A&gt; library.&lt;UL&gt;&lt;LI&gt;Dataset size is in bytes (original format).&lt;/LI&gt;&lt;LI&gt;Dataset timestamps are Python timedate in UTC (converted from original to Python type).&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;The function returns a dictionary of properties for all shapefiles in a folder or datasets in a file geodatabase.&lt;UL&gt;&lt;LI&gt;The dictionary keys are dataset names.&lt;UL&gt;&lt;LI&gt;Feature datasets are recursed, and the feature dataset name is prefixed to the dataset name.&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;The timestamps are further stored in another dictionary with those keys being the type of timestamp:&amp;nbsp; Creation, LastModification, LastAccess.&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;Error catching is limited.&amp;nbsp; The code is demonstrative and not production.&lt;UL&gt;&lt;LI&gt;One assertion statement is included to catch the most likely error of an invalid workspace type being passed since com errors can be cryptic, or at least in this case.&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;If you do install comtypes and haven't worked with it before, see the following StackExchange post about a configuration change that is necessary to make it work with ArcGIS:&amp;nbsp; &lt;A href="http://gis.stackexchange.com/questions/37672/arcobjects-comtypes-at-10-1-and-newer" rel="nofollow noopener noreferrer" target="_blank"&gt;ArcObjects + comtypes at 10.1 and newer&lt;/A&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 11 Dec 2021 18:57:28 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/describe-the-size-of-a-dataset/m-p/419045#M32897</guid>
      <dc:creator>JoshuaBixby</dc:creator>
      <dc:date>2021-12-11T18:57:28Z</dc:date>
    </item>
    <item>
      <title>Re: describe the size of a dataset</title>
      <link>https://community.esri.com/t5/python-questions/describe-the-size-of-a-dataset/m-p/419046#M32898</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Joshua,&lt;/P&gt;&lt;P&gt;thanks for your comprehensive and penetrative answer!&lt;/P&gt;&lt;P&gt;I have to digest your code &lt;IMG src="https://community.esri.com/legacyfs/online/emoticons/wink.png" /&gt;, but I think it contains all answers i'm looking for.&lt;/P&gt;&lt;P&gt;Lothar&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 07 Jul 2015 06:53:32 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/describe-the-size-of-a-dataset/m-p/419046#M32898</guid>
      <dc:creator>LotharUlferts</dc:creator>
      <dc:date>2015-07-07T06:53:32Z</dc:date>
    </item>
  </channel>
</rss>

