<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How use all sample data for just training and not validation? in ArcGIS Image Analyst Questions</title>
    <link>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046223#M302</link>
    <description>&lt;P&gt;Is it the prepare_data step that gives the error? It may be an issue within the training data itself?&lt;/P&gt;&lt;P&gt;I have run a SingleShotDetector with 0% validation split and it works ok too. Note: the show_results and average_precision functions won't work and will give index errors. When you do the save function you will have to pass a compute_metrics=False parameter to save the model or it will give an index error also.&lt;/P&gt;</description>
    <pubDate>Tue, 13 Apr 2021 04:01:46 GMT</pubDate>
    <dc:creator>Tim_McGinnes</dc:creator>
    <dc:date>2021-04-13T04:01:46Z</dc:date>
    <item>
      <title>How use all sample data for just training and not validation?</title>
      <link>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046172#M299</link>
      <description>&lt;P&gt;Hello, is there a way to use all sample data for just training and not validation? Val_split_pct parameter which shows the p&lt;SPAN&gt;ercentage of training data to keep as validation,&amp;nbsp;&lt;/SPAN&gt;doesn't accept 0 value.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 27 Jul 2022 18:35:30 GMT</pubDate>
      <guid>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046172#M299</guid>
      <dc:creator>MaryamBarzegar</dc:creator>
      <dc:date>2022-07-27T18:35:30Z</dc:date>
    </item>
    <item>
      <title>Re: How use all sample data for just training and not validation?</title>
      <link>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046212#M300</link>
      <description>&lt;P&gt;It looks like this works when using a jupyter notebook. Setting the val_split_pct to 0.0 doesn't give any errors and trains the model ok. As expected, the validation loss cannot be calculated.&lt;/P&gt;&lt;P&gt;However when using the&amp;nbsp;Train Deep Learning Model tool in Pro, it seems to use some of the training data for validation despite putting in zero as the split. So I think it may default back to 10% - but I am not sure.&lt;/P&gt;&lt;P&gt;The above is true for my MaskRCNN model, but may be different for other models. What model are you using and is it giving you any errors when trying to do a zero percent split?&lt;/P&gt;</description>
      <pubDate>Tue, 13 Apr 2021 03:03:22 GMT</pubDate>
      <guid>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046212#M300</guid>
      <dc:creator>Tim_McGinnes</dc:creator>
      <dc:date>2021-04-13T03:03:22Z</dc:date>
    </item>
    <item>
      <title>Re: How use all sample data for just training and not validation?</title>
      <link>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046219#M301</link>
      <description>&lt;P&gt;Hi Tim, I'm using ChangeDetector model and it gives me the below error:&lt;/P&gt;&lt;P&gt;"ename": "IndexError",&lt;BR /&gt;"evalue": "index 0 is out of bounds for axis 0 with size 0",&lt;/P&gt;&lt;P&gt;data = prepare_data(output_path,&lt;BR /&gt;chip_size=256,&lt;BR /&gt;val_split_pct=0.0,&lt;BR /&gt;dataset_type='ChangeDetection',&lt;BR /&gt;batch_size=4&lt;BR /&gt;)&lt;/P&gt;</description>
      <pubDate>Tue, 13 Apr 2021 03:22:31 GMT</pubDate>
      <guid>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046219#M301</guid>
      <dc:creator>MaryamBarzegar</dc:creator>
      <dc:date>2021-04-13T03:22:31Z</dc:date>
    </item>
    <item>
      <title>Re: How use all sample data for just training and not validation?</title>
      <link>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046223#M302</link>
      <description>&lt;P&gt;Is it the prepare_data step that gives the error? It may be an issue within the training data itself?&lt;/P&gt;&lt;P&gt;I have run a SingleShotDetector with 0% validation split and it works ok too. Note: the show_results and average_precision functions won't work and will give index errors. When you do the save function you will have to pass a compute_metrics=False parameter to save the model or it will give an index error also.&lt;/P&gt;</description>
      <pubDate>Tue, 13 Apr 2021 04:01:46 GMT</pubDate>
      <guid>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046223#M302</guid>
      <dc:creator>Tim_McGinnes</dc:creator>
      <dc:date>2021-04-13T04:01:46Z</dc:date>
    </item>
    <item>
      <title>Re: How use all sample data for just training and not validation?</title>
      <link>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046230#M303</link>
      <description>&lt;P&gt;Yes, the prepare_data gives the error. This model is a bit different from other models I mean in the training dataset I have 3 different folders, (images before, images_after, labels); however, for instance, in case of Multitaskroadextractor model I had 2 folders (images and labels). I don't think my training dataset has any problem since it works with other values of&amp;nbsp;&lt;SPAN&gt;val_split_pct parameter. The whole error message:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;SPAN class="ansi-red-fg"&gt;IndexError&lt;/SPAN&gt;                                Traceback (most recent call last)
In  &lt;SPAN class="ansi-blue-fg"&gt;[1]&lt;/SPAN&gt;:
Line &lt;SPAN class="ansi-blue-fg"&gt;7&lt;/SPAN&gt;:     batch_size=&lt;SPAN class="ansi-blue-fg"&gt;4&lt;/SPAN&gt;

File &lt;SPAN class="ansi-blue-fg"&gt;C:\Users\barzegarm\AppData\Local\ESRI\conda\envs\arcgispro-py3-clone1\lib\site-packages\arcgis\learn\_data.py&lt;/SPAN&gt;, in &lt;SPAN class="ansi-green-fg"&gt;prepare_data&lt;/SPAN&gt;:
Line &lt;SPAN class="ansi-blue-fg"&gt;1368&lt;/SPAN&gt;:  **kwargs)

File &lt;SPAN class="ansi-blue-fg"&gt;C:\Users\barzegarm\AppData\Local\ESRI\conda\envs\arcgispro-py3-clone1\lib\site-packages\arcgis\learn\_utils\change_detection_data.py&lt;/SPAN&gt;, in &lt;SPAN class="ansi-green-fg"&gt;prepare_change_detection_data&lt;/SPAN&gt;:
Line &lt;SPAN class="ansi-blue-fg"&gt;695&lt;/SPAN&gt;:   imagery_type=imagery_type

File &lt;SPAN class="ansi-blue-fg"&gt;C:\Users\barzegarm\AppData\Local\ESRI\conda\envs\arcgispro-py3-clone1\lib\site-packages\arcgis\learn\_utils\change_detection_data.py&lt;/SPAN&gt;, in &lt;SPAN class="ansi-green-fg"&gt;create_train_val_sets&lt;/SPAN&gt;:
Line &lt;SPAN class="ansi-blue-fg"&gt;610&lt;/SPAN&gt;:   imagery_type=imagery_type

File &lt;SPAN class="ansi-blue-fg"&gt;C:\Users\barzegarm\AppData\Local\ESRI\conda\envs\arcgispro-py3-clone1\lib\site-packages\arcgis\learn\_utils\change_detection_data.py&lt;/SPAN&gt;, in &lt;SPAN class="ansi-green-fg"&gt;__init__&lt;/SPAN&gt;:
Line &lt;SPAN class="ansi-blue-fg"&gt;322&lt;/SPAN&gt;:   &lt;SPAN class="ansi-cyan-fg"&gt;self&lt;/SPAN&gt;.n_c = &lt;SPAN class="ansi-cyan-fg"&gt;self&lt;/SPAN&gt;.x[&lt;SPAN class="ansi-blue-fg"&gt;0&lt;/SPAN&gt;].data.shape[&lt;SPAN class="ansi-blue-fg"&gt;0&lt;/SPAN&gt;]

File &lt;SPAN class="ansi-blue-fg"&gt;C:\Users\barzegarm\AppData\Local\ESRI\conda\envs\arcgispro-py3-clone1\lib\site-packages\fastai\data_block.py&lt;/SPAN&gt;, in &lt;SPAN class="ansi-green-fg"&gt;__getitem__&lt;/SPAN&gt;:
Line &lt;SPAN class="ansi-blue-fg"&gt;120&lt;/SPAN&gt;:   &lt;SPAN class="ansi-blue-fg"&gt;if&lt;/SPAN&gt; &lt;SPAN class="ansi-cyan-fg"&gt;isinstance&lt;/SPAN&gt;(idxs, Integral): &lt;SPAN class="ansi-blue-fg"&gt;return&lt;/SPAN&gt; &lt;SPAN class="ansi-cyan-fg"&gt;self&lt;/SPAN&gt;.get(idxs)

File &lt;SPAN class="ansi-blue-fg"&gt;C:\Users\barzegarm\AppData\Local\ESRI\conda\envs\arcgispro-py3-clone1\lib\site-packages\fastai\vision\data.py&lt;/SPAN&gt;, in &lt;SPAN class="ansi-green-fg"&gt;get&lt;/SPAN&gt;:
Line &lt;SPAN class="ansi-blue-fg"&gt;270&lt;/SPAN&gt;:   fn = &lt;SPAN class="ansi-cyan-fg"&gt;super&lt;/SPAN&gt;().get(i)

File &lt;SPAN class="ansi-blue-fg"&gt;C:\Users\barzegarm\AppData\Local\ESRI\conda\envs\arcgispro-py3-clone1\lib\site-packages\fastai\data_block.py&lt;/SPAN&gt;, in &lt;SPAN class="ansi-green-fg"&gt;get&lt;/SPAN&gt;:
Line &lt;SPAN class="ansi-blue-fg"&gt;75&lt;/SPAN&gt;:    &lt;SPAN class="ansi-blue-fg"&gt;return&lt;/SPAN&gt; &lt;SPAN class="ansi-cyan-fg"&gt;self&lt;/SPAN&gt;.items[i]

&lt;SPAN class="ansi-red-fg"&gt;IndexError&lt;/SPAN&gt;: index 0 is out of bounds for axis 0 with size 0
&lt;SPAN class="ansi-red-fg"&gt;---------------------------------------------------------------------------&lt;/SPAN&gt;
&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 13 Apr 2021 04:25:44 GMT</pubDate>
      <guid>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046230#M303</guid>
      <dc:creator>MaryamBarzegar</dc:creator>
      <dc:date>2021-04-13T04:25:44Z</dc:date>
    </item>
    <item>
      <title>Re: How use all sample data for just training and not validation?</title>
      <link>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046231#M304</link>
      <description>&lt;P&gt;I tried&amp;nbsp;MultiTaskRoadExtractor model and it doesn't give any error so, the problem is just ChangeDetector model&lt;/P&gt;</description>
      <pubDate>Tue, 13 Apr 2021 04:40:12 GMT</pubDate>
      <guid>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046231#M304</guid>
      <dc:creator>MaryamBarzegar</dc:creator>
      <dc:date>2021-04-13T04:40:12Z</dc:date>
    </item>
    <item>
      <title>Re: How use all sample data for just training and not validation?</title>
      <link>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046236#M305</link>
      <description>&lt;P&gt;Apologies - somehow I read ObjectDetection instead of ChangeDetection. Yes, I think there is something in the underlying code which is breaking when trying this.&lt;/P&gt;&lt;P&gt;I can't find it documented anywhere, but there appears to be a split_type parameter to choose if your training\validation split is random or defined by folder.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;val_split_pct (float): percentage of data to split in validation if the split_type is "random"

split_type (str, optional): If split_type='manual' will use train val folders. Defaults to 'random'.&lt;/LI-CODE&gt;&lt;P&gt;And the code shows how it should be structured:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;if split_type == 'folder':
    if (path / 'train').exists() and (path / 'val').exists():
        folder_check(path / 'train')
        folder_check(path / 'val')
        train_images_before = get_files(path / 'train' / 'images_before', extensions=image_extensions)
        train_images_after = get_files(path / 'train' / 'images_after', extensions=image_extensions)
        train_labels = get_files(path / 'train' / 'labels', extensions=image_extensions)
        val_images_before = get_files(path / 'val' / 'images_before', extensions=image_extensions)
        val_images_after = get_files(path / 'val' / 'images_after', extensions=image_extensions)
        val_labels = get_files(path / 'val' / 'labels', extensions=image_extensions)&lt;/LI-CODE&gt;&lt;P&gt;You could maybe try putting your existing image\label folders under a 'train' folder and creating some empty image\label folders under the 'val' folder. In the prepare_data function add a parameter split_type='manual' and see if that makes any difference (it could also be split='manual', I'm not sure)?&lt;/P&gt;</description>
      <pubDate>Tue, 13 Apr 2021 05:03:11 GMT</pubDate>
      <guid>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046236#M305</guid>
      <dc:creator>Tim_McGinnes</dc:creator>
      <dc:date>2021-04-13T05:03:11Z</dc:date>
    </item>
    <item>
      <title>Re: How use all sample data for just training and not validation?</title>
      <link>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046723#M306</link>
      <description>&lt;P&gt;I think they only defined the split_type parameter in the source code of Change Detector model but it can't be defined in prepare_data. I created 2 folders as train and val and in each folder I created 3 folders as images_before, images_after and labels. Then I tried split_type = 'manual' and split_type = 'folder' and both didn't work:&lt;/P&gt;&lt;PRE&gt;&lt;SPAN class="ansi-red-fg"&gt;Exception&lt;/SPAN&gt;                                 Traceback (most recent call last)
In  &lt;SPAN class="ansi-blue-fg"&gt;[1]&lt;/SPAN&gt;:
Line &lt;SPAN class="ansi-blue-fg"&gt;7&lt;/SPAN&gt;:     batch_size=&lt;SPAN class="ansi-blue-fg"&gt;4&lt;/SPAN&gt;

File &lt;SPAN class="ansi-blue-fg"&gt;C:\Users\barzegarm\AppData\Local\ESRI\conda\envs\arcgispro-py3-clone1\lib\site-packages\arcgis\learn\_data.py&lt;/SPAN&gt;, in &lt;SPAN class="ansi-green-fg"&gt;prepare_data&lt;/SPAN&gt;:
Line &lt;SPAN class="ansi-blue-fg"&gt;902&lt;/SPAN&gt;:   folder_check(path)

File &lt;SPAN class="ansi-blue-fg"&gt;C:\Users\barzegarm\AppData\Local\ESRI\conda\envs\arcgispro-py3-clone1\lib\site-packages\arcgis\learn\_utils\change_detection_data.py&lt;/SPAN&gt;, in &lt;SPAN class="ansi-green-fg"&gt;folder_check&lt;/SPAN&gt;:
Line &lt;SPAN class="ansi-blue-fg"&gt;472&lt;/SPAN&gt;:   &lt;SPAN class="ansi-blue-fg"&gt;raise&lt;/SPAN&gt; &lt;SPAN class="ansi-cyan-fg"&gt;Exception&lt;/SPAN&gt;(&lt;SPAN class="ansi-yellow-fg"&gt;f&lt;/SPAN&gt;&lt;SPAN class="ansi-yellow-fg"&gt;"&lt;/SPAN&gt;&lt;SPAN class="ansi-yellow-fg"&gt;Three folders must be present in the &lt;/SPAN&gt;&lt;SPAN class="ansi-yellow-fg"&gt;{&lt;/SPAN&gt;path.name&lt;SPAN class="ansi-yellow-fg"&gt;}&lt;/SPAN&gt;&lt;SPAN class="ansi-yellow-fg"&gt;"&lt;/SPAN&gt;

&lt;SPAN class="ansi-red-fg"&gt;Exception&lt;/SPAN&gt;: Three folders must be present in the Training13_novalidation directory namely 'images_before', 'images_after' and 'labels'.&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;There is no parameter as split_type for prepare_data:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="MaryamBarzegar_0-1618364415294.png" style="width: 400px;"&gt;&lt;img src="https://community.esri.com/t5/image/serverpage/image-id/10790i5E4B348A4B958A84/image-size/medium?v=v2&amp;amp;px=400" role="button" title="MaryamBarzegar_0-1618364415294.png" alt="MaryamBarzegar_0-1618364415294.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="MaryamBarzegar_2-1618364636558.png" style="width: 400px;"&gt;&lt;img src="https://community.esri.com/t5/image/serverpage/image-id/10792i247A3B7BE8E371C5/image-size/medium?v=v2&amp;amp;px=400" role="button" title="MaryamBarzegar_2-1618364636558.png" alt="MaryamBarzegar_2-1618364636558.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 14 Apr 2021 01:45:53 GMT</pubDate>
      <guid>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046723#M306</guid>
      <dc:creator>MaryamBarzegar</dc:creator>
      <dc:date>2021-04-14T01:45:53Z</dc:date>
    </item>
    <item>
      <title>Re: How use all sample data for just training and not validation?</title>
      <link>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046751#M307</link>
      <description>&lt;P&gt;Yes, I think we are at a dead end here - from reviewing the code it looks like the only method that works is the single set of folders (probably why the split_type parameter is not documented anywhere).&lt;/P&gt;&lt;P&gt;For ChangeDetection I think you will just have to set the val_split_pct parameter to the lowest value you can without it giving an error. For example, using the change detection sample notebook, the supplied data has 215 images. I set val_split_pct=0.005 (which gives 1 validation image) and the training process worked ok.&lt;/P&gt;</description>
      <pubDate>Wed, 14 Apr 2021 06:44:32 GMT</pubDate>
      <guid>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046751#M307</guid>
      <dc:creator>Tim_McGinnes</dc:creator>
      <dc:date>2021-04-14T06:44:32Z</dc:date>
    </item>
    <item>
      <title>Re: How use all sample data for just training and not validation?</title>
      <link>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046756#M308</link>
      <description>&lt;P&gt;Thank you Tim. Yeah, I did the same and I defined&amp;nbsp;&lt;SPAN&gt;val_split_pct= 0.01 but it would be helpful if we could control this and selecting images for validation wasn't random.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 14 Apr 2021 07:03:42 GMT</pubDate>
      <guid>https://community.esri.com/t5/arcgis-image-analyst-questions/how-use-all-sample-data-for-just-training-and-not/m-p/1046756#M308</guid>
      <dc:creator>MaryamBarzegar</dc:creator>
      <dc:date>2021-04-14T07:03:42Z</dc:date>
    </item>
  </channel>
</rss>

