<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Python Random.Sample does not seem that random in Python Questions</title>
    <link>https://community.esri.com/t5/python-questions/python-random-sample-does-not-seem-that-random/m-p/175999#M13531</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;I could add two sort of related things:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;1. If you have parallel processes, each using the random function, you can mix things up (scramble the "states" so that they are all out of sequence which is a good thing) by using the .jumpahead() method. For example:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;PRE class="lia-code-sample line-numbers language-none"&gt;timeInt = int(str(int(time.time() * 10000))[-5:])
random.jumpahead(timeInt)&lt;/PRE&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;2. When using FGDBs, you can send it absurdly long SQL strings. Such as "OBJECTID in (1,2,3,4,5,.....)" I sent one that was &amp;gt; million characters long and it actually worked! Curious what the character limit is... There must be one, right? I know that I have been frustrated by Oracle SDE having a SQL statement limit of ~1400 characters, which is pretty lame.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Sat, 11 Dec 2021 09:04:23 GMT</pubDate>
    <dc:creator>ChrisSnyder</dc:creator>
    <dc:date>2021-12-11T09:04:23Z</dc:date>
    <item>
      <title>Python Random.Sample does not seem that random</title>
      <link>https://community.esri.com/t5/python-questions/python-random-sample-does-not-seem-that-random/m-p/175996#M13528</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Hi, Please see the attached GIF image&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I am trying to make a little tool to make random selections of features from feature classes by randomly selecting FIDs. What it does is:&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;1.Creates a python list of feature FIDs&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;2. Using the random.sample I draw a number of samples from my python list of FIDs, the number drawn is equivalent to what the user desires&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;3. This is then converted to a sql statement allowing a selection to be made and stored in a new feature class&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I don't have trouble with my code, but the output does not seem to be that random, as the gif below shows&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I have tried random.shuffle and other variations but seem to get a similar output.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Does anyone know anything about getting a better result, different method perhaps?&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Right now I am working on some test data and my population is about 300 with a sample of 30.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Here is the part of my code that lists the FIDs and makes a selection&lt;/SPAN&gt;&lt;BR /&gt;&lt;PRE class="plain" name="code"&gt;rows = arcpy.SearchCursor(InputToSample)
for row in rows:
&amp;nbsp;&amp;nbsp;&amp;nbsp; fidVal = row.getValue(FieldName)
&amp;nbsp;&amp;nbsp;&amp;nbsp; fidList.append(fidVal)
#make a random selection
rndList = random.sample(fidList, numSample)&lt;/PRE&gt;&lt;DIV style="display:none;"&gt; &lt;/DIV&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Thanks&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;David&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;[ATTACH=CONFIG]11785[/ATTACH]&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 08 Feb 2012 02:29:37 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/python-random-sample-does-not-seem-that-random/m-p/175996#M13528</guid>
      <dc:creator>DavidBirkigt</dc:creator>
      <dc:date>2012-02-08T02:29:37Z</dc:date>
    </item>
    <item>
      <title>Re: Python Random.Sample does not seem that random</title>
      <link>https://community.esri.com/t5/python-questions/python-random-sample-does-not-seem-that-random/m-p/175997#M13529</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Alright,&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I have found out why my random sampling method does not work. It is because random.sample considers the order. Ie if you have a population 1-10 and select 3 elements ex numbers 456 are drawn, this selection would be considered distinct from 654 as they were selected in a different order. I will post some correct code when I have a better sampling method.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;David&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 08 Feb 2012 15:27:49 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/python-random-sample-does-not-seem-that-random/m-p/175997#M13529</guid>
      <dc:creator>DavidBirkigt</dc:creator>
      <dc:date>2012-02-08T15:27:49Z</dc:date>
    </item>
    <item>
      <title>Re: Python Random.Sample does not seem that random</title>
      <link>https://community.esri.com/t5/python-questions/python-random-sample-does-not-seem-that-random/m-p/175998#M13530</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Somewhat off-topic, but I thought I might mention: I wrote a very similar tool (extract random sample of features) a little while ago, and found that it was breaking when the sample size was &amp;gt;~10,000. I was constructing the SQL statement string to pass to the Select tool something along the lines of &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;PRE __default_attr="plain" __jive_macro_name="code" class="jive_macro_code jive_text_macro"&gt;" OR ".join(["'{0}' = {1}".format(oid_fname,x) for x in random_fids])&lt;/PRE&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;And this string of OR statements was falling over. I had to change it to use the IN statement for it to work, i.e.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;PRE __default_attr="plain" __jive_macro_name="code" class="jive_macro_code jive_text_macro"&gt;"'{0}' IN ({1})".format(oid_fname,",".join(random_fids))&lt;/PRE&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;(Note I haven't double-checked that syntax, but hopefully you get the idea).&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;In other words: chaining multiple OR statements made the arcpy.Select() tool fail, using the IN SQL statement worked on &amp;gt;50,000 records.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Just to save you some pain &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 08 Feb 2012 20:23:13 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/python-random-sample-does-not-seem-that-random/m-p/175998#M13530</guid>
      <dc:creator>ThomMackey</dc:creator>
      <dc:date>2012-02-08T20:23:13Z</dc:date>
    </item>
    <item>
      <title>Re: Python Random.Sample does not seem that random</title>
      <link>https://community.esri.com/t5/python-questions/python-random-sample-does-not-seem-that-random/m-p/175999#M13531</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;I could add two sort of related things:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;1. If you have parallel processes, each using the random function, you can mix things up (scramble the "states" so that they are all out of sequence which is a good thing) by using the .jumpahead() method. For example:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;PRE class="lia-code-sample line-numbers language-none"&gt;timeInt = int(str(int(time.time() * 10000))[-5:])
random.jumpahead(timeInt)&lt;/PRE&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;2. When using FGDBs, you can send it absurdly long SQL strings. Such as "OBJECTID in (1,2,3,4,5,.....)" I sent one that was &amp;gt; million characters long and it actually worked! Curious what the character limit is... There must be one, right? I know that I have been frustrated by Oracle SDE having a SQL statement limit of ~1400 characters, which is pretty lame.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 11 Dec 2021 09:04:23 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/python-random-sample-does-not-seem-that-random/m-p/175999#M13531</guid>
      <dc:creator>ChrisSnyder</dc:creator>
      <dc:date>2021-12-11T09:04:23Z</dc:date>
    </item>
    <item>
      <title>Re: Python Random.Sample does not seem that random</title>
      <link>https://community.esri.com/t5/python-questions/python-random-sample-does-not-seem-that-random/m-p/176000#M13532</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Another thing to note is that the Oracle "IN" operator is limited to 1000 elements by default; thus you would need to break up your IN statements by OR operators and keep each one to 1000 elements.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 09 Feb 2012 17:17:01 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/python-random-sample-does-not-seem-that-random/m-p/176000#M13532</guid>
      <dc:creator>LoganPugh</dc:creator>
      <dc:date>2012-02-09T17:17:01Z</dc:date>
    </item>
  </channel>
</rss>

