Latest Contributions by JoshuaBixby

‎02-15-2015

A completely different approach to relying on application logic to select random records would be to rely on the database management system to select random records. If someone is using an enterprise DBMS (Oracle, SQL Server, PostgreSQL, etc..), the database will support some way of returning randomly selected records. def main(): import arcpy import os fc_in = arcpy.GetParameterAsText(0) # input featureclass fl_out = arcpy.GetParameterAsText(1) # output layerfile cnt = arcpy.GetParameter(2) # number of features to select dbms = # use 'oracle' or 'sqlserver' oid_fld = arcpy.Describe(fc_in).OIDFieldName oracle_where_clause = ( "{} IN (SELECT {} FROM " "(SELECT {} FROM {} ORDER BY dbms_random.value) " "WHERE rownum <= {})".format( oid_fld, oid_fld, oid_fld, os.path.split(fc_in)[1], cnt ) ) sqlserver_where_clause = ( "{} IN (SELECT TOP {} {} FROM {} ORDER BY NEWID())".format( oid_fld, cnt, oid_fld, os.path.split(fc_in)[1] ) ) if dbms == 'oracle': where_clause = oracle_where_clause elif dbms == 'sqlserver': where_clause = sqlserver_where_clause arcpy.MakeFeatureLayer_management(fc_in, "tmpLayer") arcpy.SelectLayerByAttribute_management("tmpLayer", "NEW_SELECTION", where_clause) arcpy.MakeFeatureLayer_management("tmpLayer", "selection") arcpy.SaveToLayerFile_management("selection", fl_out) if __name__ == '__main__': main() In many ways, this would not be a very good "general" approach for several reasons. One, relying on the DBMS makes the code more involved or less portable because every DBMS seems to have a different approach to selecting random records. Second, passing SQL through ArcGIS tools always seems to have a sketchiness about it. It does work, but I have definitely run into issues as well. Looking at the code above, one might wonder why lines 29 and 30 exist, i.e., why not just pass the SQL directly to the MakeFeatureLayer tool. When I first ginned up this code, I tried doing just that, but I found an interesting/odd behavior. The SQL to select random records actually became embedded in the definition of the feature layer so every time ArcMap was refreshed, the records kept changing. ArcMap didn't like that, there would be issues with displaying polygons at times. It would be neat, though, with attribute-only data to load a dynamically, randomly changing table into ArcMap for testing at times.

‎02-15-2015

If we are already importing random, then we can rely on random.sample to do the heavy lifting for us, assuming we have already went to the effort of building an OID list and want sampling without replacement. def main(): import arcpy from random import sample fc_in = arcpy.GetParameterAsText(0) # input featureclass fl_out = arcpy.GetParameterAsText(1) # output layerfile cnt = arcpy.GetParameter(2) # number of features to select fld_oid = arcpy.Describe(fc_in).OIDFieldname lst_oids = [oid for oid, in arcpy.da.SearchCursor(fc_in, (fld_oid))] oids = ", ".join(map(str, sample(lst_oids, cnt))) where = "{0} IN ({1})".format(arcpy.AddFieldDelimiters(fc_in, fld_oid), oids) arcpy.MakeFeatureLayer_management(fc_in, "selection", where) arcpy.SaveToLayerFile_management("selection", fl_out) if __name__ == '__main__': main() If building an OID list in-memory becomes an issue, one could resort to using a reservoir sampling approach. def stream_sample(iterator, k): from random import randint result = [next(iterator) for _ in range(k)] n = k for item in iterator: n += 1 s = randint(0, n) if s < k: result = item return result def main(): import arcpy fc_in = arcpy.GetParameterAsText(0) # input featureclass fl_out = arcpy.GetParameterAsText(1) # output layerfile cnt = arcpy.GetParameter(2) # number of features to select fld_oid = arcpy.Describe(fc_in).OIDFieldname sample_oids = [oid for oid, in stream_sample(arcpy.da.SearchCursor(fc_in, "OID@"), cnt)] oids = ", ".join(map(str, sample_oids)) where = "{0} IN ({1})".format(arcpy.AddFieldDelimiters(fc_in, fld_oid), oids) arcpy.MakeFeatureLayer_management(fc_in, "selection", where) arcpy.SaveToLayerFile_management("selection", fl_out) if __name__ == '__main__': main() A plus of using reservoir sampling is that the memory footprint can be quite modest to trivial when working with very large datasets. A minus of using reservoir sampling is that calling random so many times can add noticeable overhead; that said, I can still sample from a million records in a couple seconds. The stream_sample function is taken from JesseBuesking on the StackExchange thread: pick N items at random. The code from JesseBuesking is basically just implementing Don Knuth's algorithm for picking random elements from a set whose cardinality is unknown.

‎02-12-2015

OK, now we are getting down to business. Hopefully Vince Angelo‌ can find some time to chime in, he always has good information to share on these types of question. I will have to take some time to think them over.

‎02-12-2015

You can see database views in ArcGIS Desktop just by connecting to a database, what functionality are you hoping to gain by registering a database view? Is it not functionality but performance related?

‎02-12-2015

Providing a bit more information would be helpful. You mention ArcSDE 10.1, have you applied any Service Packs or patches? What edition of ArcSDE (Personal, Workgroup, Enterprise)? What version and edition of SQL Server are you using? What version of MS Access are you using? What driver(s) and version(s) have you tried?

‎02-12-2015

There are a couple of things going on here. First, arcpy.GetInstallInfo has no parameters. It will accept an argument and not throw an error, but any argument that is passed doesn't affect the results of what is returned. The arcpy.GetInstallInfo function gets the installation information that relates to the currently loaded ArcPy site package, which in your case is ArcGIS Engine. If you install ArcGIS Desktop and ArcGIS Engine on the same machine using standard installation instructions, the two will share a single Python interpreter, usually C:\Python27\ArcGIS10.x (10.2 in this case). Another way to look at it is that a single Python interpreter has 2 ArcPy site packages registered/installed. Since the site packages have the same name (arcpy), they both can't be loaded into the interpreter at the same time. When the interpreter encounters an import arcpy statement, it will find and import whichever site package is found first in the search path for modules, i.e., sys.path. Before importing ArcPy, you can quickly determine which ArcPy site package will be loaded by running: import imp imp.find_module('arcpy') In this case, the result will come back with ...\Engine10.2\... since it comes first in the sys.path. Reversing the order of sys.path before importing ArcPy will likely find and import the Desktop site package for your situation. import imp import sys sys.path.reverse() imp.find_module('arcpy') The issue with reversing sys.path out of hand is that it will import the Engine-based site package if ArcGIS Engine was installed first. There are a couple of more thoughtful ways to get around this problem. First, setting the PYTHONPATH Windows environment variable to the Desktop-based site package will ensure that it is loaded first regardless of the sys.path order. However, in this case, it seems you don't already know that location and am trying to figure it out. A second approach would be to remove all of the Engine-related entries in sys.path before import arcpy. import sys for p in sys.path[:]: if 'Engine' in p: sys.path.remove(p) If the goal is to determine whether ArcGIS Desktop is installed and where, maybe querying a WMI service for installed applications and information is a better approach. Adapted from the Microsoft Script Center List Installed Software Python script: import win32com.client objWMIService = win32com.client.Dispatch("WbemScripting.SWbemLocator") objSWbemServices = objWMIService.ConnectServer(".", "root\cimv2") colItems = objSWbemServices.ExecQuery("Select * from Win32_Product " "where Name Like 'ArcGIS % for Desktop'") for objItem in colItems: print "Name: ", objItem.Name print "Install Date: ", objItem.InstallDate print "Install Location: ", objItem.InstallLocation print ""

‎02-10-2015

With CalculateField_management, try removing "PYTHON_9.3". You aren't actually passing an expression, just a value.

‎02-10-2015

Review the documentation on ArcPy Data Access cursors. The arcpy.da update cursor doesn't have a setValue method, that was for the older-style update cursor. You are effectively mixing up the syntax for the two types of cursors.

‎02-10-2015

I am not understanding your predicament. Could you post a screenshot or some specific examples of what you would like to see and what you are actually seeing? Regarding your code, you don't have to loop over the dataframes if you aren't going to pass them to ListLayers. Calling ListLayers without a dataframe object will list the layers in all of the dataframes.

‎02-10-2015

You can't pass ListLayers only a dataframe object, it will fail. At a minimum, and map document or layer needs to be passed. That said, the code still has an issue.

‎02-10-2015

Called a bug.

‎02-09-2015

Have you tried putting in a sleep call to pause the script in between calling serviceStartStop()? I wonder if the server sometimes gets bogged down and file locks aren't released right away. Have you tried re-starting or re-stopping the service that just failed, does it work the second time right after it failed the first time?

‎02-09-2015

Let's jump over to the other thread to continue the discussion.

‎02-09-2015

Taking your question at face value, i.e., does SelectLayerByAttribute or similar tools have internal checking for SQL injection, I think the answer is pretty clearly no. I just did a quick check using SelectLayerByAttribute, and I was able to drop a table in SQL Server by injecting extra SQL into the where_clause of the tool. Since MS Access doesn't support multiple SQL statements, it didn't work on personal geodatabases. It also didn't work on file geodatabases, which I am guessing is for the same reason. Of course, I could use less invasive SQL injection with all three to return more records than intended. Although I didn't check all DBMSes and all forms of SQL injection, the fact that I could successfully use some SQL injection with some DBMSes gives a strong indication the tools themselves are not doing any internal checks for SQL injection. I think Esri would say these tools are simply passing SQL along, and that hardening against SQL injection should be taking place elsewhere. Not only would programming internal checks get complicated and quickly, it would likely involve putting big constraints on how SQL is used with those tools. Always a trade off. The tools you reference might not be hardened against SQL injection, but that doesn't mean the floodgates are open. There are still multiple layers in the application stack between these tools and the interface of ArcGIS Server that users will be interacting with. One thing Esri introduced, I can't remember when exactly, is standardized queries for ArcGIS Server. In terms of publishing GP tools, there may be extra precautions in place, I don't know. I am a firm believer in seeing is believing, especially with ArcGIS. Regardless of what the documentation does or doesn't say, I say test it and see for yourself.

‎02-09-2015

In this case, there is no url to close. The url variable in the script is just a string. The script never directly interacts with the file-like object returned by urllib2.urlopen. Instead, the json.load method calls urllib2.open and iterates through the returned file-like object. Once json.load returns, everything it created goes out of scope, including objects returned by urllib2.urlopen.

Online Status	Online
Date Last Visited	an hour ago

My Ideas

Latest Contributions by JoshuaBixby

Re: Help altering some python code to randomly select one feature

Re: Help altering some python code to randomly select one feature

Re: Best practices for using nonspatial tables (SQL Views) as event layers?

Re: Best practices for using nonspatial tables (SQL Views) as event layers?

Re: MS SQL Server and Missing SDE 10.1 Base Table

Re: Why does GetInstallInfo('desktop') return info for Engine?

Re: how to update field with name of FC for list of FCs ?

Re: how to update field with name of FC for list of FCs ?

Re: Replace parts of a data source string in Layer files?

Re: Replace parts of a data source string in Layer files?

Re: Inserting a feature into a polygon featureclass with z-values

Re: "Could not undeploy services" error when using Python

Re: Arcpy TableToTable_Conversion writing extra fields?

Re: Do the SelectLayerByAttribute or MakeFeatureLayer tools have any sort of internal checking for SQL Injection?

Re: Arcpy TableToTable_Conversion writing extra fields?

Re: Can Azure Application Gateway Replace ArcGIS W...

Re: ArcGIS Concurrent (desktop) License Server - '...

Re: How to use ArcPy without installing ArcGIS

Re: Update the organization URL in ArcGIS Enterpri...

Re: Mobile or File Geodatabase Views: Grab earlies...

Python Snippets

Esri Forestry Group (EFG)