Long SQL Select Statements and Python

01-29-2020 02:48 PM
MVP Esteemed Contributor

Earlier this week I posted Quoting Quotes  where I was dealing with a sql select statement.  As a follow up, here is something I learned today when dealing long variables.  I suspect a few of you guys (and you know who you are) already know this trick, but it's new to me so perhaps someone can benefit from it some time.

Dealing with (My) SQL Select statements in python became a challenge for me as I have several tables that have a ridiculous number of fields.  On top of that, some of them are free text 'comments' fields and those can be problematic because you never know what the contents are and what weird characters you will encounter that bomb things out at no charge.

A typical approach when using python to scrape data from a SQL database is to open a cursor, and pass it a query.  

cn = mysql.connect(user = 'userName', password = 'password', host = 'yourHost', database = 'yourDatabase')
cursor = cn.cursor()

query = "SELECT some fields FROM yourDatabase.yourTable WHERE some condition is met;"


As mentioned I have a ton of fields to get and if I use:

query = "SELECT * FROM dataBase.myTable"

which selects ALL the fields with the * operator, I got myself in trouble when I encountered the weird characters in some of the fields.  What I ended up doing was entering groups of field names to my query variable until I found the one(s) that had bad characters in them. But, the select statement was so wide it is hardly readable in my Spyder ide code window.  Imagine listing 50 or 60 field names in one line.  Here is the solution:

query = """SELECT idProject, idController, siteAddress, siteApn,
            idRealProperty, projectName, projectType,
            projectValuation, numStories, totalSquareFootage, hasSepticSystem,
            hasIncludedDocs, isOverAcre, dwellingUnits, numWindows, roofingSquareFeet, 
            exteriorFinishType, finishingSquareFeet, newMeters, upgradedMeters, 
            tempMeters, additionalCircuits, numFurnaces, numACUnits, numHumidifiers, 
            numAirCleaners, numWaterHeaters, numWaterSofteners, numBoilers, 
            feetFencing, feetRetWall, kilowatts, footprintChange, septicAndBedrooms, 
            multipleKitchens, occupancyChange, roofSheath, licensedContractor, 
            ownerBuilder, affirmApplicant, certifyApplication, flags_0, flags_64,
            selected, activeSubRevCycles, gotoPhaseId, 
            paidFees, wasRejected, completedIssuances, dateCreated, applicationSubmitted, 
            applicationAccepted, lastSubmitted, submittalApproved, datePermitIssued
            FROM master_db.saltlakecountybuildingpermits;"""  

This is actually a trimmed down list of field names I needed but you get the idea. Notice how all the field names are comma and space separated and the last field name is not covered by a comma.  The three quotes at the beginning and end tells the interpreter to ignore the new lines and subsequently treats the variable value as one long string of characters.  I've never had to use this approach with a 'where' statement in an arcpy cursor, but I suspect it works there as well.

That should just about do it....
Tags (1)
0 Kudos
1 Reply
MVP Regular Contributor

Multiline strings with the triple quotes is a common method for this type of thing. Typically, triple double quotes """content""" is used for docstrings that describe the use of functions so I use triple single quotes '''contnent'''

Alternatively, you could use join and format string methods with a list of field names.

fields = ["one", "two", "three", "four"]
sql = "select {} from table_name".format(",".join(fields))‍‍

Going one step further, you can dynamically get the list of fields with arcpy.ListFields()

fields = [f.name for f in arcpy.ListFields(r"path\to\my\data")]
sql = "select {} from table_name".format(",".join(fields))‍‍‍‍
0 Kudos