Using a SQL expression to check data in two databases

3292
3
Jump to solution
11-26-2015 02:28 AM
BenjaminSimpson
New Contributor III

Hi,

I have a toolbox script which appends data from a personal geodatabase to an Oracle database. I want to include a check to see if the data being imported from the personal gdb is not already in the Oracle database.I have a field (MODELNAME) which is in both the personal gdb table which the data is being appended from and the Oracle table where the data is being appended to. In the gdb the there is only one model name in that field. I want to check if the model name from the gdb already exists in the MODELNAME field in the Oracle database. Here is where I get stuck. I know what I want to do, I just dont know how to do it. I am assuming that I need to use a SQL statement from this. Something like "SELECT * FROM SAGIS_REACH_NETWORK AS r WHERE r.MODELNAME = SimReaches.MODELNAME".

I was thinking of doing a count of the number of rows where the model name in the Oracle table match the gdb table.

This is what I have so far:

model_list = [arcpy.GetParameterAsText(0)]
if model_list != None:
    # Only run script if atleast one model is selected i.e. model_list is not empty
    try:
        if arcpy.ProductInfo() == "ArcInfo" or "ArcEditor":
    
            # Parameters
            # Personal Geodatabase we are importing into
            outmodel = "Database Connections\\simcatmaster@G01L.sde" 
            # List tuples with feature classes and tables that we want to import, and their output name in Oracle
            # e.g[(in_name,out_name)]
            tables = []
            tables.append(('\\simcat\\SimReaches','\\simcatmaster.SAGIS_REACH_NETWORK'))
            today = datetime.today()
            
            # connect to oracle
            oracle_conn = pyodbc.connect("DSN=ORACLE_SIMCAT_x86")
                  
            oracle_cursor = oracle_conn.cursor()
            
            # Find the total count of models (geodatabases) selected
            modelListLen = len(model_list)
            
            # Declaring the first model position within the list of models selected (model_list)
            modelPos = 0
            for model in model_list:
                
                # Check if the model selected has already been imported to SAGIS_REACH_NETWORK
                SQL = "SELECT * FROM SAGIS_REACH_NETWORK AS r WHERE r.MODELNAME = {0}.MODELNAME;".format(tables[0])
                
                oracle_cursor.execute(SQL).fetchall()
                SQLCount = len(oracle_cursor)
                print SQLCount

Does anyone have any advice as to how I could do this?

Thanks in advance for any help,

Ben.

0 Kudos
1 Solution

Accepted Solutions
David_JAnderson
New Contributor III

Without knowing a little more about the nuts and bolts of how you have the databases set up it is difficult to give specific advice.  Do you have it set up so that the MS Access database can talk to the Oracle database?  Or a setup where the Oracle database can talk to the Access database?  If either is the case then you can do things in one query.  Otherwise it is going to take two queries.

The query you have won't work for the one query route.  First all tables have to be listed in the from clause.

like this:

"SELECT * FROM SAGIS_REACH_NETWORK AS r, {0}  WHERE r.MODELNAME = {0}.MODELNAME;".format(tables[0])

Next I wouldn't pull all the rows across the network.  All you really need is the count.

"SELECT count(*) FROM SAGIS_REACH_NETWORK AS r, {0}  WHERE r.MODELNAME = {0}.MODELNAME;".format(tables[0])

The fetch then becomes:

num_matching_modes = oracle_cursor.execute(SQL).fetchone()

You might need to experiment interactively to see how the fetch is coming across.  Sometimes it is a single value, sometimes as list.

If the two databases are not talking to each other there will need to be two queries.  One to get the model name in the personal gdb and one to check if it is in Oracle.

distinct_model_sql = 'select distinct(ModelName) from ImportTable'

model_name  = cursor.execute(distinct_model_sql).fetchone()

Then

matching_sql = select count(*) from OracleTable where model_name=?"

match_count = cursor.execute(matching_sql,(model_name)).fetchone()

Use the ? construct to fill in the variable so as to avoid a SQL injection attack.  Plus it is just good form and it lets the database connector do the proper string formatting vis a vis quotation marks.

View solution in original post

0 Kudos
3 Replies
DarrenWiens2
MVP Honored Contributor

The thing that jumps out is that you collect user input into 'model_list', but that value never gets compared to anything in your Oracle DB.

0 Kudos
David_JAnderson
New Contributor III

Without knowing a little more about the nuts and bolts of how you have the databases set up it is difficult to give specific advice.  Do you have it set up so that the MS Access database can talk to the Oracle database?  Or a setup where the Oracle database can talk to the Access database?  If either is the case then you can do things in one query.  Otherwise it is going to take two queries.

The query you have won't work for the one query route.  First all tables have to be listed in the from clause.

like this:

"SELECT * FROM SAGIS_REACH_NETWORK AS r, {0}  WHERE r.MODELNAME = {0}.MODELNAME;".format(tables[0])

Next I wouldn't pull all the rows across the network.  All you really need is the count.

"SELECT count(*) FROM SAGIS_REACH_NETWORK AS r, {0}  WHERE r.MODELNAME = {0}.MODELNAME;".format(tables[0])

The fetch then becomes:

num_matching_modes = oracle_cursor.execute(SQL).fetchone()

You might need to experiment interactively to see how the fetch is coming across.  Sometimes it is a single value, sometimes as list.

If the two databases are not talking to each other there will need to be two queries.  One to get the model name in the personal gdb and one to check if it is in Oracle.

distinct_model_sql = 'select distinct(ModelName) from ImportTable'

model_name  = cursor.execute(distinct_model_sql).fetchone()

Then

matching_sql = select count(*) from OracleTable where model_name=?"

match_count = cursor.execute(matching_sql,(model_name)).fetchone()

Use the ? construct to fill in the variable so as to avoid a SQL injection attack.  Plus it is just good form and it lets the database connector do the proper string formatting vis a vis quotation marks.

0 Kudos
BruceHarold
Esri Regular Contributor

Hi

Please see if my tool (link below) does what you need.  You can make a NEW_SELECTION on one input based on shared values from the other object, any unselected records will be your diff.

I haven't tried PGDB inputs but File GDB and SDE workspaces work, although SDE will take longer to make the table scan - but it sure is a lot easier than coding or using SQL.

http://www.arcgis.com/home/item.html?id=e638afe0695a4ad38388cb8d9b350446

Regards

0 Kudos