How or when should you use a related table over having the data in the geodatabase?
For example, in my office we are collecting data, using collector, each year on species (Turkey) counts per location. Currently, we have a point shapefile created with an attribute table similar to the one below.
Site | 2015Count | 2015Density | 2016Count | 2016Density | 2017Count | 2017Density | 2018Count | 2018Density | Acres |
1 | 54 | 0.027 | 19 | 0.010 | 15 | 0.008 | 10 | 0.005 | 1984 |
2 | 80 | 0.006 | 82 | 0.006 | 90 | 0.007 | 92 | 0.007 | 13815 |
3 | 22 | 0.016 | 30 | 0.022 | 33 | 0.024 | 36 | 0.026 | 1381 |
4 | 80 | 0.015 | 84 | 0.015 | 84 | 0.015 | 81 | 0.015 | 5456 |
5 | 71 | 0.001 | 69 | 0.001 | 63 | 0.001 | 60 | 0.001 | 79651 |
Our question is, should we have the data in a related table or add 2 new columns each year? We see advantages and disadvantages of both. If we switch to a related table symbolizing by certain years is not easily done, we do this often, but we do not have to continually add a column each year. However, we do need to add a new year to the domain list each year if using the related table. If we keep the data in the geodatabase it will allow us to symbolize by each year without exporting and making copies of the layer with queries to view each year.
I guess the question is, is there a standard or best practice saying that data should be in a related table when x occurs. Or is it more of a preference thing at this point since GIS can handle larger datasets now?
Also, if this should be posted in another forum I would be happy to post it there as well.
Thank you.
I would say keep it in a geodatabase table since you are only adding 2 columns a year. It would be a while before the table becomes a pain in relation to the # of fields.
If we keep it in a geodatabase and keep the workflow simple, how do we convince or encourage our database designers to overlook "best practices" to allow for a better and more efficient workflow?
When you say geodatabase do you mean enterprise like SQL Server or file geodatabase? If SQL Server, you could have a single table with year, site and density with a view that makes it look like you have it above.
The relational database would be created in SQL server. We would be doing what you suggested if we went with a relational database. However, I am questioning the efficiency of doing so? What is the benefit in a SQL server database? It seems like an extra step to get the final output needed. Why "waste" the time of making a related table and schema and then creating a view that looks like the final output, when you could simply have it in the shapefile itself? That is mainly my question. Why would you use a related table when it seems like it is adding steps and adding to workflow?