How to create a new variable in GIS

3293
14
03-21-2017 06:16 PM
ABDALLAMOHAMED
Occasional Contributor

Hello everybody,

I wanted to create a new variable out of existing ones, the same idea as creating dummy variable. I have my data in shapefile as follows:

metro_cities_names, metro_pop,  typology_type. I wanted to recreate new variables from them as follows:

from Metro_pop i want to create a variable called metro_size less than 500, 500-1000000, 1000000-2000000 etc

From typology_type  I want to create a variable called typogy_type which will have four level: 1 = Black in low poverty, 2= Black in high poverty, 3= Latinos in low poverty, 4= Latinos in high poverty.

From metro-cities_names i want to create a variable called Regional_type where I will be able to categorize these 100 metro cities into  regions e.g Northeast, Midwest, south and west.

I know how to do them in SPSS but I hate it, plus It would be easy for me to run my OLS in Arcmap than SPSS.

Thanks a lot for your input

0 Kudos
14 Replies
RebeccaStrauch__GISP
MVP Emeritus

Hi ABDALLA MOHAMED ,

I wanted to create a new variable out of existing ones, the same idea as creating dummy variable. I have my data in shapefile as follows:

By "variable", are you saying you want new fields with the new value representing the various values in the field currently?  OR are these temporary values for calculations?

Have you thought about maybe moving your shapefiles into a file GDB? If you are keeping them as shapefiles, you may want to review Working with fields in shapefiles by adding a field in ArcCatalog—Help | ArcGIS Desktop 

You may want to look are using Attribute domains...A quick tour of attribute domains—Help | ArcGIS Desktop 

"Attribute domains are rules that describe the legal values of a field type, providing a method for enforcing data integrity"

OR maybe subtypes A quick tour of subtypes—Help | ArcGIS Desktop 

"Subtypes are a subset of features in a feature class, or objects in a table, that share the same attributes. They are used as a method to categorize your data."

If you are planning on createing new fields and calculating new values, are you planning on doing this manually and with field calculator (basically a one time thing) or are you needing to script this?  If scripting, what do you wan to use for development?  Model builder or something like Python (my recommendation). 

Answers to these questions will help the community get you started in the right direction. 

(and sorry, I don't know what OLS is, which for those that do know, it may answer some of the above ??)

DanPatterson_Retired
MVP Emeritus
0 Kudos
ABDALLAMOHAMED
Occasional Contributor

Hi Dan. Thanks a lot for your input. In the link you provided they didn't talk about create new variables , they just talk about OLS

0 Kudos
RebeccaStrauch__GISP
MVP Emeritus

Ha....boy I've been out of school too long....I knew that at one time.  Ugh!

0 Kudos
ABDALLAMOHAMED
Occasional Contributor

Hi Rebecca..Thanks a lot for your great input. Appreciated. I will go over the links you provided. OLS stands for Ordinary least Squared, it is a linear regression model used to choose the best model fit the regression line . I'm planning to use field calculator not scripting because I'm not good at it. I don't if field calculator would be helpful for me.

0 Kudos
ABDALLAMOHAMED
Occasional Contributor

Sorry I forgot to mention I wanted create new fields with the new values representing the various values in the field currently I have

0 Kudos
DanPatterson_Retired
MVP Emeritus

Have you considered breaking down the process into different fields, even manually with queries and using the field calculator to provide the reclassed values for the various components?  Once you have your base fields created you could concatenate them into a final classification.  This would avoid scripts altogether and unless you plan to do this numerous times, the manual process of reclassification and concatenation would ultimately be faster.

0 Kudos
ABDALLAMOHAMED
Occasional Contributor

Dan,

would you please give more explanation. I never used concatenation in GIS, I used it in excel though

0 Kudos
DanPatterson_Retired
MVP Emeritus

Add a new text field of appropriate width... lets assume that you have 3 fields (aka, variables), initially 2 string and 1 number

'Age' , 'Locale' , 'Education'  assume 30, New York and College as the answers

To produce new values for your new field you would simply open the field calculator, select python parser and set the 'type' to string/text

"{0} {1} {2}".format(!Locale!, !Age!, !Education!)   # note there is a space between the sets of curly braces.

This is equivalent to

!Locale! + " " + !Age! + " " + !Education!   # where you have to concatenate the spaces directly.

With the first format you can swap out the sequentials numbers to suite your sequence preference ie

"{2} {0} {1}".format( etcetera...  would put Education, then locale then age...

Hope you cate the drift.  I have a number of blog posts on the newer style python formatting options.  These are largely based on python versions from 3.3- 3.5, Python 3.6 is introducing a somewhat simpler form, but it is a year or so away from becoming a fixture in Arc*.

The only thing you have to remember is to set your parser to python, and field names are enclosed in exclamation marks.

PS

I might add you can play around with the base format

"{0}, {1}, {2}".format(!Locale!, !Age!, !Education!)   # is now comma-separated

"{0} - {1} - {2}".format(!Locale!, !Age!, !Education!)   # or a dash or two