For as long as I can remember (which any more isn't that far back) the rule of thumb for creating geocoding locators was to use data that had all the address components in separate fields. Is that still the case?
For example; if you are matching against centerlines, you want the pre_dir, name, suf_dir & street type all in separate fields. Same would be true for address points: house_number, street components, and unit designator all split out in their own fields.
Now I see not only in the old style address locators, but the new style locators mapping to 'FULL STREET NAME' or 'FULL ADDRESS' which indicates to me that S MAIN ST in a single street name field or 1234 E ELM AVE #2 in a single address field isn't such a bad thing.
I'm going to go out on a limb and put the suspected answer in front of my question, which is "It depends". My question though is this: With either of the locator approaches (old school and new) is it still preferred to keep fields separate in the matching data or is it okay to have them merged into one?
There are, I think, two questions here.
1. What format should the reference data be in to build the locator?
2. What format should the input address come in to the locator as? Address in separate components like Address, City, State, Zip or all in a single field?
For question 1, it is still preferred to have your street component data split up into its components like PreDir, PreType, StreetName, StreetType, SufDir. We don't break unit values up into really fine components but we do suggest that the unit type and unit number are in separate fields. That being said, we do still do a very good job of geocoding even if the data is all in a single field. You would not map it to FullStreetName but just to StreetName or to both (unit information would still be better if it was separated out). FullStreetName is used for display purposes only. It is beneficial for countries where the street type can be concatenated to the streetname like in Germany.
For question 2, you will get a bit better performance with multi field input than single field input but for interactive geocoding it is negligible so we generally say to stick with single field. For batch geocoding you will get some benefits from your data being in multiple fields. Quality is essentially the same between to both though so that shouldn't be considered as a factor.
The coolest thing about geocoding for me is I have access to very good data in all aspects; points, lines and polygons. Years ago I was able to twist the right arms and get some good data standards in place so now years later, the pain of data development is long gone but we are all reaping the benefits of those early days.