I have some questions about writing/modifying locator styles. The first thing I want to say is that I have read the document "Customising Locators in ArcGIS 10" closely, and then answers I seek are not in there. To my knowledge, that is the only official documentation on ArcGIS 10 locator .lot.xml files (excluding the documentation in the schema files). I've also scoured the web for answers to no avail.
So here are my questions.
- The <std_elt> element is documented with "Not yet supported. Reserved for future use". What does this tag do? The fact that it is found in the <default_input> tags in the standard .lot.xml files makes me think it really does do something, and it's just undocumented.
TL;DR: What is the <std_elt> element for?
- The search_context attribute has "TODO: Documentation" where the documentation should be. The document "Customising Locators in ArcGIS 10" has this to say:
"The content in braces is a hint that a particular search context applies for the element. The engine manages sets of tests for elements within search contexts; these are discussed later in this document."
Later in the document:
"The source style used the "ZIPSearch" search context for Postal, but we will use "PostalSearch." There is also a search context called "CitySearch." These search contexts are defined by the engine and manage a set of tests for the element."
This doesn't really explain what search contexts are available, or how they work. I'm also unclear what the relationship is between the search_context attribute and the <result> element.
TL;DR: What is the search_context attribute for, and how does it relate to the <result> element?
- Which parts of the regular expression syntax are actually supported? "Customising Locators in ArcGIS 10" p.56 (see below) states that the expression syntax is limited to 6 items representing a very limited subset of standard regex syntax.
However, I've seen several examples that use functionality not present in this list. For example, page 60 of the same document shows the code snippet: <alt>`[0-9]{4}`</alt>, which shouldn't work if the information on p.52 is correct.
TL;DR: Is there a more accurate description of supported regex functionality for locator styles?
- If I have a <ref_data_style> containing a <multiline_grammar>, does this fall back to the top-level <multiline_grammar> if no candidates/matches are found? Does a similar thing happen for <inputs><default_input>? I've had some odd results in my tests where I believe that even if I deliberately break my locator, it still finds things in my data that are exact textual matches, i.e. "AB C" matches "AB C" but not "ABC".
TL;DR: What other work is carried out when geocoding that isn't specified in the <inputs> and <multiline_grammar> elements of the <ref_data_style>?
- In <mapping_schema><standardization> tags, are they only used when standardising addresses? "Customising Locators in ArcGIS 10" p.42 states that in order to be valid, a locator style must ensure that "Mapping Schema standardization includes all schema fields." I haven't done this for some of my input, and things seem to work OK.
TL;DR: If I don't plan on standardising address data, do I need to create the <standardization> element in my mapping schema?
- Finally, I'm using Python to generate my locators and run tests (specifically using the GeocodeAddresses_geocoding() function). However, I can't see how to make a multiline query from Python, and I believe that this is what I actually want to test. According to this stack-exchange post, this isn't possible. Is this correct, or am I misunderstanding the how multiline grammar works?
TL;DR: Can I perform a multiline query in Python?
Thanks in advance, please let me know if I need to clarify any of my questions.
-David