Help finding similar sounding street names

8132
9
11-13-2014 09:35 AM
JamesGibson
New Contributor III

Hi all,

 

A quick summary of myself/what i'm trying to accomplish: 

I work for the City of Belton, TX. I've been doing city work for about a year and a half total, so I am still learning a lot in this position. 

Among many things that I do for the City, I have the privilege of assigning 9-1-1 addresses and approving new street names in subdivision plats that come to my table.  (I've been doing 9-1-1 addressing for about four years now, thanks to my previous position before working at the city)

 

After a bombardment of new plats coming to my office these last few months,  I've decided to try and find a way to help speed the process of finding existing duplicate/similar sounding street names - as it takes so long to do this manually.

 

I've tried a bunch of searches on this topic and have found a few solutions - I just don't have the experience to implement them.

 

What I found were different ways of doing 'fuzzy' searches:  soundex, metaphone, double metaphone, and a few others.  I even found a website that pretty much does exactly what I need. http://www.johnstonnc.com/mainpage.cfm?category_level_id=771&content_id=3543 which uses an SQL soundex search. 

I sent an email to the webmaster to see how they did that, and was given this answer. 

"For the search you mention, it is a simple SQL “soundex” search.  The query is below:

 

SELECT DISTINCT ROAD_NAME, SUFFIX, COMMUNITY

FROM TABLENAME

WHERE SOUNDEX(ROAD_NAME) = SOUNDEX('#Form.STREETSOUND#')

ORDER BY ROAD_NAME, SUFFIX, COMMUNITY"

 

My issues:

1. I hear that soundex searches will work, but there are better ways to do this, such as the double metaphone way.

2. It would be great to implement this on our website,  but I don't have that type of access;  I need to be able to do this locally.

3. I am not familiar with creating a MYSQL database.

4. I need to easily update the streets list.

 

I need to have a form where I can enter a potential street name to make sure it doesn't exist already or have a similar name in the database. (Beach vs Beech)

How would I go about implementing this?  Is there a way to do this through Python? Do I need my streets list to be in a MYSQL database? Right now, they are in my FileGDB, and an excel file.

 

Thanks for any help,

 

Anthony.

9 Replies
PaulCrickard1
Occasional Contributor II

If you have arcServer you could run a query against the service with existing names and test it using JavaScript (soundex function from http://rosettacode.org/wiki/Soundex#JavaScript )

<html>

<head><title>Soundex</title>

</head>

<body>

<script>

var soundex = function (s) {

     var a = s.toLowerCase().split('')

         f = a.shift(),

         r = '',

         codes = {

             a: '', e: '', i: '', o: '', u: '',

             b: 1, f: 1, p: 1, v: 1,

             c: 2, g: 2, j: 2, k: 2, q: 2, s: 2, x: 2, z: 2,

             d: 3, t: 3,

             l: 4,

             m: 5, n: 5,

             r: 6

         };

     r = f +

         a

         .map(function (v, i, a) { return codes })

         .filter(function (v, i, a) { return ((i === 0) ? v !== codes : v !== a[i - 1]); })

         .join('');

     return (r + '000').slice(0, 4).toUpperCase();

};

//TEST ONE: HARD vs HEART

wordOne= soundex("hard");

wordTwo=soundex("heart");

if(wordOne==wordTwo){

alert("close");

}

else{alert("nope");}

//TEST TWO: PAUL vs BALL

//use substring because first sound p and b is different but the next three numbers match

one=soundex("paul");

two=soundex("ball");

if(one.substring(1)==two.substring(1)){

alert("close");

}

else{alert("nope");}

//TEST THREE: PAUL vs CAR

//NOT CLOSE

a=soundex("paul");

b=soundex("car");

if(a.substring(1)==b.substring(1)){

alert("close");

}

else{alert("NOPE");}

</script>

</body>

</html>

0 Kudos
PaulCrickard1
Occasional Contributor II

Here is a sample of a javascript webpage that queries arcServer, grabs the name of a specific feature and compared it to the word dead (the feature name is Lead).

<html>

<head><title>Blank</title>

</head>

<body>

<script src="http://myserver/JavaScript/ThirdParty/qwest/qwest-0.6.0.min.js"></script>

<script>

//the soundex function

var soundex = function (s) {

     var a = s.toLowerCase().split('')

         f = a.shift(),

         r = '',

         codes = {

             a: '', e: '', i: '', o: '', u: '',

             b: 1, f: 1, p: 1, v: 1,

             c: 2, g: 2, j: 2, k: 2, q: 2, s: 2, x: 2, z: 2,

             d: 3, t: 3,

             l: 4,

             m: 5, n: 5,

             r: 6

         };

     r = f +

         a

         .map(function (v, i, a) { return codes })

         .filter(function (v, i, a) { return ((i === 0) ? v !== codes : v !== a[i - 1]); })

         .join('');

     return (r + '000').slice(0, 4).toUpperCase();

};

//an AJAX library to connect to arcserver, grab one feature name and compare

qwest.post('http://dmdview/ArcGIS/rest/services/BikeWays_Trails/MapServer/0/query',{where:'objectid = 76',f:'json',outFields:'ParentPathName'})

     .success(function(response){

            compare=soundex(response.features[0].attributes.ParentPathName)

            word="dead";

            sound=soundex(word);

            if(sound == compare){

                 alert(word+ " is similar to "+response.features[0].attributes.ParentPathName);

             }else if(sound.substring(1) == compare.substring(1)){

                 alert(word+ " is similar to "+response.features[0].attributes.ParentPathName);

            }else{

               alert("nope");}

  })

     .error(function(message){

        alert(message);

     });

</script>

</body>

</html>

0 Kudos
PaulCrickard1
Occasional Contributor II

Lastly, another soundex library.

<html>

<head><title>soundex</title>

</head>

<body>

<script>

/*

  * v 1.0e  NEEDS TESTING

  * -----------------------

  *

  * The following SoundEx function is:

  *

  *    (C) Copyright 2002 - 2013, Creativyst, Inc.

  *               ALL RIGHTS RESERVED

  *

  * For more information go to:

  *           http://www.Creativyst.com

  * or email:

  *           Support@Creativyst.com

  *

  * Redistribution and use in source and binary

  * forms, with or without modification, are

  * permitted provided that the following conditions

  * are met:

  *

  *   1. Redistributions of source code must

  *      retain the above copyright notice, this

  *      list of conditions and the following

  *      disclaimer.

  *

  *   2. Redistributions in binary form must

  *      reproduce the above copyright notice,

  *      this list of conditions and the

  *      following disclaimer in the

  *      documentation and/or other materials

  *      provided with the distribution.

  *

  *   3. All advertising materials mentioning

  *      features or use of this software must

  *      display the following acknowledgement:

  *      This product includes software developed

  *      by Creativyst, Inc.

  *

  *   4. The name of Creativyst, Inc. may not be

  *      used to endorse or promote products

  *      derived from this software without

  *      specific prior written permission.

  *

  * THIS SOFTWARE IS PROVIDED BY CREATIVYST CORPORATION

  * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES,

  * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED

  * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A

  * PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL

  * THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,

  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL

  * DAMAGES (INCLUDING, BUT NOT LIMITED TO,

  * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS

  * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)

  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,

  * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT

  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY

  * WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF

  * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

  *

*/

  function SoundEx(WordString, LengthOption, CensusOption)

  {

      var TmpStr;

      var WordStr = "";

      var CurChar;

      var LastChar;

      var SoundExLen = 10;

      var WSLen;

      var FirstLetter;

      if(CensusOption) {

          LengthOption = 4;

      }

      if(LengthOption != undefined) {

          SoundExLen = LengthOption;

      }

      if(SoundExLen > 10) {

          SoundExLen = 10;

      }

      if(SoundExLen < 4) {

          SoundExLen = 4;

      }

      if(!WordString) {

          return("");

      }

      WordString = WordString.toUpperCase();

      /* Clean and tidy

      */

      WordStr = WordString;

      WordStr = WordStr.replace(/[^A-Z]/gi, " "); // rpl non-chars w space

      WordStr = WordStr.replace(/^\s*/g, "");     // remove leading space

      WordStr = WordStr.replace(/\s*$/g, "");     // remove trailing space

      /* Some of our own improvements

      */

      if(!CensusOption) {

          /* v1.0e: GH at begining of word has G-sound (e.g., ghost)

          */

          WordStr = WordStr.replace(/^GH/g, "G");   // Chng leadng GH to G

          WordStr = WordStr.replace(/DG/g, "G");     // Change DG to G

          WordStr = WordStr.replace(/GH/g, "H");     // Change GH to H

          WordStr = WordStr.replace(/GN/g, "N");     // Change GN to N

          WordStr = WordStr.replace(/KN/g, "N");     // Change KN to N

          WordStr = WordStr.replace(/PH/g, "F");     // Change PH to F

          WordStr =

              WordStr.replace(/MP([STZ])/g, "M$1"); // MP if fllwd by ST|Z

          WordStr = WordStr.replace(/^PS/g, "S");   // Chng leadng PS to S

          WordStr = WordStr.replace(/^PF/g, "F");   // Chng leadng PF to F

          WordStr = WordStr.replace(/MB/g, "M");    // Chng MB to M

          WordStr = WordStr.replace(/TCH/g, "CH");  // Chng TCH to CH

      }

      /* The above improvements may

       * have changed this first letter

      */

      FirstLetter = WordStr.substr(0,1);

      /* in case 1st letter is

       * an H or W and we're in

       * CensusOption = 1

      */

      if(FirstLetter == "H" || FirstLetter == "W") {

          TmpStr = WordStr.substr(1);

          WordStr = "-";

          WordStr += TmpStr;

      }

      /* In properly done census

       * SoundEx the H and W will

       * be squezed out before

       * performing the test

       * for adjacent digits

       * (this differs from how

       * 'real' vowels are handled)

      */

      if(CensusOption == 1) {

          WordStr = WordStr.replace(/[HW]/g, ".");

      }

      /* Begin Classic SoundEx

      */

      WordStr =  WordStr.replace(/[AEIOUYHW]/g, "0");

      WordStr = WordStr.replace(/[BPFV]/g, "1");

      WordStr = WordStr.replace(/[CSGJKQXZ]/g, "2");

      WordStr = WordStr.replace(/[DT]/g, "3");

      WordStr = WordStr.replace(//g, "4");

      WordStr = WordStr.replace(/[MN]/g, "5");

      WordStr = WordStr.replace(//g, "6");

      /* Properly done census:

       * squeze H and W out

       * before doing adjacent

       * digit removal.

      */

      if(CensusOption == 1) {

          WordStr = WordStr.replace(/\./g, "");

      }

      /* Remove extra equal adjacent digits

      */

      WSLen = WordStr.length;

      LastChar = "";

      TmpStr = "";

      // removed v10c djr:  TmpStr = "-";  /* rplcng skipped first char */

      for(i = 0; i < WSLen; i++) {

          CurChar = WordStr.charAt(i);

          if(CurChar == LastChar) {

              TmpStr += " ";

          }

          else {

              TmpStr += CurChar;

              LastChar = CurChar;

          }

      }

      WordStr = TmpStr;

      WordStr = WordStr.substr(1);          /* Drop first letter code   */

      WordStr = WordStr.replace(/\s/g, ""); /* remove spaces            */

      WordStr = WordStr.replace(/0/g, "");  /* remove zeros             */

      WordStr += "0000000000";              /* pad with zeros on right  */

      WordStr = FirstLetter + WordStr;      /* Add first letter of word */

      WordStr = WordStr.substr(0,SoundExLen); /* size to taste     */

      return(WordStr);

  }

alert(SoundEx("paul")+" "+SoundEx("ball"));

alert(SoundEx("paul").substring(1)+" "+SoundEx("ball").substring(1))

alert(SoundEx("car")+" "+SoundEx("truck"));

alert(SoundEx("car").substring(1)+" "+SoundEx("truck").substring(1))

</script>

</body>

</html>

0 Kudos
JamesGibson
New Contributor III

Thanks Paul for your help!

However, we do not have ArcServer. 

And I kinda understand in theory how these scripts and things work,  but i'm not understanding how a user can input a name, like a form with a submit button to search for similar names in a streets list/database.

Also, i'd like to see how to do this using the double metaphone method if anyone knows how to do that.

0 Kudos
JamesGibson
New Contributor III

and just to clarify,  I need to be able to input a name,  and have a list of all similar names in the database listed on screen.

0 Kudos
PaulCrickard1
Occasional Contributor II

I have a REST - ish service running that dumps a database. It looks like this:

[{"division":"ADMIN","objectid":"18186","name":"Central Avenue Pedestrian and Lighting Improvements Project Phase II","pid":"DESIGN"},{"division":"CIP","objectid":"18149","name":"4th Street Roadway Improvements (Central to Tijeras)","pid":"CONSTRUCTION"},{"division":"CIP","objectid":"18152","name":"Alligator Courtyard Exhibit","pid":"CONSTRUCTION"},{"division":"CIP","objectid":"18574","name":"Convention Center Remodel Phase 2","pid":"CONSTRUCTION"},{"division":"CIP","objectid":"19020","name":"Edith Transfer Station","pid":"PROJECT_DEVELOPMENT"}]

I then created a webpage that connects to this data and compares the user input. It prints out any name that sounds the same. I used resign so I could match the word DESIGN. You would have more words that match so it would print any that did.

Capture.JPG

then here is the code. The architecture is like REST. You write a server side script to query your DB and put out the info in JSON. then the website uses AJAX to get the data and do the work client side. In my first example I used ArcServer because then you dont have to write the server side. the HTML is below and below that I have attached a sample server side script that would generate the JSON.

<html>

<head><title>Blank</title>

<script src="http://myserver/JavaScript/ThirdParty/qwest/qwest-0.6.0.min.js"></script>

</head>

<body>

Name: <input type="text" id="myText" >

<button onclick='compareWords()'>Copy Text</button>

<p id='data'></p>

<script>

var e;

var compare;

var sound;

var response;

var soundex = function (s) {

     var a = s.toLowerCase().split('')

         f = a.shift(),

         r = '',

         codes = {

             a: '', e: '', i: '', o: '', u: '',

             b: 1, f: 1, p: 1, v: 1,

             c: 2, g: 2, j: 2, k: 2, q: 2, s: 2, x: 2, z: 2,

             d: 3, t: 3,

             l: 4,

             m: 5, n: 5,

             r: 6

         };

     r = f +

         a

         .map(function (v, i, a) { return codes })

         .filter(function (v, i, a) { return ((i === 0) ? v !== codes : v !== a[i - 1]); })

         .join('');

     return (r + '000').slice(0, 4).toUpperCase();

};

function compareWords(){

w=document.getElementById("myText").value;

console.log(w);

sound=soundex(document.getElementById("myText").value);

qwest.get('http://myserver/myJSON.asp')

     .success(function(response){

    

  for(i=0;i<response.length-1;i++){

  e=response.pid;

  compare=soundex(response.pid);

  console.log(response.pid);

  console.log(compare);

  if(sound.substring(1) == compare.substring(1)){

  console.log(e +" sounds like " +w );

  var data = document.getElementById("data");

  data.innerHTML+=e +"<br/>";

  }else{console.log("nope");}

  }

  })

     .error(function(message){

        alert(message);

     });

}

</script>

</body>

</html>

Server side : need to populate your connection string wiht DB values for CAP variables

<%

set myconn_ = server.createobject("adodb.connection")

connection_ = "Provider=sqloledb; Data Source=NAME; Initial Catalog=DBNAME;User Id=USER; Password=PASSWORD;Network Library=DBMSSOCN"

myconn_.open (connection_)

set result_ = server.createobject("adodb.recordset")

sql = "Select distinct something from some table"

set result_ = myconn_.execute(sql)

Response.ContentType = "application/json"

        response.write("[")

while not result_.EOF

  response.write("{""division"":""" & result_("Division")&""",""objectid"":"""& result_("objectid") &""",""name"":"""& result_("ProjectName") &""",""pid"":"""& result_("Project_Status") &"""},")

  result_.movenext()

wend

response.write(" {""division"":"" " & 9 &""",""objectid"":"""& 9 &" "",""name"":"""& 9 &" "",""status"":"""& 9 &"""}]")

result_.close

myconn_.close

set result_ = Nothing

set myconn_ = Nothing

response.end

%>

0 Kudos
JamesGibson
New Contributor III


Agh,  I feel like this is too complicated for me to understand lol.

0 Kudos
JamesGibson
New Contributor III

An update -

I never figured out how to do this via a web interface,  but I did find some source code available in C# that works and produces pretty much exactly what I need.

Phonetic String Comparison with Soundex

I converted it to VB using a C# to VB tool,  and fixed any conversion errors manually to make the program run correctly.  (I needed to edit the code to personalize the program more, and I know VB more than C#)

I successfully converted the provided SoundEx code to VB, but couldn't figure out how to convert the supplied Metaphone code to VB.  So I searched the internet and found a different Metaphone implementation and got it working that way. (Honestly, I cannot find where I found this since I lost all of my bookmarks.)

Here is what I came up with.

1.jpg2.jpg3.jpg

I added to the code a way to load a text file from a directory, and save this location to an .ini file so that I don't need to reload the street names list every time.

The only thing to do now is export my Street Centerlines to an excel when I add any new street names to the list, format it to show only my 'rd_name' column, and save it as a text file.

0 Kudos
BruceHarold
Esri Regular Contributor

Hi

You might consider leveraging the World Geocode Service for his function.  Here is an example that filters on street names in Belton TX that are 'Davis Streets':

http://geocode.arcgis.com/arcgis/rest/services/World/GeocodeServer/find?text=Davis%20St%20Belton%20T...

The example only finds a single candidate, but there is spelling handling, like this search for Jackson St with a typo:

http://geocode.arcgis.com/arcgis/rest/services/World/GeocodeServer/find?text=Jacksin%20St%20Belton%2...

Regards

0 Kudos