Home

Table of Contents

Instructors

Help

Address Parsing

[ Credit ]

Objectives

  • In this section you will learn about the AdParse.TRN file which allows you to add or modify address components that may be unique to your database.

What the file does

The Address Parsing file is used to standardize the many ways in which street addresses may be written. The structure of the file is such that you can easily customize it to suit your own particular needs. This can be useful in situations where the addresses that you are geocoding contain a non-standard abbreviation or component order.

If, for example, you find that the address number is often shown as appearing after the street instead of before the street (E.G. Main St. 123 instead of 123 Main St.), you can instruct the Atlas Geocoder to assume a different order to the address convention.

  • Here are the specific items that you can control with the Address Parsing File:
  • The order of the address components (street name, street number, etc.)
  • The use of plurals when street intersections are used (1st & Main Streets)
  • The convention for numeric street names (2nd St. versus Second Street versus 2d Street)
  • Possible Intersection Conjunctions (1st & Main versus 1st and Main, etc.)
  • Items which should be stripped from the address (Apt. #, Building, etc.)

Where it is located

The street type translation file is named ADDPARSE.TRN and is located in the \AtlasGIS\Geocode folder on your hard drive.

How to edit the translation file

The ADDPARSE.TRN file can be edited using any text editor or word process. To avoid inadvertantly inserting any formatting characters into the file, a text editor such as Windows Notepad is generally preferred.

After opening the file, simply find the section containing the direction you would like to add to. Each section includes comments that are proceeded with a slash ("/") character with instructions for that section.

The ADDPARSE.TRN file contains six sections, each of which is used to control a different address parameter. The following are the major sections of the file:

Section 1 - Address component ordering

Section 1 is used to set the order that the address components most often appear in. There are 7 predefined options, as follows:

Value Format Example
0 #PNTS 779 East Evelyn Ave South (U.S. Domestic)
1 NT# Middlefield Road 287
2 #NT 287 Middlefield Road
3 N+T# Middlefieldroad 287 or Middlefield road 287
4 #N+T 287 Middlefieldroad or 287 Middlefield road
5 TN# Road Middlefield 287
6 #TN 287 Road Middlefield

Where the Format codes are defined as:

# -> House Number
P -> Prefix Direction
N -> Name
T -> Type
S -> Suffix Direction
+ -> May be concatenated into a single word

To modify the address order setting, simply replace the value in the appropriate section of the file with a "0" through "6", depending on the format of your file.

Section 2 - Plural character for the street type in street intersections

This section allows you define what character at the end of the street type indicates a plural. The default value is "s". This is most commonly used when street intersection addresses are indicated as in the following example: 1st and Main Streets.

Section 3 - Numeric street names

This section allows you to control under what circumstances the geocoder will interpret a number in the address as a house number versus a street name. Intuitively, we know that if a "3" is followed by the characters "rd", the reference is probably to 3rd St. rather than the 3 being part of the address number. The values listed in this section reflect those that are used to distinguish numeric street names from house numbers, if the values follow a number. The default values include:

st (Example: 1st)

rd (example: 3rd)

nd (example: 2nd)

th (example: 5th)

To modify this section, add the appropriate characters on blank lines following the last value. This section would most likely be used in situations where the incorrect characters are used consistentl or systematically in a database (E.G. 3d instead of 3rd).

Section 4 - Intersection conjunctions

This section is used to specify how addresses consisting of street intersections are interpreted. The possible values consist of the different characters that can be used to separate the two street names making up the intersection (E.G. 1st and Main, 1st & Main, 1st/Main, etc.).

To modify this section, add the appropriate characters on blank lines following the last value. This section would most likely be used in situations where the incorrect characters are used consistently or systematically in a database.

The existing default values in the file include:

@
&
at
and
"/"

Section 5 - Pre-strip tokens

Pre-strip tokens are the characters in the address that are to be removed in addition to anything that follows. For example, if the string Apt. is specificied as a pre-strip token, the characters Apt. together with anything following (such as #115) will be removed.

This section should be modified if your addresses often contain strings not listed below. Remember, that pre-strip tokens will be removed together with whatever follows them. If you have a specific string that you want to have removed but not following characters, list those characters in the Post-Strip Tokens section, instead.

Default values consist of the following

Apartment #
Apartment#
Apartment
Apt #
Apt#
Apt
Suite #
Suite#
Suite
Ste #
Ste#
Ste
Room #
Room#
Room
Rm #
Rm#
Rm
Post Office Box #
Post Office Box
P O Box #
P O Box
PO Box #
PO Box#
PO Box
POBox #
POBox#
POBox
P O #
PO#
P O B
POB
Box #
Box#
/ Box -- Don't use "Box by itself since it interferes with streets like "Box Canyon Rd"
#

Section 6 - Post-strip tokens

Post-strip tokens are character strings that are removed without regard to anything that follows them. These could include items such as, "floor", "department", etc. The default values that are included in the file consist of:

Floor
Fl

This section should be modified if your addresses often contain strings not listed above.


<- Back Next ->