Technical Details
External Libraries and Build Tools
While not used as an external library, parts of Frank Warmerdam’s Shapefile C Library are included directly in PAGC’s code base. The only truly external library that PAGC uses is Oracle’s Berkeley DB, which is used for PAGC’s memory pool. Most Linux systems are likely to come with the Berkeley DB libraries installed; a basic installation of Ubuntu seems like a possible exception. Both Mac OS X and Windows users who would like to build from source will need to first install Berkeley DB on their systems. Additional tools (GCC compilers and build system tools to be exact) will also need to be installed in order to successfully build from source under these two systems.
Mac OS X
Apple’s Xcode Tools. Version 2.0 of Xcode Tools is included in the software CDs that are bundled with Tiger, but is not part of Tiger’s base install. We have built PAGC using both Xcode Tools 2.0 and 2.2 under Tiger. We have not built PAGC using Xcode Tools 1.5 under Panther, but we see no reason why there would be a problem doing this.
Windows
A recent release of both MinGW and MSYS.
Address Matching Algorithms
The matching of addresses with the reference address-ranged street network shapefile begins with a rule-based Aho-Corasick, driven standardization of both data sources. The reference data is indexed using BerkeleyDB b-trees for exact key lookups and soundex lookups, and a pointerless trie indexing scheme for edit distance lookups, adapted from ideas of Shang and Merrett. The matching of data records uses the standard Fellegi-Sunter method with modifications to permit similarity measures.
PAGC Address Layer Display Issues
If a suitable match cannot be found for an address in an address database, PAGC assigns this address NULL coordinates. Unfortunately, some GIS data viewers have difficulty in displaying layers that contain features with NULL coordinates, and may cause the viewing software to crash. We do know that this is the case for QGIS releases 0.7.2 to 0.7.4. However, QGIS release 0.7.0 has no problem displaying layers created by PAGC that contain NULL coordinate addresses (we don’t know about QGIS 0.7.1), nor have either of the preview versions of QGIS 0.8.0.