Kiernan Roche

Blog

pyusps, a tool for USPS ZIP lookups

by Kiernan Roche on March 30 2018


I work for the Framingham State Alumni Office. Much of my job is database management and maintenance.

Today I was given a spreadsheet of 4,000 donor addresses and instructed to look up the full postal (ZIP) code for each one, and update the donor database with that information. Most of the entries already contained a five digit ZIP, but my supervisor wanted the full postal code (five digit ZIP plus four digit suffix; see here for more information). Since manually searching for ZIP codes is inefficient, I decided to automate it. A Python script would do.

After some Googling, I realized that there were few data sources for ZIP+4 codes. Google's Maps API only exposes the first five digits, and Yahoo's relevant APIs were deprecated years ago. Since USPS invented and maintains the ZIP code system and the databases containing ZIP codes, I wondered if USPS itself had an API. If so, the script would always be able to fetch accurate data.

Turns out, USPS has a ZIP lookup tool. It's very simple to fill out a form and get a full nine digit ZIP code for any US address.

However, the tool doesn't respond to direct requests, such as those made by a script. Clicking the "submit" button makes an AJAX request to an API endpoint located here. I opened up Firefox devtools and examined the requests my browser was making so I could replicate them in Python.

The API accepts POST requests as input in the following format:

{
    "address1":     "",
    "address2":     "",
    "city":         "",
    "companyName":  "",
    "state":        ""
}

address1, city, and state are required parameters, while address2 and companyName can be omitted. state is a 2-letter state code (such as MA or HI).

After analyzing the API, I was ready to make my own API calls with a Python script. The script itself is very short and is purpose-built to scrape ZIP codes for an input list of addresses.

The most time-consuming part of writing the script was wrangling the API. If a request doesn't contain a User-Agent header, the API won't respond. It took some trial-and-error to figure out which header(s) was causing the issue, but once I got it working the rest of the script was simple to write.

The script takes input in the form of a file input.txt containing a newline-delimited list of addresses in the form:

123 Main Street, Somewhere, MA
321 Some Street, Elsewhere, HI
...

For each address, the script will look up the ZIP code, append that to the address, and write the result to a file output.txt.

The script is on GitHub. As always, I will accept pull requests containing fixes or new features.

Category: Software