Skip to main content
Row of Houses graphic


 

Smarter Insights Into PD&R Data: Introducing the Neighborhood Change Web Map API

HUD.GOV HUDUser.gov

Keywords: Office of Policy Development and Research, Data, Neighborhood Change Web Map, Application Programming Interface, Python, Census Tracts

 
Research & Data Spotlight
HUD USER Home > PD&R Edge Home > Research & Data Spotlight
 

Smarter Insights Into PD&R Data: Introducing the Neighborhood Change Web Map API

Introduction

Since 2005, HUD's Office of Policy Development and Research (PD&R) has collected and published quarterly HUD Aggregated United States Postal Service (USPS) Address Data. These data, now standardized to 2020 census tracts, originally were intended to measure residential vacancy rates, but they also reveal which neighborhoods are producing housing. Because HUD updates these data frequently and makes them available nationally, they offer policymakers and researchers a powerful way to measure neighborhood change in near-real time. 

In 2023, PD&R launched the Neighborhood Change Web Map (NCWM), which allows planners, analysts, policymakers, and others interested in neighborhood change to use a web-based map to query HUD Aggregated USPS Address Data instead of assembling multiple data tables manually. To further the power of these data, PD&R has developed an Application Programming Interface (API) to allow users to programmatically interact with the NCWM data, allowing them to build the data into their own workflows and designs. 

This article demonstrates how to employ the NCWM API with Python to retrieve USPS address data and analyze recent housing growth using a census tract near Lubbock, Texas as an example.

What is an Application Programming Interface, and how is it different from other data retrieval methods?

An API is an interface between two computers that facilitates the transfer of data between a client requesting the data and a server that sends them. Rather than manually downloading tables with a web browser, clients using the NCWM API can send structured requests directly to the server and retrieve data in return. Using an API offers several advantages: it ensures proper data documentation of data retrieval processes, ensures future replicability, cuts down on intermediate data storage, and is considerably faster than manually retrieving and compiling data.

Using the NCWM API to query census tract address data near Lubbock, Texas

The following example demonstrates how to use the NCWM API to query HUD Aggregated USPS Address Data for census tract 48303010402, a fast-growing census tract west of Lubbock, Texas (figure 1, shown in cyan). This tract has experienced rapid housing growth in recent years. Since the 2020 decennial census, the number of USPS active residential addresses in the tract nearly tripled, increasing from 950 in the third quarter of 2020 to 2,799 in the third quarter of 2025. Plainly stated, census tract 48303010402 has nearly three times as many households now than it did during the previous decennial census.

Figure 1. NCWM Display of Census Tract in Lubbock, Texas


Figure 1. NCWM Display of Census Tract in Lubbock, Texas.


To access the NCWM API, data users must first register to use the HUD Aggregated USPS Administrative Data on Address Vacancies. The data and API access is restricted to users affiliated with government agencies, educational institutions, or nonprofit organizations. The first step is to register an API token through your HUD User API account. Registering a token means creating a unique key that informs the server that the data user's access to the API is valid; without it, the server will not return any data. 

After retrieving the API token, the user must build the Python script to query the data. The NCWM API requires that the user specify several parameters to select records:

  • type: This argument indicates which NCWM table should be queried to return data standardized to 2020 census tracts. The choices are the following:
    • 1: USPS address data
      • Contain address counts, including vacancy status, using administrative data from USPS.
    • 2: HUD-assisted household data
      • Contain counts of households participating in public housing, tenant-based voucher, project-based voucher, project-based section 8, multifamily, and other HUD programs.
    • 3: Decennial census data
      • Contain counts of households and population from the 2010 and 2020 decennial censuses, which serve as references for benchmarking the USPS address data and are helpful for answering questions such as, "Are the USPS address data in a particular census tract reliable?"
    • 4: American Community Survey (ACS) data
      • Select variables are available such as the neighborhood poverty rate. Data are from non-overlapping 5-year ACS periods: 2008 to 2012, 2013 to 2017, and 2018 to 2022.
    • 5: Cumulative Low-Income Housing Tax Credit (LIHTC) placed-in-service data
      • This includes the total number of LIHTC units placed into service for a census tract. 
  • year
    • 2008 to 2025 (with updates)
  • month
    • Quarterly periods: 3, 6, 9, and 12
  • tractid or stateid
    • This argument represents the census tract or set of census tracts per state that should be queried. This example uses tractid: "48303010402," but an alternate method would be to use stateid: "48" to query all tracts in Texas. Currently, the API only supports querying by specific census tract or state. 

Figure 2 presents an example of Python code that calls the NCWM API to retrieve USPS address data for a single census tract across all available years and then formats that information into a pandas dataframe.

The first section imports two libraries into the script, pandas and requests, which manage the data as a dataframe and call the API, respectively. 

The second section, Variables, sets the variables used in the script, including the API endpoint as URL, the years and months to call the data, and an empty list used to temporarily hold dataframes containing each quarter's data returned from the API call.

The third section in the code block (Call the API to request data) is a nested loop that calls the API for each year and quarter and appends the returned data into the holding list. The outer loop goes year by year, whereas the inner loop cycles through March, June, September, and December. Within the nested loop, a set of parameters in the params dictionary are specified, including the table type (in this case, USPS address data), the year, the month, and the single census tract. The script creates an object called response that uses the request library to send the parameters and API token (as headers) to the NCWM API to request data. If response has a successful HTTP status code, data['results'] is transformed into a pandas dataframe and held in the list. The script includes a line to create the YEAR_MONTH column in the dataframe because it does not occur within the 'results' list in the object returned by the API call. The final section (Concatenate the dataframes into a single dataframe) then combines the list_of_dfs into a single dataframe.

The Python script in figure 3 illustrates how to inspect the first row of 70 in the dataframe, and figure 4 shows the first 13 fields of the resulting output data. Analysts can use this code to review the received data and write further code for data processing or analysis. All the returned fields are object data types, also commonly referred to as string or text, which users must convert to either whole numbers or data types supporting fractional numbers before analysis (for example, doubles and floats).

Figure 2. Using Python to call the API


import pandas as pd
import requests

# Variables
api_token   = token
url         = "https://www.huduser.gov/hudapi/public/uspsncwm"
years       = range(2008, 2026)
months      = range(3, 13, 3)
list_of_dfs = []
headers     = {
    "Authorization": f"Bearer {api_token}",
    "Accept": "application/json"
}

# Call the API to request the data
for year in years:
    # print(f"Processing year: {year}")
    for month in months:
        params = {
            "type": 1,
            "year": year,
            "month": month,
            "tractid": "48303010402"
        }

        try:
            response = requests.get(url, headers=headers, params=params)
            response.raise_for_status()
            data = response.json()

            if 'results' in data and data['results']:
                df = pd.DataFrame(data['results'])
                df["YEAR_MONTH"] = f"{year}-{str(month).zfill(2)}"
                list_of_dfs.append(df)
                print(f"Year-Month: {year}-{month}: {len(df):,} records fetched")

        except requests.exceptions.RequestException:
            print(f"Request failed for {year}-{month}")
        except ValueError as e:
            print(f"JSON decode error for {year}-{month}-{e}")

# Concatenate the dataframes into a single dataframe
if len(list_of_dfs) > 0:
    df2 = pd.concat(list_of_dfs, ignore_index=True)
else:
    df2 = pd.DataFrame()

Figure 3. Python code to inspect the first row of the dataframe


# Walk the first row of the concatenated dataframe to inspect the values
print(f"\nTotal records in dataframe: {len(df2):,}\n")

if not df2.empty:
    max_len = max(len(col) for col in df2.columns)
    print(f"{'Idx':<3} | {'Dtype':<10} | {'Column Name':<{max_len}} | First Value")
    print('-' * (6 + 13 + max_len + 16))

    for idx, col in enumerate(df2.columns):
        d_type = str(df2[col].dtype)
        first_val = df2[col].iloc[0]
        print(f"{idx:<3} | {d_type:<10} | {col:<{max_len}} | {first_val}")
else:
    print("No data was fetched")

Figure 4. First row of the complete dataframe


Total records in dataframe: 71

Idx | Dtype      | Column Name                   | First Value
----------------------------------------------------------------
0   | object     | TRACT ID                      | 48303010402
1   | object     | STATE_GEOID                   | 48
2   | object     | COUNTY_GEOID                  | 303
3   | object     | STCNTY                        | 48303
4   | object     | STATE_NAME                    | Texas
5   | object     | COUNTY_NAME                   | Lubbock County
6   | object     | CBSA_CODE                     | 31180
7   | object     | CBSA_TITLE                    | Lubbock, TX
8   | object     | METRO_MICRO_AREA              | Metropolitan Statistical Area
9   | object     | METRO_DIVISION_CODE           | None
10  | object     | METRO_DIVISION_TITLE          | None
11  | object     | CENTRAL_OUTLYING_COUNTY       | Central
12  | object     | R_SUM_TOT_RES                 | 1114.0
  

Using matplotlib to graph data called from the NCWM API

After having the NCWM API retrieve USPS address data into a dataframe, users can apply matplotlib to understand housing growth in this census tract by graphing the change in the number of active, vacant, and no-stat residential addresses over time. The code block in figure 5 formats the YEAR_MONTH column, formats the residential address counts data as numeric values instead of text strings, and then plots the data on a graph with customizations to improve readability.

Figure 6 displays the resulting graph, which shows a steady increase in active residential addresses (green sections of the bars) between 2010 and 2020, followed by a sharp period of growth.

Figure 5. Python code to plot a stacked bar chart for residential address data in census tract 48303010402


import matplotlib.pyplot as plt

# Format and sort YEAR_MONTH
df2["YEAR_MONTH"] = pd.to_datetime(df2["YEAR_MONTH"])
df2 = df2.sort_values(by="YEAR_MONTH", ascending=True)

# Fix casting
cols = [
    "TOTAL_RESIDENTIAL_ADDRESSES",
    "Active_Residential_Addresses",
    "STV_RESIDENTIAL_ADDRESSES",
    "LTV_RESIDENTIAL_ADDRESSES",
    "NO_STAT_RESIDENTIAL_ADDRESSES"
]
df2[cols] = df2[cols].fillna(0).astype(float).astype(int)

# Plot
plt.figure(figsize=(12, 6))
x = range(len(df2))
plt.bar(x, df2["Active_Residential_Addresses"], label="Active", color="#4CAF50")
plt.bar(
    x,
    df2["STV_RESIDENTIAL_ADDRESSES"],
    bottom=df2["Active_Residential_Addresses"],
    label="Short-Term Vacant",
    color="#FFFF00"
)
plt.bar(
    x,
    df2["LTV_RESIDENTIAL_ADDRESSES"],
    bottom=df2["Active_Residential_Addresses"] + df2["STV_RESIDENTIAL_ADDRESSES"],
    label="Long-Term Vacant",
    color="#FFA500"
)
plt.bar(
    x,
    df2["NO_STAT_RESIDENTIAL_ADDRESSES"],
    bottom=df2["Active_Residential_Addresses"]
    + df2["STV_RESIDENTIAL_ADDRESSES"]
    + df2["LTV_RESIDENTIAL_ADDRESSES"],
    label="No-Stat",
    color="#FF0000"
)

# Customize x-ticks on graph
ticks_to_show = [i for i, date in enumerate(df2["YEAR_MONTH"]) if date.month == 9]
labels_to_show = [df2["YEAR_MONTH"].dt.strftime("%Y-%m")[i] for i in ticks_to_show]

# Format graph
plt.xticks(ticks=ticks_to_show, labels=labels_to_show, rotation=45, ha="right")
plt.xlabel("Year-Month")
plt.ylabel("Residential Address Count by Mail Delivery Status")
plt.title("Residential Addresses Over Time")
plt.legend(title="Address Type")
plt.tight_layout()
plt.grid(axis="y", linestyle="--", alpha=0.4)
plt.show()
  

Figure 6. Residential addresses by mail delivery status for census tract 48303010402


Figure 6. Residential addresses by mail delivery status for census tract 48303010402

Conclusion

This article demonstrated how to use Python to call the NCWM API to query HUD Aggregated USPS Address Data to understand housing growth near Lubbock, Texas. Other NCWM datasets also are available with small tweaks to the code. Although the example in this article used Python to query the API endpoint, users can structure queries using various programming languages. This example showed that the number of USPS active residential addresses near Lubbock grew slightly but was relatively steady before 2020. In early 2020, the number of active residential addresses began to grow, nearly tripling by the second quarter of 2025, which suggests a high rate of housing production. Local planners, analysts, and policymakers can use similar methods to track neighborhood change in census tracts nationwide.

Note

Data users should remain mindful of the sublicense agreement to use USPS address data, which does not include selling, licensing, or distributing the data.

Pandas is a data manipulation library in Python that holds data as a table, similar to a data table in Microsoft Excel or a SQL database, in what is called a dataframe. ×

A dataframe is a data structure organized into a two-dimensional table of rows and columns. ×

 
Published Date: 22 January 2026


The contents of this article are the views of the author(s) and do not necessarily reflect the views or policies of the U.S. Department of Housing and Urban Development or the U.S. Government.