A Picture of Subsidized Households General Description of the Data, Technical Comments, and Bibliography

General Description of the Data

This data file sketches a picture of subsidized households across the United States. Each line is identified by key numbers and letters. This file is very similar to the 1996 file on the same subject; differences are noted on the main data page.


Data on households were sent by local housing agencies to the U.S. Department of Housing and Urban Development (HUD), and were summarized by HUD. This report uses the latest report on each household, provided it was received in the last 30 months. The average form is 10 months old. Earlier data are available for 1977, 1993, and 1996 (see bibliography). The completeness or incompleteness of the data is shown on every line and is discussed below.

Data on the size and location of each program and project are from HUD's own administrative records, plus a database on Tax Credit projects collected by a contractor. (This is the only source used for Tax Credits, since data on individual households are not collected in that program.)

Program Overview

Rents are subsidized in all programs covered in this report. Households generally pay rent equal to 30% of their incomes, after deductions, and the Federal government pays the rest. To enter the programs, people must generally be below an income limit, which varies by household size and location. They apply, and wait until their name rises to the top of the waiting list for the limited number of subsidized units available. Tax Credits are a program of the Internal Revenue Service, where landlords obtain tax benefits for renting to low income households. Some projects have a mix of subsidized and unsubsidized units; just the subsidized units are counted here.

Report Overview

The data file shows summary records for: the whole United States, each state, each housing agency, neighborhood (Census tract) and individual housing project. Only the summary records are included; no records for individual households. A Census tract is a compact area averaging about 1,500 households. We lack addresses for about 14% of subsidized housing units, so they cannot be placed in tracts, and thus the tract totals are underestimates. Each summary record shows the number of housing units being summarized, completeness of the household data, average household income, percent minority, elderly, female headed, Zip code, county, metropolitan area, etc. In the reports, the information is spread across two facing pages. Column headings in the reports are printed vertically, and are explained more fully on pages 12 16.

Summary records are shown even when data are incomplete. To protect reliability we do not show household information where fewer than 40% of households are reported. We still show project number, project size, location, etc. To protect privacy we also do not show household information where 10 or fewer households are reported. All these households are still included in larger summaries, such as for the agency, state and United States.

Order of the File

The records are grouped by state. Within each state they are sorted by project number, so individual project numbers are easy to find. You can sort the files in any way you need. The numbers for different programs are sorted with Public & Indian numbers first in each state, then Section 8 project numbers, then Federal Housing Administration (FHA) project numbers, then Tax Credit projects, and finally Census tract summaries that total across all programs.

Suggestions for Using the Data File

You can use various software tools to explore the data base. If you read the file on Internet, your browser probably has a 'look for' button, so you can move quickly to any city name, project name, project number, etc. If you download the data to your own computer, you can use a spreadsheet or database package. Such packages let you:

  • Jump to any part of the file (often using the f5 key, or a command to 'look for' a specific name)

  • Sort the records to see the highest incomes, or most concentrated tracts, or most integrated large projects, etc. (for example sort by record type and income)

  • Print selected items (in spreadsheet set width of other items to zero; in database list items you want in a report)

  • Map the data (by selecting the records you want to map, then asking for an X-Y graph, using longitude for the X axis, and latitude for the Y axis, with a small symbol at each point, and no lines to connect the points; some databases do not give such graphs).

  • Print a referral list of subsidized housing in an area, with size, turnover, % elderly, neighborhood characteristics, etc.

The national data file is about 50 megabytes. State files are smaller. File names include 'hud3' to distinguish them from the first files, hud2 for 1996 data, and hud1 for 1993 data. Three extra files come with the data:

hud3.wk1 Empty spreadsheet file, ready to fill: labels and widths of variables are already defined
hud3.dbf Empty data base file, ready to fill: names and widths of variables are already defined
Readme3.txt Documentation

Database Software

After downloading you can open hud3.dbf, which has the names and widths of the variables. Then import the file as delimited ASCII. In dBase:
  .Use C:\hud3.dbf
  .Append from C:\hud3AK1.txt Delimited
  .Go top

Spreadsheet Software

After downloading you can open hud3.wk1, which has the names and widths of the variables. Then import the file as an ASCII file with numbers separated by commas (text is enclosed in quotes). In Lotus:
  /File Retrieve C:\hud3.wk1
  /File Import Numbers C:\hud3AK1.txt
If your spreadsheet cannot read large files, use another package to split the file into pieces.

Fixed Format Software

Names and widths of the variables are in the layout printed on pages 12 16. You can skip over the commas and quotes. The files are delimited ASCII, with double quotes around text, and commas separating all fields, so they can be read directly by most programs. At the same time the files are in fixed format, so they can be read by programs that need fixed formats. We have kept the record length under 240 characters, which is a limit in Lotus, since we expect some users will use Lotus.

Scope of the Data File

The following table shows record types that are present in the computer file. For example there is one national summary record in each program, and 30 55 state summaries, depending on the program. Project records are present in some programs, tract records in others, and Moderate Rehabilitation has neither, but just agency summaries.

Number of Records by Type

  Total National Records Size Class Records State Records Housing Agency Records Project Records Tract Records
All Records 196,209 9 43 460 6,626 48,649 140,422
Records that are Totals across Programs 61,328 1 9 55

Indian Housing 2,903 1 7 31 183 2,681
Public Housing 17,014 1 9 54 3,195 13,755
S.8 Certificates+Vouchers 81,818 1 9 55 2,594
Moderate Rehab. 717 1 9 53 654

S.8 New+Substant.Rehab. 15,233 1
S.236 4,277 1
Other Subsidy 3,420 1
Housing Tax Credit 9,499 1

The 1996 file made more of an attempt to give one summary for each agency with multiple numeric codes, such as some state agencies. These are now reported as separate agencies, because some of their projects are being re-assigned to local agencies, and processing the current totals was too complex. Where we do still summarize several codes into one agency (as in Puerto Rico and the Navajo agency), we no longer also have partial summaries for each code. Any project whose agency is not shown in the beginning of the project number has a note in the name (e.g. RQ001001 is noted as being @RQ005).

This data file covers about four and a half million HUD-subsidized housing units, and a third of a million housing units assisted by Low Income Housing Tax Credits, for a total of nearly five million subsidized housing units. About a quarter are in Public Housing projects. Another quarter have Section 8 Certificates or Vouchers, which let participants choose their own rental units in the private market. About a fifth of the subsidized units are in the Section 8 New Construction and Substantial Rehabilitation programs. The other units are divided among various other programs, primarily Section 236 and the Low Income Housing Tax Credit.

This report does not cover other types of subsidy, such as Rehabilitation grants, Homesteading, the Farmers Home Administration, etc.

Only a few topics are mentioned here. More discussion was given in the 1996 report, and the data here update those figures, and provide somewhat more complete geographic information. No changes are made in Tax Credits, since we have no newer data than last year. Detailed definitions of all the items are on pages 12-16.

Basic Counts

Subsidized Housing Units Available % Occupied % Reported Average Months since Report Average People per Unit Rent per Month Spending per Month Income per Year Income as % of Local Median
U.S. Total 4,901,000
80 10 2.3 202 402 $9,200
Indian Housing 70,000
48 14 3.7 173 262 16,000
Public Housing 1,322,000 90 80 11 2.4 192 345 8,900 25
S.8 Certificates+Vouchers 1,433,000
84 9 2.8 204 432 9,100 24
S.8 Moderate Rehab. 110,000
65 10 2.3 159 440 7,400 21
S.8 New+Substantial Rehabilitation 895,000
83 8 1.6 190 494 8,900
S.236 448,000
76 10 2.1 255 *302 10,000
Other Subsidy 292,000
71 9 2.4 199 *391 10,000
Housing Tax Credit 332,000

Note: Blank means data not available

Note:*only includes Section 8 spending (Loan Management), averaged over units receiving it, not spending from Section 236 and other programs.

Percent Occupied

Occupancy rates are only collected for Public Housing. We assume plausible levels of occupancy in other programs for purposes of calculating the completeness of reporting (see page 13).

Percent Reported

Overall reporting is 80% of occupied units (so we have 20% missing data). Public Housing, Certificates+Vouchers, Section 8 New, and Section 236 have the highest reporting, from 76 84%. The smaller programs range from 48% to 71% complete. In Tax Credits no reporting is required. When households take certificates or vouchers to a new area (portability), they are reported by the local agency there, but are still billed to and in this file are counted against the original agency that issued them a certificate or voucher. In previous files they were wrongly counted against the agency that reported them.

Some of the households counted above in Section 236 projects have incomes high enough to pay full rent, without subsidy. These households are not reported to HUD and their characteristics are not included in this report.

Average Months since Report

Households should be reported annually, on a flow basis, so for good recent data the average should be about 6 months. Larger averages mean recent data have not been received, or errors have been made so that some of the recent records have not yet been accepted into the system.

Average People per Unit

There is variation from 1.6 to 3.7 people per household. You can also multiply these figures by the number of housing units to find the Total Number of People Served by the programs: about 11 million.

Rent per Month

Average rents paid by households range very narrowly from $159 per month in Moderate Rehabilitation projects to $255 per month in Section 236. These rents include estimates by each housing agency of tenant-paid utilities. Rents in Indian Housing do not mirror their relatively higher incomes, since most Indian Housing is Mutual Ownership, with a distinct calculation for household cost.

Spending per Month

This is a new item on this file. This spending does not reflect what it would cost to expand any particular program. For example Public Housing costs appear low, since HUD paid off the construction costs several years ago. New Public Housing would cost much more than the average current spending, since construction costs would have to be paid. Also low spending can reflect high incomes of occupants, as in Indian housing, not necessarily management efficiency.

For Section 8, total spending is calculated for each household (from the rent calculation form) and averaged, so averages do vary in each tract and agency. In Certificates+Vouchers and Moderate Rehabilitation our estimate also includes the 8% administrative fee. We are not able to estimate or include the spending on vacant units.

For Public & Indian housing, we only include 1996 operating and modernization spending, since the construction costs have already been paid (the long term bonds were paid off a few years ago), and opportunity cost is not available. Drug Elimination Grants and other smaller grant programs are not covered.

Operating subsidy to all agencies, and modernization funding to large agencies (generally over 250 units), is distributed every year by formula so we use the actual funding obligations.

For smaller agencies modernization funding is competitive, and they do not receive it every year, so the actual funding for any agency in one year would not reflect average long term spending. As an approximation we found the average modernization funding per unit, and multiplied this average by the units in each agency to estimate the long term expected value of modernization spending. A higher fraction of Indian agencies receive modernization than of other agencies, so we calculated the average separately for Indian and non-Indian agencies.

Then in each agency, for this summary, total spending in Public and Indian Housing was averaged per occupied unit per month, without differentiating among projects. In fact some receive more or less than average each year.

In other programs spending is not yet available.

Income per Year

Incomes range from $7,400 per year in Moderate Rehabilitation to $16,000 per year in Indian Housing. This is total income, without subtracting the adjustments that are used in calculating rent. However some types of income are not counted at all by HUD and are not included here, such as scholarships and earnings of minors.

Income as Percent of Local Median

The median income of families in each metropolitan area and each non-metropolitan county is estimated regularly by HUD, because it is used in setting income limits for housing subsidies. (The most common income limit is 50% of local median income, adjusted for household size(1), and with some other adjustments which restrict the variation from area to area). In this file we compare each tenant household's income to the adjusted median income (i.e. twice the 50% limit), so for example if the 50% income limit for a particular household is $15,000, then adjusted local median income is $30,000 and a subsidized household with $10,000 income has 33% of adjusted local median income.

When we say 'adjusted for household size' we mean the following: In households with 4 people we make no adjustment; we use the median as the base, simply taking the tenant household's income as a percent of median. We reduce the base 10% for each person under 4 in a household (3 people: income as a percent of 90% of the median; 2 people: 80%; 1 person: 70%). We increase the base 8% for each person over 4 in the household (5 people: income as a percent of 108% of the median; 6 people: 116%; etc.). Thus Certificates and Vouchers have higher dollar levels of income than Public Housing, but similar percents of median income, since their household sizes are larger, and the base for each large household is adjusted upward.

The figures here are average ratios for each project, census tract, agency, state, etc. One could also have sorted the households in each project, etc., and found the median ratio, which would have been somewhat smaller.

Technical Comments

Some agencies reported zero in some items, rather than their actual data. We have removed most of the bias, generally by treating zero as missing data. Virtually all zero rents and zero incomes are indeed missing data, since the only households with truly zero gross income and zero gross rent are formerly homeless households who for some reason do not expect any income assistance in the coming year. Zero bedrooms are meaningful (efficiency apartments), and we try to minimize bias by treating units with no bedrooms reported and more than 2 people as missing data. We did not process households with invalid project numbers, or with no people reported, since we would not have been able to categorize them properly.

Number of Subsidized Units by Missing Data

Housing Agency Records Project Records Tract Records
Total % Lack Geography % Lack Tenant Data Total % Lack Geography % Lack Tenant Data Total % Lack Geography % Lack Tenant Data
Total 2,934,000 7 15 3,366,000 14 24

Indian Housing 70,000 79 49 70,000 91 53

Public Housing 1,322,000 8 18 *1,329,000 19 21

S.8 Certificates+Vouchers 1,433,000 2 11

1,206,000 **0 15
Moderate Rehab. 110,000 11 23

S.8 New+Substant.Rehab.

895,000 4 16


448,000 2 18

Other Subsidy

292,000 10 26

Housing Tax Credit

332,000 24 56

Note:* For a few agencies, HUD erroneously shows more units in projects than the total for the agency. We reflect that fact here, since we do not know where the error is.

Note:** No tract record lacks geography, but 16% of Certificate+Voucher units are not shown on tract records, and thus these 16% lack geography, even after weighting. The Percent Reported in a previous table shows the missing data overall, before weighting.

This table shows the number of subsidized units where agency or project records have no tenant data or geographic data (because of non-reporting of either, or suppression of tenant data to protect confidentiality). Agencies have less suppression of tenant data than projects, since they are larger. Agencies also have less missing geographic data, since we can have geographic codes for an agency if any of its projects are geographically coded. Specifically this table is based on the absence of data on bedrooms and latitude from the records. Other items may be more or less complete.

Incomplete Data are shown by the column headed "% Reported," the column in the middle of the even-numbered pages. The summaries are weighted to adjust for partial reporting in any project: households reported are assumed representative of other occupied units in the same project (or program and agency for Certificates+Vouchers and Moderate Rehabilitation). This assumption is not always correct. Furthermore, if far too many or too few households were reported in a project or program, the weight was limited to the range from .5 to 5, to avoid great discrepancies in weights, and resulting variance in the final data. The range was limited to .5 to 1.49 on Certificate+Voucher tract summaries, to avoid implying that two or more households moved to the same remote tract, when we only have one report from it. There is no weighting to represent the places that did not report at all.

Processing Steps

The report merges data from about 50 different computer tapes. The steps of processing are described in the 1996 report. Additional steps involve summarizing operating and modernization subsidies for each agency, and adding two additional files of addresses of Public Housing Projects.


Zero means the number is less than a half and rounds to zero. Blank or -1 is shown when data are unknown, or when 'few' households were reported. 'Few' means (a) less than 40% of households were reported, too few for reliable data, or (b) the absolute number of households reported was 10 or fewer, so their confidentiality could be at risk. For reporting rates below about 80%, the data may be unrepresentative, but we show data down to 40% to give some indication of patterns. In the computer file, the symbol -1 is used for low reporting. 8888, 888, and 88 have special meaning: more than one MSA, county, etc. is included in the area, so a more specific code cannot be given. Data are limited to the space available. For example, to fit 100% into 2 digits, it has been coded as 99.

(1) The terms family and household here have specific meanings. The area median income is based on both subsidized and unsubsidized families. 'Families' include only people living together who are related by blood, marriage or adoption; people living alone are not used for establishing median incomes. They all are included in the broader term 'household' and all are of course eligible for subsidy, so when writing about people living in subsidized units, we use the broad term households. There is also another sense of family, as opposed to elderly households, which was once a common distinction in HUD programs. We avoid that usage, since we now have many non-elderly single people in HUD programs. When we want to indicate households where the head and spouse are under age 62, we use the term 'non-elderly'.