Using NEO CANDO -- General Information and Precautions

Most of the data in NEO CANDO are updated annually. The Census data are the exception. The Census of the Population occurs every ten years. Many of the data are available from 2010 to the most current year. However, this will vary by data source. Only census, vital statistics and HMDA data are available for the 16 counties outside Cuyahoga County. The following information provides the user with some general information and caveats regarding the data in NEO CANDO. However, we encourage the user to read the data documentation for each of the data sources we provide to understand how we calculate the indicators, precautions and caveats about using the data, and how to appropriately cite the data from NEO CANDO.

Infrequent event problem
There are some events that occur infrequently (infant deaths, homicides), and the user should be cautioned in using these numbers to calculate annual rates at the neighborhood or census tract level. The fluctuation from year to year could be an anomaly rather than a trend. In these cases, it is suggested that a three-year average, or even five-year average, be used to calculate the rate to "smooth" any fluctuations that are not meaningful. For example, the infant mortality rate is an example where the rate may be unreliable due to the small number of infant deaths. To illustrate calculating a three year average:

Year Infant deaths Live births
1994 15 200
1995 6 205
1996 4 199

Three year average = 25/604*1000 = 41 infant deaths per 1000 live births (infant mortality rate)

In order to compare the data over time, choose the three years prior to 1994 and calculate a three-year average for this time period also.

Rate versus Count
Most data in NEO CANDO are reported in counts and rates. To determine which is more appropriate for your use, consider the following:

A count would be used when you want to determine the magnitude of a problem/issue that needs to be addressed. For example, it would be important to know the actual number of births to first time mothers or teen mothers to determine the need for a nurse visitation program geared toward these new mothers.

A rate would be used when comparing across geographic areas or over time periods. A rate measures the probability of an event occurring in a particular area during a particular time period. Using the above example, a rate might be used to determine the geographic areas that are more likely to have a higher rate of births to teens or first time mothers in order to target the nurse visitation program to areas of greater need.

A percent is a rate per 100. Poverty rates are per 100 population. Poverty rate is the same as percent poor. Rates per 1,000 are commonly used when reporting vital statistics related indicators. Rates per 100,000 are commonly used when reporting crime related statistics. The "per" number is called the multiplier. When calculating a rate you need 3 pieces of information the numerator, the denominator and the multiplier. For example, to calculate the poverty rate we need to know the number of persons living below poverty (numerator), the total number of persons for whom poverty status is determined (population living below poverty + population living above poverty) (the denominator) and the "100".

It will be clear in NEO CANDO what indicators are rates the multiplier (1,000 or 100,000) or the word "rate" is included in the indicator name.

In NEO CANDO, we do have one ratio the adult/child ratio. The adult/child ratio simply divides the number of adults by the number of children to determine how many adults there are per child. If there are 100 adults and 50 children the ratio would be 2 adults for every child 100 divided by 50 = 2.

Medians are used to report income, rent and housing value in CANDO. The median represents the middle value in a distribution. The median divides the total frequency into two equal parts. Half of the distribution has a value above the median and half of the distribution has a value below the median.

When comparing indicators across geographic areas or time periods, it is important to consider structural, geographic and economic variations that may contribute to the differences in the rate.

Population counts used in denominators
Denominators in rates are usually, but not always, the residential population. Rates are also computed using other denominators, such as teenage female population when computing the teen birth rates. Census tracts, in general, have a low population, and we caution the user in interpreting rates at this geographic level. Three-year averages or aggregating tracts could be used to avoid the low population issue. This is particularly important when calculating rates based on specific population groups rather than on the entire population. We consider tracts with a population less than 100 to be non-residential, and therefore consider rates calculated using these tracts to be unreliable.

The neighborhoods of Downtown and Industrial Valley are non-residential areas and rates calculated based on these areas should be viewed cautiously.

Population in non-census years
The population of the census tract or neighborhood is used to calculate many of the rates within NEO CANDO. The Census of the Population occurs every ten years on the decade. In order to calculate the population between census years, we use linear interpolation and extrapolation techniques. We are unable to release the tract level population estimates. However, the annual population estimates for the counties and places as a whole are available from the Census Bureau.

Much of the data the Center receives is at the address level. In order to determine what census tract the address is located in, the Center geocodes the data. Geocoding assigns latitude and longitude, 2010 census tracts and other geographic identifiers, such as zip code and municipality, to valid addresses. Not all addresses can be geocoded. The main reason addresses do not geocode is that the address range does not exist. In these cases, the address cannot be assigned to a specific tract or neighborhood; therefore, we put them in the UNKNOWN category.

In order to assign the Statistical Planning Area (SPA), or neighborhood, to the census tracts in Cleveland, we have a correspondence file that contains all of the census tracts and the neighborhood associated with each census tract. Each neighborhood in the City of Cleveland and each suburban municipality consist of a number of census tracts. The number of census tracts in each neighborhood/municipality varies depending on population size. For a list of the census tracts in each neighborhood click here.

The 2000 data in NEO CANDO have been put into the 2010 census tract boundaries to allow for comparisons between the 2000 and 2010 Censuses. Some census tract boundaries and numbers change from Census year to Census year. In order to accurately compare the data over time, the data must be in the same geographic boundaries. For those census tracts that changed between the 2 Census years, we used GIS techniques to determine the proportion of the 2000 tract population that lived in the corresponding 2010 tract.

Geographic reference maps
The Center has created geographic reference maps showing the cities, townships and villages within each of the 17 counties included in NEO CANDO. Within Cuyahoga County we also have maps for each of the 36 statistical planning areas in the City of Cleveland. The maps also include the census tracts that are within or cross the neighborhoods, cities, townships or villages. To view these geographic reference maps click here.