% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/tuberculosis_1910.R
\docType{data}
\name{tuberculosis_1910}
\alias{tuberculosis_1910}
\title{Tuberculosis 1910 death rates in New York and in Richmond}
\format{
A 2 x 2 x 2 table of counts involving three binary variables
\describe{
 \item{Group}{One of two racial groups: "White" or "Colored"}
 \item{City}{One of two U.S. cities: "New York" or "Richmond"}
 \item{Total}{Either total deaths from tuberculosis or total population in 1910: "Deaths" or "Population"}
}
}
\source{
"An Introduction to Logic and Scientific Method" by Morris R. Cohen and Ernest Nagel, (1934), Harcourt, Brace and Company, New York.
}
\description{
Tuberculosis 1910 death rates in New York and in Richmond
}
\details{
These are historical data taken from page 449 of Cohen and Nagel's 1934 "Introduction to Logic and Scientific Method".
For this reason, the original names of the racial groups have been retained. 

The data are of special historical interest in Statistics because they are one of the earliest
recorded instances of a real Simpson's paradox (Simpson 1951) occurring in practice 
(see Blyth 1971).  Preserving this historical context, the questions posed by Cohen and Nagel (1934) are also
recorded here using their own words.  The data and questions appear at the back of their book
as exercises on "Chapter XVI: Statistical Methods".

In their table, Cohen and Nagel (1934, p. 449) include the "death rates from tuberculosis 
in Richmond, Virginia, and in New York City in 1910".  These rates (in number per 100,000)
are easily calculated and so have been excluded from the table given here.  

In their words, Cohen and Nagel (1934, p. 449) pose the following two questions as exercise
(*emphasis* is theirs):

 "a. Does it follow that tuberculosis caused a greater mortality in Richmond than in New York?

  b. Notice that the death rate for whites and that for Negroes were *lower* in Richmond 
  than in New York, although the *total* death rate was *higher*. 
  Are the two populations compared really *comparable*, that is, homogeneous?"
}
\references{
Blyth, Colin R. 1972.
On Simpson's Paradox and the Sure-Thing Principle.
Journal of the American Statistical Association, 67, pp.364-366.

Cohen, Morris R.; Nagel, Ernest. 1934.
An Introduction to Logic and Scientific Method.
Harcourt, Brace and Company.
New York.

Simpson, E.H. 1951.
The Interpretation of Interaction in Contingency Tables.
Journal of the Royal Statistical Society, Series B, 13, pp. 238-241.
}
\author{
R.W. Oldford.
}
\concept{Simpson's paradox}
\concept{contingency tables}
\concept{history of statistics}
\keyword{datasets}
