A friend got into a discussion of the death penalty with some liberal economists, who were all aghast to hear him point out that variations in the murder rate by states are driven largely by ethnic demographics. He writes:
Predictably, I got involved and ended up running a bunch of regressions using a spreadsheet I built. The spreadsheet is attached. I didn't try to reproduce any of the complex models that economists have used to prove/disprove DP deterrence. That is far beyond my knowledge of statistics and/or access to data. However, I did suggest that demographics might explain state level variations in homicide rates.As you might expect, this idea was not well received. Any number of alternative explanations were offered. Predictably, they were almost entirely junk. I did find one marginal exception (poverty) described below. My major results were
1. Demographics do a rather good job of explaining state variables in homicide. The basic demographic regression gives a R2 of 0.687 and an adjusted R2 of 0.6734. The P-values are superb.
An R-squared of 67% (or r = 0.82) is extremely high in the social sciences.
2. Contrary to what Jared Taylor says, the white / non-white percentage in any state is not a good predictor of homicide. The R2 is only 0.303 for homicide versus white percentage, although the result is statistically significant. It turns out that a major reason for this poor result is Hawaii. It is 29.38% white and is quite safe, with one of the very lowest homicide rates in the US. Conversely, adding the black and Hispanic populations gives a better result with 0.567. However, as stated above, a multiple regression that uses black and Hispanic as independent variables gives the R2 of 0.687.
It may well be that Jared Taylor suggested simply adding the black and Hispanic populations as shorthand for a more complex analysis. However, the R2 delta separating combining these populations and handling them independently is huge. Note that the coefficients are 0.212 for black and 0.089 for Hispanic. These are quite in line with expected relative homicide rates for these groups. Given my limited knowledge of statistics, it isn't clear if this an accidental correct result or a genuine result of the regression.
4. As stated above, Hawaii is very safe and not very white. At some level this could be interpreted as a positive omen for our nation's future. However, the non-white population of Hawaii is mostly Asian, not black or Hispanic. Ironically, Hawaii is the only place where I personally have encountered street violence and drugs...
5. I didn't include Washington, DC or Puerto Rico in my regressions. DC is truly an outlier in any number of respects (homicide, demographics, etc.). Since my goal was a state-level analysis, DC didn't belong. Not enough data was available for Puerto Rico to include it and it's not a state either.
6. I treated all of the states as equal. In other words, each state was one data point. I am not sure if this makes sense or not. Perhaps a weighted least square regression might have been better. This is quite unclear to me and somewhat beyond my knowledge of statistics.
The folks over at EV kept proposing alternative explanations for state level variations in homicide. By themselves, most did have some limited predictive power. However, once you added demographics back in, they mostly fell apart. They were.
1. Death Penalty - Some folks alleged that the death penalty is actually associated with higher homicide rates (rather than a consequence of them). The worksheet "Regression B H Death Penalty" shows the results of adding the DP (I coded 1 for any state that has the DP, 0 otherwise). The coefficient is positive, but not statistically significant.
2. Southern State - The worksheet "Regression B H Southern State" shows the results of adding Southern State (I coded 1 for any state in the south, 0 otherwise). The coefficient is positive, but not statistically significant. The lack of a better result for Southern State surprised me. Southern whites are known to be more murderous than their Northern counterparts. However, it didn't show up in the regression analysis.
The highest white imprisonment rates (as of 1997) were found not in old Confederate states but in old cowboy states like Oklahoma, Texas, Nevada, Arizona, and Alaska (okay, not many cows in Alaska, but you get the picture -- a heritage of frontiers, saloons, that kind of thing).
Meanwhile, blacks had relatively low imprisonment rates in the old South, suggesting that conservative policies tend to be good for the moral health of African-Americans.
The states with the highest rates of black imprisonment were Iowa and Wisconsin, with Minnesota being pretty bad too. In other words, blacks tended to be at their worst in the Progressive old Northwest, where whites are nicest. A reader once told me that an article in the black press had advised that Iowa had the easiest welfare requirements in the country, so Iowa had attracted some of the worst blacks in the country.
3. Urbanization - The worksheet "Regression B H Percent Urban" shows the results of adding urbanization (from the Census). Interestingly enough the coefficient is negative (higher urbanization is associated with less homicide), but not significant. I found this very surprising.
4. Inequality - The worksheets "Regression B H Family Gini" and "Regression B H Household Gini" show the results of adding inequality. Interesting enough, the coefficients are negative (higher inequality is associated with less murder), but not significant.5. Population - Many people think that bigger states have more murder. The worksheet "Regression B H State Population" shows the results of adding population. The coefficient is positive but not statistically significant.
6. Poverty - This was the only unexpected result. The worksheet "Regression B H Poverty" shows the results of adding poverty. The coefficient is postive and marginally significant (a P-value of 0.056). Note that this was the only regression that produced an adjusted R2 better than demographics alone. It wasn't much better, but it was better.
The r-squared for a two factor multiple regression with % black and % Hispanic was .673. Making a three factor multiple regression by adding poverty raised the r-squared to .692. That doesn't sound like too much, but it's not a bad little increase.
Robert's Rationale, Audacious Epigone, Antero Kalva, and La Griffe du Lion have taken looks at the problem too.
I think this is one of those situations that are so common in American sociology where race is so dominant a factor that it makes sense to analyze differences by state for one race at a time. That's the only way to find subtle differences in the effectiveness of public policy, because, as with school achievement test scores, the racial composition of a state just overwhelms everything else. It's like doing astronomy near the sun -- you have to have a solar eclipse to see anything besides the sun.