# 33. Statistical manipulation - False causality
## 33.1. Definition
The first thing to acknowledge is that "false causality" is not always manipulative. It can result from human error and the poor design of an experiment or the delivery of data. However, this is such a well known problem in scientific research that any excuse of human error doesn't really stand up to scrutiny unless an experiment has been designed and conducted in a totally unprofessional and unscientific way.
False Causality, (often just called the correlation fallacy or regression fallacy) seeks to make a victim conclude that there is a causal relationship between two events when in fact none exists. It takes advantage of the misconception that a correlation is the same as a causal relationship - which it most certainly is not.
A victim reaches this invalid conclusion based on the way the evidence is presented to him. The victim reaches the "self-evident" conclusion about an "obvious correlation" although any causal relationship between the data is ambiguous.
In political circles, this is an especially important technique. It allows political parties or individuals to demonstrate (disingenuously) that they are capable of making certain things happen. For instance, the technique is used to prove a politician's ability to bring about an increase in employment, GDP, personal income, whatever is on their political agenda. The technique is used to create an artificial perception which is beneficial to the manipulator's chances of (re-)election or acceptance.
In corporate circles, the technique is similarly employed to demonstrate the potency or safety of a product or service to a customer. It is also often used to demonstrate to shareholders, directors or employees that a corporate strategy is working, by "proving" correlations between improved business and a particular management strategy.
Similarly, this technique is used by government, corporate interests and individuals to demonstrate the safety or value of a behaviour based on previous historical precedence backed by copious amounts of obviously correlating, historical data.
## 33.2. Persistence
Short, Medium or Long.
## 33.3. Accessibility
Low. This is not generally for the layman, since it requires both a good understanding of statistics and the subject matter. It also requires access to some smart media presentation methods to deliver the manipulative message clearly and to a wide audience. Generally, this manipulative method is reserved for use by large corporations, political parties and governments.
## 33.4. Conditions/Opportunity/Effectiveness
### 33.4.1. How does it work?
False causality as a manipulative technique works because most of us don't fully understand the relationships between cause and effect. There are several statistical techniques (like regression analysis) which can easily be used to demonstrate that a strong correlation exists between 2 or more sets of data.
For instance, when the temperature drops there is an increased incidence of the common cold. However, the correlation thus established, does not mean that we can jump to the conclusion that there is a causal relationship between the two data sets. It's not that simple.
### 33.4.2. Examples
The classic and absurd case used is a comparison between the number of people buying ice cream at the beach and the number of people who drown at the beach. A simple regression analysis will show that there is a strong correlation between these data. However, no-one would claim that ice cream causes drowning because it's obvious that this isn't so.
In this case, the drowning and the buying of ice cream are obviously related by a third factor, for instance the number of people at the beach, or the time of year etc.
A similar case compares the strong statistical correlation of the sales of blankets in London and the winter temperature in Toronto. Clearly the connecting variables are the latitude of both cities and the average night temperatures.
The list of cases of such "correlations without cause" is endless.
### 33.4.3. Correlation and Causality
So what are the real possibilities for cause and effect between data which appear to be very strongly correlated?
When a statistical test (like a regression analysis) shows a strong correlation between variables A and B, there are six possible scenarios in terms of causality:
- A causes B.
- B causes A.
- A and B both partly cause each other.
- A and B are both caused by a third factor, C.
- B is caused by C which is correlated to A.
- The observed correlation was due to chance alone.
This final possibility (just chance) can be established and quantified by various statistical tests that calculate the probability that the correlation observed would be as large as it is just by chance if, in fact, there is no relationship between the variables. So there is a way of eliminating this random correlation.
### 33.4.4. Manipulative applications
The false causality fallacy is used for all kinds of manipulative purposes.
Take, for example, a manipulator who claims to prove that exposure to a particular chemical causes cancer. Using the simple ice-cream example above, we could make a similar and highly reasonable assertion for the incidence of cancer in a particular population and its exposure to a particular environmentally available chemical. In such a situation, there may be a statistical correlation even if there is no real causal effect. The existence of the chemical is actually just one of many potential causes, but by no means the only one.
For instance, in an industrial area, it may well be that there are higher cancer rates than in the more leafy suburbs or country areas far away from industrial sites. However, there is also a tendency for poorer people to live in or closer to industrial areas where real estate prices are lower. There is also a tendency for there to be more migrant workers of different ethnic origins in these communities, they have different medical profiles and propensities. These less-fortunate populations tend to have poorer diets, bad housing, higher levels of overcrowding etc. than those in more fortunate areas.
In other words, the high correlation between the environmental chemical concentration and the cancer incidence cannot, by itself, be taken to demonstrate causality.
However, such a correlation also demonstrates the possibility that there really is a causal relationship and that further investigation is required.
The discovery of nitro-glycerine as an important blood pressure reducing drug happened precisely because a number of workers in a large explosives factory suffering from angina (including the boss), felt much better at work where they ingested small amounts of nitro-glycerine, than they did at home. The correlation was noticed and investigated, resulting in the use, to this day, of nitro-glycerine as a life-saving drug.
## 33.5. Methodology/Refinements/Sub-species
*See Child Pages:*
- [[Regression Fallacy]]
- [[Clustering]]
## 33.6. Avoidance and Counteraction
False Causality can be avoided by designing experiments to use so-called control groups, which are randomly selected to act as a "standard" against which to measure the results on an "affected group" or "treatment group".
But let us return to our example of the environmental chemical exposure and the incidence of cancer. In a statistically honest study, the effect of false causality can be eliminated by conducting tests using mammals similar to human beings. The researcher assigns some of the population to a treatment group and some to a so-called "control group". The assignment to each group is completely random. Then, in controlled tests the treatment group receives exposure to the suspected chemical agent and the control group receives no exposure. If the first group has higher cancer rates than the control group, the researcher knows that there is no "third factor" at play because he knows that he assigned subjects to the exposed and non-exposed groups completely at random. The causal relationship between the chemical exposure and the incidence of cancer is thus demonstrated.
If such tests of causality cannot be fully demonstrated, we can assume that a proposed causal relationship is just being manipulated.