Genetics and education policy

Philip Ball has an article in the December issue of Prospect (ungated on his blog) arguing that consideration of the genetic basis to social problems is a distraction from socioeconomic causes. The strawman punchline for the Prospect article is “It’s delusional to believe that everything can be explained by genetics”.

The article has drawn a response from one of the people named in the article, Dominic Cummings. Ball suggests that Cummings presents “genetics as a fait accompli – if you don’t have the right genes, nothing much will help”, although this statement suggests Ball had not invested much effort getting across Cummings’s actual position (as contained in this now infamous essay). Ball responded in turn, with Cummings firing back (in an update at the bottom of the page), and Ball responding again.

Beyond the tit for tat – read their respective posts for that – there are some interesting points about whether genetics tells us anything about education policy.

As a start, Ball claims that “Social class remains the strongest predictor of educational achievement in the UK”, referencing this article. However, the authors of that article don’t consider the role of genetics or other potential predictors. The references that article gives for the claim are similarly devoid of relevant comparisons, which is unsurprising as they largely comprise policy positioning documents from various organisations. It’s hard to credibly claim something is a superior predictor when it is not assessed against the alternatives.

So, what is the evidence on this point? For one, we have twin and adoption studies. As a sample, Bruce Sacerdote studied Korean adoptees into the United States (admittedly, not the UK as per the quote) and found that shared environment (which would include socioeconomic status) explained 16 per cent of the variation in educational attainment. Genetic factors explained 44 per cent. This is a consistent finding in adoption studies, with children more closely resembling their biological parents than their adopted parents. For twin studies, an Australian analysis found a 57 per cent genetic and 24 per cent shared environment contribution to variation in education. A meta-analysis of heritability estimates of educational attainment found that, in the majority of samples, genetic variation explained more of the variation in educational attainment than shared environment.

Of course, we don’t have the genetic data or understanding at hand just yet, but there are other factors such as IQ that are better predictors of education than social class. This territory is also complicated - there are genetic effects on both IQ and social class – but IQ tends to outperform. This meta-analysis shows that IQ is a better predictor of education, income and occupation than socioeconomic status – not overwhelmingly so, but superior nonetheless.

Then there is the link between genetic factors and socioeconomic status, with a long line of studies finding a relationship. One of the more recent was by Daniel Benjamin and friends (ungated pdf). They found heritability of permanent income (20-year average) of 0.58 for men and 0.46 for women. Part of the predictive power of socioeconomic status comes from its genetic basis. Gregory Clark’s hypothesis of low social mobility being a result of genetic factors reflects this body of work.

Turning next to Ball’s pessimism of the future of genetics, he states:

In September an international consortium led by Daniel Benjamin of Cornell University in New York reported on a search for genes linked to cognitive ability using a new statistical method that overcomes the weaknesses of traditional surveys. The method cross-checks such putative associations against a “proxy phenotype” – a trait that can ‘stand in’ for the one being probed. In this case the proxy for cognitive performance was the number of years that the tens of thousands of test subjects spent in education.

From several intelligence-linked genes claimed in previous work, only three survived this scrutiny. More to the point, those three were able to account for only a tiny fraction of the inheritable differences in IQ. Someone blessed with two copies of all three of the “favourable” gene variants could expect a boost of just 1.8 IQ points relative to someone with none of these variants. As the authors themselves admitted, the three gene variants are “not useful for predicting any particular individual’s performance because the effect sizes are far too small”.

This, however, is only part of the picture. If we look at another study in which Benjamin was involved, three SNPs (single nucleotide polymorphisms – single base changes in the DNA code) were found to affect educational attainment. In total, they explained 0.02 per cent of the variation in educational attainment – practically nothing. But combine all the SNPs in the 100,000 person sample, and you edge up to 2.5 per cent. But even more interesting, they calculated that with a large enough sample they could explain over 20 per cent of the variation. Co-author Philipp Koellinger explains this in a video I recently linked. Although this study found variants with low explanatory power, it also points to the potential to explain much more with larger samples.

For more on the background to the feasibility of identifying the causal genetic variants for traits such as IQ, its worth looking at this paper by Steve Hsu. Possibly the most important point is that the causal variants for traits such as cognitive ability and height are additive in their effect. In his final response, Ball states that “And that might be because we are thinking the wrong way – too linearly – about how many if not most genes actually operate.” But the evidence shows that is how they largely work. Although a few years old now, this paper’s theoretical and empirical argument that genetic effects are largely additive has generally been affirmed in later research. This considerably simplifies the task of predicting outcomes based on someone’s genome. In fact, this is one reason selective breeding has been so successful and genetic data is already being used successfully in cattle breeding (There’s an example of the gap between entrepreneurship and policy development – while some of us are arguing whether this stuff is possible, someone else is already doing it).

Now, supposing you have this genetic data, how might this change education? Returning to the article I linked above (ungated pdf), Benjamin and friends suggested this genetic information could be used to better target interventions. They propose early identification of dyslexia as an example.

They also suggest using genetic data as controls. This could provide more precision in studies of whether interventions to target socioeconomically disadvantaged children are effective. The genetic controls allow you to hone in on what you are interested in. In the question and answer session of a video of talk by Jason Fletcher I recently linked, Benjamin pointed to the famous Perry PreSchool Project and noted that additional precision through the use of genetic data would have been of great value.

Ball also indirectly alludes to another reason to learn about genetic factors. In his last response, he writes:

Personally, I find a little chilling the idea that we might try to predict children’s attainments by reading their genome, and gear their education accordingly – not least because we know (and Plomin would of course acknowledge this) that genes operate in conjunction with their environment, and so whatever genetic hand you have been dealt, its outcomes are contingent on experience.

This argument runs both ways. Supposing there are large gene-environment interactions, how can you understand the effects of changing the environment without looking at the way that environment affects people via their genome? As an example of this, Jason Fletcher examined how variation in a gene changed the response to tobacco taxation policy (he talks about this in a video I recently linked). Those with a certain allele responded to taxation and reduced smoking. Others didn’t. Too be honest, I’m not sold on the results of this particular study, but it illustrates that genetic factors that need to be considered if these gene-environment interactions are as large as people such as Ball believe.

[I should admit at this point that G is for Genes: The Impact of Genetics on Education and Achievement is sitting unread in my reading pile….]

Putting it together, Ball is off track in his suggestion that learning about and targeting genetic factors distracts from dealing with socioeconomic issues. Understanding of genetic and socioeconomic factors are complements, and by disentangling their effects, we could better tailor education to address each.

That is not to say that the genetic enterprise is guaranteed to be successful. But there is plenty of evidence that our genes are relevant and, on that basis, should be considered.

Further, there are changes we can make today. Ball asks what genetics can add beyond recognition that some children are more talented than others. The thing is, much schooling is still structured as though we are blank slates. Maybe it is an understanding of genetics that will finally get us to a point where education is better designed for people with different capacities, improving the experience across the full range of abilities and backgrounds.

The beauty of self interest

In my review of E.O. Wilson’s The Social Conquest of Earth, I quoted this passage which captures Wilson’s conception of the origin of cooperation in humans.

Selection at the individual level tends to create competitiveness and selfish behaviour among group members – in status, mating, and the securing of resources. In opposition, selection between groups tends to create selfless behavior, expressed in greater generosity and altruism, which in turn promote stronger cohesion and strength of the group as a whole.

This passage from Matt Ridley strikes at the heart of Wilson’s dichotomy between selfishness and generosity:

“Group selection” has always been portrayed as a more politically correct idea, implying that there is an evolutionary tendency to general altruism in people. Gene selection has generally seemed to be more of a right-wing idea, in which individuals are at the mercy of the harsh calculus of the genes.

Actually, this folk understanding is about as misleading as it can be. Society is not built on one-sided altruism but on mutually beneficial co-operation.

Nearly all the kind things people do in the world are done in the name of enlightened self-interest. Think of the people who sold you coffee, drove your train, even wrote your newspaper today. They were paid to do so but they did things for you (and you for them). Likewise, gene selection clearly drives the evolution of a co-operative instinct in the human breast, and not just towards close kin.

You can read the full article here.

A week of links

Links this week:

  1. W. Brian Arthur on economic complexity.
  2. A great article on humans as imitators.
  3. Higher latitudes have colder weather which leads to larger people which causes lower population and higher investment in children which triggers economic growth.
  4. An epidemic of over-diagnosis.
  5. Financial price data are converted into music, the music is played to a rat, then the rat guesses whether the price will fall or rise.
  6. Is being good at science a matter of nature?
  7. Women earn less even when they set the pay.
  8. Social and cognitive skills are complements.

Ignorance feels so much like expertise

In the Pacific Standard, David Dunning of the Dunning-Kruger effect writes:

A whole battery of studies conducted by myself and others have confirmed that people who don’t know much about a given set of cognitive, technical, or social skills tend to grossly overestimate their prowess and performance, whether it’s grammar, emotional intelligence, logical reasoning, firearm care and safety, debating, or financial knowledge. College students who hand in exams that will earn them Ds and Fs tend to think their efforts will be worthy of far higher grades; low-performing chess players, bridge players, and medical students, and elderly people applying for a renewed driver’s license, similarly overestimate their competence by a long shot.

But education is not always the answer:

While educating people about evolution can indeed lead them from being uninformed to being well informed, in some stubborn instances it also moves them into the confidently misinformed category. In 2014, Tony Yates and Edmund Marek published a study that tracked the effect of high school biology classes on 536 Oklahoma high school students’ understanding of evolutionary theory. The students were rigorously quizzed on their knowledge of evolution before taking introductory biology, and then again just afterward. Not surprisingly, the students’ confidence in their knowledge of evolutionary theory shot up after instruction, and they endorsed a greater number of accurate statements. So far, so good.

The trouble is that the number of misconceptions the group endorsed also shot up. For example, instruction caused the percentage of students strongly agreeing with the true statement “Evolution cannot cause an organism’s traits to change during its lifetime” to rise from 17 to 20 percent—but it also caused those strongly disagreeing to rise from 16 to 19 percent. In response to the likewise true statement “Variation among individuals is important for evolution to occur,” exposure to instruction produced an increase in strong agreement from 11 to 22 percent, but strong disagreement also rose from nine to 12 percent. Tellingly, the only response that uniformly went down after instruction was “I don’t know.”

The way we traditionally conceive of ignorance—as an absence of knowledge—leads us to think of education as its natural antidote. But education, even when done skillfully, can produce illusory confidence. Here’s a particularly frightful example: Driver’s education courses, particularly those aimed at handling emergency maneuvers, tend to increase, rather than decrease, accident rates. They do so because training people to handle, say, snow and ice leaves them with the lasting impression that they’re permanent experts on the subject. In fact, their skills usually erode rapidly after they leave the course. And so, months or even decades later, they have confidence but little leftover competence when their wheels begin to spin.

In cases like this, the most enlightened approach, as proposed by Swedish researcher Nils Petter Gregersen, may be to avoid teaching such skills at all. Instead of training drivers how to negotiate icy conditions, Gregersen suggests, perhaps classes should just convey their inherent danger—they should scare inexperienced students away from driving in winter conditions in the first place, and leave it at that.

The full article is worth reading.

A week of links

Links this week:

  1. The freedom to pursue informed self-harm has a long and noble tradition.
  2. What happens when behavioural economics is used to explain rational behaviour.
  3. A great summary of some of Gordon Tullock’s work. HT: Garett Jones
  4. Another study on the limited effect of parenting on IQ. HT: Billare via Stuart Ritchie
  5. What Hayek might say to Republicans.
  6. The long shadow of history on the distribution of human capital in Europe. HT: Ben Southwood
  7. Opposition to urban development by “environmentalists” is among my bigger gripes. Left-leaning cities are less affordable.
  8. I have only just come across Dominic Cummings. Some interesting thoughts. Check out his blog.
  9. Affirmative action to overcome liberal bias.
  10. How your brain decides without you.

Genome Wide Association Studies and socioeconomic outcomes

A few months back, I posted about a Conference on Genetics and Behaviour held by the Human Capital and Economic Opportunity Global Working Group at the University of Chicago. In that post, I linked to a series of videos from the first session on the effect of genes on socioeconomic aggregates.

Over the last couple of days, I watched the videos from the session on Genome Wide Association Studies (GWAS). As for the first set of videos, they are technical (as you might expect for a bunch of academics) – particularly the questions – but cover some important points.

In early studies linking genetic factors to behaviour and socioeconomic outcomes, candidate gene studies were the dominant method. In a candidate gene study, a gene is hypothesised to have an effect, and that hypothesis is tested directly. However, there are some major problems with candidate gene studies, with the literature littered with claims of the “gene for X” that simply can’t be replicated.

David Cesarini opened the session by pointing to this low level of replication of candidate gene studies. He suggests three problems might be causing this failure to replicate. These are multiple hypothesis testing coupled with publication bias, population stratification, and the low power of the small samples typically used.

Multiple hypothesis testing in candidate gene studies arises because more than one gene tends to be tested. In that case, the significance level of the tests should be adjusted to account for the multiple tests. But the reality is that the many negative tests never see the light of day, with the successful ones presented as successfully meeting a threshold appropriate for a single test. Publication bias exacerbates that problem as negative results tend not the be published and you don’t know how many tests have been conducted.

In contrast, GWAS is a hypothesis free approach. All SNPs in a sample (single nucleotide polymorphisms – DNA sequence variations in which a single nucleotide varies in the population) are tested for association with a trait. As there are as many hypotheses being tested as there are SNPs, very high significance thresholds are applied to avoid false positives. But as the number of SNPs in an array is known from the start, there is no doubt about the appropriate threshold.

Cesarini’s talk focused on the second problem, population stratification. This occurs where allele (variants of a gene) frequencies correlate with confounding variables. A classic example is analysing a mixed population of Asians and Caucasians and discovering the chopsticks gene. This can be overcome in GWAS by a technique called principal components analysis, which can be used to model the ancestry of the population and correct for stratification before conducting the analysis.

The next speaker, Daniel Benjamin, spoke on the third problem – the low power of candidate gene studies. Power is the ability to statistically demonstrate an association when that association exists. A test with low power will miss the associations most of the time.

The low power of candidate gene studies is partly due to their typically low sample size, usually between 50 and 3,000 people. Benjamin points out that there may not be any genes in social science with effects large enough to be detected in samples of this size.

The low power of a study has an important implication beyond the inability to find any effects that exist. If real results are rare, they will be swamped by the false positives, which would occur for 1 in 20 tests using the typical significance level. Benjamin runs through some numerical examples and shows that given the expected effect sizes of genes on social science outcomes, you simply shouldn’t trust most candidate gene study results. False positives will drown the real findings. This contrasts with GWAS. Once you get to decent sample sizes in the order of 100,000, you can be relatively confident that what you do find (even though you miss a lot) will be true.

Benjamin also talks about the Social Science Genetic Association Consortium (SSGAC), which is an attempt to build datasets large enough to apply GWAS to social outcomes such as IQ and risk aversion. The proof of concept was on educational attainment, which the next speaker covers in more detail.

Philipp Koellinger opens by asking why there are so many null results in the search for genetic influences. Is it because the effects are small? Because they are non-linear? Or there are gene-environment interactions? Maybe the results of twin studies showing most social outcomes are heritable are wrong?

Part of the answer was given by a study of educational attainment in which Koellinger and the previous two speakers were involved. They used a GWAS to search for SNPs that affected educational attainment in an initial sample of 100,000 people. They then replicated the result in another sample of 25,000 people. All three SNPs found in the discovery stage were replicated.

Importantly, the effect sizes were smaller than expected, with those three SNPs explaining 0.02% of the variation in educational attainment. If you added up the effects of all the SNPs in their sample, you could explain around 2 to 2.5% of the variation.

While this sounds low, it provides a basis for hope. Based on projections for larger sample size, it should be possible to explain 20% of the variation in education attainment through genetic factors.

Jason Fletcher was next, and he asked two main questions. First, how much should we believe GWAS results given how differently GWAS is done compared to normal science procedure. Second, what use are GWAS results? He spends more time on the second question and points out the usual possibilities, such as providing measures for latent variables. For example, if you don’t know the IQ of your sample but have their genomes and know how this affects intelligence, the genetic information could be used to attempt to determine the effect of IQ on a certain outcome.

Fletcher also points to the potential for exploration of gene-environment effects. He gives the example of people responding differently to tobacco taxation based on having different alleles. His paper on this topic is here.

Within his talk, Fletcher asks an interesting question about whether the SSGAC will become a natural monopoly in GWAS. Do we need a second SSGAC to enable people to check the results, and is it feasible for one to emerge? Others may be more viable as genetic testing becomes cheaper, but the tendency for one to dominate may still remain.

In the questions to Fletcher’s presentation, Benjamin makes the important point that the use of GWAS results as control variables could give much more precision to the estimates of the effect that a social science experiment is designed to measure. He gives the example of the Perry pre-school project – expensive educational interventions with a small sample, in which any added precision as to their effects would be of great value.

The last speaker, Dalton Conley, returned to the population stratification problem. His argument is that it may not be as easy to solve as it seems. Conley refers mainly to a technique called Genomic-relatedness-matrix restricted maximum likelihood (GREML) or Genome-wide complex trait analysis (GCTA) (which I have posted about before). This technique seeks to determine the contribution of all the sampled SNPs combined to variation in a trait. The output is a lower bound estimate of heritability. This technique relies, however, on an assumption that among those who are less related than second cousins (higher degrees of relatedness are removed), they share alleles in a way that is uncorrelated with any similarity in environment.

Conley argues that this assumption is false, and shows that using GREML, he can obtain a finding that birth in an urban or rural environment is heritable, in direct violation of the assumption. This result does not disappear after controlling for population stratification.

To deal with this problem, consideration should be given to testing for variation within families – any differences in genes between siblings will truly be random. The problem with this is that most massive datasets for which GWAS is performed don’t have pedigree data of that nature. The good news, however, is that the violation of the assumption does not seem to puncture the GWAS results. It is violated but the consequences are trivial. A paper by Conley and friends on this paper can be found here.

A week of links

Links this week:

  1. A good Jared Diamond interview.
  2. The 10,000 hours rule - the best you can do is find the peak of your own ability.
  3. Tinder works because a picture is “worth that fabled thousand words, but your actual words are worth… almost nothing”. (HT: Razib)
  4. Dumb incentives, although economists would be the first to point out a lot of the unintended consequences.
  5. No evidence for the benefits of expertise for fund managers.
  6. Drunks are more utilitarian. And maybe you should do that drinking on an empty stomach.
  7. Are social psychologists biased against Republicans?

Improving behavioural economics

A neat new paper has appeared on SSRN from Owen Jones - Why Behavioral Economics Isn’t Better, and How it Could Be (HT: Emanuel Derman via Dennis Dittrich). My favourite part is below. As I have said many times before, giving a bias a name is not theory.

[S]aying that the endowment effect is caused by Loss Aversion, as a function of Prospect Theory, is like saying that human sexual behavior is caused by Abstinence Aversion, as a function of Lust Theory. The latter provides no intellectual or analytic purchase, none, on why sexual behavior exists. Similarly, Prospect Theory and Loss Aversion – as valuable as they may be in describing the endowment effect phenomena and their interrelationship to one another – provide no intellectual or analytic purchase, none at all, on why the endowment effect exists. …

[Y]ou can’t provide a satisfying causal explanation for a behavior by merely positing that it is caused by some psychological force that operates to cause it. That’s like saying that the orbits of planets around the sun are caused by the “orbit-causing force.” …

[L]oss aversion rests on no theoretical foundation. Nothing in it explains why, when people behave irrationally with respect to exchanges, they would deviate in a pattern, rather than randomly. Nor does it explain why, if any pattern emerges, it should have been loss aversion rather than gain aversion. Were those two outcomes equally likely? If not, why not?

Part of the solution provided by Jones, as reflected in much of his past work, rests in evolutionary theory.

An updated economics and evolutionary biology reading list and a collection of book reviews

I have updated my economics and evolutionary biology reading list, with a few new additions including John Coates’s The Hour Between Dog and Wolf, Gregory Clark’s new book on social mobility and Jonathan Haidt’s The Righteous Mind. As before, I have been selective, adding only the best books (or articles) in the area. That said, I am always open for suggestions or comment.

When updating the list, I realised I have written a lot of book reviews over the last few years. I have collected most of them together on one page, which you can find here. It includes a lot of good books that aren’t on the reading list as they are not on topic. It also contains a few books that are on topic but haven’t made the cut.

A week of links

Links this week:

  1. Cooperation in humans versus apes.
  2. In praise of pilots.
  3. Are women better decision makers? You can ask about some sex differences.
  4. Amazon is doing us a favour. Goodbye book publishers.
  5. The logic of failure.
  6. The Behavioural Insights Team has lunch with Walter Mischel. Mischel’s work is fantastic and his new book is on my reading list, but the mention of brain plasticity and epigenetics (in the same sentence!) has reduced my expectations.
  7. Charles Murray on Ayn Rand. HT: Alex Tabarrok