Autism as a disorder of dimensionality

Note: I was saving this for the launch of the Symmetry Institute, but given the recent discussions around REBUS/CANAL, Deep CANALs, and Neural Annealing I pushed it forward.

I. Network dimensionality

Lately, I’ve been thinking of the “autistic bundle of symptoms” as naturally arising from having a nervous system whose dimensionality parameter is maladaptively high. The following is an attempt to explain what I mean by this.

All networks have an implicit dimensionality, which we can think of essentially as a branching factor: if one node is connected to one other node, and so on, this is a one dimensional network. If one node can on average branch to 2.5 nodes, it’s a 2.5 dimensional network, and so on. Trees and leaves have this sort of branching dimensionality parameter as well, typically between 1.4-1.6 (see Hausdorff dimension). The dimensionality of a network is a crucial factor for what kinds of patterns can form in the network; higher dimensions can encode more complexity.

Sufficiently high network dimensionality is a prerequisite for intelligence (similar to how new capabilities unlock at larger LLM parameter sizes), but excessively high dimensionality in neural networks can be a curse and I think this factor is the heart of autism’s specific symptom profile. A pseudonymous poster by the name of Uriah has laid out the initial groundwork (Uriah makes many claims; my hypothesis only requires the narrow subset involving neuronal density):

II. Autism as a growth disorder: Uriah’s thesis

I’m going to contend tonight that autism is a growth disorder whose prevalence increases with advancing average birth weight and height and which explodes in frequency when weight and height can increase no further, resulting in a kind of “spillover” of growth into the brain.…

The idea that autism is a growth disorder may sound strange, but it’s not that much of a reach. The most consequential empirical finding in the autism literature is that autistics experience accelerated brain growth in the first 2-5 years of life. [link]

IN 2011 Eric Courchesne and co. managed to microscopically inspect the brains of autistics who had died early and found them to have prefrontal cortices that were extraordinarily dense with cells, 67% more than expected by their ages: [link] …

Studies on young autistics sometimes find them to have elevated levels of growth factors like IGF-1/ IGF-2 and growth hormone binding protein. You may know of IGF-1 as the protein that becomes elevated by dairy consumption and can produce acne. [link]

As of 2021 only a very small percentage of autism’s genetic risk can be accounted for by named genes, but an unusual number of risk genes overlap with growth and cancer promoting pathways like mTOR, IGF, and PTEN. [link]

MTOR hyperactivation seems to be the primary cause of tuberous sclerosis, a condition in autism co-exists at a frequency of 25-50%. TS patients have large growths on their skin that are paralleled by growths in their brains (tubers) [link] …

The strongest genetic overlap between autism and another measurable quality is with depression and low well-being. Interestingly, some of the genes that increase autism risk also improve IQ, which is the opposite of what you see in schizophrenia and ADHD. …

Autistics have large, impressive looking frontal lobes, but autism is in many ways actually reminiscent of the executive dysfunction and avolition of people who have suffered frontal lobe damage. It’s possible there are just too many cells. …

The autistic frontal lobe can be compared to a huge ceremonial sword a man keeps on his wall. It looks powerful, but if he actually tries to swing it he fails so miserably he’d be better off with his fists. But if the right, rare person came along to pick it up…..

To summarize: Uriah believes autism is a growth disorder that involves the creation of too many brain cells, that this growth is mirrored elsewhere in the body, and that somehow having more brain cells hinders normal human functioning.

There are many single-factor attempts at explaining autism at many levels of description, from developmental deprivation to assortative mating to a mistuning of Bayesian dynamics, but Uriah’s is my favorite because the thesis is so simple and testable: regardless of what’s causing it, autists have way more neurons per unit volume (+67% in the PFC; it’s a single small study, but consistent with general themes in autism research). My basic thesis is we can take this simple fact and fully derive the cognitive-emotional symptoms of autism if we make one additional move: that increased neuron density will lead to increased network dimensionality.

III. Autism as a disorder of dimensionality

As a practical matter, the more neurons we pack into a space, the more connections there will be between these neurons, and the higher the network dimensionality will be. (Autists have been shown to have both more neurons and increased synapse density, both of which would increase network dimensionality, though in subtly different ways; we can be a little strategically ambiguous about which factor is dominant until the science is more clear.) This begs the question: what properties do higher dimensional networks have?

1. High-dimensional networks will have more “winning lottery tickets”. This is a concept from machine learning where certain random seeds to initialize networks seem to produce radically better results than others, perhaps by virtue of matching the structure of some problem domain. Such a random seed is a “winning lottery ticket”.

Michelangelo described his creation of David as “I saw the Angel in the marble and carved until I set him free.” Autists, with their thicker neural connections, simply have more stone to work with, more lottery tickets to scratch off, more parameters to model the world in general, more latent “great solutions” within their connectome. (All else being equal, this should offer a boost to IQ that will be cleanly distinguishable from e.g. developmental stability metrics, myelination, etc.) On the other hand, these solutions are often hidden under a noisy thicket of connections and neural pruning is slow, predicting autists’ slow life history.

2. Nervous systems with higher dimensionality have weaker defaults. There’s a concept of ‘canalization’ in biology and psychology, which loosely means how strongly established a setting or default phenotype is. We can expect “standard-dimensional nervous systems” to be relatively strongly canalized, inheriting the same evolution-optimized “standard human psycho-social-emotional-cognitive package”. I.e., standard human nervous systems are like ASICs: hard-coded and highly optimized for doing a specific set of things.

Once we increase the parameter size, we get something closer to an FPGA, and more patterns can run on this hardware. But more degrees of freedom can be behaviorally and psychologically detrimental since (1) autists need to do their own optimization rather than depending on a prebuilt package, (2) the density of good solutions for crucial circuits may go down as dimensionality goes up, and (3) the patterns autists end up running will be notably different than patterns that others are running (even other neurodivergents), and this can manifest in missed cues and the need to run or emulate normal human patterns ‘without hardware acceleration.’

To phrase this in terms of LLM alignment (from an upcoming work):

Having a higher neuron count, similar to a higher parameter count, unlocks both novel capabilities and novel alignment challenges. Autism jacks the parameter count by ~67% and shifts the basis enough to break some of the pretraining evolution did, but relies on the same basic “postproduction” algorithms to align the model.

I.e. the canalization we inherit from our genes and environments is optimized for networks operating within specific ranges of parameters. Jam too many neurons into a network, and you shift the network’s basis enough that the laborious pre-training done by evolution becomes irrelevant; you’re left with a more generic high-density network that you have to prune into circuits yourself, and it’s not going to be hugely useful until you do that pruning. And you might end up with weird results, strange sensory wirings, etc because pruning a unique network is a unique task with sometimes rather loose feedback; see also work by Safron et al on network flexibility.

The hierarchical predictive processing (HPP) account of the brain suggests the brain uses a hierarchy of predictive models which try to aggressively “predict away” mundane sensory data on lower levels of the hierarchy, leaving high-level resources free for unusual & important input. But high-dimensional, weakly-canalized nervous systems will have idiosyncratic and complex sensory mappings that default predictive motifs may struggle with predicting, leading to difficulty in ‘skillfully ignoring’ sensory data. This accords with the intense world hypothesis. See REBUS, ALBUS, CANAL, Deep CANALs, and Neural Annealing for discussion of HPP and the effects of elevated network dimensionality via a higher temperature parameter.

3. High-dimensional networks can embed more detail, but also struggle with structural stability. Just like a low-dimensional knot will dissipate in high-dimensional space*, many of the human-default structures we use to regulate executive function tend to be dissipative in higher-than-normal dimensionality.

Shinzen Young theorizes suffering may arise when the nervous system switches from laminar flow to turbulent flow; as a rule, we should expect higher turbulence and lower neural coherence at higher network dimensionalities and especially across longer distances, affecting stability of emotion, cognition, and muscle coherence. We can expect many of the behavioral and cognitive symptoms of autism to be compensatory attempts to reduce network dimensionality so as to allow structures to form. The higher the dimensionality and lower the default canalization, the more necessary extreme measures will be (e.g. “stimming”). “Autistic behaviors” are attempts at cobbling together a working navigation strategy while lacking functional pretrained pieces, while operating in a dimensionality generally hostile to stability. Behavior gets built out of stable motifs, and instability somewhere requires compensatory stability elsewhere.

4. Brains with a higher density of neurons will have much tighter tolerances. If there are problems with developmental stability, myelination, or especially metabolism (since [a] extra neurons & neural infrastructure will consume more energy, and [b] the compensatory/alignment processes will also need to be more active, and [c] autism often involves elevated aerobic glycolysis, a very inefficient means of producing energy), these problems may cascade into a fractal mess. Any physiological weakness will be amplified.

5. Dimensionality is per-tissue and per-organ, not uniform. Every circuit has its own natural density/dimensionality it’s designed for, and my intuition is that organs closer to the brain are designed to have higher dimensionality. In some sense this makes them more capable of general processing, but also more prone to the particular deficits expressed in autism, with the brain as the apex of this hierarchy. Over time, civilization has thrown humanity increasingly high-dimensional challenges, leading to evolution progressively ‘dialing the dimensionality knob up’ on our nervous systems. Perhaps we can view dysfunctional autists as those who overshot the human nervous system’s current ‘Goldilocks zone’ for dimensionality and have nervous systems dominated by static/turbulence as a result. There may be different ‘flavors’ of autism, depending on which brain regions and tissues have elevated dimensionality.

We might envision an anatomical map with normative ranges of dimensionality: “the heart ganglion is normally optimized for activity between 3.9-5.2 dimensions, but we’re measuring yours at 4.3-5.8. Expect to deal with turbulence in matters of the heart.”

Nervous systems with higher-than-normal structural dimensionality will also exhibit higher-than-normal variance in activity levels, which can produce godshatter. This is not unique to autism, but is often a defining feature of the experience.

IV. Godshatter as a unifying dynamic in personality disorders

The concept of godshatter comes from a story by Vernor Vinge, A Fire Upon The Deep (spoilers below). Vinge’s setting has the universe segmented into “zones of thought”: close to the galactic center, only very simple thoughts can form and almost no technology functions. Further away, more complex intelligences and technology can emerge; the extreme fringes of the galaxy are the playgrounds of super-advanced AIs, essentially gods and demons. Humans are sort of in the middle. The story has an ancient and evil superintelligent AI come back to life on the very fringes of the galaxy. As it’s destroying a benevolent superintelligence, this benevolent superintelligence tries to download itself into a nearby human brain and sends that human to the lower zones of thought to activate an ancient antidote hidden there. Part of the story revolves around the “godshatter” experience of this human, who has shards of a very high-dimensional alien’s mind embedded in his brain. It’s a fantastic story and I highly recommend both books in the series (A Fire Upon The Deep and A Deepness In The Sky).

Godshatter is a perfect metaphor for the result of a rapid decrease in dimensionality. Healthy nervous systems have smooth and context-appropriate arousal/dimensionality levels. However, maintaining this dynamic is a very complex task, especially out of our ancestral environment (metastability is hard!). When energy levels become jagged, the brain doesn’t always have time to put things away neatly and this can produce “godshatter” — shards of frozen high-dimensional structure that are unable to be used or metabolized by the lower-dimensional networks they’re embedded in. I.e. godshatter is trauma, and trauma is godshatter.

The lens of dimensionality allows us a technical analysis of problems which happen under rapid fluctuations in arousal. During inflationary spikes, low-dimensional structures in the nervous system are exposed to extreme out-of-band stresses and may disintegrate, leaving only high-dimensional turbulence. During deflationary spikes, structures formed and embedded in high-dimensional networks are forced to inhabit a much smaller ‘space’, creating intense network stresses and haphazardly jettisoning structural features. See e.g. here for a discussion of dimensionality, embedding, and network stress, and Neural Annealing for a discussion of cleaning these shards under the annealing metaphor.

V. Personality disorders as strategies to manage godshatter

The DSM-V identifies 10 basic personality disorders, sorted into 3 clusters:

Cluster A personality disorders include paranoid personality disorder (PPD), schizoid personality disorder (SPD), and schizotypal personality disorder (STPD), and are characterized by odd and eccentric traits;
Cluster B personality disorders are the most common, and include borderline personality disorder (BPD), histrionic personality disorder (HPD), narcissistic personality disorder (NPD), and antisocial personality disorder (ASPD), and are characterized by dramatic, emotional, and/or erratic behavior;
Cluster C personality disorders include dependent personality disorder (DPD), obsessive-compulsive personality disorder (OCPD), and avoidant personality disorder (APD), and are characterized by excessive fear and anxiety.

Where do these categories come from? I believe each personality disorder can be usefully framed as both a distinct coping strategy for maintaining structural stability under uncontrolled rapid expansion and contraction of network dimensionality, and a phenomenological state of having divergent shards of high-dimensional structure lodged in one’s nervous system.

As a first pass, I would translate the types as:

Cluster A is the non-integration cluster, which seeks stability (preservation of features; see the Cybernetic ‘Big 5’) through avoidance of interactions that would act as destabilizing feedback on internal structure;
Cluster B is the projection cluster, which seeks stability through externalizing entropy (projection) and borrowing ambient social energy to sustain ordered high-dimensional states;
Cluster C is the dependence cluster, which seeks stability through avoidance of high-energy states and transitions, and through externalizing regulation.

These disorders, of course, are extreme cases of normal human patterns. Each cluster accrues significant entropy over time, although this buildup is often internal for A & C and external for B. The presence of one coping strategy also doesn’t preclude the presence of others: stability is the imperative, any port in a storm.

Just as many disorders involve the godshatter dynamic, I believe many healthy physical, mental, and therapeutic practices tacitly revolve around building good habits for preventing and managing uncontrolled dimensionality transitions — and improvements in this general factor of good mental hygiene may drive reductions across all dimensions of psychopathology. Rephrased: a crucial property of good worldviews and “personal vibes” is the ability to handle fluctuations in dimensionality (both + and -). There’s a great deal of content around the semantic and somatic content of trauma in my circles, and I think that’s great; I also suspect the network dimensionality frame can offer us new understandings of what kinds of shards can get lodged in nervous systems, and perhaps also new ways to be kind to ourselves. This could be as simple as “I notice my network dimensionality changed; let me adjust what I’m holding onto and my expectations of myself to match.”

*There are some *very* loose estimations that the human connectome operates at a range between ~7-11 dimensions. My expectation is it will be useful to put harder numbers on this and study it in more contexts, and across more organs.

Acknowledgements: Thank you to Leo Haller for discussion about these topics, Elin Ahlstrand for the motivation to write it down, Uriah and Vernor Vinge for their prior work on this topic, and Adam Safron for the motivation to post. Network dimensionality was an ambient topic at QRI while I was there — *thanks in particular to Andres Gomez Emilsson for a past comment on dimensionality and knots. The possibility of dissolving mental knots in high-dimensional spaces, and these knots staying dissolved once energy levels settle, is approximately equivalent to the Neural Annealing hypothesis.

Document written summer 2021; condensed & polished May 2023.

Appendix A: Genius and madness

Emil suggests madness and genius being linked is more than a trope:

I submit that this other factor is mental illness, or what we now a days would call the general factor of psychopathology, or P factor. You can think of this as an overall index of a person’s craziness. There is a long running interest in genius and madness. The saying goes that the only difference between them is success. That is true enough. Many researchers have looked over the family histories of historical geniuses and they do have elevated rates of mental illness, both in themselves and in their relatives. For example, Simonton in his Genius 101 book from 2009, summarizes 6 lines of evidence:

“First, genius does seem “near ally’d” with madness. This alliance holds in the sense that various indicators and symptoms of psychopathology appear to occur at a higher rate and intensity among geniuses than in the general population.

Second, the greater the magnitude of genius, the more likely it is that these signs will appear. Yet the level of psychopathology seen in even the greatest geniuses remains below the level characteristic of those who would be considered indisputably insane. In fact, works of genius do not appear when a genius has succumbed to complete madness. So “thin Partitions do their Bounds divide.”

Third, some psychopathologies appear more frequently, with depression being the most common. Other syndromes, such as the paranoid schizophrenia of John Nash, are less common, albeit not impossible.

Fourth, family lineages that have higher than average rates of psychopathology will also feature higher than average rates of genius. Hence, even if a genius does not have a modicum of mental illness, someone in his or her family may be less fortunate. However normal Albert Einstein may or may not have been as an adult, it cannot be denied that his son Eduard succumbed to schizophrenia and had to be institutionalized.

Fifth, the rate and intensity of psychopathological symptoms varies across the diverse domains of achievement. In some domains, such as poetry, mental illness may run rampant, whereas in other domains, such as the natural sciences, mental illness will not be much more common than in the general population.

Sixth and last, any tendencies toward psychopathology are almost invariably counterbalanced by other personal traits that strengthen the individual’s response to any symptoms. Especially critical are a sharp intellect and strong willpower that prevent any crazy thoughts from becoming outlandish behaviors. The symptoms of pathology thereby become resources to be exploited rather than insecurities to be feared.”

Neuronal density is a plausible candidate for the strongest factor underlying both genius and madness: it both drastically reduces canalization (normalcy), allowing the brain to be wired in strange ways and pointed in odd directions, and offers many more parameters — the raw stuff of achievement. This can lead to madness, genius, or both.

I wonder if von Neumann had a large d_model, n_layer, head_size or block_size, or kv cache. All of these hyperparams might manifest slightly different.
— Andrej Karpathy (@karpathy) April 3, 2023

Insofar as von Neumann was the beneficiary of generalized hypertrophy / increased neuron density, and won the lottery of having the high-dimensional versions of all these systems cohere: likely all of the above.

Appendix B: An autism epidemic?

Uriah is not the only one to argue for an autism epidemic starting around 1980, but is my primary source for the thesis that this was an actual shift of the underlying distribution of growth (due to unknown chemical/nutritional changes) which at the extreme manifests as autism. If human nature arises directly from (or is identical with) nervous system dynamics and capacities, and the distribution of nervous systems has shifted significantly since 1980, this is a very big deal. One way to combine this frame with the dimensionality thesis is: if you were ever wondering what it would look like to put microdose LSD in the water supply, in some sense we’ve been living that experiment since ~1980. What could be causing this? Hard to say, but glyphosate, microplastics, and antibiotics could be good places to look.

What conditions other than autism are disorders of dimensionality? Perhaps ADHD (more on this in a future post). Are there disorders that arise from having too low of a network density/dimensionality, rather than too high? Are these disorders becoming less common?

If autism involves more neurons per unit volume, and/or more connections per unit volume, what is there less of?

There’s suggestive evidence that physical temperature has dropped roughly 1 degree Fahrenheit over the last 150 years for unknown reasons, likely decreasing metabolic throughput. If we’ve had a shift toward higher neural density (and a corresponding increase in metabolic load) in the meantime, we should expect an epidemic of metabolic problems, especially in high-AQ individuals. Which seems to fit what we do observe. Lower temperature would likely lead to lower neural activity (and thus dimensionality); higher neural density would lead to higher network dimensionality. Which trend has dominated seems like an open and important question.

Appendix C: Network density psychometrics (added 9/3/23)

Experimental metrics for network density

We can define ‘network density’ as the combination of two factors: (1) neurons per unit volume of brain (“neural density”) and (2) synaptic connections per neuron (“synaptic density”). These combine with activity to produce network dimensionality. I think this is a very promising candidate for a natural dimension of cognitive variation in general, and explanation for autism in particular, for the reasons described above. But how do we test it?

Autopsies may be the gold standard for quantifying these factors and initial results seem to support the thesis that both are elevated in autism (elevated neural density in autists; elevated synaptic density in autists). On the other hand, these studies are small because autopsies are expensive and destructive. What cheap and non-destructive proxies could we devise for network density?

I’m somewhat optimistic that denser microstructure leads to particular macroscopic structural features that would show up on certain forms of MRI, especially when paired with modern ML, although we’d still need autopsy+MRI studies for establishing that such features really are due to neural/synaptic density.

Another option is a challenge-response metric. Casali et al. 2013 outlines a “zap and zip” method for inferring structural connectivity: first he stimulates a brain with TMS, then tries to compress the resulting EEG patterns. Essentially the method is to ‘ring the brain like a bell and measure how clear and long the resonance is.’ Casali frames this as the “Perturbational Complexity Index” (PCI) and suggests it may be a good proxy for whether a coma patient is likely to wake up: patients with highly compressible stimulation+response patterns may have lost much of their internal neural structure. The less compressible the result is (the less simple the reverberation is), the more structure remains and the more likely coma patients are to eventually wake.

Casali’s “zap and zip” method may be too coarse-grained and noisy to use on healthy, wakeful people, but I think it’s directionally useful as an example of a challenge+response that could plausibly proxy network density — i.e. autists’ brains should be less compressible under zap and zip, because there’s more microstructure to break up the reverberating signal. A less disruptive and more fine-grained adaptation could involve using a high-definition electrode array to infer local EM field complexity (higher EMF complexity = more dense microstructure).

A new 3-factor decomposition of g

One of the most useful, stable, and predictive psychological constructs from the last century has been Spearman’s general factor of intelligence, g. It’s generally separated into two components, fluid intelligence and crystallized intelligence, which further break down into scores on specific subtests. However, everything’s fairly correlated with each other and g is defined as the vector which best captures this “general factor”. Thus far g has resisted a clean mechanistic decomposition: although measures of intelligence generally cohere and we can identify correlations between g and certain behavioral and neurological features, we don’t have a clear story about what “causes” g.

I believe “network density” allows a fresh and useful decomposition of g into three components:

General well-formedness / developmental stability / lack of noise: essentially how well-put-together a physiology is. This involves no substantial tradeoffs. We can call this “base IQ”.
Network density: tradeoffs based on packing density of neurons and number of connections between neurons (as discussed in this work). Denser networks are associated with higher IQ because (a) their lower canalization allows more flexibility in fitting to new problem spaces, (b) their higher number of parameters allows higher resolution mapping of such problem spaces, and (c) they contain more network lottery tickets. IQ tests specifically test for the positive tradeoffs associated with low canalization and not the negative tradeoffs, which can be significant.
Ancestral package. Tradeoffs based on one’s particular evolutionary history.

This decomposition suggests there can be significant differences between people with the “same” IQ: e.g. we can consider two people with a 130 IQ:

Alan has a “base IQ” of 130 and a network density bump of +0SD;

Bob has a “base IQ” of 115 and a network density bump of +1SD.

Alan’s high IQ will present as essentially being a very smart “normie”. He’s likely very healthy, not particularly into stereotypically “autistic interests”, isn’t likely to fall into stereotyped (coping) behaviors, and is less cognitively flexible (and vulnerable) as someone with a higher network density.

Bob’s high IQ will present in stereotypically autistic ways. He might be of average health, although he may also suffer from various metabolic deficiencies. He will likely exhibit high cognitive flexibility and is more likely to hold novel beliefs, but likely has more trouble than Alan with emotional regulation and ADHD.

We’re all familiar with these two archetypes; I’m suggesting there could be a clean one-factor decomposition of what constitutes the core difference. This decomposition should be testable on both an experimental and genetic basis; the important moves would be to (a) settle on a good experimental proxy for network density, and (b) tease out which “genetic factors for IQ” might belong in each of our three buckets (well-formedness vs network density vs ethnic package)*.

*What IQ-correlated traits correlate with well-formedness and not with network density? What correlates with network density and not with well-formedness?

Maximum network density and health

I expect that baseline health is an important gating factor on network density. That is, as network density increases, physiology needs to be increasingly healthy and efficient in order to support and power the extra neurons & synapses. I’d offer a loose three-factor model: as network density rises there are (1) more neurons to feed, (2) fewer non-neural cells to support them, and (3) more vasomuscular operations required to form and stabilize patterns. Average brains may have some extra capacity (perhaps enough to handle +1SD of network density) but once this is exhausted, increases in ND must be strictly matched with increases in general health / base IQ.

Metabolism is perhaps the most intuitive limiting factor — e.g. someone with a “base IQ” of 100 and +5SD network density necessarily ends up as a non-functional autistic, similar to what happens when we take a rack of H100s and plug it into a standard residential wall socket. Genetics may offer an upper bound on metabolic output, but metabolism can easily be degraded by modern lifestyle (e.g. seed oils, lack of micronutrients, lack of exercise, etc). Autistic coping behaviors often double-down on exactly these risk factors, which suggests the potential of surprisingly large improvements (positive spirals) in borderline cases where someone is just short of being able to handle their network density.

Added 9-28-23: Scott Alexander offers a similar hypothesis in AUTISM AND INTELLIGENCE: MUCH MORE THAN YOU WANTED TO KNOW:

If Ronemus isn’t missing some obscure de novo mutations, then people who get autism solely by accumulation of common (usually IQ-promoting) variants still end up less intelligent than average. This should be surprising; why would too many intelligence-promoting variants cause a syndrome marked by low intelligence? And how come it’s so inconsistent, and many people have naturally high intelligence but aren’t autistic at all?

One possibility would be something like a tower-vs-foundation model. The tower of intelligence needs to be built upon some kind of mysterious foundation. The taller the tower, the stronger the foundation has to be. If the foundation isn’t strong enough for the tower, the system fails, you develop autism, and you get a collection of symptoms possibly including low intelligence. This would explain low-functioning autism from de novo mutations or obstetric trauma (the foundation is so weak that it fails no matter how short the tower is). It would explain the association of genes for intelligence with autism (holding foundation strength constant, the taller the tower, the more likely a failure). And it would also explain why there are many extremely intelligent people who don’t have autism at all (you can build arbitrarily tall towers if your foundation is strong enough).

I’ve only found one paper that takes this model completely seriously and begins speculating on the nature of the foundation. This is Crespi 2016, Autism As A Disorder Of High Intelligence. It draws on the VPR model of intelligence, where g (“general intelligence”) is divided into three subtraits, v (“verbal intelligence”), p (“perceptual intelligence”), and r (“mental rotation ability”) – despite the very specific names each of these represents ability at broad categories of cognitive tasks. Crespi suggests that autism is marked by an imbalance between P (as the tower) and V + R (as the foundation). In other words, if your perceptual intelligence is much higher than your other types of intelligence, you will end up autistic.

It doesn’t really present much evidence for this other than that autistic people seem to have high perceptual intelligence. Also, it doesn’t really look like autistic people are worse at mental rotation. Also, the Gardner paper has analyzed autistic patients’ fathers by subtype of intelligence, and there is a nonsignificant but pretty suggestive tendency for them to have higher-than-normal verbal intelligence; certainly no signs of high verbal intelligence preventing autism. I can’t tell if this is evidence against Crespi or whether since all intellectual abilities are correlated this is just the shadow of their high perceptual intelligence, and if we directly looked at perceptual-to-verbal ratio we would see it was lower than expected. Also also, Crespi is one of those scientists who constantly has much more interesting theories than anyone else (eg), and this makes me suspicious.

Overall I would be surprised if this were the real explanation for the autism-and-intelligence paradox, but it gets an A for effort.

Edit May 17, 2024:

Manley, J., et al. (2024). Simultaneous, cortex-wide dynamics of up to 1 million neurons reveal unbounded scaling of dimensionality with neuron number