By Emma Young
Picture yourself sitting in a Zen garden, surrounded by low, rounded bushes and gravel raked into rippling swirls. Now imagine standing in front of a brutalist building, all straight lines and sharp edges. If you think you’d feel more relaxed in the Zen garden, there could be a low-level perceptual reason — one that could explain everything from why you’re far more likely to find a jagged script on the cover of a death metal album than on a romance novel, to why clouds and lullabies seem to go together.
The , published in Proceedings of the Royal Society B, suggests that we automatically associate variations in one particular property of images or sounds with variations in levels of emotional arousal. This gives us an instinctive understanding, just from the tone of someone’s voice or watching their movements, of whether they are angry or sad, excited or calm. But it seems that these associations between perception and emotion are so automatic and fundamental that we apply them to inanimate objects, as well.
Beau Sievers at Harvard University and colleagues ran a series of five studies designed to explore this property, known as the “spectral centroid”. Images and sounds can be decomposed into a spectrum that contains many components of different frequencies. Those components will differ depending on the shape or sound: a smooth curve, for example, is made up of lower frequencies than a shape with lots of straight lines and corners. The spectral centroid (SC) is essentially the average of this frequency spectrum. Sievers’ team wanted to know whether this property was related to levels of emotional arousal, with anger and excitement, for example, being characterised by high levels of arousal and feeling sad or peaceful by low levels.
In one study, the researchers asked participants to match hundreds of randomly-generated shapes and sounds to feeling angry, excited sad or peaceful. Shapes with lots of corners and sharp angles had high SCs and were also far more likely to be matched to anger and excitement. The same was true for high SC sounds characterised by brief bursts of white noise. In contrast, blobby, swirly shapes and more ‘rounded’ sounds had low SCs, and were predominantly paired with being peaceful or sad.
Another group of participants were asked to draw shapes that were angry, sad, excited or peaceful. The angry and excited shapes had an average of 24 and 17 corners respectively (a high SC) whereas the sad and peaceful shapes had an average of 9 and 7 corners (a low SC). The SC of a shape could be used to predict its emotional label with close to 80 per cent accuracy, the researchers report.
Sievers and colleagues then analysed recordings of people moving around and also speaking emotionally-neutral sentences while expressing several different emotions. They found that angry speech and movements had a consistently higher SC than sad speech and movements.
The researchers think that the SC of someone’s voice or movements provides a clear signal of how strongly emotionally aroused they are. Being able to perceive this property automatically would clearly be useful for survival. If the person coming towards you is really angry, say, you’d want to pick up on that right away. But these associations seem to be so fundamental that we link any visual input or sound with a high SC (including spiky shapes, death metal and a brutalist building design) to the high-arousal emotions of anger or excitement, and anything with a low SC (such as clouds, the sound of lullabies and the swirling gravel of Zen gardens) to feeling calm (or possibly sad — context will clearly play a role).
This process could also account for the classic “cross-modal” finding, first reported in 1929 and replicated around the world many times since, that people tend to pair a blobby shape with a word-sound like “bouba” and a spiky shape with a word like “kiki”.
When it comes to reading emotion in vocalisations, other animals may use a similar approach. In 2017, another team reported that English-, German- and Mandarin-speakers were , including amphibians, reptiles and mammals. The spectral centroid may relate to emotional arousal not just across cultures but across species, Sievers and colleagues write.
“By understanding how the brain extracts low-level cross-modal features to determine meaning, we can build a deeper understanding of how communication can transcend immense geographic, cultural, and genetic variation,” the team concludes.