Psychologists Are Mining Social Media Posts For Mental Health Research — But Many Users Have Concerns

By Emily Reynolds

This article contains discussion of suicide and self-harm

In 2014, the Samaritans launched what seemed like an innovative new project: Radar. Designed to provide what the charity described as an “online safety net”, users could sign up to Radar to receive updates on the content of other people’s tweets, with emails sent out based on a list of key phrases meant to detect whether someone was feeling distressed.

In principle, this meant people could keep an eye on friends who were vulnerable: if they missed a tweet where somebody said they felt suicidal or wanted to self-harm, for example, Radar would send it on, in theory increasing the likelihood that someone might get help or support.

In practice, however, things weren’t so simple. Some pointed out that the app could be used for stalking or harassment, allowing abuse to be targeted during someone’s lowest point. There were false positives, too — “I want to kill myself”, for example, is often used as hyperbole by people who aren’t actually distressed at all. And others felt it was an invasion of privacy: their tweets might be on a public platform, they argued, but they were personal expression. They hadn’t consented to being used as part of a programme like Radar, no matter how well meaning it was.

Samaritans shut down Radar just a week after launch. But since then, the use of social media data in mental health research — including tweets, Facebook and Instagram posts, and blogs — has only increased. Researchers hope that the volume of data social media offers will bring important insights into mental health. But many users worry about how their data is being used.

Targeted tools

Social media’s role in research continues to grow. In February of this year, a team from King’s College London made headlines with a paper in Scientific Reports that found days with particularly high volumes of depression and schizophrenia-related tweets also saw higher numbers of crisis episodes at mental health service providers in London.

The benefits of using such data are clear, says Anna Kolliakou, lead author of the paper. The volume of data would have been otherwise impossible to obtain, for one, and there was potential to gain insight into the “opinions and experiences of communities that transcend location”.

The KCL team hopes that monitoring tweets at a population level could predict mental health service activity and help manage strained services. But others have looked at whether social media data could also tell us something about individuals’ mental health. A 2017 study looked at the course and onset of PTSD using Twitter data, for instance, suggesting that the language of people’s tweets could provide early hints they would develop the condition.

And in 2018 researchers found  that the use of “language predictive of depression” on Facebook – sadness, loneliness, hostility and more – could predict whether a user was depressed, in some cases a full three months before they received a formal diagnosis. The paper’s authors suggested that such predictive data could be used to screen (consenting) adults for depression.

When it comes to studies looking to help at-risk individuals, Sarah Knowles, an NIHR Senior Research Fellow in Knowledge Mobilisation at the University of York, is “very sceptical of accuracy and validity, and what they’re intended to achieve”.

“I haven’t ever seen a convincing example of how targeting an individual on social media helps them personally, and it’s perhaps more likely to make them feel isolated or even attacked,” she says. “Some people use social media as an outlet for difficult feelings, and worrying about being ‘tagged’ as at-risk might shut this down, so you’re potentially removing a coping strategy.”

Knowles also points out that a high number of people who seek help find it difficult to access (figures vary, but the Mental Health Foundation suggests that 75% of people with mental health problems in England may not get access to the treatment they need). Framing mental health as a problem of “detection” is therefore missing the point, she argues. And programmes like Radar ultimately “comfort onlookers who are worried about someone, rather than help the person expressing a problem”.

Even if algorithms could help identify those who are at risk, there are still questions about whether such tools are wanted. A survey published in late 2019 found that although people may see the value of using algorithms to detect mental illness from social media data, they don’t actually trust social media platforms with their personal information.

These aren’t just academic arguments — Facebook is already using an algorithm in some countries to detect suicidal ideation in posts, with the aim of providing help to those who need it.

The company says that machine learning is only one part of its efforts to help people who are struggling: in a blog post, its head of product development writes that “technology can’t replace people in the process, but it can be an aid to connect more people in need with compassionate help.” For those with concerns about the ethics of machine learning in this context, however, that reassurance is not likely to go far enough.

Public or private?

In the studies that used Facebook and Twitter data to predict PTSD and depression, participants had consented: they knew what their data was being used for. But one of the appeals of social media data for many researchers is the ability to scrape information from so-called “public” datasets.

However, this might not be as straightforward as it seems, at least ethically speaking. A study in Social Media and Society found that many users don’t believe that researchers should use their tweets for research without permission. Another was more mixed, with some users feeling positively about their data being used for mental health monitoring, at least when anonymised. But others were still not convinced.

Despite this tension, research using data scraped from Twitter without consent continues. One study used tweets from a hashtag, “#MyDepressionLooksLike”, to identify communities tweeting about  depression; another looked at the language used by people who had tweeted that they were affected by PTSD, depression, bipolar disorder and seasonal affective disorder.

Many of those who share online are sceptical. “I know that an argument in favour of this would be that once people speak on public platforms, they inherently give permission for that content to be used,” says Sarah-Louise Kelly, a content writer who lives in Glasgow. “But I disagree.” Kelly wrote a blog for many years, starting from her teens; many of her posts included accounts of depression, OCD, anxiety and trauma. She’s recently limited her tweets and blogs about mental health, but says she still tries to be open about “the peaks and troughs” of her mental health.

At 30, Kelly notes that she’s “from a generation that used the internet as a diary”: she’s “never grown out of the mindset” that sharing online can bring community, catharsis and solidarity.  She also feels uncomfortable about her data being used in research.

“If tweets or blogs of mine have been used in research, I really resent that I’ve never been contacted,” she says. “Where’s the gain for me and my community? I have to pay for my own therapy, I have to work out my own coping mechanisms, I learn nothing. I understand the importance of these studies and don’t want to undermine them. But social media users deserve to be treated as more than case studies when they’re discussing such intimate problems.”

Martin*, in his 40s, initially joined Twitter for news updates, but he soon found himself part of a community of people who shared his experiences with poor mental health. “It was really important to me because not everybody in my life knew I was unwell — I had a job at the time I was terrified of losing,” he says. Martin, in a new role and a new relationship, is now able to be more open, and he uses Twitter less to talk about his own experiences with mental illness. But, like Kelly, he’s unhappy with the idea that personal posts could be used as research data.

“There were things I posted online that, at that time, I wasn’t saying to anybody else,” he says. “I was talking about bad experiences with mental health services, stigma, knowing that I was speaking to an ‘audience’, if you will, that would understand what I’d gone through. The idea of that being scraped from Twitter and used in research doesn’t really sit right with me.”

It’s not that he doesn’t want to contribute to learnings about mental health — he points to several Covid-19 related surveys on mental health that he’s completed over the last few months alone. But consent is important. “I read a lot about mental health, I’m interested in the area in general, so this isn’t an anti-research stance. But any [research] I’ve been involved with — that I know of, anyway! — has been something I’ve willingly opted into. That should be the case with everything.”

Martin’s experience with healthcare services has also been mixed: he’s been “invalidated” by psychiatrists and other healthcare workers, and mentions that many of those from his Twitter community have experienced the same thing.

“Obviously researchers are not psychiatrists and they’re not responsible for my treatment or care,” Martin says. “But when you’ve been in a situation where you feel that you’ve not been treated with due respect or you’ve had your voice taken away from you, it’s even more important that anything that is related to mental health involves giving full consent. That’s something I know a lot of people feel strongly about.”

Reading the fine print

While some object to the potential use of their posts, others aren’t aware that it’s happening at all. The Social Media and Society study found that many users were unaware public tweets could be used by researchers, and as co-author Casey Fiesler noted, Twitter or Facebook privacy policies don’t seem to increase awareness, even when they are clear about how users’ data could be used. “As we know,” she wrote, “most people don’t read privacy policies.” Plus, she said, many people don’t have a good understanding of how far their social media posts really reach. Kelly also raises this point: “Lots of people treat social media like an extension of community; they think they’re speaking just to their followers.” When posts are used as part of research, however, that community inadvertently gets a lot bigger.

Knowles feels strongly that social media data should not be automatically available to use at all. “It’s [people’s private] information, and they should be able to have reasonable expectations about being informed if someone else wants to reuse it.” Think about a coffee shop, where you can sit and have a conversation with a friend about your health. “That’s public,” she says. “But we all know you shouldn’t sit down, eavesdrop, and assume you have the right to take away and use what was said.”

None of this is to say that researchers are throwing away all conception of ethical data collection when working with social media: Knowles says that many researchers are wary of using social media data at all, and that approvals from ethics committees for such work can be demanding too. She’s more worried about commercial organisations — like Facebook — using universities’ procedures as a benchmark for conducting social media research. If universities haven’t got a clear set of ethical standards yet, following in their footsteps may not be the best idea.

Kolliakou agrees that ethical consideration is paramount: “researchers have an obligation to handle this data like they would any other research project — privacy and confidentiality should be the utmost priorities.”

But she’s also hopeful that use of social media data could have a big impact on the prevention of illness or distress. “There have been great advances in developing systems that could automatically identify individuals experiencing a crisis,” she says. “The question is whether this is at all acceptable. This is a complex research issue still at its infancy, and until we have some concrete findings that it actually works, uninvited contact with users should not be made. Intervening with individuals unaware their information has been used is invasive and not appropriate.”

*Some names have been changed

Emily Reynolds is a staff writer at BPS Research Digest