by Carey Finn (@carey_finn) Ellie Frymire, the data-visualisation specialist who explored what people were really saying about the #MeToo movement through a cluster analysis of 1.4m tweets, shares insights into her research, as well as tips for life in the age of data. Frymire was a speaker at this year’s Design Indaba in Cape Town.

Q5: What was the biggest learning from the #MeToo analysis for you?
Ellie Frymire:
The biggest learning from the cluster analysis [was] likely just understanding the capabilities of machine learning (although it took my computer a few hours to complete, which felt reminiscent of trying to get online in the early ’90s). Machines are our friends! Within the corpus, though, I learned how much of what we post online is… inconsequential. So much of our social media activity doesn’t hold significant meaning (mine included — my last tweet as of writing this was a meme, but it was a funny meme!). We can use those powerful machines and other dimensions, like engagement or content, to find the meaningful messages.

Q5: Are you planning any similar studies in the future? What criteria would you use to choose a new topic/project?
I always have projects in the back of my mind I hope to start. Since meeting so many amazing people at Design Indaba, I’ve been inspired to use machine learning to create art. The power of machine learning is amazing — but algorithms still require a lot of hand-holding, which affects the analysis. Art, however, allows for a certain freedom unconstrained by the critical eye of statistics. I’d love to see art transformed through live data, in essence, real-time analysis and visualisation. An abstract dashboard, if you will. If I were to repeat this project of collection, analysis, and reporting, I’d love to apply this process to a more-worldly topic. I learned so much about the water and land issues in South Africa, and I think I’d like to know what people have to say about that (not to mention, who they are).

Q5: How can we, in this age of ever-increasing data, guard against losing depth (or voices) in all of it?
EF: One of the biggest lessons my parents shared with me while growing up in Silicon Valley in the early 2000s was the thought that “everything you put online will last forever”. Although it’s somewhat true, we’re finding a lot of the clutter of our old websites is either lost (like MySpace, losing all music and pictures from that time) or forgotten (into the white-noise abyss of data online today). The process of archiving this data is its own beast (good luck to the Library of Congress on that one), not to mention the intentionality and voice behind the data. So often in our reporting (not only in publications but also internal organisational reporting), we focus too much on aggregation and less so on evaluation. We need to not only ask “how much”, “how often”, “where” and “when”, but also “why”. They all work together to paint a picture. It’s important to find the sweet spot between quantitative and qualitative as we discuss current data and movements such as this one.

Q5: How can we know which data, and which interpretation(s) of data to trust?
This is a difficult question that I struggle with often. We should always try to understand the details that went into data analysis but, with so much of our current life dependent on data, it would equate to a full-time job just to parse through the datasets and their interpretations that we interact with daily. I have been particularly struck by the stories I read in Weapons of Math Destruction, written by Dr. Cathy O’Neil. If our algorithm was trained on a racist data set, it would give us racist results. Even raw data needs to be questioned — how was this data collected, what was asked, and what could have been missed? My rule of thumb is to question the motivations of the creators. Who is the analysis by, what owners have a stake in the research, and what was the purpose of the work?

Q5: In a way, your work feels like an ultra-modern type of content/discourse analysis — do you think that discourse analysis/content analysis is something that will ever be entirely outsourced to machines?
I suppose, in a Black Mirror sort of way, this is possible. But so much of my analysis relied on decisions I made in regards to data cleansing, algorithmic parameters, and qualitative analysis, just to name a few (something I’ve referenced often in these answers, which I suppose speaks to how important those decisions are when we consider the results of machine learning). Human decisions are not entirely removed just yet. However, I think it’s entirely possible to see something created by a human but applied to a new corpus — for example, it wouldn’t be too difficult to run my analysis on a larger data set of #MeToo tweets, possibly from an entire year, or #blacklivesmatter, etc. Machines certainly do the work — and likely can do more analysis, faster, than a year ago. But it’s up to us to understand, interpret, and communicate the results. Although, once machines can give nervous speeches at Design Indaba like me, then I might be singing a different tune.


Carey FinnCarey Finn (@carey_finn) is a writer and editor with a decade and a half of industry experience, having covered everything from ethical sushi in Japan to the technicalities of roofing, agriculture, medical stuff and more. She’s also taught English and journalism, and dabbled in various other communications ventures along the way, including risk reporting. As a contributing writer to, her new regular column “Q5” aims to hone in on strategic insights, analysis and data through punchy interviews with experts in media, marketing and design.

— One subscription form, three newsletters: sign up now for the MarkLives newsletter, including Ramify headlines; The Interlocker, our new monthly comms-focused mailer; and Brands & Branding, launching soon!

Online CPD Courses Psychology Online CPD Courses Marketing analytics software Marketing analytics software for small business Business management software Business accounting software Gearbox repair company Makeup artist