EP13: Fighting Misinformation with Machine Learning

Media Jungle

0:00

-18:48

EP13: Fighting Misinformation with Machine Learning

Host Alex Ragir is joined by Kinzen CEO and founder of Storyful's Mark Little to talk about combatting misinformation with machine learning, what Elon Musk should with Twitter, and more...

Alex Ragir

May 25, 2022

Subscribe to Apple Podcasts, Spotify, Google Podcasts, Stitcher, and everywhere you listen to podcasts (if you like the show, leave us a review!). Not much time? Watch short clips on TikTok, Instagram, and Youtube.

Welcome back to Media Jungle! Welcome to new subscribers from Twitter, The Defiant, LA Times, Washington Square Films, and more. Simply respond to this email with any feedback…

We have a great episode this week with:

Mark Little (@Marklittlenews): Co-founder and CEO of Kinzen, and founder of Storyful. He’s also worked as a VP at Twitter and a foreign correspondent for RTÉ News.

Podcasts grow from referrals, so if you like Media Jungle, please consider sharing with someone interested in learning about how the media industry works.

Read the transcript of the full interview below. Or watch the full episode on our Youtube channel here:

Missed last week’s show? Here are two short clips from the chat with Camila Russo, founder of The Defiant:

Here’s a transcript of this week’s episode with Mark Little, recorded May 11th, 2022:

Alex: Welcome to the thirteenth episode of the Media Jungle video podcast. I'm your host, Alex Ragir, coming to you every week to break down the business behind the news industry, technology, and the creator economy.

Misinformation is destroying the truth and it's a terrible thing. Unless you're a company that gets paid to stop misinformation, then it can be good for business. Mark Little is here. The founder of misinformation-fighting platform Kinzen. He's also the founder of a little-known company named Storyful, sold to News Corp for $25 million about 10 years ago (not to be confused with another company called Storyhunter, my company). I'm really excited to talk to you again.

The Misinformation Boom

Alex: Mark, you seem to pick some of the toughest enemies.

Mark: Yeah, I suppose I'm one of those people in media who sort of runs toward the fire. Wherever I think there's going to be something like a clash, a seismic shift between technology and journalism and media, that's where I've always just been obsessed. Wherever the problem is, right? If you're a good entrepreneur you're supposed to start with the problem, not the solution, and wherever I am is trouble. So that's kind of how I marked my career.

Alex: Ten years ago we saw the boom of social media disrupting breaking news, and you were at the forefront with Storyful. Now you're at the forefront of this sort-of misinformation boom.

Mark: Yeah. So with Storyful I think there was kind of two waves of the social web. I think the first wave was everybody thought this was a liberation. I remember with Storyful we got founded because in the Arab uprisings, we were find these amazing videos, but then we were also realizing that was misinformation, and there were governments trying to pull the wool over our eyes on social media. And so we thought back at Storyful, we were fixing a bug in the system, like pollution, right? “There's a fabulous new way of communicating in the world. It's got a couple of problems. We'll fix them with Storyful.” That wasn't the case. I went to work for Twitter, and started to realize that it wasn't just a bug in the system, but misinformation and hate speech were kind of parts in the system. They were being promoted by the algorithms, and that second problem is what I got obsessed with. [That] this was far deeper than just a bit of a side-effect. There was something really deep in the roots of the business model and the algorithms that run the internet these days. And we had to go and see if we could fix that. That was the small, tiny problem I set out with Kinzen to solve.

Alex: Maybe you can help break down exactly how Kinzen works and what you do for people.

Mark: I got obsessed with the impact of machine learning, right? So if you're sitting on Spotify, if you're on Netflix or you're writing your emails, there’s machine learning in the background recommending what happens next. So I got obsessed with all of that and I realized, as a journalist, if I could match good old fashioned editorial analysis with these machine learning algorithms, we could do a way better job at finding the really terrible stuff that's on the internet. Because at the moment you've got these content moderators are overwhelmed, making decisions every day on millions of pieces of content. So we wanted to bring in trained analysts, match with machine learning, to give early warning of where we could see the most serious misinformation, the most damaging hate speech. Give the technology companies a heads up and say “Hey, this is where you should be looking right now in the world, because there’s some really bad stuff happening with real-world harm.”

Alex: So your clients are the platforms, their governments, or NGOs? Who do you work with?

Mark: Yeah, we generally don't work with governments, we do work with public health agencies, but primarily we work with big technology companies. These are people that have got a lot of audio or video on their platforms. We also work as well with the content moderators. Many of these big platforms outsource the content moderation to other workforces, and we've also worked with fact-checkers with a wide range of people. We’re interested in brand safety, so hopefully helping advertisers make better decisions about whether to put their ads on the internet. But right now, it's the tech platforms that use us as a kind of radar or early-warning system of where in the world they need to be worried about it. For example, we had just big elections in the Philippines, and we were watching there because we saw the dominant candidate, the guy who won the election, was using a lot of misinformation and manipulation in the background to drum up support online. So that's one of the things we were watching out for is that kind of threat.

Alex: So maybe you could take us in there, like what types of things are you looking for? How does, how do you detect it? How does it all kind of work?

Mark: So we are in about 17 different languages right now. We've got these well-trained analysts in these different places around the world, essentially watching out for the trends that are coming from different platforms and that are coming from different networks as well, also in different languages. So we've got machine learning, helping us translate and transcribe audio and video for us spawning patterns, and then telling the machine or the human being “Hey, keep an eye on this.” So it's a, what we call human-in-the-loop machine learning system. Where the human being is playing a role in correcting and priming the pump of the machine learning all the time and it just gets better and better.

Alex: So what types of things can it come up with and can you do this in any language?

Mark: We're benefiting from major advances in machine learning. And I always think with media people, we think the biggest change over the last ten years were subscriptions and paywalls and the creator economy. I would argue that for media and communications in general, it was the birth of what they call these big language models, where the big companies like Google Microsoft and Facebook have developed these massive natural language processing models. A good example would be what's called GPT-3. This is powering things like translation, transcription, and auto-complete, but they've now outsourced them. So a tiny company like ours in Ireland can actually use these open source models, a bit like the way that in the birth of home computing, twenty, thirty years ago, we all got access to this massive computational power. So we're using that and we sit on top and we build our new models on top of that. It's all down to this rapid advance in the last couple of years that allows us, with a human eye exponentially scale the skill of a human editor. And that's, what's so exciting about this. It's kind of old-fashioned editorial skill matched with, in some cases, just cutting-edge technology,

Alex: And then you notify the platform or the client that this looks like it's developing into something that might go viral, into a conspiracy theory, or into something that might be damaging?

Mark: Yeah. We're also helping them sort out between people expressing a good opinion or a bad opinion. Just to explain, [but] someone who might be against the conventional wisdom versus someone who's actually trying to create real-world harm. So for example, someone online say “I don't like this religion,” but we want to make sure we distinguish that from someone who says “I'm going to go out and harm people of another religion.” So we're helping these big platforms make very fine, calibrated decisions. And in that way we're hoping that we're helping make a contribution to freedomof speech, to better expression to distinguish between what's a real threat, and what's just someone saying something wrong in the internet or expressing an unpopular viewpoint. So making that difference between harm and opinion is one of the things we're watching out for. To give an example, we'd be watching elections in Brazil coming up or in Kenya, where there might be violence because of one side going on the internet and manipulate debate and try and sense people and make them go out and commit violence. It's that kind of early warning that we're looking for. Can we see a trend that might result in someone getting hurt in real life as they did in Myanmar a few years ago, where Facebook just didn't have eyes on the local community and there were literally genocidal acts created because people on Facebook were using that platform to create real world harm. So we're very informed by that experience and so are the technology platforms.

Kinzen’s beginnings

Alex: Was this the same direction that you always had with Kinzen?

Mark: No. We started out thinking about how we could push the good stuff, right? So we started out thinking to ourselves that the reason why we're so screwed up in particularly social media, is that the real quality media, that people really feel and need some sort of intentional need; The quality political news, or even just news about your profession, it has to compete with all of the draws. It has to compete with the algorithms, constantly feeding you things that feeds your worst instincts. So we what we wanted to do initially was use this machine learning to find out, for example, what Alex wants. He lives in a certain place. He works at a certain job.

Alex: Miami.

Mark: There you go. You're in Miami [laughs]. You want to know about the Dolphins, you want to know about your local community, but you also want to know about your media business. You want all have control over your algorithm, and that's what we started doing. So it was a positive idea, like push the good stuff faster to Alex, but we got a phone call from a friend of mine who used to work at Twitter and said, “Hey, can you also use that machine learning to detect the bad stuff?” And it turned out we can. So it's almost like a reverse recommender. So we started out looking at a consumer facing aggregator, so it was a bit like smart users. Some of those folks doing really good work in the aggregator space, but the difference was we wanted to give the ordinary person the control over the algorithm. I want to change over to audio, I want 15 minutes in the morning at this particular time, I want to show you, almost like an app for fitness, what does your news look like at the end of the week. So that was the thing we tried to do initially. We came across a huge problem: publishers don't want to do deals with new tech platforms, right. They've been bitten by the last wave of dependence they've had on platforms, and so we would have had to go out and do deals with every single publisher to maintain a high enough quality of content for consumers. Meanwhile, as I say, we got a phone call from somebody who said “Listen, we're having a problem. We don't know what's in all of this vast amount of audio and video. Can you use machine learning to detect the bad stuff?” And that's exactly where we went. We decided, “You know what? We can actually do a much better job providing enterprise software play for the big tech companies to be better at detecting the really bad stuff and making sure they can make the right decision.

Alex: So then it's like a SAS model where you charge a subscription?

Mark: Yeah, so it's a pretty big enterprise software play. We're doing very big deals across multiple markets and it's got to the point now where it's very much like a big software play, and I think that is only increasing in size as we go forward, as people become dependent in a good way on the services that we're providing. And more importantly on the technology, because we've got to the point now where we're building stuff that is absolutely cutting-edge. And we've been benefiting a lot from one of the biggest changes in the media space and that's these new, big learning models. We've all heard about GPT-3 and BERT, and what's happening here is it's a bit like the revolution of the home computer back 30 years ago; these big NLP models are helping a small startup out of Dublin, like ours.

Alex: What’s NLP?

Mark: Natural Language Processing. What this means is, think about your auto-complete on your email, right? You're writing your email and it's detecting what is in the words you're writing and it's suggesting words to complete the sentence. Essentially language models, these massive, big computational models that are being built by the big platforms. They've been open-source now. So small startups like us can build a new model to transcribe Arabic. And we use this machine learning model and essentially build on top of it. And that's where we've got this cutting-edge technology that can understand Arabic for example is like one written language, but it's actually spoken in about 12 or 13 dialects and we have to transcribe the audio. And that's where the machine learning starts to get really smart.

Elon Musk and the future of Twitter

Alex: Next segment, Elon Musk bought Twitter. You used to work for Twitter. There's a lot of misinformation I'm sure you're identifying on Twitter and, uh, they're trying to figure out how to control it and balance it with free speech. Elon Musk is a big free speech guy. He’s probably going to let Trump back on. They have to try to control bots. There's a lot of things he could do. What advice would you give him as a way to make Twitter better and allow free speech and fight misinformation?

Mark: You know, I'm a big admirer of Elon Musk. I've met him, I admire what he's done. This guy will be considered to be the greatest entrepreneur in this generation and many generations. Even what he's done with Tesla. And I actually think, in some cases, he's got some good ideas about Twitter. I worked for Twitter and I can tell you, it was a pretty dysfunctional kind of place. It was not managed the way you would like, and yet at the same time, a lot of very good people working there. For me what's happened with Musk over the past couple of weeks. He's got some great ideas potentially about running it as a better business., introducing things like payments and tipping and subscription products, maybe some other elements that you can bring to the table. Better organization, more focused on product, all great ideas.
Here's the problem: He's outside of a swim lane when it comes to the big issues of content moderation. He's got a particular view on Donald Trump being de-platformed, and I think a matter of opinion, he's probably right that taking people permanently off Twitter is not a good idea. But when he starts to explain what he's going to do about content moderation, you suddenly see him losing his way. Like last night, for example, he came out and said de-platforming Trump was a really bad idea. Again, nothing wrong with that. Then he goes on to say that if there aren't tweets that are wrong and bad, we should make them visible or maybe suspend them or temporary suspension, but not a permanent ban. Everybody who knew anything about content moderation was tearing their hair out, because for years they been realizing it's just really difficult. You know, one guy that I like is Alex Stamos [former Security Chief of Facebook], and he said that watching Musk last night trying to make these simplistic solutions is like watching a baby play with a blender from behind a plexiglass wall. He was coming up with these solutions on the huff, and he hasn't looked under the hood. So in the end of the day, it comes to these high-profile decisions like Trump, content decisions in the United States that form a tiny proportion of the rules and of the moderation decisions being taken every single day. And for the rest of those decisions, there's lots of good reasons for them. So, in the end of the day, good content moderation is never going to be perfect, but good content moderation is the best thing for free speech.

Alex: So maybe Elon seems like he's a bit naive about some of these things now, but you have sort of trust that he will try to get to the bottom of the complexities of it?

Mark: Yeah. Like he met the other day [Thierry Breton, the European Commissioner], one of the people responsible for bringing new rules in here in Europe that will be making these platforms more accountable and more transparent. He said he agreed with everything he'd heard, which is a big surprise to me, because I thought he was on a collision course with Europe. So he's showing signs that he is educating himself really, really quickly. I think the key for him is to be listening to the smart people inside Twitter that have been doing this for quite a while and making really tough decisions in a tough environment. So learning a lot about what happens when you look outside the United States, when you get to a country like Kenya or Myanmar, or you look at a country where the language at the moment is not being protected. Nobody knows how people are fermenting hate. So you've got to really have a deeper understanding. And I think just like he saved SpaceX and he saved Tesla from constant threat in the early days, I would hope he brings the same kind of openness to problem-solving, and not get stuck on the tiny proportions of content moderation cases that get such huge publicity and think about the responsibility to the couple of hundred million people who call Twitter their daily home, or at least use it on a basis and get some service of it. I think to defend free speech, you can't just say everybody can say something. You can’t say we’ll let Trump be on the platform. You have to make sure that people are protected from the bullies and the abusers on the spam armies. So there is a need for a good content moderation.

Alex: Yeah. It's almost like we're going to meet Elon Musk like we meet our politicians when they run for office. He's put himself at the center of this important communication platform. Mark, so glad to have you. Hope to have you on soon. You can follow Mark at @Marklittlenews on Twitter, and also check out Kinzen.com. Until next week.

Media Jungle

EP13: Fighting Misinformation with Machine Learning

The Misinformation Boom

Kinzen’s beginnings

Elon Musk and the future of Twitter

Discussion about this episode