Clearview AI and the end of privacy, with author Kashmir Hill

Tachnologies

Clearview AI and the end of privacy, with author Kashmir Hill

hintsmedia

October 17, 2023

Clearview AI and the end of privacy, with author Kashmir Hill

Today, I’m talking to Kashmir Hill, a New York Times reporter whose new book, Your Face Belongs to Us: A Secretive Startup’s Quest to End Privacy as We Know It, chronicles the story of Clearview AI, a company that’s built some of the most sophisticated facial recognition and search technology that’s ever existed. As Kashmir reports, you simply plug a photo of someone into Clearview’s app, and it will find every photo of that person that’s ever been posted on the internet. It’s breathtaking and scary.

Kashmir is a terrific reporter. At The Verge, we have been jealous of her work across Forbes, Gizmodo, and now, the Times for years. She’s long been focused on covering privacy on the internet, which she is first to describe as the dystopia beat because the amount of tracking that occurs all over our networks every day is almost impossible to fully understand or reckon with. But people get it when the systems start tracking faces — when that last bit of anonymity goes away. And it’s remarkable that Big Tech companies like Google and Facebook have had the ability to track faces like this for years, but they haven’t really done anything with it. It seems like that’s a line that’s too hard for a lot of people to cross.

Listen to Decoder, a show hosted by The Verge’s Nilay Patel about big ideas — and other problems. Subscribe here!

But not everyone. Your Face Belongs to Us is the story of Clearview AI, a secretive startup that, until January 2020, was virtually unknown to the public, despite selling this state-of-art facial recognition system to cops and corporations. The company’s co-founders Hoan Ton-That and Richard Schwartz are some of the most interesting and complex characters in tech with some direct connections to right-wing money and politics.

Clearview scraped the public internet from billions of photos, using everything from Venmo transactions to Flickr posts. With that data, it built a comprehensive database of faces and made it searchable. Clearview sees itself as the Google of facial recognition, reorganizing the internet by face searches and its primary customers have become police departments and now the Department of Homeland Security.

Kashmir was the journalist who broke the first story about Clearview’s existence, starting with a bombshell investigation report that blew the doors open on the company’s clandestine operations. Over the past few years, she’s been relentlessly reporting on Clearview’s growth, the privacy implications of facial recognition technology, and all of the cautionary tales that inevitably popped up, from wrongful arrests to billionaires using the technology for personal vendettas. The book is fantastic. If you’re a Decoder listener, you’re going to love it, and I highly recommend it.

Our conversation here hits on a lot of big-picture ideas: Whether we as a society are just too nihilistic about privacy to make the difficult but necessary tradeoffs to regulate facial recognition; what kinds of policy and legal ideas we even need to protect our privacy and our faces; and what aws are even on the books right now. There’s an Illinois biometric privacy law that comes up quite a bit in this conversation — and at the end Kashmir tells us why she’s actually hopeful why we’re not going to live in a dystopian future. It’s a great conversation, it’s a great book. I loved it, I think you’re really going to like it.

Here is Kashmir Hill, author of Your Face Belongs to Us. Here we go.

Kashmir Hill, you are the author of Your Face Belongs to Us, a book about a startup called Clearview AI, and you’re also a tech reporter at The New York Times. Welcome to Decoder.

I am really excited to talk to you. I have followed your work for years and years. You have been on what some might call the privacy beat, what you call the dystopia beat. There’s a deep relationship between those ideas in the context of technology, and it all comes to a head in this book, which is about a startup called Clearview. It is founded by a number of characters. There are a number of links to the alt-right, the whole thing. But fundamentally, what they do is scan faces and do facial recognition at scale, and there are just a lot of themes that collide in this book. It is kind of an adventure story. It’s a lot of fun. Let’s start at the very beginning. Describe Clearview AI and what they do and why they do it.

Clearview AI basically scraped billions of photos from the public internet. They now have 30 billion faces in their database collected from social media sites like Facebook, Instagram, LinkedIn, Venmo. They say that their app identifies people with something like 98.6 percent accuracy. And at the time I found out about them, they were secretly selling this kind of superpower to police, and no one knew about it.

That first step, we’re going to take a bunch of faces off the public internet… a lot of technology companies start by just taking stuff off the public internet. We are in a time right now that the context of everything is generative AI. There are a million lawsuits about whether you should be able to just freely scrape information off the public internet to train a generative AI system. That theme comes up over and over again, but there’s something in particular about faces and what Clearview AI did with faces that everyone reacts differently to. Why do you think that is?

I just think it’s so personal. Who we are is in our face. And this idea that anyone can snap a photo of us and suddenly know not just who we are and where we live and who our friends are, but dig up all these photos of us on the internet going back years and years. I think there’s just something inherently privacy-invasive about that that just is more resonant for people than cookies or tracking what websites you’ve been to. It’s really controlling your identity.

As you’ve been talking about the book, promoting the book, have you sensed that people respond to it differently when it’s faces? The reason I ask this is because you have done a lot of reporting about cookies, about advertising tracking, about all of these pretty invasive technologies that permeate the internet and, thus, modern life. It always feels pretty abstract. You have to start by explaining a lot of stuff to get to the problem when you’re talking about cookies on a website or advertising or something. When you start with faces, it seems immediately less abstract. Have people responded to the book or the ideas in it differently because it’s faces?

Well, one, just everyone gets the face, right? You don’t need to be a technology expert to understand why it might be invasive for somebody just to know who you are or find your face in places that you don’t want them to find it. I also think that it builds on all that privacy reporting I’ve been doing for years — all that online tracking, all those dossiers that have been created about us online, that we’ve created and that other people have created on us.

The face is the key to accessing all that in the real world. All this online activity, the dossier, can now just be attached to your face as you’re moving, as you’re walking down the street, when you’re making a sensitive purchase at a pharmacy, when you’re trying to get into Madison Square Garden. All of a sudden, it’s like your Google footprint attached to your face.

Talk about Clearview AI itself, because the big companies have kind of had this capability for a while, and to their credit, they haven’t really done much with it. Google, inside of Google Photos, will do some face matching, but that’s not public as far as we know. Facebook can obviously do it, but they keep that inside of Facebook. Clearview is just like, “We’re doing it. We took a bunch of data, and we’re doing it. Now the cops can look at your face.” Why is this company different? How did it start?

I think this was really surprising to people — it’s something that’s in the book — that Google and Facebook both developed this ability internally and decided not to release it. And these are not companies that are traditionally that conservative when it comes to private information. Google is the company that sent cars all over the world to put pictures of our homes on the internet.

What was different about Clearview AI is that they were a startup with nothing to lose and everything to gain by doing something radical, doing something that other companies weren’t willing to do. I put them in the same category of being a regulatory entrepreneur as an Uber or an Airbnb — that this was their differentiator. They said, “We’re going to make this database, and we’re going to reorganize the internet by face, and that’s our competitive advantage. And we want to make our database as big as we can before anyone else can catch up to us.”

Were they searching out the market of police departments and right-wing influencers, or did they start with that political bent from the beginning? Because that’s a real theme of your book, that a bunch of characters are floating around this company from the start that are not necessarily great characters to be under a company, but they seem to have welcomed it.

Yeah, so Clearview AI is really a strikingly small company, just a ragtag group of people, I think exemplified by the technical co-founder, Hoan Ton-That. This young guy, he grew up in Australia, obsessed with technology, obsessed with computers. [At] 19 years old, drops out of college and moves to San Francisco, and he’s just trying to make it in the tech gold rush. It was 2007. He becomes a Facebook developer, then he starts doing these silly iPhone games. And he makes an app called Trump Hair where you can put Donald Trump’s hair on people in your photos. Just throwing spaghetti at the wall to see what will stick. And he starts out kind of liberal. He moves to San Francisco, grows his hair long, plays guitar, hangs out with artists. And then he moves —

Yeah. [Laughs] And then he moves to New York and really falls in with this conservative group of individuals. People had a lot of far-right interests. And [he] was able to build this radical technology because it’s open source now; it’s very accessible. Anyone with technical savvy and the money to store and collect these images can make something like this. And they were able to have money around them. He met Peter Thiel at the Republican National Convention, and Peter Thiel ends up becoming the first investor in the company that became Clearview AI, giving them $200,000. Though they eventually ended up selling it to police departments, originally, it was just searching. It was a product in search of a user, and they had all kinds of wild ideas about who might buy it.

Those ideas are really interesting to me. I can see a lot of ways that a consumer might want to search the internet by face, or retail stores, like you mentioned. You walk into a store, they want to know who you are, what you’ve bought before. There are a lot of markets. And somehow, they’ve ended up with the authorities, which is maybe the last market anybody wants. How did they end up with the cops?

So, they originally were trying to sell it to private businesses: hotels, grocery stores, commercial real estate buildings. They would also give it to investors and people who own those grocery stores and buildings. That’s one of my favorite anecdotes about one of the first users of Clearview AI: this billionaire in New York, John Catsimatidis, who had the app on his phone, was thinking about putting it in his grocery stores to identify shoplifters, specifically Häagen-Dazs thieves, and ends up in an Italian restaurant in SoHo. His daughter walks in, and she’s got a man on her arm, and he didn’t know who it was, so he asked a waiter to go over and take a photo of them and then runs the guy’s photo through Clearview AI and figures out who he is. He’s a San Francisco venture capitalist, and he approved.

But yeah, originally, they were just like, “Who will pay for this?” When it was getting vetted at one of these real estate buildings as a tool to use in the lobby to vet people coming in, the security director loved it and said, “You know who would really benefit from this? My old colleagues at the NYPD.” And so that’s how they got introduced to the New York Police Department. NYPD loved it, and lots of officers there started secretly using it. This shocked me that police can just essentially get this unvetted tool from some random company and download it to their phones and just start using it in active investigations. But that’s what happened. And Clearview gave them free trials. They told their friends, other departments. All of a sudden, the Department of Homeland Security is getting access to it and officers around the world. And everyone’s just really excited to have this new, very powerful tool that searches the whole internet looking for somebody.

There’s a big assumption baked in there. You’ve hit on it. It’s unvetted. You’ve used it, you’ve had it used on you. Does it work?

So I have never had access to Clearview AI myself. I’ve asked many times, “Hey, can I download the tool?” And they say it’s only for police departments, now at least. But Hoan Ton-That has run searches on me several times. I talked to him a lot for the book. For me, it was very powerful. It turned up 160 or so photos of me, from professional headshots that I knew about to photos I didn’t realize were online. A photo of me with a source I’d been talking to for a story. I remember this one photo of somebody, and there’s a person walking by in the background. And when I first looked, I didn’t see me. Then I recognized the coat of the person in profile walking by in the background. It’s a coat I bought in Tokyo, very distinctive. And I was like, “Wow, that’s me.” I couldn’t even recognize myself. I’ve seen searches that law enforcement has done. It really is quite powerful. I think facial recognition technology has advanced in ways that most people don’t realize.

And is it powerful at the average level of facial recognition technology? Is Clearview more powerful than the average piece of technology? Where does it land on that scale?

At the time that I first heard about them — and in the first few years working for law enforcement, they hadn’t been vetted. No one had tested their algorithm for accuracy in a rigorous way — but there is a federal lab called NIST, or the National Institute of Standards and Technology, and they run something called the [Face Recognition Technology Evaluation.] And so they’ll test all these algorithms. And Clearview, the first time they did the test, they came out really high on the scale. They actually do have quite a powerful algorithm that was really one of the best in the world. And I think, at the time, it was the top-rated algorithm from an American company. So, they do have a good algorithm.

And you said it’s open source, and it’s a ragtag team. How are they outdoing everyone else? What’s the secret to their success here?

It’s not completely open source. Hoan Ton-That was not a biometric kind of genius. He didn’t have any experience specifically with facial recognition technology. His introduction to it was through academic papers and research that was being shared online. But he did recruit someone who had some more experience with machine learning and neural net technology. And he said they fine-tuned the algorithm. They trained it on lots of faces collected from the internet. So clearly, they’re doing something right there. But it started with… I mean, he started from zero. He went from Trump Hair to this radical app with 30 billion faces. It’s quite a story.

That database of faces is really interesting to me because it doesn’t belong to them. They’ve scraped it from social media sites. They’ve scraped it from the public internet. They’re looking for photos of you; they find them. They clearly have not taken those photos of you. Someone else has taken those photos of you. How is it that they remain in possession of this dataset now that the company is public and everyone knows that they scraped all of this information?

Several years ago, some of the companies whose data they had scraped, whose user’s data they had scraped — Facebook, Google, Venmo, LinkedIn — sent Clearview cease-and —

Venmo was actually one of the very first sites they scraped, which was interesting to me because Venmo has gotten a lot of scrutiny from privacy activists who said that it was very bad for users that Venmo makes everybody public by default — that all your transactions are public by default unless you change your privacy settings. Privacy activists have been criticizing them for years and years and years. And Hoan told me, “Yeah, that was great for me because on the Venmo.com website, they actually were showing real-time transactions, public transactions between users, and it would update every few seconds. It had photos of the users and links to their profile page.” And so he developed a scraper that just hit that site every few seconds, and it was like a slot machine where he just pulls it and faces come spilling out. So yeah, Venmo was in there.

These companies sent Clearview AI cease-and-desist letters. [They] said, “Hey, you’re violating our terms of service. You’re not supposed to do this. We see this as a violation of contractual law, the Computer Fraud and Abuse Act.” Then, that was it. No one sued Clearview AI. No one forced the company to delete the photos. As far as I know, Clearview AI still has them and is still collecting —

Why has no one sued them? This is bonkers to me.

I’ve never really gotten a satisfactory answer to this, honestly. I think part of it is that it’s a bit of a legal gray area, whether it’s illegal to scrape or not. And there are a lot of digital rights groups who want us to have the ability to scrape, to make it easier to collect information that’s on the public internet. There’s at least one federal court ruling in this case between LinkedIn and HiQ, the startup that was scraping information from LinkedIn to basically let employers know if any of their employees were thinking about leaving. The finding in that case was that scraping was legal. And so I think part of it is that these companies don’t think they’d win if they sued. And then, I don’t know. Maybe they just don’t want to bring more attention to the fact that the horse is already out of the barn, that Clearview already has all of their users’ photos.

Or they’re taking advantage of the gray area, too. That’s the thing that just leaps out to me, is Google is training all of its AI systems on the public internet, and so is Amazon, and so is Facebook, and so is OpenAI. And if you go chase down Clearview AI, you might cut yourself off. But then, on the flip side, there’s a bunch of users. They’re our photos. They’re not the platform’s photos. If I upload photos to Facebook, Facebook is very clear like, “These are still your photos. We’ve signed some license with you, or you’ve not read a license and clicked ‘I accept,’ more likely, that says we can use them.” But they’re still my photos. Why haven’t any users gone after Clearview AI?

Clearview has been sued in a few states where there’s a relevant law. There’s a lawsuit in California. The Vermont attorney general sued them. And basically, a whole bunch of litigation got consolidated in Illinois because Illinois is one of the few states that has this really strong law directly applicable to what Clearview AI did called the Biometric Information Privacy Act, or BIPA. I tell the history of it in the book. It’s a bit of a fluke of history that it was passed, but it’s the rare law that moved faster than the technology. And so, yeah, they’re fighting. They’re trying to say, “Hey, you violated our privacy. You violated this law. Get us out of your databases.” The law moves very slowly, as anybody who’s ever watched a lawsuit happen [knows], and so those kind of suits have been going on for years now.

The thing that really broke this company into the mainstream and made people pay attention to it is your reporting. The cops are using it, people were using it, these characters on the right wing were using it. But the company sought no publicity. It didn’t want anyone to be known. And you started reporting on it. They still tried to hide. And then something happened, and Hoan Ton-That started talking to you and honestly started being proud of his company in a very different way, publicly proud of what they were doing. What was the change there? What happened?

Yeah, when I first started reporting on Clearview AI, they very much wanted to stay in the shadows. And they actually were not talking to me but tracking me. They put an alert on my face so that when law enforcement officers who I was talking to uploaded a photo of me to show me what the results were like, the company would get an alert, and they would reach out to the officers and tell them, “Stop talking to her.” They deactivated one of their accounts. That was a bit creepy for me.

But, at some point, they changed their mind, and they hired a crisis communications consultant, basically an expert that you go to when you’re having a PR disaster. And they went with this woman who… She’s a political person. She was who Eliot Spitzer called when he was having his troubles. And I think she told them, “Look, you can’t stop her. She’s going to do this story. And we need to go on the offensive here. We need to defend what you’ve built and try to make sure that your company stays in existence and can keep doing business.” Because it looked pretty bad when I first started looking into them. Their efforts to hide themselves while they’re exposing so much about millions of people was not a good look.

So when the tone changed and they hired a crisis person, they started engaging with you in the reporting. What was the pitch for why this was a good thing to build? I can come up with hypothetical reasons why some hypothetical facial recognition system is good to build, but here you’ve got a real one here. Here, you’ve got actual cops who are using it. You’ve got a bunch of downstream obvious bad things that are happening. What was their pitch for why it was good?

What Hoan Ton-That says, what he’s evolved into around facial recognition technology, is that what the company is selling — this power for police officers to identify criminal suspects, to solve crimes — is the best possible use of facial recognition technology. That they are making the world safer, more secure. It’s being used to rescue children. I remember this line from that first interview I had for him, where he said, “They’re using facial recognition technology to find and arrest pedophiles; it’s not getting used by pedophiles.” And so this is what they really lean into — that this is a technology that is making the world safer. And they’re restricting its use to law enforcement, so this is good, that society should embrace this.

So this runs right into the tradeoffs of all technology that is used by law enforcement. It seems like they are a battering ram of rhetoric when it comes to why law enforcement is using it. Like you say, “We’re catching pedophiles, and thus, no more questions should be asked.” Whenever I hear that, the red flags go off for me. You’re trying to prevent me from asking questions about the Fourth and Fifth amendments. You’re trying to prevent me from asking questions about privacy by making them seem morally wrong to ask.

But there’s a part of me that says, “Look, the technology definitely has an error rate. I don’t know what the cops are doing. I can’t audit their use of it. When they do rely on technology like this, history and statistics suggest that they will have a disproportionate impact on marginalized communities.” Has Clearview addressed any of this, or are they just saying the classic tech company line of, “This is a tool, and tools are neutral and it depends on who uses it and why”?

Clearview definitely pushes that onus to police departments in saying, “We’re just providing the technology for them to use. They should never arrest somebody based on a Clearview match alone and that they need to do more investigating.” I think, for us as a society, there’s just a lot to evaluate here. I’ve talked to a lot of officers who, yeah, they’ve solved crimes with Clearview AI as a starting point. And horrific things — abuse of children. But I think we need to ask ourselves, are we comfortable with this database of probably hundreds of millions of people, probably you and me? Should we all be in the lineup every time the police are trying to solve a crime, whether it’s shoplifting or murder? And if they are going to use facial recognition technology, what are the rules? Do you need to get a warrant to search a database like this? Should every officer just have this on their phone and use it whenever they want? What do you do after you get a match? What kind of crime should you use it for?

Even if we just accept that it’s a useful tool, there are still so many conversations we have to have. I know of at least one person who has been misidentified as a criminal suspect because of Clearview AI. He lived in Georgia. It was basically purse theft in Louisiana. He was the hit. He got arrested the day after Thanksgiving, put in jail for a week, awaiting extradition. Louisiana had to hire lawyers to clear all this up. It can be really damaging when it goes wrong or if the police trust the face match too much — not to mention what happens if it starts getting rolled out more widely. And we look at China as an example of that. What if we start having a technology like this running all the time on all the cameras, tracking us everywhere we go? It could be used in chilling ways against protestors or to gather damning information about a political opponent. It’s such a range that I really think we need to think hard about this and not just let it slip in and become ubiquitous or become normalized without setting up some guardrails.

So I can already hear the responses from some of our listeners who think you can’t put the genie back in the bottle ever, and your privacy is already gone. Just by holding a smartphone, your privacy is already gone. And what is the difference between having your face out there versus your already gigantic digital fingerprint? Is the genie just out of the bottle? It feels like we might be in a liminal moment where there is a law in Illinois, and maybe there should be a federal law. Or maybe we should just say “stop” in some way. Just scream out the windows, “Please stop.” But there’s a chance that it’s already over, and a generation of young Americans in particular just believes that all the cameras on the internet, the cops can look at them, and that’s going to be that.

I am not a privacy nihilist. If I were, I probably wouldn’t be on the beat because what’s the point?

I do think that we can change course, and I do think that we can restrain technologies through norms and through policies and regulations and laws. We could live in a world where there were speed cameras on every road and jaywalking cameras everywhere, and if you sped or if you jaywalked, you would immediately get a ticket. But I don’t think any of us want to live in that world. And so, even though that’s possible, it doesn’t exist. Jay Stanley at the ACLU gave me this great example of a time that we’ve restricted technology, and that’s last century, when there were all these tiny bugs and recording devices that were starting to get manufactured. If you’ve heard the Nixon White House tapes, then you’ve benefited from that technology. People at the time were freaking out that they were just going to be recorded all the time, that you could no longer have a private conversation, that there were just these bugs everywhere.

And we passed laws to make eavesdropping illegal, to restrain the ability to record conversations. And it’s the reason why all of these surveillance cameras that just are everywhere in public space now are only recording our images and not our conversations. I don’t think we just need to accept that we’re going to live in this dystopian world because technology makes it possible. I think that we can choose the world that we live in. I hope that we won’t just have ubiquitous facial recognition all the time. Because I think it would be so chilling to not be able to gossip at dinner without the worry that a person next to you is going to identify you with an app on their phone and blast out what you’re talking about on Twitter, or X, as we call it these days.

Put that into practice for me. I’ve read a lot of your reporting. A lot of your reporting is about how the Big Tech companies build these ubiquitous surveillance networks, mostly to put advertising in front of us. At the end of it all, they’re just trying to sell us some paper towels, and faster than ever before. And there are billions of dollars in between me and the paper towels. But that’s what it’s for. It’s very targeted advertising. And there’s some debate about whether it’s even effective, which I think is very funny, but that’s what it’s largely for. And I go out, I see my family, I listen to our readers, and they’re like, “Facebook is listening to us on our iPhones.” And they won’t believe me that probably not. That’s probably not happening, that there’s this other very complicated multibillion-dollar mechanism that just makes it seem like Facebook is listening.

It would be very illegal.

But they’ve just given up, right?

It’d be very illegal if they were.

It would be illegal, and also it would be harder. It feels like it would be much harder to light up your microphone all the time and listen to you than just assemble the digital fingerprint that they’ve managed to assemble and show you the ads for a vacation that your friend was talking about. You can explain it, but then people just fall back on, “Well, Facebook is just listening to me on my phone, and I still have a phone and it’s fine.” And that’s the nihilism, right? That’s where the nihilism comes into play, where even when people assume that one of the most invasive things that can happen is happening, they’re like, “But my phone’s so useful. I definitely need to keep letting Facebook listen to me.”

Yeah, I’m still going to take it with me to the bathroom.

Right. You ask somebody if they would put a camera in the bathroom, and they’re like, “No.” And you’re like, “Well, you carry seven of them in there all the time.” But of course, you have to have your phone in your bathroom.

Do you see that changing at the policy level? Okay, now here’s a set of technologies that is even more invasive or can do this tracking that we don’t think we should do, or could get a politician into trouble like it did with Nixon, or X, Y, and Z bad thing could happen, we should probably restrict it before it gets widespread. Or is the nihilism, the cultural nihilism around privacy, still the dominant mode?

I feel like we’re at the tipping point right now of deciding, are we going to continue having anonymity in our everyday life, in our public spaces, or not? I hope we go the way of yes, and I feel like lawmakers, oftentimes, it is very private for them and how does this get used against them. I think about that crazy recording from the Beetlejuice show, and you’re fondling your boyfriend and getting fondled, and you kind of think you’re anonymous.

I wasn’t sure where that was going to go. I thought you were going to talk about the actual movie Beetlejuice and not Lauren Boebert, but yeah, I’m glad we got there.

I think that’s the first time anyone said fondle on Decoder, I want to be clear.

You think you’re in a crowd and you’re anonymous, and you don’t realize they have these night vision cameras at the show staring down at you capturing everything that’s happening. I think if we have more moments like that that affect lawmakers where, yeah, they thought they were in this private space. They didn’t think that it was being taped, that it would be tied back to them. I just think, all of us, even if you think, “Oh, I’m fine, I’d be fine with people knowing who I am,” there are moments in your day where you’re doing things that you just wouldn’t want easily known by strangers around you, or a company, or government. I just think that that is true.

And we have seen this get restricted other places. Like Europe investigated. All the privacy regulators in Europe and Canada and Australia, they looked at what Clearview did, and they said, “This breaks our privacy laws. You’re not allowed to collect people’s sensitive information, biometric face print, this way and do what you’re doing.” And they kicked Clearview AI out of their countries.

Is Clearview still collecting the data? Are they still scraping the internet every single day, or is the database fixed?

So, when I first wrote about them in January 2020, they had 3 billion faces. And today, they probably have more, but last I heard, they had 30 billion faces. So they are continuing to grow their database.

Do we know what the sources are of that growth? Is it still the public internet, or are they signing deals? How’s that working?

Unfortunately, they’re not a government actor. They’re a private company, so I can’t send them a public records request and find out what all their sources are. So, I mostly see it through when I see an example of a search, whether they run it on me or I see it show up in a police investigation. But yeah, it seems like pretty wide out there — news sites, employer sites. They seem to be pretty good at targeting places that are likely to have faces. And one of my last meetings with Hoan Ton-That, before I was done with the book, they had just crawled Flickr. He himself was finding all these photos of himself when he was a kid, like a baby coder in Australia. He said, “It’s a time machine. We invented it.” And he did a search on me, and it showed photos I didn’t know were on Flickr that one of my sister’s friends took. It was me at a point in my life when I was depressed, I was heavier, I weighed more. I don’t put photos from that time on the internet, but there they were. Clearview had them.

We have a joke on The Verge staff that the only functional regulation on the internet is copyright law. If you want something to come down off the internet, your fastest way to doing it is to file a DMCA request. I’m shocked that a bunch of Flickr users haven’t done this with Clearview. I’m shocked that someone else has not realized, “Okay, this company boosted my photos.” Getty Images, we just had the CEO on Decoder, I’m shocked that they haven’t done this. Is it just the company is still in the shadows, or have they actually developed a defense? It just seems, at this point, given the nature of copyright lawsuits on the internet, it’s out of the norm that there isn’t one.

Yeah, I’m not a lawyer. I just played one when I was a baby blogger at Above the Law.

What Clearview often argues is that they are very comparable to Google, and they say, “These are not our images. We’re not claiming ownership over these images; we are just making it searchable in the same way that Google makes things searchable.” And when you do a search in Clearview AI, all it shows you is the little face, and you have to click a link to see where the full image is on the internet and where it came from. I have talked to officers who have found deleted photos with Clearview AI, so it makes me think that they are in fact storing the images. But yeah, I haven’t seen somebody make that argument against them yet.

So it’s interesting. Someone did once upon a time make that argument against Google, and there is that case. We’re already in the Boebert zone, so I’ll say it was Perfect 10 v. Google. Perfect 10 was a soft-core porn magazine, I think, and Google was doing Google Images, and they were taking the thumbnails. A lot of the law of the internet is like this. It’s the way it is.

There is Google Images, there is reverse-image search on Google. Do you see a difference in those two things? I’m confident that I could put my face in the Google Image reverse search, and it would spit out some answers that look like me or are me. Is there a meaningful difference here?

Clearview AI is, in so many ways, building on the technology that came before it from, yeah… They ended up hiring Floyd Abrams as their lawyer, preeminent First Amendment lawyer, worked for The New York Times to defend the right of the newspaper to publish the Pentagon Papers. And he was specifically talking about precedent from Google cases that supported what Clearview AI was doing. That they’re a search engine, and instead of searching for names, they’re searching for faces. That hasn’t been completely successful for them in the courts. Judges have said, “Okay, fine. You have the right to search images and look at what’s out on the internet, but you don’t have the right to create this biometric identifier for people. That that is an extra step too far.”

But in so many ways, they’re building on what came before — from all these technology companies encouraging us to put our photos online, put our faces online next to our names, to the actual technologies and algorithms that engineers at universities and at these companies developed and then made available to them. So yeah, they’re building on what came before. I don’t think that necessarily means that we do have to keep letting them do what they’re doing. But so far, we have in the US, in much of the US.

So you mentioned the courts. There was a case in Illinois, the ACLU sued Clearview for violating the Illinois biometrics law that you mentioned. They settled, and part of that settlement was Clearview agreeing to only sell the product to law enforcement and no one else. That seems like an awfully gigantic concession: we will have no customers except the cops. How did they get there? How did that affect their business?

It was funny because both sides presented the settlement as a win. The ACLU said, “We filed the suit because we wanted to prove that this Illinois law, BIPA, works,” and Clearview AI did try to say that it’s unconstitutional, that it was a violation of their First Amendment right to search the internet and access public information. That didn’t work. They had to settle.

So ACLU said, “Hey, we prove that BIPA works. Other states need BIPA. We need BIPA at the federal level.” Meanwhile, Clearview agreed in the settlement to restrict the sale of their database only to the government and law enforcement. And so ACLU said, “Hey, we won, because now this huge database of billions of faces won’t be sold to companies, won’t be sold to individuals.” But Clearview said, “Hey, this is a win for us. We’re going to continue doing what we’re doing: selling our tool to the police.”

And they do still have lots of contracts with police departments. They have a contract with the Department of Homeland Security, the FBI, widely used by the government. But it was important in that, yeah, it means they can’t sell it to private companies. So that cuts off one line of business for them.

Does that limit the size of their business? Are their investors happy about this? Are they sad about this? Is Peter Thiel mad that the company isn’t going to go public or whatever?

So one of the investors that I’ve talked to a few times is David Scalzo. He was a venture capitalist out here on the East Coast. He was so excited about investing in Clearview AI because he told me they weren’t just going to sell this to police — they were going to sell this to companies; they were going to sell this to individuals. He said, “Everyone in America is going to have the Clearview AI app on their phone. The moms of America are going to use this to protect their children.” And he thought he was going to make a ton of money off of Clearview AI. He said, “It’s going to be the new Google. The way you talk about Googling someone, you’re going to talk about Clearviewing their face.” And so he has been frustrated by the company agreeing to tie its hands, just selling it to police, because he says, “I didn’t want to invest in a government contractor.” And yeah, there is a question about the future of Clearview.

When I think of not lucrative businesses, I think of government contractors.

No government contractor has ever made a killing.

So yeah, he’s not happy about it. And Clearview sell their technology for very cheap compared to other government contractors.

Yeah. When I first started looking into them, and I’m getting these government contracts showing up in public records requests. In some cases, they were charging police like $2,000 a year for access to the tool. It was like one subscription for $2,000. Now, their most recent one they signed with the Department of Homeland Security, is close to $800,000 for the year. So, either they’ve got a lot of users —

It still seems cheap, right? But either they have a lot of users —

Take DHS for all they’re worth.

Either they have a lot of users, or DHS is like, “We’re going to pay you a lot because we want to make sure that you stay in business.”

Yeah, that’s the part that I’m really curious about. Is there competition here? Is Raytheon trying to build a system like this? If you see a market, particularly a lucrative government contracting market, it seems like the big companies should be racing in to build competitive products or more expensive products or better products. Is that happening, or are they in a market of one?

There are copycats. There’s this public face search engine that anyone can use called PimEyes. It doesn’t have as large a database. It doesn’t have as many photos come up, but it is out there. I haven’t heard about anyone else doing exactly what Clearview is doing and selling it to police. Most other companies just sell a facial recognition algorithm, and the customer has to supply the database of faces. So that does set Clearview apart.

I wonder how it’s going to affect other businesses, just the reaction to Clearview. It has been such a controversial company. It has run into so many headwinds, and it’s unclear at this point how expensive this is going to be. They’ve had fines levied by European privacy regulators that they have so far not paid, and this Illinois law is very expensive to break. It’s $5,000 per person whose face print you use. It cost Facebook $650 million to settle a lawsuit over BIPA for automatically recognizing faces to tag friends in photos. It could break the company. Clearview has only raised something like $30 million. So yeah, I keep waiting to see what’s going to happen financially for them.

It would be incredible if the Department of Homeland Security is funding a bunch of fines to the Illinois government to keep this company afloat. But that’s the cycle we’re in. The revenue is going to come from law enforcement agencies to pay fines to a state government instead of there being any sort of federal law or cohesive regulatory system. Is any change there on the horizon that there might be a federal facial recognition law or more states might look at, quite frankly, the revenue that Illinois is going to gain from this and pass their own laws? Or is it still status quo?

It’s strange to me because I hear so often from lawmakers that privacy is a bipartisan issue, that everyone’s on board, that no one likes the idea of —

I’m not doing anything.

Yeah, they don’t do anything. And I chart it in the book, but strange political bedfellows coming together again and again to talk about facial recognition technology and its harms to civil liberties. Most recently, a hearing led by John Lewis — who has since passed but civil rights leader, he was leading the impeachment investigation into Trump — and he partnered with Jim Jordan and Mark Meadows, huge Trump supporters in Congress. And they had this hearing about facial recognition technology, and they said it. They said, “There’s not much we agree on here, but this is a topic that unites us. We all believe we need to protect citizens from invasions of their privacy.” And then nothing happens.

It’s just so gridlocked at the national level that I don’t have a lot of hope for something coming from there. But we have seen a lot of activity on this at the local level and at the state level from BIPA — and maybe other states would pass something like that — to just state privacy laws that give you the right to access information that a company holds on you and delete it. So if you live in California or Connecticut or Virginia or Colorado, you can go to Clearview and say, “Hey, I want to see my results.” And if you don’t like being in their database, you can say, “Delete me from your database.”

Do you think enough people know that they can do that? If I lived in one of those states, I would be doing that every week and just being like, “Who knows about me? Delete it.” There should be a secondary economy of companies just offering that service to people. There already are, in some cases. There is DeleteMe that just deletes you from various things. Is that the solution here, that there’s just a market for privacy, and you can be on one side of it or the other?

California, actually as part of its law, has this requirement that a company has to disclose how many times people use this right against them. And so I was looking at Clearview’s privacy page to find out. California has millions and millions and millions of people, and Clearview, last year I think, got something like 451 requests for deletion there, which seems quite tiny. I would think it would be higher than that.

Yeah. That’s just tech reporters. That’s just people seeing if they can do it.

Yeah, mostly it’s probably tech reporters and privacy academics and students who are doing it as their homework for some class.

Legislative aids making sure the law is in compliance.

Is that just people don’t know, and there needs to be a bunch of education? Is that, eventually people will realize, “This is happening, and I should go and proactively try to stop it?” What keeps people from wanting to protect their privacy?

I just think people don’t anticipate the harms. I think that’s what’s so hard about privacy is that you don’t realize what is going to hurt you, what information is out there is going to harm you until it happens. Until you do get wrongfully arrested for a crime because a police officer made a mistake and identified you with Clearview. It’s hard to see it coming. You don’t realize until after it’s happened.

There’s the flip side of this. It’s where we started. The big companies have had the opportunity to do it for a long time. This is a very processor-intensive task. They’re running these high-end machine learning algorithms. You need all this stuff. Amazon could do it, Google can do it, Facebook can do it. Apple could do it if they wanted to. But they don’t. They have stopped themselves, and they haven’t even stopped themselves in the way they usually stop themselves. They’re not saying, “Hey, you should pass a law, or we’re definitely going to do this,” which is what they’re effectively doing with AI right now. They’re just not doing it.

I can’t recall another time when all of those companies have just not done something, and they’ve allowed one startup to go take all the heat. Is there a reason for that? Is there just an ineffable morality inside of all these companies that’s keeping them from doing it? Or is there a reason?

I think facial recognition technology is more controversial. There’s just something that is specifically toxic about it. I do think there’s worry. I think there’s worry about legality. Illinois has this law around use of face prints. So does Texas.

Is it really just Illinois is keeping everyone from doing it?

I remember a few years ago when Google had that Art Selfie app. Do you remember that? You could take your photo, and it would tell you what masterpiece you look like. And it didn’t work in Illinois, and it didn’t work in Texas. They geofenced them off because it is a really expensive law to break. So I think that’s part of it.

They have introduced this technology in ways. Like, when I go on my iPhone, I can search all my photos by face and see them all. That’s a convenient tool, and I think their users like it. Maybe it’s just we, as a society, aren’t asking for the ability to just recognize everybody at a cocktail party. Andrew Bosworth at Meta has talked a few years ago about how he would love to give us facial recognition capabilities in glasses, and it would be great at a cocktail party to put a name to a face, or blind users or face blind people could use it. But that he’s worried — maybe society doesn’t want this. Maybe it’s illegal.

No, so I think this is the killer app for these glasses. I would wear the headset all day. You could put me in one of their silly VR headsets all day long if I could do faces and names. I’m horrible at faces and names. I would probably be history’s greatest politician if I could just remember people’s names. I believe this about myself because it’s how I excuse the fact that I’m really bad at faces and names. That’s the killer app. You wear the glasses, they’re expensive, whatever, but it can just tell you who other people are. I know that people would buy that product without a second’s hesitation. The societal cost of that product seems like it’s too high. I don’t know how to build that product in a privacy-sensitive way. And no one I’ve ever interviewed on this show has ever offered me a solution.

But the market wants that product, right?

The version of this that I imagine could be possible would be like in the way that we set the privacy of our Facebook profiles or Instagram pages, we say, “This is public,” or, “This is visible only to friends,” or, “Friends of friends can see the content.” I could imagine a version of Meta’s augmented reality glasses where you could set the privacy of your face and say, “Okay, I’m willing to opt in to facial recognition technology, and I want my face to be public. I want anybody who’s wearing these glasses to know who I am.” Or, “You know my social graph. I want to be recognizable by people I’m connected to on Facebook or Instagram or Threads.” Or, “I want to be recognizable to friends of friends.”

I could imagine that world in which we have the ability to say how recognizable we want our friends to be because the technology is offered by a company that knows our social graph. I just wonder, if that happens, how many people opt in to that? And then, do you get completely stigmatized if you’re a person who says, “I want to be private all the time”?

It feels like eating too much sugar or something. There’s something happening here where, of course, I want everybody at the party to know who I am and what my interests are so they can come talk to me. But 10 years down the line, I’m sitting in a jail for a week waiting for my lawyer to tell the cops, “That wasn’t me.” Those are so disconnected in time and harm that I’m just not sure how to communicate that to people.

Right. Or you set your face to public because you’re like, “This is great for advertising my business.” But then you’re out at a bar with your sidepiece and you forget that your face is public, and now you are in trouble. [Laughs] It’s just hard to anticipate the harms. Sometimes the benefits are more obvious and sometimes the harms are more obvious. Maybe with facial recognition technology, these companies haven’t released it because they do see the harms more clearly than the benefits.

That is one of the first times anyone has ever claimed that tech companies see the harms more clearly than the benefits.

Yeah, I’m not certain about that.

That I can recall on the show, actually. Even the executives from the tech companies.

So let’s talk about where this goes. We’ve established that Clearview is a pretty singular company. They’ve built a technology that other people could have built, but for various reasons — most notably the governments of Europe and Illinois, two governments that you often think of together — other people aren’t in this market. But the cops really like this technology. Dads looking at their daughters on dates in restaurants appear to really like this technology. There’s a market for it; there’s a demand for it. The harms are pretty hard to explain to people. Is this going to keep happening? Are there going to be more state-level laws or European Union laws? Is everyone just waiting to see what happens with Clearview? What does Clearview think is going to happen?

I think Clearview wants to keep selling this to law enforcement, and they are. I think that the question we need to ask ourselves right now is: how widely deployed do we want this to be? And it’s a question at the government level. Do we want police only using this to solve crimes that have already happened? Or do we want to roll out facial recognition technology on cameras around the country so that you can get real-time alerts when there is a fugitive on the loose? I was thinking about this when that guy escaped in Pennsylvania, and it just felt like we were looking for him for forever. And I can imagine a case like that being, they say, “If we just put facial recognition on all the cameras, then we could find them in an instant.” So yeah, that question of do we deploy it more widely? Do we all have an app like this on our phone? Or do we set more rules, where we control whether we’re in these databases, we control when this is used for our benefit versus on us?

And there are so many questions there because, if we do roll it out more widely, it’s just going to be used against some people more than others. We’re already seeing it in the police use. We know of a handful of wrongful arrests where people have been arrested, put in jail for the crime of looking like someone else. And in every case, it’s involved a person who is Black. So already, we’re seeing when it goes wrong. It’s going wrong for people who are Black. Facial recognition technology is being used more on them. We need to make some decisions right now of what we want the world to look like and whether we want our faces tracked all the time. I hope the answer is no. I hope that doesn’t happen because I do think we need zones of privacy. I don’t want to live in a panopticon.

We’re already seeing a bunch of private uses of this, maybe not the panopticon version, but the “Hey, the sports stadium has facial recognition technology to track the person on their way out the door.” Madison Square Garden famously is tracking lawyers from law firms that are suing the Dolan family. That’s happening. Is that going to keep happening? Do some of these laws affect that, too? Or are we going to have little zones of privacy and little zones of not privacy?

Yeah, so Madison Square Garden installed facial recognition, as many shops now have done. Like grocery stores use this to keep out shoplifters, and Madison Square Garden was saying, “We want to keep out stalkers during concerts. We want to keep out people who’ve been violent in the stadium before.” And then, in the last year, they started using it to ban lawyers who worked at firms that had sued Madison Square Garden because the owner, James Dolan, didn’t like them and how much money they cost him. But Madison Square Garden has done this for all their properties in New York — Beacon Theatre, Radio City Music Hall — but they have a theater in Chicago, and they can’t do that there because Illinois has this law. You can’t use lawyers’ face prints without their permission.

So again, laws work, and we could pass more of them if we want to. But yeah, companies are definitely rolling out facial recognition technology on us to deter crime. And then, as a service. And I do see this in a lot of arenas now: to go through the concession line faster, just pay with your face for your Coke. And that’s part of the normalization of the technology, and I think that’s fine. If you’re comfortable with that, and it makes your life easier, that’s great. But I think we should have limits on it so that they can’t just start building some crazy face database and using it for something else. I really think we need to put limits on the technology to protect us.

Well if I’ve learned anything, it’s that I need to move back home to Chicago.

That’s my takeaway from this episode of Decoder. I left there a long time ago, but maybe it’s time to go back. Kash, I am such a huge fan of your work. I love the book. I think it’s out now. People should go read it. Tell them where they can buy it.

They can buy it anywhere. Amazon, if you’re into the tech giants. You can get it at Barnes & Noble, at Bookshop.

I just like making people say they can buy it at Amazon. That’s just a troll I do at the end of every episode. This has been great. I really recommend the book.

I like Bookshop.org because it supports your independent bookstore, which is great.

Thank you so much for being on Decoder, Kash. This was great.

Thank you so much. It was great to be here.

Decoder with Nilay Patel /

A podcast about big ideas and other problems.

SUBSCRIBE NOW!