AI and the Search for Truth: Kylan Rutherford on Data Voids, Distrust, and the Pathways to Better Information

Kylan Rutherford is a Siegel Research Fellow at New York University’s Center for Social Media and Politics, where he studies the complex relationship between social media content creators, platform consumers, and the platforms themselves. We sat down with Kylan to learn more about his research on how AI is affecting search, misinformation, and data voids.

What are you working on as a Siegel Research Postdoctoral Fellow at NYU’s Center for Social Media and Politics?

Our research projects look at how tech and social media interact with our lives—especially how we perceive and how we participate in politics and political discussions. One of the projects we’re working on tries to understand how politics shows up on TikTok. We’re also looking at podcasts that aren’t inherently political, but that still get very political around an election or around a big issue. We’re looking at how those spaces break on the partisan divide. Finally, we’re thinking about how individuals are increasingly using AI and how that affects politics. How does AI help them understand and evaluate information in general and in political content, especially?

How did you come to this work? What in your background made you interested in these questions?

My doctoral work is in political science. I was initially focused on immigration and climate change. That changed during the COVID-19 pandemic. My siblings and parents and I were all squished back into a house together. I was watching how we were all using tech during that period. In some ways it was crucial to keeping us sane, and in other ways technology was driving us insane. That experience drove me to want to understand how we could improve technology. I started researching recommendation algorithms on social media which has led to some really cool collaborations and field experiments

Over the past years our lab has been homing in on AI as a crucial medium to research. That’s because the use of AI is skyrocketing, and also because it’s early enough in its adoption that we can still shape how it’s used and regulated. AI will shape our society and as researchers, we can play a role in that.

You and your colleagues recently finished a paper, “From Search Engines to Answer Engines: Testing the Effects of Traditional and LLM-Based Search on Belief in Misinformation.” How did this research come to be? What are the significant findings? 

The paper started as a follow-up to a paper about search engines that the lab did a few years ago. There was a fascinating finding that if Google had reliable links to draw from, then it was a helpful tool when somebody was engaging with misinformation. But if there aren’t a ton of good resources for the search engine to draw from—which some people have called data voids—then the tool can actually increase belief in misinformation. In that case, Google is not a fact checker; instead, it’s an unintentional reinforcement mechanism for misinformation.

We were curious whether the same pattern holds now that AI has been incorporated into a lot of traditional search engines. We found that the earlier finding still holds. When there’s a data void, the AI tool seems to increase belief in misinformation. However, it does so at a much lower rate than the traditional search engine. If you home in on the people who are using AI effectively, the increase in belief in misinformation drops to almost nothing. 

In other words, AI clearly does better than the traditional search engine in cases where there are data voids. This is important because people are increasingly using AI tools for search whether they realize it or not. AI is increasingly being incorporated into traditional search engines, and people are beginning to use tools like TikTok for search.

How do data voids work in practice? How do they affect how misinformation spreads?

Let’s say we’re researching the moon landing. Somebody told us that the moon landing was faked. That’s a situation where there’s not going to be a data void. A lot of people have talked about the moon landing. A lot of people have published things about it on both sides. Google is really good at aggregating the most relevant and the most reliable resources. If you give Google a deep bench of things to draw from, the front page of search results is going to be very reliable. 

Now consider someone who is creating a new conspiracy theory. Maybe they’re claiming that the Mars rover mission was faked or that there is something on the other side of the moon. Nobody’s been talking about that but Google will still try to fill its front page with at least ten search results. If it doesn’t have reliable sources, it’s got to fill that from somewhere. 

That data void allows people to easily perpetuate misinformation. If you’re able to get that misinformation stated on three different sites or if you’re able to get a few people to tweet that misinformation, then you are giving Google sources to draw on. Then, if people are searching about that story, they’ll find that Google is scraping the bottom of the barrel. And because of that, even if I have a niche sketchy website where I’ve got this article posted, Google is going to put that on the front page. It doesn’t have better things to put on the front page because of the data void. That’s also true of recent events. If there’s a breaking news story, there’s going to be a data void because it just happened. 

In the paper, you look at how Google’s traditional search compares with the AI tool Perplexity when it came to how they influenced belief in misinformation. What prompted you to compare those two tools? 

For traditional search, Google was the obvious choice since it is so widely used. We chose Perplexity over ChatGPT and some other tools because Perplexity uses a Retrieval Augmented Generation (RAG) model. Most AI models have harvested the Internet up to a certain date. After that date, they’re going to be unreliable. If something happened today and that content isn’t ingested into the data, the AI tool is going to know nothing about it.

When we started this experiment, the free version of ChatGPT didn’t have this RAG functionality. The RAG model allows the tool to collect from the headlines today and pull information and analyze that material. Because we wanted to look at cases where there was a data void, we needed to look at recent articles, and ChatGPT would have completely failed at this, at the time. During our experiment, ChatGPT rolled out their own RAG model on the free version. In the future, we may try running the same experiment with ChatGPT as we did with Perplexity.

Your study found that Perplexity generally performed better than Google in data voids, but it still sometimes increased belief in false information. What can users do to protect themselves?

Other researchers have shown that the more times you hear a story, the more likely you are to believe it. It’s important to be aware of that pattern, no matter the search tool that you’re using.

It’s also important to be proactive. Don’t be lazy with the tools you use. In our experiment, we found that there are quite a few people who weren’t reading the article we gave them but were instead using the search engine or AI tool as a substitute for engaging the article. The tool should be a supplement, not a substitute for engaging with material.

Misinformation content generally fails to pass the sniff test. The headline is catchy and easy to believe. But if you actually dive into it, you find that it’s implausible. Using a search engine or AI tool should not replace the effort of asking, “Do I think this sounds reasonable or not?”

Finally, source reliability still matters. We’re seeing a lot of alternative media sources pop up, and some of them are incredibly helpful and incredibly reliable. Some of them are not. We all have different political leanings, and I might diagnose sources from the other side as less reliable than my sources, but they are still much more reliable than that guy I heard on TikTok, who said X, Y, or Z.

It’s especially important when considering a very recent event. If you’re in a situation where the story is breaking news, treat all information with skepticism. Breaking is messy. Wait a little bit before you make drastic decisions. If it feels like it’s an earth-shattering story, you’re not going to have a data void for very long—the reliable sources will pick the story up. Trust that these resources will deliver; give it an hour or two.

Current AI policies often restrict chatbots from responding in “risky” contexts. Is there a risk of being too cautious with these tools?

It’s tricky. There’s a lot of responsibility on whoever’s providing the end product. For example, doctors have always made mistakes. We don’t love that. But we are accustomed to that. We’re finding AI can help in a lot of cases. It can diagnose some conditions better than human doctors. It’s not perfect, and it’ll still make some mistakes. But by-and-large the mistakes are lower. However, if you make that switch now, you’ve put a target very specifically on the AI tool as responsible for all of the mistakes. On a macro level, it might save more lives. But you don’t really get credit for the lives you save in that switch. You just get consequences. And now they’re more easily attributable consequences. 

AI search engines are similar. In our study, the users that we were interacting with got the answer right more often when they were using the AI tool than the traditional search engine. But AI was claiming it had the answer, whereas Google wasn’t saying it had the answer; it was saying, “Here are the links.” A class action lawsuit against Google for the links that it provided is a little bit harder to prove than a lawsuit saying, “AI told me that this was true and I acted based on that belief.”

As a result, Google has had a policy of not including Gemini AI summaries in search results when there’s a data void or if it’s a topic related to misinformation. That policy is shifting, but they still exercise caution in these spaces. They don’t want to be responsible for people acting on misinformation. But ironically, that’s the area where these tools could actually be the most helpful in fighting misinformation. 

What policies or structural changes to search tool design would you advocate for based on this research?

Blame attribution is something that scares AI tools away from places where they can help. It’s important that AI resources aren’t the finisher. In other words, the AI output is not the last thing that you do, but it’s an input that informs a human decision. For example, it’s bad if AI is making a final job hiring decision. But it’s good if AI is used in the process.

I also think it’s important that we educate ourselves on how to use these tools. We’re learning new things all the time about what these tools can and can’t do. Plus, they’re changing all the time. We need to be comfortable with AI getting things wrong. A lot of people give up on AI the first time that the tool gives them false information. But we didn’t stop driving the first time our car broke down. We got it fixed and we were more careful. We’ve become comfortable with errors in so many human systems. We need to understand where we’re willing to accept imperfections in AI as well.

I’m not an expert in regulations or policy, but I think it’s important that AI companies are transparent about how their models work. Now is the time to be thinking about our regulations. A year ago would have been better. But we still have time before things get locked in. 

Were there any surprising or counterintuitive findings from your research?

People weren’t as lazy as we thought they’d be. We didn’t force them to spend a ton of time on Google or Perplexity. Some people did one search; they phoned it in. But I was surprised that other people did multiple search queries on Google or asked multiple questions on Perplexity. A lot of people were going above and beyond. As we’ve discussed, there’s a danger spot when people are too lazy with these tools. But I don’t think that that’s going to be the main problem.

Another thing that surprised me was that a lot of people who used Perplexity also double-checked with Google. People are willing to try these tools, but they’re not quite willing to blindly trust them. If they saw that Perplexity and Google were giving them the same answer, they were okay moving forward. They felt that they knew how Google worked and gave it a high level of trust. Initially we were upset because we felt that people were cheating on the study. But it is an interesting pattern. 

What are you reading/watching/listening to right now that you would recommend to readers, and why?

The top book I recommend is System Error. It’s a deep dive into where big tech went wrong. It points out a lot of these things that we’ve been talking about. As AI emerges, we have a chance to change how we approach tech and correct some of the things that failed with social media.

It’s still summer so I’m also reading some fun stuff. Thursday Murder Club is a great murder mystery series. And I just reread Hamlet

I know you’re studying podcasts. Are you an avid podcast listener?

Most of the podcasts I’m listening to for our research are not ones that I would recommend. We’re studying the manosphere and other corners of the ecosystem. A lot of the time I listen to these exceptional cases. 

I enjoy listening to Up First from NPR and Armchair Expert. I’ve been gravitating away from some of the more political podcasts that I used to enjoy. I’m slowly replacing them with comedy or other types of content. Some other ones I like are NADDPOD (Not Another D&D Podcast), College Football Power Hour, followHIM, and Pod and Prejudice