Q&A with Siegel Research Fellow at Princeton’s Center for Information Technology Policy Basileal Imana

Basileal Imana is a postdoctoral research associate at Center for Information Technology Policy at Princeton University. His research interests broadly lie in studying privacy and algorithmic fairness properties of real-world systems on the Internet. Imana received his Ph.D. in computer science from the University of Southern California. Prior to USC, he received his BSc in 2017, also in CS, from Trinity College Connecticut, where he worked on solving computationally difficult problems using high-performance computing.

Tell us about yourself and your background. What brought you to this point in your career? What drew you to this area of research and to your current projects?

Basileal Imana: I am currently a postdoc at Princeton’s Center for Information Technology Policy. My current research focuses on developing methods for auditing bias in algorithms used to deliver content on social media platforms. My academic background is in Computer Science, both at the undergraduate and graduate levels.

My interest in the research field of algorithmic auditing began after reading “Automating Inequality,” a book that explores how automated decisions can negatively impact access to public services. I was increasingly drawn to how social media platforms play a pivotal role in shaping access to information and opportunities to billions of users. The more I studied the issue of algorithmic bias, the more I became convinced that self-policing by platforms is not enough and that external audits can play an important role in ensuring platforms’ operations align with societal interests. The focus of algorithmic auditing on the impact of technology on individuals and the type of adversarial thinking it involves also resonated with my prior research interests in security and privacy. That led me to center my Ph.D. thesis on algorithmic auditing, particularly on studying how social media algorithms influence access to economic opportunities such as jobs and education.

Tell us about your current role. What kind of research are you working on right now? 

I am involved in two main lines of research, both aimed at increasing the transparency and accountability of ad delivery algorithms that platforms use to decide which economic opportunity users are exposed to.

My first line of research focuses on extending existing black-box auditing methods, which are designed to work without platforms’ cooperation or any special access to the platform. These methods have been used to uncover algorithmic discrimination that is a result of choices made by social-media platforms’ algorithms that opaquely optimize for their business goals. For example, prior studies and some of our early work showed the algorithms that Meta uses to optimize the relevance of housing and employment ads discriminate by gender and race. My work focuses on extending these auditing methods to more types of economic opportunities where there are legal protections against discrimination, such as education, credit, and insurance opportunities. 

My second line of work involves developing a technical framework that enables platforms to explicitly support auditing. Currently, platforms do not widely support external audits due to the lack of incentives, and the risks of leaking user private data and proprietary trade secrets. Recent legislation, such as the EU’s Digital Services Act, has provisions that provide legal incentives for platforms to undergo independent external audits. However, implementing these proposals while safeguarding privacy and protecting trade secrets presents significant technical challenges, which my work seeks to address.

You recently published “Auditing for Discrimination in Algorithms Delivering Job Ads” which investigates algorithmic discrimination in ad delivery. Can you tell me a bit more about what went into this research?

The paper increases our understanding of how job ad delivery algorithms could perpetuate real-world gender disparities and potentially violate anti-discrimination laws. The method we propose differentiates between biases in ad delivery based on protected categories such as gender, and biases that stem from differences in qualifications, which are legally permissible. We applied the methodology to study how job ads are delivered on Facebook and LinkedIn. 

Through controlled experiments, we show that even when Facebook ads are not explicitly targeted by gender, their delivery can still be biased even among equally qualified candidates. For example, in one experiment, we advertised on Facebook a pair of ads for delivery driver jobs at Domino’s and Instacart. Both jobs have similar qualifications but different real-world demographics, where Domino’s drivers skew male and Instacart drivers skew female. Despite our ads not specifying gender as a targeting parameter, Domino’s ad was shown to a larger fraction of men than women, and vice-versa for the Instacart ad.

This research focuses on a part of the hiring process – the delivery of job ads – over other algorithmic discrimination in hiring studies. What is particularly important about investigating algorithmic discrimination at this stage?

Job ad delivery is important because it is one of the initial steps in the hiring process where potential candidates learn about open job opportunities. If certain demographic groups are excluded from seeing these ads due to algorithmic bias, they have less chance of getting further in the recruitment pipeline and engaging in the downstream processes that begin with that initial point of contact with the opportunity

The article ends with some suggestions for ad platforms to improve the auditing of their algorithms to make them more accurate and aligned to the public interest. Can you tell me a bit more about those suggestions?

In this context, we define auditing in the public interest as ensuring ad delivery algorithms operate in a way that is transparent and non-discriminatory.  One of our key recommendations for transparency is providing more granular data access to auditors while maintaining user privacy. This could involve, for example, a portal dedicated to auditors where platforms provide more detailed reporting on the delivery optimization metrics across different demographic groups.

What are the biggest challenges you face in approaching these research questions? What obstacles stand in the way of this kind of research more broadly?

One of the major challenges is that the algorithms that determine the selection and order of ads people see, and the specific factors that go into those decisions, are opaque to both advertisers and users. More broadly, auditors and researchers have to rely on the limited features and data access that platforms provide to any regular user to design experiments and study these algorithms, rather than being granted greater and more specialized access for research purposes. These limitations make it challenging to design controlled experiments, increasing both the time it takes to develop new methods and the cost of conducting audits.

In another publication, “Having your Privacy Cake and Eating it Too: Platform-supported Auditing of Social Media Algorithms for Public Interest”, you and your co-authors identify key challenges to algorithmic auditing in the public interest. What specific challenges are you exploring in this research?

This work was inspired by some of the challenges we encountered in our previous projects on black-box audits of ad delivery algorithms.

Firstly, black-box audits pose a challenge because it’s hard to directly link observed bias to algorithmic decisions. This is mainly due to numerous confounding variables that we cannot easily control without access to the inner workings of the platforms. Notable early audits, including Latanya Sweeney’s seminal work on racial bias in Google Ads, hypothesized that such biases were primarily driven by algorithmic choices. However, these initial methods struggled to distinguish the effect of the algorithms from other influencing factors, such as market effects and temporal variations. It was not until 2019 that a method was developed that effectively isolated these algorithms’ roles by controlling for the confounding variables, thus confirming the initial hypothesis.

Secondly, our methods often have to rely on workarounds like using location as a proxy for demographic attributes that platforms don’t report, which can lead to measurement errors. For example, in our audit of LinkedIn’s ad platform, we used counties as a proxy for gender because LinkedIn does not report gender of ad recipients. However, we were not able to infer the gender of 30-40% of our ad recipients because LinkedIn does not report the location of people in counties that received very few ad impressions. Since the audits depend on these kinds of clever hacks to overcome challenges, they often do not easily generalize across different categories of ads, demographic groups, or platforms, which limits how effective they can be applied on a broader scale.

Another motivation for this research is recent legislative pushes in both the EU (Digital Services Act) and the US (Platform Accountability and Transparency Act) to provide vetted academic researchers access to platform data without compromising user privacy and platforms’ business interests. Our work proposes a technical framework for implementing these legislative proposals in practice for the context of algorithms that shape access to information and opportunities on social media.

When it comes to weighing the risks related to these proposals in regard to private data and platforms’ proprietary algorithms, how did you navigate the value tension here between privacy and transparency? 

Our work demonstrates that privacy is an important risk to address, but that it is not a hurdle to increasing transparency. We propose a framework that mitigates privacy risks by giving auditors privileged, but limited, access to platforms’ relevance estimators without exposing user data or details of the platform’s proprietary algorithms. The framework uses Differential Privacy, which uses aggregated and noisy data to shield individual-level data while maintaining statistical averages.  Differential Privacy ensures that individual user privacy is maintained while still allowing for meaningful analysis of the platforms’ algorithmic biases.

Our suggestion is for platforms to provide direct query access to the algorithms used to determine the “relevance” of each piece of content for social media users. These recommendation algorithms, referred to as “relevance estimators” in our paper, are the core engines that determine which organic posts and ads users see on their algorithmic timelines. 

Given these algorithms are also the core of platforms’ business models, we propose a metered approach that is designed to prevent the disclosure of sensitive user data and proprietary information while still providing a robust framework for auditing for bias and discrimination.

If implemented, how do you see these suggestions impacting real-world applications of algorithmic auditing? What potential and possible pitfalls do you think this research signals for  the growing industry of algorithmic auditors?

We want to provide a viable blueprint for transparency that social-media platforms can follow. Given it is designed to be generalizable to study bias across multiple types of content and platforms, it can help increase our understanding of how platforms shape public discourse and access to economic opportunities.

We also recognize the ongoing need for legislative and platform cooperation to achieve these goals. The efficacy of the audits depends on the cooperation of the platforms being audited. In our efforts to collaborate with platforms, we identified significant concerns, such as the potential risk of our framework being abused by malicious actors and the risk of measuring bias without having ready-to-deploy bias mitigation strategies.

What kind of work would you like to do in the future?

One new area I am exploring is recent initiatives advocating for the adoption of digital public infrastructure (DPI) in various countries globally. DPI holds a significant potential to revolutionize the delivery of public services, making them more accessible and efficient. However, it also presents several challenges, such as ensuring data privacy, preventing security breaches, addressing ethical concerns like digital inequality, and managing the potential for misuse of data. I am interested in studying these privacy, security, and ethical considerations integral to designing and implementing DPI systems.

What attracted you to the SFE fellowship? What are you hoping to get out of the fellowship experience? / What have you gotten out of the fellowship experience?

Siegel’s emphasis on infrastructure aligns with how I study social media in my work as critical infrastructure. I joined the program hoping to find new opportunities for collaboration, receive feedback on my ongoing projects, and increase the visibility of my work by effectively communicating my research to non-technical audiences. The monthly research seminars and in-person gatherings also provided excellent opportunities to receive feedback on my work and gain new perspectives on how my research intersects with other important questions surrounding AI adoption.

What are you reading/watching/listening to right now that you would recommend to readers, and why?

AI Snake Oil: I find both the blog and the research it draws from incredibly useful for distinguishing the real benefits of AI from the surrounding hype. It’s a great resource for anyone interested in understanding the true impact and limitations of AI technology.

More from Basileal Imana: