Like all major AI companies, Bing’s Image Creator has… Content Policy Creating images that encourage sexual assault, suicide, graphic violence, hate speech, bullying, deception and misinformation is prohibited. Some rules are strict even by the usual “trust and safety” standards (hate speech is defined as speech that “excludes” individuals on the basis of any actual or perceived “characteristic” that is consistently associated with bias or systemic marginalization.). As expected, this will rule out a lot of soothing images altogether. But the rules are the least of it. The most influential and interesting question is how these rules are actually applied.
I now have insight into AI safety rules in action, and it certainly looks as if Bing is taking very broad rules and training its engine to apply them more broadly than anyone would expect.
Here’s my experience. I’ve been using Bing Image Creator recently to create Cybertoonz (examples herehere and here), despite my severe lack of artistic talent. It had the usual technical problems — too many fingers, weird faces — and some that I suspected were designed to avoid “gotcha” claims of bias. For example, if you requested an image of members of the European Court of Justice, the engine almost always generated images of more women and identifiable minorities than the European Court of Justice would produce in the next 50 years. But if the political correctness of the AI engine detracted from the message of the cartoon, it was easy to demand male judges, and Bing did not treat this as “exclusion” of images by gender, as one might have feared.
My more recent experience is a little more troubling. I created this cartoon from Cybertoonz to illustrate Silicon Valley’s counterintuitive claim that social media engages in protected speech when it suppresses the speech of many of its users. My image prompt was a variation of “Low angle shot of a male authority figure in a black t-shirt standing and speaking into a megaphone amid a large group of seated people wearing masks or tape over their mouths. Lo-fi digital art”.
As always, Bing’s first attempt was surprisingly good, but flawed, and getting a usable copy required dozens of modifications to the claim. None of the pictures were quite right. I finally settled on the one that worked best, turned it into a Cybertoonz cartoon, and published it. But I wasn’t giving up on finding something better, so I went back the next day and turned the router back on.
This time, Bing refused. He told me that my claim violates Bing’s safety standards:
After some experimentation, it became clear that what Bing objected to was filming the audience “wearing masks or tape over their mouths.”
How does this violate Bing’s safety rules? Do masks incite violence? sign for”[n]”consensual intimate activity”? In context, these interpretations of the rules are ridiculous. But Bing doesn’t explain the rules in context. It tries to write additional code to make sure there are no violations of the rules, whether or not there is a possibility that the image it produces If you show non-consensual sex or violence, the Trust and Safety Act will reject it.
This is almost certainly the future of the limits of trust and safety in AI. It will start with loose rules written to satisfy Silicon Valley’s left-leaning critics. These general rules will then be extended further with hidden code written to block many fully compatible claims just to ensure that they block only a few incompatible claims.
In the context of Cybertoonz, such limitations on AI output are merely an annoyance. But AI won’t always be a toy. It will be used in medicine, employment, and other important contexts, and the same dynamic will be in effect there. AI companies will be under pressure to adopt standards of trust and safety and implement rules that strictly prohibit results that might offend the left half of American political discourse. But in applications that affect people’s lives, the code that guarantees these results will lead to a host of unforeseen consequences, many of which no one can defend against.
Given the risks, my question is simple. How do we avoid these consequences and who works to prevent them?