The Limited Times

Now you can see non-English news...

Dear AI, generate me a cool three year old - Netzwelt newsletter about problems with image generators

2022-11-16T20:05:02.848Z


Text-to-image generators can create spectacular art on command. But the technology based on artificial intelligence also has serious blind spots.


The other day I tried to generate a birthday present for a friend using artificial intelligence (AI).

I had taken a picture of him and his young daughter at a swimming lake near Berlin in the summer, wearing similar baseball caps, sitting in the sand and sharing a piece of squishy watermelon.

Wouldn't it be nice, I thought, if I could somehow describe this snapshot in such a way that great new AI image generators like Midjourney and Stable Diffusion spit out this scene as if Neo Rauch had painted it in oils.

Print it out, frame it, happy birthday!

Three hours later I gave up and brought him a bottle of Embarrassment Crémant.

Because I just couldn't convey to the programs what the girl in the photo looks like (she's a pretty cool three-year-old who loves dinosaurs).

No matter what complicated caption I typed into Dall-E 2, the "three-year-old girls" in the results were insistently outfitted with Rapunzel hair and dowdy pink sundresses.

But the father was suddenly more muscular than in real life and wore a basketball jersey.

I suppose it was also due to my inability to compose a good prompt - that's what the texts for the text-to-image generators are called.

In fact, the new services have a problem with bias.

Research teams are now also working on this.

After all, stereotypes in the media will probably continue to cement themselves if the image generators now also depict a world full of clichés.

Around three million mid-journey users alone are already generating thousands of new images every day - which is why I and my colleagues have investigated which industries and areas are already revolutionizing the systems.

It is now clear that all generators reproduce the world in an extremely distorted way.

As a research team led by Stanford University's Federico Bianchi found, it's difficult to portray a wealthy African man in Stable Diffusion without finding features in the image that suggest poverty.

"Attractive person"?

Fair-skinned women.

A "terrorist" looks suspiciously Arabic.

If you enter "thug", i.e. criminal, stable diffusion generates black men.

And »software developers« are almost never portrayed as female.

Of course, all these biases do not reflect the real world, but only the training data – i.e. a very large selection of images from the Internet that have been cleaned with automatic filters that are anything but perfect.

All of that is a problem.

Dall-E 2, Midjourney or Stable Diffusion already support a number of creative tasks in the production of films or advertising or are used to illustrate articles and blogs.

Greeting cards and Power Point presentations will soon be added, because Microsoft has already announced that it will include a generator plugin in its Office package.

We are on the verge of commercializing the generators and development is progressing at breakneck speed.

The question of which world is then actually depicted and how the distortions can be compensated for is one of the most urgent tasks for developers and distributors of the new systems, alongside solving copyright problems.

Our current Netzwelt reading tips for SPIEGEL.de

  • »How artificial intelligence is revolutionizing the creative world« (13 minutes reading)


    The next big thing in software has names like Dall-E, Stable Diffusion or Midjourney.

    These text-to-image generators create amazing images and are disrupting the creative industries.

    We asked people who work with it and generated impressive material ourselves.

  • "A big thing from Microsoft" (6 minutes of reading)


    The "Flusi" is 40 years old.

    Microsoft is giving Flight Simulator helicopter flights, glider pilots and the legendary Hughes H-4 Hercules for its birthday.

    Our Captain Kremp was present at the presentation in Oregon.

  • "Really or not?

    This is how you keep the perspective« (5 minutes reading)


    Tesla rabbles against Elon Musk?

    George W. Bush jokes with Tony Blair about the Iraq war?

    Twitter has thoroughly messed up the conversion of its verification system.

    Jörg Breithut explains how to recognize fake profiles.

External links: Three tips from other media

  • »Who Said It: Elon Musk or Mr. Burns?« (English, three minute read)


    »Family, religion, friendship.

    These are the three demons you have to defeat if you want to be successful in business.” Who said it – the super-rich Twitter boss or the super-rich, super-nasty nuclear power plant owner from The Simpsons?

    Surprisingly challenging and fun quiz from »The New Republic«.

  • »The Crypto Story« (English, 40 minutes reading time)


    A text about trust, distrust, utopia and a surprisingly simple guide: If you only read one text about the boom and bust of cryptocurrencies, leave it at this lovingly designed overview article.

  • »The Search of Shame« (English, one minute read)


    The drama surrounding Twitter has escalated every day since Musk's takeover.

    With this tool, you can check which of your followers have already paid Musk the $8 for the useless blue verification tick.


In any case, have a great week!

Best regards

Theresa Locker

Source: spiegel

All tech articles on 2022-11-16

You may like

Trends 24h

Latest

© Communities 2019 - Privacy

The information on this site is from external sources that are not under our control.
The inclusion of any links does not necessarily imply a recommendation or endorse the views expressed within them.