The Limited Times

Now you can see non-English news...

DALL-E 2 and Google Imagen: The text

2022-06-18T14:34:21.249Z


Darth Vader mows the lawn, a salad dressing testifies in court: Bizarre motifs from the text-to-image generators DALL-E 2 and Google's in-house AI "Imagen" are flooding the web. Access remains limited for good reasons for the time being.


Enlarge image

DALL-E mini after entering the text »ET drives a Tesla«

Photo:

DALL-E mini / THE MIRROR

Surgeons operate on a toilet bowl.

Jesus gets into a Ferrari.

Batman is waiting for his parents on the soccer field.

Such crazy scenes on photo tiles are currently flooding the internet.

Most come from the artwork generator DALL-E mini, which is fed with creative text inputs - and spits out bizarre imagery.

The software, which is based on what is known as artificial intelligence (AI), can be tried out in the browser free of charge – which many people are currently doing.

It is enough to write in a half sentence in the text field what you want to see.

The results fill entire forums on Reddit and have helped DALL-E mini into the Twitter trends: Nine-image tiles show puffer fish as a make-up tip influencer, a salad dressing on the witness stand and Darth Vader mowing the lawn.

The AI ​​creates a photo gallery from even the most obscure ideas, designs Pikachu chairs worth seeing and builds a toaster in the form of a Nintendo Switch.

In the following quiz, you can guess which text entry generated the image shown:

However, DALL-E mini has some limitations: Many results are not very impressive and rather difficult to recognize, the large rush also causes problems for the operators, in the meantime the browser tool is overloaded and cannot be used.

Waiting times of three minutes per attempt, on the other hand, are normal.

Because the art tile generator on the web is just a slimmed-down version of DALL-E 2 , an AI model that is 27 times more extensive and correspondingly more powerful than the mini version from Open AI, a research institution funded by Elon Musk and Microsoft.

So far, however, only a few selected researchers and artists have been allowed to try out this software.

With DALL-E 2 at its fullest, you can create impressive fake photos, with spectators in ancient Rome taking pictures of gladiator fights with smartphones, raccoons in space or dancing avocados with sunglasses on their heads.

However, the AI ​​models are increasingly moving away from pure fun machines and can now do more than just copy works of art or merge singer Taylor Swift with a Christmas tree.

The software is trimmed to create near-realistic images based on a description.

"That's an amazing achievement," says Wolfgang Konen from the Technical University of Cologne in an interview with SPIEGEL.

"The tools have become very powerful."

Google is also involved – and considers its in-house AI called Imagen to be the best text-to-image software.

In a direct comparison with the competitors DALL-E, VQGan and LDM, the Google model is superior, according to a research report by the US company.

"Human viewers prefer images to other methods, both in terms of image-to-text relationship and in terms of realism." Indeed, the glossy images of the chocolate eagle on mango pieces, the stone koala DJ and the teddy parade on the streets of Tokyo look really amazingly good and are razor sharp.

But Google employees only picked out the best results.

There is no public tool to try the software.

Google has decided against releasing the code and demo.

The reason given was that the concerns were too great that the software could be misused and reinforce harmful stereotypes.

Users send AI astray

The DALL-E mini shows where this can lead.

The development team has excluded pornography and violence.

But that doesn't stop users from sending the AI ​​down more or less gloomy paths.

Some use the software to nail Ronald McDonald to the cross, invent toy guillotines, and depict Adolf Hitler having a drink with Minions.

For Boris Dayma from the DALL-E-mini development team, that's no reason to switch off the demo.

"Art is inherently subjective," Dayma wrote to SPIEGEL.

“In the course of history, there have always been works of art that people perceived as crossing borders.” He cannot say whether such works of art are dangerous for society.

One thing is clear: the fundamental problems of the AI ​​have not been ironed out even in modern models.

The developers of DALL-E mini themselves admit that the training data already excludes entire ethnic groups.

One of the reasons for this is that when collecting the training motifs, all those that are not labeled in English are ignored.

Western culture is thus becoming the standard for AI.

The AI ​​and the bias problem

This effect comes as little surprise to Wolfgang Konen.

"Like humans, AI also has a cultural bias," says the IT professor.

It is therefore crucial that the developers deal openly with their software and the images used.

"With AI, it's important to know how the results came about and what the AI ​​was trained with," says Konen.

"Otherwise you can't tell if the software is biased."

It is conceivable that artificially generated photos can be used to form opinions, says Wolfgang Konen.

"Of course, thousands of images can be created very easily that suggest authenticity." This could be dangerous with fake images of a presumed live event shared via social media.

However, AI texts are still the bigger problem.

"I don't see image generators as dangerous as text generators," says the scientist.

With texts that have been automatically created thousands of times, you can use social media to generate much more power over opinions.

Source: spiegel

All tech articles on 2022-06-18

You may like

Trends 24h

Latest

© Communities 2019 - Privacy

The information on this site is from external sources that are not under our control.
The inclusion of any links does not necessarily imply a recommendation or endorse the views expressed within them.