The new artificial intelligence technology is programmed to pay attention to particular words while creating images from text descriptions. This technique of focused attention has allowed about three times increase in image quality as compared to previous text-to-image generation.
Xiaodong He, a principal researcher in Deep Learning Technology at Microsoft, says that the pictures produced by the bot are created pixel by pixel, from scratch. “These birds may not exist in the real world — they are just an aspect of our computer’s imagination of birds,” he added.
At the core of this artist bot, there’s a technology named Generative Adversarial Network, or GAN. It consists of two machine learning models–one generates images from texts, and other uses text to judge the authenticity of generated images. Working together, both models attempt to produce a perfect image.
As per the researchers, the AI artist bot can generate any kinds (absurd as well as sane) of images like pastoral scenes, grazing livestock, floating double-decker bus, etc. Surprisingly, each of the generated images had details that were absent from text description. It suggests that this AI has artificial imagination as well.
As per the researchers, this artist AI completes a research circle around the intersection of natural language processing and computer vision. They first started with a technology that generated photo captions; it was followed by a technology that answered the questions humans ask about images.
A researcher on the team said that image generation is a difficult task as the process needs the bot to imagine details that aren’t in the caption. However, as mentioned above, the bot was able to do so.
Find the detailed information in this research paper.