Short Bytes: Google has open sourced its Show and Tell system which will now be available in TensorFlow machine learning library. The Show and Tell system can analyze an image and provide a relevant caption describing the situation of the image. The code of the system is available on GitHub.
In 2014, the Google Brain team started working on a system that could analyze an image and write a caption for it. The system could analyze what was happening in the image. At that time, their image classification model Inception V1 enabled the system to achieve an accuracy of 89.6%. Months followed and the image classification model was upgraded to Inception V2 in 2015 enabling 91.8% accuracy.
The current Inception V3 model enables the system to analyze the images at 93.9% accuracy. The improved system can detect multiple objects in an image along with their characteristics and write a more relevant caption. “An image classification model will tell you that a dog, grass and a frisbee are in the image, but a natural description should also tell you the color of the grass and how the dog relates to the frisbee.”
Google has announced that its Show and Tell AI-based image captioning system is now available as an open source model as a part of TensorFlow. “This release contains significant improvements to the computer vision component of the captioning system, is much faster to train, and produces more detailed and accurate descriptions compared to the original system.”
TensorFlow, a successor to DistBelief and originally developed by the Google Brain team, is a machine learning library which comprises of open source software. The open source library is used by many teams at various Alphabet-owned companies including Google.
Source: Google Research Blog
If you have something to add, tell us in the comments below.
Read more about Google: