Oleg Zabluda's blog
Thursday, August 02, 2018
 
"""
"""
by training image recognition networks on large sets of [Instagram] public images [...] which included 3.5 billion images and 17,000 hashtags. The crux of this approach is using existing, public, user-supplied hashtags as labels instead of manually categorizing each picture. [...] By training our computer vision system with a 1 billion-image version of this data set, we achieved a record-high score — 85.4 percent accuracy — on ImageNet [...] this research offers important insight into how to shift from supervised to weakly supervised training [...] We plan to open source the embeddings of these models in the future
[...]
we developed new approaches that are tailored for doing image recognition experiments using hashtag supervision. That included dealing with multiple labels per image [...] sorting through hashtag synonyms, and balancing the influence of frequent hashtags and rare ones.
[...]
distribute the task across up to 336 GPUs, shortening the total training time to just a few weeks. the biggest [model] in this research is a ResNeXt 101-32x48d with over 861 million parameters [...] we designed a method for removing duplicates to ensure we don’t accidentally train our models on images that we want to evaluate them on, a problem that plagues similar research in this area.

Though we had hoped to see performance gains in image recognition, the results were surprising. On the ImageNet image recognition benchmark — one of the most common benchmarks in the field — our best model achieved 85.4 percent accuracy by training on 1 billion images with a vocabulary of 1,500 hashtags. That’s the highest ImageNet benchmark accuracy to date and a 2 percent increase over that of the previous state-of-the-art model.

On another major benchmark, the COCO object-detection challenge, we found that using hashtags for pretraining can boost the average precision of a model by more than 2 percent.
[...]
it may be at least as important to select a set of hashtags that matches the specific recognition task. We achieved better performance by training on 1 billion images with 1,500 hashtags that were matched with the classes in the ImageNet data set than we did by training on the same number of images with all 17,000 hashtags. On the other hand, for tasks with greater visual variety, the performance improvements of models trained with 17,000 hashtags became much more pronounced, indicating that we should increase the number of hashtags in our future training.

Increasing the volume of training data is generally good for image classification. But it can create new problems, including an apparent drop in the ability to localize objects within an image. We also observed that our largest models are still underutilizing the benefits of a 3.5 billion-image set, suggesting that we should train on even bigger models.
"""
https://code.fb.com/ml-applications/advancing-state-of-the-art-image-recognition-with-deep-learning-on-hashtags/
https://code.fb.com/ml-applications/advancing-state-of-the-art-image-recognition-with-deep-learning-on-hashtags/

Labels:


| |

Home

Powered by Blogger