imageannotationandfeatureextraction内容摘要:
Manmatha 2515 Visual terms Or partition using a rectangular grid and cluster. Actually works better. 2516 Grid vs Segmentation Segmentation vs Rectangular Partition. Results Rectangular Partition better than segmentation! Model learned over many images. Segmentation over one image. 2517 Feature Extraction amp。 Clustering Feature Extraction: Color Texture Shape Kmeans clustering: To generate finite visual terms. Each cluster’s centroid represents a visual term. 2518 CoOccurrence Models Mori et al. 1999 Create the cooccurrence table using a training set of annotated images Tend to annotate with high frequency words Context is ignored Needs joint probability models w1 w2 w3 w4 V1 12 2 0 1 V2 32 40 13 32 V3 13 12 0 0 V4 65 43 12 0 P( w1 | v1 ) = 12/(12+2+0+1)= P( v3 | w2 ) = 12/(2+40+12+43)= 2519 Correspondence: Translation Model (TM) Pr(f|e) = ∑ Pr(f,a|e) a Pr(w|v) = ∑ Pr(w,a|v) a 2520 Translation Models Duygulu et al. 2020 Use classical IBM machine translation models to translate visterms into words IBM machine translation models Need a bilingual corpus to train the models V2 V4 V6 Mary did not slap the green witch Maui People Dance。imageannotationandfeatureextraction
阅读剩余 0%
本站所有文章资讯、展示的图片素材等内容均为注册用户上传(部分报媒/平媒内容转载自网络合作媒体),仅供学习参考。
用户通过本站上传、发布的任何内容的知识产权归属用户或原始著作权人所有。如有侵犯您的版权,请联系我们反馈本站将在三个工作日内改正。