imageannotationandfeatureextraction内容摘要:

Manmatha 2515 Visual terms  Or partition using a rectangular grid and cluster.  Actually works better. 2516 Grid vs Segmentation  Segmentation vs Rectangular Partition.  Results Rectangular Partition better than segmentation!  Model learned over many images. Segmentation over one image. 2517 Feature Extraction amp。 Clustering  Feature Extraction:  Color  Texture  Shape  Kmeans clustering: To generate finite visual terms. Each cluster’s centroid represents a visual term. 2518 CoOccurrence Models  Mori et al. 1999  Create the cooccurrence table using a training set of annotated images  Tend to annotate with high frequency words  Context is ignored  Needs joint probability models w1 w2 w3 w4 V1 12 2 0 1 V2 32 40 13 32 V3 13 12 0 0 V4 65 43 12 0 P( w1 | v1 ) = 12/(12+2+0+1)= P( v3 | w2 ) = 12/(2+40+12+43)= 2519 Correspondence: Translation Model (TM) Pr(f|e) = ∑ Pr(f,a|e) a Pr(w|v) = ∑ Pr(w,a|v) a 2520 Translation Models Duygulu et al. 2020 Use classical IBM machine translation models to translate visterms into words IBM machine translation models  Need a bilingual corpus to train the models V2 V4 V6 Mary did not slap the green witch Maui People Dance。
阅读剩余 0%
本站所有文章资讯、展示的图片素材等内容均为注册用户上传(部分报媒/平媒内容转载自网络合作媒体),仅供学习参考。 用户通过本站上传、发布的任何内容的知识产权归属用户或原始著作权人所有。如有侵犯您的版权,请联系我们反馈本站将在三个工作日内改正。