Skip to main content
Log in

Unsupervised video co-segmentation based on superpixel co-saliency and region merging

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Nowadays, fully unsupervised video object segmentation is still a challenge in computer vision. Furthermore, it is more difficult to segment the object from a set of clips. In this paper, we propose an unsupervised and on-line method that efficiently segments common objects from a set of video clips. Our approach is based on the hypothesis, that common or similar objects in multiple video clips are salient, and they share similar features. At first, we try to find out the regions in every clip which are salient and share similar features by proposing a new co-saliency scheme based on superpixels. Then, the most salient superpixels are chosen as the initial object marker superpixels. Starting from these superpixels, we merge neighboring and similar regions, and segment out the final object parts. The experimental results demonstrate that the proposed method can efficiently segment the common objects from a group of video clips with generally lower error rate than some state-of-the-art video co-segmentation methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Achanta R, Estrada F, Wils P, Süsstrunk S (2008) Salient region detection and segmentation. In: Gasteratos A, Vincze M, Tsotsos J (eds) Computer Vision Systems. vol. 5008, Springer Berlin Heidelberg, pp. 66–75

  2. Achanta R, Hemami S, Estrada F, Susstrunk S (2009) Frequency-tuned salient region detection. In: 2009 I.E. Conference on Computer Vision and Pattern Recognition, pp. 1597–1604

  3. Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Susstrunk S (2012) SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 34:2274–2282

    Article  Google Scholar 

  4. Alexe B, Deselaers T, Ferrari V (2012) Measuring the objectness of image windows. IEEE Trans Pattern Anal Mach Intell 34:2189–2202

    Article  Google Scholar 

  5. Badrinarayanan V, Budvytis I, Cipolla R (2013) Semi-supervised video segmentation using tree structured graphical models. IEEE Trans Pattern Anal Mach Intell 35:2751–2764

    Article  Google Scholar 

  6. Bai X, Wang J, Simons D, Sapiro G (2009) Video SnapCut: robust video object cutout using localized classifiers. ACM Trans Graph 28:1–11

    Article  Google Scholar 

  7. Batra D, Kowdle A, Parikh D, Jiebo L, Tsuhan C (2010) iCoseg: interactive co-segmentation with intelligent scribble guidance. In: 2010 I.E. Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3169–3176

  8. Cao X, Tao Z, Zhang B, Fu H, Feng W (2014) Self-adaptively weighted co-saliency detection via rank constraint. IEEE Trans Image Process 23:4175–4186

    MathSciNet  Google Scholar 

  9. Cheng M-M, Zhang G-X, Mitra NJ, Huang X, Hu S-M (2011) Global contrast based salient region detection. In: 2011 I.E. Conference on Computer Vision and Pattern Recognition (CVPR), pp. 409–416

  10. Chiu W-C, Fritz M (2013) Multi-class video co-segmentation with a generative multi-video model. In 2013 I.E. Conference on Computer Vision and Pattern Recognition (CVPR), pp. 321–328

  11. Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24:603–619

    Article  Google Scholar 

  12. Endres I, Hoiem D (2010) Category Independent Object Proposals. In: Daniilidis K, Maragos P, Paragios N (eds) Computer Vision – ECCV 2010. vol. 6315, Springer Berlin Heidelberg, pp. 575–588

  13. Feng T, Brennan S, Qi Z, Hai T (2007) Co-tracking using semi-supervised support vector machines. In: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on, pp. 1–8

  14. Fu H, Cao X, Tu Z (2013) Cluster-based co-saliency detection. IEEE Trans Image Process 22:3766–3778

    Article  MathSciNet  Google Scholar 

  15. Golland P, Bruckstein AM (1997) Motion from color. Comput Vis Image Underst 68:346–362

    Article  Google Scholar 

  16. Hochbaum DS, Singh V (2009) An efficient algorithm for Co-segmentation. In: 2009 I.E. 12th International Conference on Computer Vision, pp. 269–276

  17. Hou X, Zhang L (2007) Saliency detection: a spectral residual approach. In: 2007 I.E. Conference on Computer Vision and Pattern Recognition, pp. 1–8

  18. Huazhu F, Dong X, Bao Z, Lin S (2014) Object-based multiple foreground video co-segmentation. In: Computer Vision and Pattern Recognition (CVPR), 2014 I.E. Conference on, pp. 3166–3173

  19. Jiaming G, Zhuwen L, Loong-Fah C, Zhou SZ (2013) Video co-segmentation for meaningful action extraction. In: 2013 I.E. International Conference on Computer Vision (ICCV), pp. 2232–2239

  20. Joulin A, Bach F, Ponce J (2010) Discriminative clustering for image co-segmentation. In: 2010 I.E. Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1943–1950

  21. Joulin A, Bach F, Ponce J (2012) Multi-class cosegmentation. In 2012 I.E. Conference on Computer Vision and Pattern Recognition (CVPR), pp. 542–549

  22. Kailath T (1967) The divergence and bhattacharyya distance measures in signal selection. IEEE Trans Commun Technol 15:52–60

    Article  Google Scholar 

  23. Krähenbühl P, Koltun V (2014) Geodesic object proposals. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T (eds) Computer Vision – ECCV 2014. vol. 8693, Springer International Publishing, pp. 725–739

  24. Li H, Ngan KN (2011) A co-saliency model of image pairs. IEEE Trans Image Process 20:3365–3375

    Article  MathSciNet  Google Scholar 

  25. Liu Z, Zou W, Li L, Shen L, Meur OL (2014) Co-saliency detection based on hierarchical segmentation. IEEE Signal Process Lett 21:88–92

    Article  Google Scholar 

  26. Manen S, Guillaumin M, Gool LV (2013) Prime object proposals with randomized prim’s algorithm. Presented at the Proceedings of the 2013 I.E. International Conference on Computer Vision

  27. Meng F, Li H, Liu G, Ngan KN (2012) Object co-segmentation based on shortest path algorithm and saliency model. IEEE Trans Multimedia 14:1429–1441

    Article  Google Scholar 

  28. Ning J, Zhang L, Zhang D, Wu C (2010) Interactive image segmentation by maximal similarity based region merging. Pattern Recogn 43:445–456

    Article  MATH  Google Scholar 

  29. Tang K, Joulin A, Li-Jia L, Li F-F (2014) Co-localization in real-world images. In: 2014 I.E. Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1464–1471

  30. Van den Bergh M, Boix X, Roig G, de Capitani B, Van Gool L (2012) SEEDS: superpixels extracted via energy-driven sampling. In: Fitzgibbon A, Lazebnik S, Perona P, Sato Y, Schmid C (eds) Computer Vision – ECCV 2012. vol. 7578, Springer Berlin Heidelberg, pp. 13–26

  31. Vedaldi A, Soatto S (2008) Quick shift and kernel methods for mode seeking. In: Forsyth D, Torr P, Zisserman A (eds) Computer Vision – ECCV 2008. vol. 5305, Springer Berlin Heidelberg, pp. 705–718

  32. Vincent L, Soille P (1991) Watersheds in digital spaces: an efficient algorithm based on immersion simulations. IEEE Trans Pattern Anal Mach Intell 13:583–598

    Article  Google Scholar 

  33. Wang T, Han B, Collomosse J (2014) TouchCut: fast image and video segmentation using single-touch interaction. Comput Vis Image Underst 120:14–30

    Article  Google Scholar 

  34. Willert V, Eggert J, Clever S, Körner E (2005) Probabilistic color optical flow. In: Kropatsch W, Sablatnig R, Hanbury A (eds) Pattern Recognition. vol. 3663, Springer Berlin Heidelberg, pp. 9–16

  35. Zhai Y, Shah M (2006) Visual attention detection in video sequences using spatiotemporal cues. Presented at the Proceedings of the 14th annual ACM international conference on Multimedia, Santa Barbara, CA, USA

  36. Zhang D, Javed O, Shah M (2013) Video object segmentation through spatially accurate and temporally dense extraction of primary object regions. Presented at the Proceedings of the 2013 I.E. Conference on Computer Vision and Pattern Recognition

  37. Zhang D, Javed O, Shah M (2014) Video object co-segmentation by regulated maximum weight cliques. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T (eds) Computer Vision – ECCV 2014. vol. 8695, Springer International Publishing, pp. 551–566

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chi-Man Pun.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, G., Pun, CM. & Lin, C. Unsupervised video co-segmentation based on superpixel co-saliency and region merging. Multimed Tools Appl 76, 12941–12964 (2017). https://doi.org/10.1007/s11042-016-3709-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-016-3709-3

Keywords

Navigation