Clothes Grasping and Unfolding Based on RGB-D Semantic Segmentation

Xingyu Zhu; Xin Wang; Jonathan Freer; Hyung Jin Chang; Yixing Gao

doi:10.1109/ICRA48891.2023.10160268

Clothes Grasping and Unfolding Based on RGB-D Semantic Segmentation

Xingyu Zhu, Xin Wang, Jonathan Freer, Hyung Jin Chang, Yixing Gao^*

^*Corresponding author for this work

Computer Science

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

130 Downloads (Pure)

Abstract

Clothes grasping and unfolding is a core step in robotic-assisted dressing. Most existing works leverage depth images of clothes to train a deep learning-based model to recognize suitable grasping points. These methods often utilize physics engines to synthesize depth images to reduce the cost of real labeled data collection. However, the natural domain gap between synthetic and real images often leads to poor performance of these methods on real data. Furthermore, these approaches often struggle in scenarios where grasping points are occluded by the clothing item itself. To address the above challenges, we propose a novel Bi-directional Fractal Cross Fusion Network (BiFCNet) for semantic segmentation, enabling recognition of graspable regions in order to provide more possibilities for grasping. Instead of using depth images only, we also utilize RGB images with rich color features as input to our network in which the Fractal Cross Fusion (FCF) module fuses RGB and depth data by considering global complex features based on fractal geometry. To reduce the cost of real data collection, we further propose a data augmentation method based on an adversarial strategy, in which the color and geometric transformations simultaneously process RGB and depth data while maintaining the label correspondence. Finally, we present a pipeline for clothes grasping and unfolding from the perspective of semantic segmentation, through the addition of a strategy for grasp point selection from segmentation regions based on clothing flatness measures, while taking into account the grasping direction. We evaluate our BiFCNet on the public dataset NYUDv2 and obtained comparable performance to current state-of-the-art models. We also deploy our model on a Baxter robot, running extensive grasping and unfolding experiments as part of our ablation studies, achieving an 84% success rate.

Original language	English
Title of host publication	2023 IEEE International Conference on Robotics and Automation (ICRA)
Publisher	IEEE
Pages	9471-9477
Number of pages	7
ISBN (Electronic)	9798350323658
ISBN (Print)	9798350323665 (PoD)
DOIs	https://doi.org/10.1109/ICRA48891.2023.10160268
Publication status	Published - 4 Jul 2023
Event	2023 IEEE International Conference on Robotics and Automation: Embracing the future: making robots for humans - ExCel London, London, United Kingdom Duration: 29 May 2023 → 2 Jun 2023 https://www.icra2023.org/welcome

Publication series

Name	IEEE International Conference on Robotics and Automation
ISSN (Print)	1049-3492
ISSN (Electronic)	2577-087X

Conference

Conference	2023 IEEE International Conference on Robotics and Automation
Abbreviated title	ICRA
Country/Territory	United Kingdom
City	London
Period	29/05/23 → 2/06/23
Internet address	https://www.icra2023.org/welcome

Keywords

Geometry
Costs
Image color analysis
Semantic segmentation
Clothing
Pipelines
Grasping

Access to Document

10.1109/ICRA48891.2023.10160268

ZhuX2023Clothes
X. Zhu, X. Wang, J. Freer, H. J. Chang and Y. Gao, "Clothes Grasping and Unfolding Based on RGB-D Semantic Segmentation," 2023 IEEE International Conference on Robotics and Automation (ICRA), London, United Kingdom, 2023, pp. 9471-9477, doi: 10.1109/ICRA48891.2023.10160268. © 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Accepted author manuscript, 5.07 MBLicence: Other (please specify with Rights Statement)

Cite this

@inproceedings{ccf178d88dd5444695665991ec20ca58,

title = "Clothes Grasping and Unfolding Based on RGB-D Semantic Segmentation",

abstract = "Clothes grasping and unfolding is a core step in robotic-assisted dressing. Most existing works leverage depth images of clothes to train a deep learning-based model to recognize suitable grasping points. These methods often utilize physics engines to synthesize depth images to reduce the cost of real labeled data collection. However, the natural domain gap between synthetic and real images often leads to poor performance of these methods on real data. Furthermore, these approaches often struggle in scenarios where grasping points are occluded by the clothing item itself. To address the above challenges, we propose a novel Bi-directional Fractal Cross Fusion Network (BiFCNet) for semantic segmentation, enabling recognition of graspable regions in order to provide more possibilities for grasping. Instead of using depth images only, we also utilize RGB images with rich color features as input to our network in which the Fractal Cross Fusion (FCF) module fuses RGB and depth data by considering global complex features based on fractal geometry. To reduce the cost of real data collection, we further propose a data augmentation method based on an adversarial strategy, in which the color and geometric transformations simultaneously process RGB and depth data while maintaining the label correspondence. Finally, we present a pipeline for clothes grasping and unfolding from the perspective of semantic segmentation, through the addition of a strategy for grasp point selection from segmentation regions based on clothing flatness measures, while taking into account the grasping direction. We evaluate our BiFCNet on the public dataset NYUDv2 and obtained comparable performance to current state-of-the-art models. We also deploy our model on a Baxter robot, running extensive grasping and unfolding experiments as part of our ablation studies, achieving an 84% success rate.",

keywords = "Geometry, Costs, Image color analysis, Semantic segmentation, Clothing, Pipelines, Grasping",

author = "Xingyu Zhu and Xin Wang and Jonathan Freer and Chang, {Hyung Jin} and Yixing Gao",

year = "2023",

month = jul,

day = "4",

doi = "10.1109/ICRA48891.2023.10160268",

language = "English",

isbn = "9798350323665 (PoD)",

series = "IEEE International Conference on Robotics and Automation",

publisher = "IEEE",

pages = "9471--9477",

booktitle = "2023 IEEE International Conference on Robotics and Automation (ICRA)",

note = "2023 IEEE International Conference on Robotics and Automation : Embracing the future: making robots for humans, ICRA ; Conference date: 29-05-2023 Through 02-06-2023",

url = "https://www.icra2023.org/welcome",

}

Zhu, X, Wang, X, Freer, J, Chang, HJ & Gao, Y 2023, Clothes Grasping and Unfolding Based on RGB-D Semantic Segmentation. in 2023 IEEE International Conference on Robotics and Automation (ICRA)., 10160268, IEEE International Conference on Robotics and Automation, IEEE, pp. 9471-9477, 2023 IEEE International Conference on Robotics and Automation, London, United Kingdom, 29/05/23. https://doi.org/10.1109/ICRA48891.2023.10160268

Clothes Grasping and Unfolding Based on RGB-D Semantic Segmentation. / Zhu, Xingyu; Wang, Xin; Freer, Jonathan et al.
2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023. p. 9471-9477 10160268 (IEEE International Conference on Robotics and Automation).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

TY - GEN

T1 - Clothes Grasping and Unfolding Based on RGB-D Semantic Segmentation

AU - Zhu, Xingyu

AU - Wang, Xin

AU - Freer, Jonathan

AU - Chang, Hyung Jin

AU - Gao, Yixing

PY - 2023/7/4

Y1 - 2023/7/4

N2 - Clothes grasping and unfolding is a core step in robotic-assisted dressing. Most existing works leverage depth images of clothes to train a deep learning-based model to recognize suitable grasping points. These methods often utilize physics engines to synthesize depth images to reduce the cost of real labeled data collection. However, the natural domain gap between synthetic and real images often leads to poor performance of these methods on real data. Furthermore, these approaches often struggle in scenarios where grasping points are occluded by the clothing item itself. To address the above challenges, we propose a novel Bi-directional Fractal Cross Fusion Network (BiFCNet) for semantic segmentation, enabling recognition of graspable regions in order to provide more possibilities for grasping. Instead of using depth images only, we also utilize RGB images with rich color features as input to our network in which the Fractal Cross Fusion (FCF) module fuses RGB and depth data by considering global complex features based on fractal geometry. To reduce the cost of real data collection, we further propose a data augmentation method based on an adversarial strategy, in which the color and geometric transformations simultaneously process RGB and depth data while maintaining the label correspondence. Finally, we present a pipeline for clothes grasping and unfolding from the perspective of semantic segmentation, through the addition of a strategy for grasp point selection from segmentation regions based on clothing flatness measures, while taking into account the grasping direction. We evaluate our BiFCNet on the public dataset NYUDv2 and obtained comparable performance to current state-of-the-art models. We also deploy our model on a Baxter robot, running extensive grasping and unfolding experiments as part of our ablation studies, achieving an 84% success rate.

AB - Clothes grasping and unfolding is a core step in robotic-assisted dressing. Most existing works leverage depth images of clothes to train a deep learning-based model to recognize suitable grasping points. These methods often utilize physics engines to synthesize depth images to reduce the cost of real labeled data collection. However, the natural domain gap between synthetic and real images often leads to poor performance of these methods on real data. Furthermore, these approaches often struggle in scenarios where grasping points are occluded by the clothing item itself. To address the above challenges, we propose a novel Bi-directional Fractal Cross Fusion Network (BiFCNet) for semantic segmentation, enabling recognition of graspable regions in order to provide more possibilities for grasping. Instead of using depth images only, we also utilize RGB images with rich color features as input to our network in which the Fractal Cross Fusion (FCF) module fuses RGB and depth data by considering global complex features based on fractal geometry. To reduce the cost of real data collection, we further propose a data augmentation method based on an adversarial strategy, in which the color and geometric transformations simultaneously process RGB and depth data while maintaining the label correspondence. Finally, we present a pipeline for clothes grasping and unfolding from the perspective of semantic segmentation, through the addition of a strategy for grasp point selection from segmentation regions based on clothing flatness measures, while taking into account the grasping direction. We evaluate our BiFCNet on the public dataset NYUDv2 and obtained comparable performance to current state-of-the-art models. We also deploy our model on a Baxter robot, running extensive grasping and unfolding experiments as part of our ablation studies, achieving an 84% success rate.

KW - Geometry

KW - Costs

KW - Image color analysis

KW - Semantic segmentation

KW - Clothing

KW - Pipelines

KW - Grasping

U2 - 10.1109/ICRA48891.2023.10160268

DO - 10.1109/ICRA48891.2023.10160268

M3 - Conference contribution

SN - 9798350323665 (PoD)

T3 - IEEE International Conference on Robotics and Automation

SP - 9471

EP - 9477

BT - 2023 IEEE International Conference on Robotics and Automation (ICRA)

PB - IEEE

T2 - 2023 IEEE International Conference on Robotics and Automation

Y2 - 29 May 2023 through 2 June 2023

ER -

Clothes Grasping and Unfolding Based on RGB-D Semantic Segmentation

Abstract

Publication series

Conference

Keywords

Access to Document

Fingerprint

Cite this