Integrating YOLOv11 and DualUNet for Precise Bridge Crack Detection and Segmentation

Zhou Wang, Yuan Li, Huailiang Cheng, Jie Zhou, Jun Song

Abstract

Concrete bridge crack detection plays a critical role in intelligent infrastructure inspection and structural safety assessment. However, existing automated approaches still face significant challenges, including the high miss rate of fine cracks, strong interference from complex backgrounds, and insufficient boundary delineation accuracy.
To address these limitations, this study proposes a novel two-stage collaborative detection–segmentation framework that integrates an enhanced YOLOv11 detector with a dual-branch DualUNet architecture. The framework follows a “coarse localization–guided fine segmentation” strategy. In the detection stage, YOLOv11 is improved by incorporating a P2 high-resolution feature layer and a C2PSA attention module, enabling accurate localization of fine cracks while effectively suppressing background noise and generating reliable region-of-interest (RoI) priors.
In the segmentation stage, a dual-branch DualUNet with a shared ResNet34 encoder is developed. The global branch captures contextual semantic information from the full image, whereas the RoI branch refines local crack features by integrating detection priors with a Multi-Scale Squeeze-and-Excitation (MSSE) module. This design enhances the representation of fine structural details and mitigates the limitations of single-stage segmentation methods.
To address the severe class imbalance caused by sparse crack pixels, a hybrid loss function combining weighted binary cross-entropy and Dice loss is adopted. Additionally, a boundary supervision mechanism is introduced to improve contour accuracy.
Experimental results on the Crack500 dataset demonstrate that the proposed framework achieves superior performance, with recall and precision reaching 0.79 and 0.75, respectively. Compared with baseline models, the proposed method improves Boundary IoU by 26%, indicating significantly enhanced edge delineation. Visual results further confirm that the method effectively suppresses background interference while preserving crack continuity, making it suitable for practical bridge inspection applications.

 

Keywords: Concrete crack detection; Bridge inspection; YOLOv11; Dual U-Net; Attention mechanism; Boundary segmentation.

 

DOI https://doi.org/10.55463/issn.1674-2974.53.4.2


Full Text:

PDF


References


Otsu, N. (1979). A threshold selection method from gray-level histograms. Automatica, 11(285-296).

Cha, Y. J., Choi, W., & Büyüköztürk, O. (2017). Deep learning‐based crack damage detection using convolutional neural networks. Computer‐Aided Civil and Infrastructure Engineering, 32(5), 361-378. https://doi.org/10.1111/mice.12263

Koch, C., Georgieva, K., Kasireddy, V., Akinci, B., & Fieguth, P. (2015). A review on computer vision based defect detection and condition assessment of concrete and asphalt civil infrastructure. Advanced engineering informatics, 29(2), 196-210. https://doi.org/10.1016/j.aei.2015.01.008

Huyan, J., Li, W., Tighe, S., Xu, Z., & Zhai, J. (2020). CrackU‐net: A novel deep convolutional neural network for pixelwise pavement crack detection. Structural Control and Health Monitoring, 27(8), e2551. https://doi.org/10.1002/stc.2551

Liu, H., Miao, X., Mertz, C., Xu, C., & Kong, H. (2021). Crackformer: Transformer network for fine-grained crack detection. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 3783-3792).

Liu, Y., & Yeoh, J. K. (2021). Robust pixel-wise concrete crack segmentation and properties retrieval using image patches. Automation in Construction, 123, 103535. https://doi.org/10.1016/j.autcon.2020.103535

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788).

Sun, S., Liu, W., & Cui, R. (2022, July). YOLO based bridge surface defect detection using decoupled prediction. In 2022 7th Asia-Pacific Conference on Intelligent Robot Systems (ACIRS) (pp. 117-122). IEEE. 10.1109/ACIRS55390.2022.9845546

Terven, J., Córdova-Esparza, D. M., & Romero-González, J. A. (2023). A comprehensive review of yolo architectures in computer vision: From yolov1 to yolov8 and yolo-nas. Machine learning and knowledge extraction, 5(4), 1680-1716.

Xu, W., Li, H., Li, G., Ji, Y., Xu, J., & Zang, Z. (2025). Improved YOLOv8n-based bridge crack detection algorithm under complex background conditions. Scientific Reports, 15(1), 13074. https://doi.org/10.1038/s41598-025-97842-2

LI, J., MENG, X., HU, L., BAO, Y., & ZHAO, S. (2025). Bridge small target crack detection based on improved YOLOv8. Journal of Tsinghua University (Science and Technology), 65(7), 1260-1271. 10.16511/j.cnki.qhdxxb.2025.26.023

Ren, W., & Zhong, Z. (2025). Building construction crack detection with BCCD YOLO enhanced feature fusion and attention mechanisms. Scientific Reports, 15(1), 23167. https://doi.org/10.1038/s41598-025-05665-y

Wang, N., Huang, S., Liu, X., Wang, Z., Liu, Y., & Gao, Z. (2025). MRA-YOLOv8: a network enhancing feature extraction ability for photovoltaic cell defects. Sensors, 25(5), 1542.

Xu, T., Zhang, G., Ruan, Y., Xu, H., Lu, R., & Lin, J. (2025). An improved YOLOv8 by fusing a coordinate attention mechanism and a bidirectional feature pyramid network for identifying power repair vehicles in the cable terminal field. International Journal of Parallel, Emergent and Distributed Systems, 1-18. https://doi.org/10.1080/17445760.2025.2518139

Khanam, R., & Hussain, M. (2024). Yolov11: An overview of the key architectural enhancements. arXiv preprint arXiv:2410.17725. https://doi.org/10.48550/arXiv.2410.17725

Song, Y., Xing, L., Song, Y., & Li, J. (2025). Real-Time Detection and Monitoring of Structural Cracks Using ConcreteCrack. https://doi.org/10.21203/rs.3.rs-7767819/v1

Zhang, R., Guan, C., Fang, Y., Duan, Y., & Sui, X. (2026). A Two-Stage Concrete Crack Segmentation Method Based on the Improved YOLOv11 and Segment Anything Model. Buildings, 16(4), 794. https://doi.org/10.3390/buildings16040794

Gao, X., Cao, C., & Yi, X. (2025). Using the improved YOLOv11 model to enhance computer vision applications for building crack detection algorithms. Scientific Reports, 15(1), 38843. https://doi.org/10.1038/s41598-025-22160-6

Ronneberger, O., Fischer, P., & Brox, T. (2015, October). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Cham: Springer international publishing. https://doi.org/10.1007/978-3-319-24574-4_28

Yang, F., Zhang, L., Yu, S., Prokhorov, D., Mei, X., & Ling, H. (2019). Feature pyramid and hierarchical boosting network for pavement crack detection. IEEE transactions on intelligent transportation systems, 21(4), 1525-1535. 10.1109/TITS.2019.2910595

Yang, A., Chen, S., Yao, K., Huang, X., Wang, Y., Li, J., & Chen, Y. (2025, August). A Multi-scale Dilated Convolution Model with Edge Optimization for Crack Detection. In Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data (pp. 231-245). Singapore: Springer Nature Singapore. https://doi.org/10.1007/978-981-95-5719-6_15

Liu, H., Miao, X., Mertz, C., Xu, C., & Kong, H. (2021). Crackformer: Transformer network for fine-grained crack detection. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 3783-3792).

Chen, S., Feng, Z., Xiao, G., Chen, X., Gao, C., Zhao, M., & Yu, H. (2024). Pavement crack detection based on the improved Swin-Unet model. Buildings, 14(5), 1442.

Xu, C., Zhang, Q., Mei, L., Chang, X., Ye, Z., Wang, J., ... & Yang, W. (2023). Cross-attention-guided feature alignment network for road crack detection. ISPRS International Journal of Geo-Information, 12(9), 382.

Du Nguyen, Q., & Thai, H. T. (2023). Crack segmentation of imbalanced data: The role of loss functions. Engineering Structures, 297, 116988. https://doi.org/10.1016/j.engstruct.2023.116988

Yang, E., Tang, Y., Zhang, A. A., Wang, K. C., & Qiu, Y. (2023). Policy gradient–based focal loss to reduce false negative errors of convolutional neural networks for pavement crack segmentation. Journal of Infrastructure Systems, 29(1), 04023002. https://doi.org/10.1061/JITSE4.ISENG-215

Chen, J. (2022, November). Optimized hybrid focal margin loss for crack segmentation. In 2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA) (pp. 1-7). IEEE. 10.1109/DICTA56598.2022.10034608

Chen, X., Shi, Y., & Pang, J. (2025). SECrackSeg: a high-accuracy crack segmentation network based on proposed UNet with SAM2 S-Adapter and edge-aware attention. Sensors, 25(9), 2642.

Rajput, V. (2021). Robustness of different loss functions and their impact on networks learning capability. arXiv preprint arXiv:2110.08322. https://doi.org/10.48550/arXiv.2110.08322

Yan, J., Wang, H., Yan, M., Diao, W., Sun, X., & Li, H. (2019). IoU-adaptive deformable R-CNN: Make full use of IoU for multi-class object detection in remote sensing imagery. Remote Sensing, 11(3), 286.

He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969).Zhang, C., Chen, X., Liu, P., He, B., Li, W., & Song, T. (2024). Automated detection and segmentation of tunnel defects and objects using YOLOv8-CM. Tunnelling and Underground Space Technology, 150, 105857. https://doi.org/10.1016/j.tust.2024.105857

Xu, Y., Yan, S., Qi, Y., Ding, Z., & Zhang, D. (2025). CDIF-Net: cross-dimensional interactive fusion network with dual-branch attention for pavement crack segmentation. Measurement Science and Technology, 36(9), 095404. 10.1088/1361-6501/adfb9e

Zhang, L., Liao, Y., Wang, G., Chen, J., & Wang, H. (2022). A multi-scale contextual information enhancement network for crack segmentation. Applied Sciences, 12(21), 11135.

Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7132-7141).

Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. (2019, July). Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 2623-2631). https://doi.org/10.1145/3292500.3330701


Refbacks

  • There are currently no refbacks.