|本期目录/Table of Contents|

[1]潘文丽,徐国庆*,王小甜,等.改进YOLOv11n的复杂水下场景目标检测模型 [J].武汉工程大学学报,2026,48(01):82-89.[doi:10.19843/j.cnki.CN42-1779/TQ.202410020]
 PAN Wenli,XU Guoqing*,WANG Xiaotian,et al. An enhanced aquatic detection-YOLOv11n model for target detection in complex underwater scenes [J].Journal of Wuhan Institute of Technology,2026,48(01):82-89.[doi:10.19843/j.cnki.CN42-1779/TQ.202410020]
点击复制

改进YOLOv11n的复杂水下场景目标检测模型
(/HTML)

《武汉工程大学学报》[ISSN:1674-2869/CN:42-1779/TQ]

卷:
48
期数:
2026年01期
页码:
82-89
栏目:
智能制造
出版日期:
2026-02-28

文章信息/Info

Title:
An enhanced aquatic detection-YOLOv11n model for target detection in complex underwater scenes

文章编号:
1674 - 2869(2026)01 - 0082 - 08
作者:
潘文丽徐国庆*王小甜乐文波
武汉工程大学计算机科学与工程学院、人工智能学院,湖北 武汉 430205
Author(s):
PAN Wenli XU Guoqing* WANG Xiaotian YUE Wenbo
School of Computer Science and Engineering,School of Artificial Intelligence,Wuhan Institute of Technology,Wuhan 430205,China

关键词:
水下目标检测YOLOv11n特征融合注意力机制
Keywords:
underwater target detection YOLOv11n feature fusion attention mechanism
分类号:
TP391.4
DOI:
10.19843/j.cnki.CN42-1779/TQ.202410020
文献标志码:
A
摘要:
为提升海洋探测开发中水下目标检测的准确率,针对复杂水下环境中目标轮廓模糊、生物聚集遮挡及多尺度目标检测精度不足等问题,提出了一种改进YOLOv11n的复杂水下场景目标检测模型。首先,在基线模型YOLOv11n的骨干网络中引入轻量化混合架构EfficientFormerV2,通过Transformer的多阶段协同设计,有效捕获局部细节与全局语义信息,缓解低分辨率及特征损失对检测性能的影响;其次,设计自适应上下文融合模块,采用局部和全局双分支结构并结合可学习权重机制,实现跨层特征的高效对齐与融合,增强模型在复杂水下场景下的目标分辨能力;最后,引入动态检测头,通过尺度感知、空间聚焦和任务解耦的联合优化,提升对多尺度水下生物目标的检测精度。实验结果表明,在RUOD数据集和DUO数据集上,改进后的模型相较于YOLOv11n,交并比阈值为50%时的平均精度均值分别提升了1.3% 和1.8%,模型参数量仅增加3.4×106,整体性能超越现有主流方法。该模型有效克服了水下特征退化及多尺度检测精度不足的问题,提升了复杂水下环境下的目标识别精度,为海洋资源勘探、沉船定位和生态监测提供高效解决方案。
Abstract:
To address challenges such as blurred target contours, occlusion by biological aggregations, and insufficient multi-scale target detection accuracy in complex underwater environments, in this paper, we proposed an enhanced aquatic detection-YOLOv11n(EAD-YOLOv11n) model for target detection in complex underwater scenes. First, a lightweight hybrid architecture, EfficientFormerV2, was integrated into the backbone network of the baseline YOLOv11n model. Leveraging a multi-stage collaborative design of Transformer, it effectively captured both local details and global semantic information, mitigating the impact of low resolution and feature loss on detection performance. Second, an adaptive context fusion module(ACFM) was designed. Employing a dual-branch structure for local and global features combined with a learnable weight mechanism, it achieved efficient cross-layer feature alignment and fusion, enhancing the model’s ?capability to distinguish targets in complex underwater scenes. Finally, a dynamic head (DyHead) was introduced. Through joint optimization incorporating scale awareness, spatial focusing, and task decoupling, it improved the model’s accuracy for detecting multi-scale underwater biological targets. Experimental results demonstrated that, compared to the baseline YOLOv11n, the enhanced model achieved improvements of 1.3% and 1.8% in mean average precision at a 50% intersection over union threshold (mAP@50) on the RUOD and DUO datasets, respectively, with only a 3.4×106 parameter increase, and outperformed existing mainstream methods. The model proposed effectively overcomes underwater feature degradation and insufficient multi-scale detection accuracy, enhancing target recognition performance in complex underwater environments, thereby providing an efficient solution for marine resource exploration, shipwreck localization, and ecological monitoring.

参考文献/References:

[ 1 ] 王德兴, 何勇, 袁红春. 基于YOLOv8-BAN模型的水下生物目标检测方法[J]. 江苏农业学报, 2025, 41(1): 101-111.
[ 2 ] 孙艺倩. 基于深度学习的水下垃圾检测方法研究[D]. 吉林:东北电力大学, 2023.
[ 3 ] 张有波. 基于视觉的水下考古机器人实时目标检测与识别[D]. 上海:上海海洋大学, 2021.
[ 4 ] XU S B, ZHANG M H, SONG W, et al. A systematic review and analysis of deep learning-based underwater object detection[J]. Neurocomputing, 2023, 527: 204-232.
[ 5 ] 黄瑜豪, 曾祥进, 冯崧. 面向边缘设备的轻量级OpenPose姿态检测模型研究[J]. 武汉工程大学学报, 2024, 46(4): 424-430.
[ 6 ] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[ 7 ] HE K M, GKIOXARI G, DOLLáR P, et al. Mask R-CNN [C]//2017 IEEE International Conference on Computer Vision (ICCV). Piscataway,NJ:IEEE,2017:2980-2988.
[ 8 ] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ??: IEEE, 2016: 779-788.
[ 9 ] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]//Computer Vision-ECCV 2016. Berlin: Springer, 2016: 21-37.
[10] 陈宇梁, 董绍江, 孙世政, 等. 改进YOLOv5s的弱光水下生物目标检测算法[J]. 北京航空航天大学学报, 2024, 50(2): 499-507.
[11] 辛世澳, 葛海波, 袁昊, 等. 改进YOLOv7的轻量化水下目标检测算法[J]. 计算机工程与应用, 2024, 60(3): 88-99.
[12] 李培坤, 李锋, 葛忠显, 等. 基于改进YOLOv8n的水下目标检测算法[J]. 电子测量技术, 2025, 48(3): 172-179.
[13] LI Y Y, HU J, WEN Y, et al. Rethinking vision Transformers for MobileNet size and speed[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway, NJ??: IEEE, 2023: 16843-16854.
[14] CHEN Z X, HE Z W, LU Z M. DEA-Net: single image dehazing based on detail-enhanced convolution and content-guided attention[J]. IEEE Transactions on Image Processing, 2024, 33: 1002-1015.
[15] HU S, GAO F, ZHOU X W, et al. Hybrid convolutional and attention network for hyperspectral image denoising[J]. IEEE Geoscience and Remote Sensing Letters, 2024, 21: 5504005.
[16] DAI X Y, CHEN Y P, XIAO B, et al. Dynamic head: unifying object detection heads with attentions[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway, NJ??: IEEE, 2021: 7369-7378.
[17] FU C P, LIU R S, FAN X, et al. Rethinking general underwater object detection: datasets, challenges, and solutions[J]. Neurocomputing, 2023, 517: 243-256.
[18] LIU C W, LI H J, WANG S C, et al. A dataset and benchmark of underwater object detection for robot picking[C]//2021 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). Piscataway, NJ: IEEE, 2021: 1-6.

相似文献/References:

备注/Memo

备注/Memo:
收稿日期:2024-10-28
基金项目:武汉大学测绘遥感信息工程国家重点实验室开放基金(21E01)
作者简介:潘文丽,硕士研究生。Email:3335676790@qq.com
*通信作者:徐国庆,博士,副教授。Email:124148659@qq.com

更新日期/Last Update: 2026-03-10