高级检索
当前位置: 首页 > 详情页

PVT-MA: pyramid vision transformers with multi-attention fusion mechanism for polyp segmentation

文献详情

资源类型:
WOS体系:

收录情况: ◇ SCIE

机构: [1]Hebei GEO Univ, Sch Informat Engn, Shijiazhuang 050031, Hebei, Peoples R China [2]Hebei GEO Univ, New Retail Joint Res Inst, Shijiazhuang 050031, Hebei, Peoples R China [3]Hebei Med Univ, Canc Res Inst, Hosp 4, Shijiazhuang 050000, Hebei, Peoples R China [4]Hebei Lanhui Technol Co Ltd, Shijiazhuang 050031, Hebei, Peoples R China
出处:
ISSN:

关键词: Polyp segmentation Multi-scale feature fusion Attention mechanism Pyramid vision transformer

摘要:
Early diagnosis and prevention of colorectal cancer rely on colonoscopic polyp examination.Accurate automated polyp segmentation technology can assist clinicians in precisely identifying polyp regions, thereby conserving medical resources. Although deep learning-based image processing methods have shown immense potential in the field of automatic polyp segmentation, current automatic segmentation methods for colorectal polyps are still limited by factors such as the complex and variable intestinal environment and issues related to detection equipment like glare and motion blur. These limitations result in an inability to accurately distinguish polyps from surrounding mucosal tissue and effectively identify tiny polyps. To address these challenges, we designed a multi-attention-based model, PVT-MA. Specifically, we developed the Cascading Attention Fusion (CAF) Module to accurately identify and locate polyps, reducing false positives caused by environmental factors and glare. Additionally, we introduced the Series Channels Coordinate Attention (SCC) Module to maximize the capture of polyp edge information. Furthermore, we incorporated the Receptive Field Block (RFB) Module to enhance polyp features and filter image noise.We conducted quantitative and qualitative evaluations using six metrics across four challenging datasets. Our PVT-MA model achieved top performance on three datasets and ranked second on one. The model has only 26.39M parameters, a computational cost of 10.33 GFlops, and delivers inference at a high speed of 47.6 frames per second (FPS).

基金:
语种:
被引次数:
WOS:
中科院分区:
出版当年[2025]版:
大类 | 3 区 计算机科学
小类 | 4 区 计算机:人工智能
最新[2025]版:
大类 | 3 区 计算机科学
小类 | 4 区 计算机:人工智能
JCR分区:
出版当年[2024]版:
Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
最新[2024]版:
Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

影响因子: 最新[2024版] 最新五年平均 出版当年[2025版] 出版当年五年平均 出版前一年[2024版]

第一作者:
第一作者机构: [1]Hebei GEO Univ, Sch Informat Engn, Shijiazhuang 050031, Hebei, Peoples R China [2]Hebei GEO Univ, New Retail Joint Res Inst, Shijiazhuang 050031, Hebei, Peoples R China
共同第一作者:
通讯作者:
通讯机构: [1]Hebei GEO Univ, Sch Informat Engn, Shijiazhuang 050031, Hebei, Peoples R China [2]Hebei GEO Univ, New Retail Joint Res Inst, Shijiazhuang 050031, Hebei, Peoples R China
推荐引用方式(GB/T 7714):
APA:
MLA:

资源点击量:42315 今日访问量:0 总访问量:1365 更新日期:2025-08-01 建议使用谷歌、火狐浏览器 常见问题

技术支持:重庆聚合科技有限公司 地址:河北省石家庄市健康路12号