文章总览 - 191
Qwen3-VL
Qwen3-VL

11
VLForgery Face Triad:Detection, Localization and Attribution via Multimodal Large Language Models
VLForgery Face Triad:Detection, Localization and Attribution via Multimodal Large Language Models

aixiv文章,基于大语言模型完成人脸篡改的检测、定位和溯源任务,其核心的创新点是构建了部分合成面部数据集,以及一个基于MLLM的LoRA微调模型和一个基于MLLM的篡改思维链EKCot。评价:其使用低级视觉模型对图片的多个方面打分作为视觉低级线索,然后和精心设计的prompt拼接来微调大语言模型MLLM。 - 人脸篡改检测 - MLLM

12
Cross-Image Pixel Contrasting for Semantic Segmentation
Cross-Image Pixel Contrasting for Semantic Segmentation

13
An Unmanned Aerial Vehicle Swarm System for Tunnel Inspection Problems
An Unmanned Aerial Vehicle Swarm System for Tunnel Inspection Problems

14
Multimodal Multitask Collaborative Revision Network for Trusted Road Segmentation
Multimodal Multitask Collaborative Revision Network for Trusted Road Segmentation

15
Segment all roads:Domain generalized freespace detection by robust surface normal information embedding and edge-aware learning
Segment all roads:Domain generalized freespace detection by robust surface normal information embedding and edge-aware learning

16
Text-Driven Traffic Anomaly Detection With Temporal High-Frequency Modeling in Driving Videos
Text-Driven Traffic Anomaly Detection With Temporal High-Frequency Modeling in Driving Videos

17
An Interaction-Scene Collaborative Representation Framework for Detecting Traffic Anomalies in Driving Videos
An Interaction-Scene Collaborative Representation Framework for Detecting Traffic Anomalies in Driving Videos

18
A memory-augmented multi-task collaborative framework for unsupervised traffic anomaly detection in driving videos
A memory-augmented multi-task collaborative framework for unsupervised traffic anomaly detection in driving videos

19
FlyLoRA
FlyLoRA

20