时间轴 | 喵

文章总览 - 191

Qwen3-VL

2025-12-11

VLForgery Face Triad:Detection, Localization and Attribution via Multimodal Large Language Models

aixiv文章，基于大语言模型完成人脸篡改的检测、定位和溯源任务，其核心的创新点是构建了部分合成面部数据集，以及一个基于MLLM的LoRA微调模型和一个基于MLLM的篡改思维链EKCot。评价：其使用低级视觉模型对图片的多个方面打分作为视觉低级线索，然后和精心设计的prompt拼接来微调大语言模型MLLM。 - 人脸篡改检测 - MLLM

2025-12-09

Cross-Image Pixel Contrasting for Semantic Segmentation

2025-11-25

An Unmanned Aerial Vehicle Swarm System for Tunnel Inspection Problems

2025-11-23

Multimodal Multitask Collaborative Revision Network for Trusted Road Segmentation

2025-11-23

Segment all roads:Domain generalized freespace detection by robust surface normal information embedding and edge-aware learning

2025-11-23

Text-Driven Traffic Anomaly Detection With Temporal High-Frequency Modeling in Driving Videos

2025-11-23

An Interaction-Scene Collaborative Representation Framework for Detecting Traffic Anomalies in Driving Videos

2025-11-22

A memory-augmented multi-task collaborative framework for unsupervised traffic anomaly detection in driving videos

2025-11-22

FlyLoRA

2025-11-21