标题:
OpenAI o1模型在急诊诊断中准确率达67% 优于人类医生
摘要:
哈佛大学医学院与贝斯以色列女执事医疗中心研究团队在《科学》期刊发表研究,评估OpenAI o1与4o模型在急诊诊断中的表现。研究基于76例真实急诊病例,对比AI与人类主治医师的诊断准确性。
o1模型在初始分诊阶段准确率达67%,高于两名人类医生的55%与50%。AI在信息有限、决策紧迫的场景下表现尤为突出。研究未对电子病历数据进行预处理,确保测试条件与临床实际一致。
该研究显示大语言模型在医疗诊断中具有潜在优势,尤其在资源紧张或信息不完整环境下。未来或可辅助医生提升急诊效率与准确性。
o1模型急诊诊断准确率超人类医生
AI在信息有限场景表现更优
研究基于真实未处理电子病历数据
OpenAI o1 Model Outperforms ER Doctors 67% Diagnostic Accuracy Initial Triage
A Harvard Medical School and Beth Israel Deaconess Medical Center study published in Science evaluated OpenAI’s o1 and 4o models against two attending physicians in diagnosing 76 emergency room patients. The o1 model achieved 67% diagnostic accuracy during initial triage, surpassing one physician at 55% and another at 50%. Diagnoses were assessed blindly by two independent physicians.
The models received unprocessed electronic medical record data identical to what physicians accessed, ensuring a fair comparison. o1 demonstrated superior performance, particularly during early diagnostic stages with limited patient information and high decision urgency. The 4o model performed comparably to physicians but did not exceed them. Researchers emphasized the models’ ability to operate without data preprocessing or human augmentation.
This study highlights the growing potential of large language models in clinical decision support, especially in time-sensitive environments. While not replacing physicians, AI systems like o1 could enhance diagnostic accuracy during critical early assessments. The findings suggest a shift toward integrating validated AI tools into emergency medicine workflows.
Key Takeaways:
OpenAI o1 model achieves 67% diagnostic accuracy in ER triage, exceeding physician performance
AI models evaluated on raw medical data without preprocessing, ensuring real-world validity
o1 outperforms both physicians and 4o model, especially in low-information, high-urgency scenarios
Source: Original Article
查看原文 →
View Original →