09 July 2025 : Clinical Research
Performance of AI Chatbots in Preliminary Diagnosis of Maxillofacial Pathologies
Ridvan GulerDOI: 10.12659/MSM.949076
Med Sci Monit 2025; 31:e949076
Table 2 True-false distributions of chatbots’ answers to case questions.
| Chatbots | Correct n (%) | Incorrect n (%) | Total n (%) | P | Effect size |
|---|---|---|---|---|---|
| ChatGPT 4.0 | 15 (65.2) | 8 (34.8) | 23 (100) | 0.125a | 0.250 |
| Grok | 12 (52.1) | 11 (47.9) | 23 (100) | ||
| Blackbox AI | 12 (52.1) | 11 (47.9) | 23 (100) | ||
| Claude AI | 7 (30.4) | 16 (69.6) | 23 (100) | ||
| Data are given as n%; a Chi-square test. | |||||






