AI Chatbots Miss More Than Half of Medical Diagnoses, Study Finds

Although chatbots and large language models can answer a slew of everyday questions, they shouldn’t be the first place you turn to for medical advice, a new study from the scientific journal Nature Medicine shows.

During the study, 1,298 participants in the UK were asked to use a large language model, such as ChatGPT or Meta’s Llama 3, for medical advice. When used in this way, the LLM correctly identified medical conditions in fewer than 34.5% of cases.

How LLMs performed in the study

The study acknowledged that LLMs now achieve scores on medical knowledge benchmarks comparable to passing the US Medical Licensing Exam, and that clinical documents from LLMs “are rated as equivalent to or better than those written by doctors.”

However, a problem was revealed when the study’s participants tried to get the same results by asking the LLM questions but were not successful. This is because users often

...

Keep reading this article on CNET.