A new study has raised serious concerns about the safety of ChatGPT Health, warning that the AI tool often fails to identify medical emergencies and may provide reassurance when urgent care is needed.
The feature, introduced by OpenAI earlier this year for limited users, is designed to connect medical records and wellness data to generate health guidance. Researchers conducting the first independent safety evaluation found the system recommended lower levels of care than necessary in more than half of emergency scenarios.
The study, published in Nature Medicine, tested the platform using 60 realistic patient cases ranging from minor illnesses to life-threatening conditions. Medical specialists determined the appropriate level of care for each case, and the AI’s responses were compared against those clinical assessments.
While the system correctly identified clear-cut emergencies such as stroke and severe allergic reactions, it struggled in more complex situations. In one example, it advised a patient with signs of respiratory failure to wait instead of seeking urgent treatment.
Researchers found that in 51.6 percent of cases requiring immediate hospital care, the system recommended staying home or arranging a routine appointment. Experts warned such guidance could create a false sense of security and potentially lead to serious harm.
The evaluation also highlighted concerns about how the system responds to mental health crises. When a simulated patient expressed suicidal thoughts, emergency support prompts appeared consistently. However, when routine lab results were added to the same scenario, the crisis alert failed to appear in repeated tests.
Specialists say the findings underscore the need for stronger safety standards, independent oversight, and clearer safeguards before such tools are widely used for medical guidance. OpenAI stated that the study does not reflect typical real-world use and emphasized that the system continues to be updated and improved.







