Preprint Article Version 2 Preserved in Portico This version is not peer-reviewed

GPT-4 Can't Reason

Version 1 : Received: 27 July 2023 / Approved: 2 August 2023 / Online: 2 August 2023 (04:10:57 CEST)
Version 2 : Received: 3 August 2023 / Approved: 4 August 2023 / Online: 7 August 2023 (07:13:00 CEST)

How to cite: Arkoudas, K. GPT-4 Can't Reason. Preprints 2023, 2023080148. https://doi.org/10.20944/preprints202308.0148.v2 Arkoudas, K. GPT-4 Can't Reason. Preprints 2023, 2023080148. https://doi.org/10.20944/preprints202308.0148.v2

Abstract

GPT-4 was released in March 2023 to wide acclaim, marking a very substantial improvement across the board over GPT-3.5 (OpenAI's previously best model, which had powered the initial release of ChatGPT). Despite the genuinely impressive improvement, however, there are good reasons to be highly skeptical of GPT-4's ability to reason. This position paper discusses the nature of reasoning; criticizes the current formulation of reasoning problems in the NLP community and the way in which the reasoning performance of LLMs is currently evaluated; introduces a collection of 21 diverse reasoning problems; and performs a detailed qualitative analysis of GPT-4's performance on these problems. Based on the results of this analysis, the paper argues that, despite the occasional flashes of analytical brilliance, GPT-4 at present is utterly incapable of reasoning.

Keywords

GPT-4; LLM; AI; reasoning; inference

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (1)

Comment 1
Received: 7 August 2023
Commenter: Konstantine Arkoudas
Commenter's Conflict of Interests: Author
Comment: Fixed a few typos and spelling errors that had slipped through the cracks in the first version.
+ Respond to this comment

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 1
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.