No, ChatGPT-4 Can’t Get an MIT Degree

Rohail Saleem
ChatGPT

This is not investment advice. The author has no position in any of the stocks mentioned. Wccftech.com has a disclosure and ethics policy.

OpenAI’s ChatGPT is a wonderful tool, albeit flawed in several respects. Leveraging the Large Language Model’s (LLM) capabilities while keeping its limitations in the peripheral vision is the correct approach for now.

Recently, a paper made waves by claiming that ChatGPT-4 can score 100 percent on MIT’s EECS curriculum. What followed, however, is a sordid tale of unethical data sourcing and repeated prompts to obtain the desired outcome. Let’s delve deeper.

Related Story Google Aims To Reimagine Android With Deep AI Integrations Bolstered By Gemini And On-Device Nano

A few days back, Professor Iddo Drori published a paper titled “Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models.” The paper scrutinized a “comprehensive dataset of 4,550 questions and solutions from problem sets, midterm exams, and final exams across all MIT Mathematics and Electrical Engineering and Computer Science (EECS) courses required for obtaining a degree.” In a striking outcome, the paper concludes:

“Our results demonstrate that GPT-3.5 successfully solves a third of the entire MIT curriculum, while GPT-4, with prompt engineering, achieves a perfect solve rate on a test set excluding questions based on images.”

Given these astonishing claims, the paper went viral on social media, garnering over 500 retweets in a single day.

The paper’s claims were then examined by Raunak Chowdhuri and his colleagues. Contrary to the paper’s assertions, Chowdhuri found glaring problems in the methodology used:

  • The dataset contained 10 unsolvable questions. This meant that ChatGPT-4 was being fed solutions within the prompts, or the questions were not being graded properly. Upon deeper examination, Chowdhuri found that ChatGPT was indeed being leaked solutions within prompts via what are known as “few shot examples,” which are problems and their solutions that are provided to a model as additional context.
  • Typos and errors in the source code pollute the prompts and lead to a different outcome than what was described in the paper itself.
  • Due to swapped parameters, particularly when it comes to the zero-shot function, the model returns confused responses that can’t possibly be graded.
  • The paper claims that ChatGPT’s responses were double-verified manually. However, Chowdhuri found that the program was using “recorded correct answer to guide its actions” – that is, when to switch between zero-shot learning and few-shot learning.

Additionally, a number of MIT professors then issued a statement, disclosing that the paper sourced the MIT dataset without authorization:

“On June 15th, Iddo Drori posted on arXiv a working paper associated with a dataset of exams and assignments from dozens of MIT courses. He did so without the consent of many of his co-authors and despite having been told of problems that should be corrected before publication.”

The statement concludes with the following one-liner:

“And no, GPT-4 cannot get an MIT degree.”

Do you think that ChatGPT’s potential is being damaged by unethical papers? Let us know your thoughts in the comments section below.

Share this story

Comments