History of Artificial Intelligence: Dartmouth 1956 to Deep Learning
AI began at Dartmouth College in 1956 with McCarthy, Minsky, and Shannon. Two AI winters followed before AlexNet's 2012 breakthrough transformed machine learning into today's AI revolution.
Ten Men, Eight Weeks, and a New Field of Science
In the summer of 1956, ten scientists gathered at Dartmouth College in Hanover, New Hampshire, for an eight-week workshop funded by a Rockefeller Foundation grant of $7,500. The proposal, written by John McCarthy of Dartmouth, Marvin Minsky of Harvard, Claude Shannon of Bell Labs, and Nathaniel Rochester of IBM, contained a sentence that launched a discipline: "We propose that a 2-month, 10-man study of artificial intelligence be carried out during the summer of 1956. The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it." McCarthy coined the term "artificial intelligence" for the proposal. The participants included Allen Newell and Herbert Simon, who arrived having already built the Logic Theorist — a program that could prove mathematical theorems — which they demonstrated during the workshop.
The Golden Age: 1956–1974
The decade and a half after Dartmouth produced genuine breakthroughs alongside extravagant predictions. McCarthy developed LISP in 1958, giving AI researchers a programming language suited to symbolic manipulation. Minsky and John McCarthy founded the MIT Artificial Intelligence Laboratory in 1959. Newell and Simon extended Logic Theorist into General Problem Solver (1957), a program they claimed modeled human problem-solving processes.
- Checkers (1952–1962): Arthur Samuel at IBM built a checkers-playing program that improved by playing against itself — an early demonstration of machine learning. By 1962 it could beat some amateur human players.
- ELIZA (1966): Joseph Weizenbaum at MIT created a program that simulated a psychotherapist by pattern-matching user input and reflecting it back as questions. Many users attributed genuine understanding to ELIZA despite knowing it was a program — an effect Weizenbaum found alarming.
- Shakey the Robot (1966–1972): SRI International built the first mobile robot capable of reasoning about its own actions, using early computer vision and planning algorithms.
The optimism was immense. In 1967, Minsky predicted that "within a generation, the problem of creating artificial intelligence will be substantially solved." Simon forecast that machines would be capable of any work a man could do "within 20 years." Neither prediction survived contact with reality.
First AI Winter: 1974–1980
Three converging failures produced the first AI winter. The 1969 book Perceptrons by Minsky and Seymour Papert mathematically demonstrated that single-layer neural networks — the dominant learning model of the 1960s — could not solve XOR and other nonlinearly separable problems. The result was read (incorrectly) as a fatal critique of neural networks broadly, causing research funding to shift away from connectionist approaches for over a decade.
The 1973 Lighthill Report, commissioned by the British Science Research Council, concluded that AI had failed to deliver on its promises in any of its three primary research areas: robotics, language processing, and machine learning. The report led to near-total elimination of AI funding in the United Kingdom. In the United States, DARPA cut AI research funding after speech recognition systems failed to meet performance targets. The combination of overpromising by researchers and underdelivering by systems produced a funding crisis that lasted through the late 1970s.
Expert Systems and the Second Winter: 1980–1993
AI recovered with a different approach: expert systems that encoded human domain knowledge as if-then rules rather than attempting general intelligence. MYCIN (1976) at Stanford diagnosed bacterial infections and recommended antibiotic treatments at a level competitive with physicians. XCON (1980) at Digital Equipment Corporation configured VAX computers, saving the company an estimated $40 million per year by 1986.
The commercial success of expert systems triggered an investment bubble. The AI industry grew from virtually nothing to $400 million by 1988. Lisp Machines — specialized hardware for running LISP-based AI programs — became a commercial market. The bubble burst in 1987 when cheaper general-purpose workstations rendered dedicated AI hardware obsolete. Expert systems proved brittle and expensive to maintain: they could not generalize beyond their encoded knowledge, and updating them as domains evolved required prohibitive expert consultation time.
| Period | Phase | Key Event | Primary Failure/Success Factor |
|---|---|---|---|
| 1956–1974 | Golden Age | Dartmouth workshop; Logic Theorist; ELIZA | Narrow tasks succeeded; general intelligence failed to materialize |
| 1974–1980 | First AI Winter | Lighthill Report; DARPA funding cuts | Overpromising; computational limits; Perceptrons critique |
| 1980–1987 | Expert Systems Boom | XCON saves DEC $40M/year; $400M industry | Rule-based systems scaled commercially |
| 1987–1993 | Second AI Winter | Lisp Machine market collapse | Brittleness of rule-based systems; hardware commoditization |
| 1993–2012 | Connectionism Resurgence | SVMs; Deep Blue (1997); backpropagation revival | Statistical ML and increased compute |
| 2012–present | Deep Learning Era | AlexNet; GPT series; AlphaFold; ChatGPT | Big data + GPU compute + deep neural networks |
AlexNet 2012: The Turning Point
The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) began in 2010, presenting competing systems with 1.2 million training images across 1,000 categories. In 2011 the best system achieved a top-5 error rate of 25.8%. In 2012, Geoffrey Hinton's team at the University of Toronto entered AlexNet — a deep convolutional neural network trained on two NVIDIA GTX 580 GPUs over five to six days — and achieved a top-5 error rate of 15.3%. The second-place system scored 26.2%. A gap of 10 percentage points was unprecedented.
- AlexNet had 60 million parameters across eight layers — five convolutional and three fully connected — a scale only made practical by consumer GPU hardware.
- The paper "ImageNet Classification with Deep Convolutional Neural Networks" by Krizhevsky, Sutskever, and Hinton became the most cited computer science paper of the 2010s.
- Google acquired DeepMind in 2014 for £400 million. DeepMind's AlphaGo defeated world Go champion Lee Sedol 4–1 in March 2016, a milestone considered 10 years ahead of schedule by AI researchers.
- OpenAI released GPT-3 in 2020 with 175 billion parameters. ChatGPT, launched in November 2022 on GPT-3.5, reached 100 million users in two months — the fastest consumer product adoption in history at the time.
Related Articles
science history
Ada Lovelace: The First Computer Programmer and the Algorithm for a Machine That Didn't Exist Yet
Ada Lovelace wrote the first published computer algorithm in 1843 — for Charles Babbage's Analytical Engine, a machine that was never built. Her conceptual insights anticipated artificial intelligence debates by a century.
9 min read
science history
Damascus Steel: The Lost Metallurgical Secret That May Have Been Carbon Nanotubes
Damascus steel blades were legendary for their strength, sharpness, and distinctive watered pattern. The technique was lost by the 1750s — and researchers only recently discovered why it worked so well.
9 min read
science history
History of Cryptography: Caesar Cipher to Quantum Threat
From Caesar's shift cipher to Enigma's 3-rotor setup broken by Turing's bombe, then DES 1977 and RSA public key — cryptography's evolution now faces the quantum computing threat.
9 min read
science history
History of Programming Languages: From FORTRAN to Modern Code
FORTRAN in 1957 launched modern programming. Follow the evolution from COBOL and BASIC through C, object-oriented languages, and today's paradigms across seven decades.
9 min read