History of Artificial Intelligence: Dartmouth 1956 to Deep Learning

AI began at Dartmouth College in 1956 with McCarthy, Minsky, and Shannon. Two AI winters followed before AlexNet's 2012 breakthrough transformed machine learning into today's AI revolution.

The InfoNexus Editorial TeamMay 23, 20269 min read

Ten Men, Eight Weeks, and a New Field of Science

In the summer of 1956, ten scientists gathered at Dartmouth College in Hanover, New Hampshire, for an eight-week workshop funded by a Rockefeller Foundation grant of $7,500. The proposal, written by John McCarthy of Dartmouth, Marvin Minsky of Harvard, Claude Shannon of Bell Labs, and Nathaniel Rochester of IBM, contained a sentence that launched a discipline: "We propose that a 2-month, 10-man study of artificial intelligence be carried out during the summer of 1956. The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it." McCarthy coined the term "artificial intelligence" for the proposal. The participants included Allen Newell and Herbert Simon, who arrived having already built the Logic Theorist — a program that could prove mathematical theorems — which they demonstrated during the workshop.

The Golden Age: 1956–1974

The decade and a half after Dartmouth produced genuine breakthroughs alongside extravagant predictions. McCarthy developed LISP in 1958, giving AI researchers a programming language suited to symbolic manipulation. Minsky and John McCarthy founded the MIT Artificial Intelligence Laboratory in 1959. Newell and Simon extended Logic Theorist into General Problem Solver (1957), a program they claimed modeled human problem-solving processes.

  • Checkers (1952–1962): Arthur Samuel at IBM built a checkers-playing program that improved by playing against itself — an early demonstration of machine learning. By 1962 it could beat some amateur human players.
  • ELIZA (1966): Joseph Weizenbaum at MIT created a program that simulated a psychotherapist by pattern-matching user input and reflecting it back as questions. Many users attributed genuine understanding to ELIZA despite knowing it was a program — an effect Weizenbaum found alarming.
  • Shakey the Robot (1966–1972): SRI International built the first mobile robot capable of reasoning about its own actions, using early computer vision and planning algorithms.

The optimism was immense. In 1967, Minsky predicted that "within a generation, the problem of creating artificial intelligence will be substantially solved." Simon forecast that machines would be capable of any work a man could do "within 20 years." Neither prediction survived contact with reality.

First AI Winter: 1974–1980

Three converging failures produced the first AI winter. The 1969 book Perceptrons by Minsky and Seymour Papert mathematically demonstrated that single-layer neural networks — the dominant learning model of the 1960s — could not solve XOR and other nonlinearly separable problems. The result was read (incorrectly) as a fatal critique of neural networks broadly, causing research funding to shift away from connectionist approaches for over a decade.

The 1973 Lighthill Report, commissioned by the British Science Research Council, concluded that AI had failed to deliver on its promises in any of its three primary research areas: robotics, language processing, and machine learning. The report led to near-total elimination of AI funding in the United Kingdom. In the United States, DARPA cut AI research funding after speech recognition systems failed to meet performance targets. The combination of overpromising by researchers and underdelivering by systems produced a funding crisis that lasted through the late 1970s.

Expert Systems and the Second Winter: 1980–1993

AI recovered with a different approach: expert systems that encoded human domain knowledge as if-then rules rather than attempting general intelligence. MYCIN (1976) at Stanford diagnosed bacterial infections and recommended antibiotic treatments at a level competitive with physicians. XCON (1980) at Digital Equipment Corporation configured VAX computers, saving the company an estimated $40 million per year by 1986.

The commercial success of expert systems triggered an investment bubble. The AI industry grew from virtually nothing to $400 million by 1988. Lisp Machines — specialized hardware for running LISP-based AI programs — became a commercial market. The bubble burst in 1987 when cheaper general-purpose workstations rendered dedicated AI hardware obsolete. Expert systems proved brittle and expensive to maintain: they could not generalize beyond their encoded knowledge, and updating them as domains evolved required prohibitive expert consultation time.

PeriodPhaseKey EventPrimary Failure/Success Factor
1956–1974Golden AgeDartmouth workshop; Logic Theorist; ELIZANarrow tasks succeeded; general intelligence failed to materialize
1974–1980First AI WinterLighthill Report; DARPA funding cutsOverpromising; computational limits; Perceptrons critique
1980–1987Expert Systems BoomXCON saves DEC $40M/year; $400M industryRule-based systems scaled commercially
1987–1993Second AI WinterLisp Machine market collapseBrittleness of rule-based systems; hardware commoditization
1993–2012Connectionism ResurgenceSVMs; Deep Blue (1997); backpropagation revivalStatistical ML and increased compute
2012–presentDeep Learning EraAlexNet; GPT series; AlphaFold; ChatGPTBig data + GPU compute + deep neural networks

AlexNet 2012: The Turning Point

The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) began in 2010, presenting competing systems with 1.2 million training images across 1,000 categories. In 2011 the best system achieved a top-5 error rate of 25.8%. In 2012, Geoffrey Hinton's team at the University of Toronto entered AlexNet — a deep convolutional neural network trained on two NVIDIA GTX 580 GPUs over five to six days — and achieved a top-5 error rate of 15.3%. The second-place system scored 26.2%. A gap of 10 percentage points was unprecedented.

  • AlexNet had 60 million parameters across eight layers — five convolutional and three fully connected — a scale only made practical by consumer GPU hardware.
  • The paper "ImageNet Classification with Deep Convolutional Neural Networks" by Krizhevsky, Sutskever, and Hinton became the most cited computer science paper of the 2010s.
  • Google acquired DeepMind in 2014 for £400 million. DeepMind's AlphaGo defeated world Go champion Lee Sedol 4–1 in March 2016, a milestone considered 10 years ahead of schedule by AI researchers.
  • OpenAI released GPT-3 in 2020 with 175 billion parameters. ChatGPT, launched in November 2022 on GPT-3.5, reached 100 million users in two months — the fastest consumer product adoption in history at the time.
artificial intelligenceAI historymachine learning

Related Articles