Milestones:Development of the BELLMAC-32 Microprocessors, 1976-1982
- Date Dedicated
- 2025/10/21
- Dedication #
- 285
- Location
- Murray Hill, NJ
- IEEE Regions
- 1
- IEEE sections
- North Jersey
- Achievement date range
- 1976-1982
Title
Development of the BELLMAC-32 Microprocessors, 1976-1982
Citation
Developed between 1976 and 1982, the Bell Laboratories BELLMAC-32 microprocessor series introduced many seminal design concepts, including 32-bit wide internal and external transfers, high-speed domino circuits to reduce complex logic gate delay times, a twin-tub CMOS process for improved power efficiency and performance, interconnect-centric logic design for signal delay reduction, gate-matrix layout which increased density, and instructions which implemented certain UNIX operating system and C programming language operations.
Street address(es) and GPS coordinates of the Milestone Plaque Sites
Nokia Bell Labs, Bldg. 6, 600 Mountain Ave, Murray Hill, NJ 07974 US (40.684042, -74.400856), Nokia Bell Labs, Bldg. 6, 600 Mountain Ave, Murray Hill, NJ 07974 US (40.684042, -74.400856)
Details of the physical location of the plaque
As noted in the letter from the President of Bell Laboratories, the plaque bearing a citation describing the historical importance and impact of the BELLMAC‐32 Microprocessor will be allowed to be installed at the front entrance of Nokia Bell Labs. It is agreed that the plaque will be publicly accessible during at least normal business hours and will also be ADA-accessible.
How the plaque site is protected/secured
The plaque will be allowed to be installed at the front entrance of Nokia Bell Labs. This area is publicly accessible during normal business hours and will also be ADA accessible and this area is under 24-hour surveillance.
Historical significance of the work
BELLMAC-32 Development: 1976-1982
After having invented the Bipolar Junction Transistor in 1947 and fabricating MOS transistors in 1960, AT&T decided to engage in the newly developing microprocessor business starting in the mid-1970s. The prevailing VLSI technology at that time was based on NMOS transistors due to their high electron mobility. Researchers at Bell Laboratories decided to move to the twin-tub CMOS process for optimal performance of both NMOS and PMOS transistors, and low power consumption in CMOS circuits.
In 1976, Bell Labs formed the Microprocessor Advisory Committee which was charged with its microprocessor effort. This initially led to the creation of the 8-bit BELLMAC-8 microprocessor in 1977, which was the first step in development of the BELLMAC-32. While the market at that time was focused on 16-bit processors, Bell Laboratories decided to leapfrog from the 8-bit BELLMAC-8 to the 32-bit BELLMAC-32. The first version of the BELLMAC-32, known as BELLMAC-80 at that time, used a gate-matrix layout for a registered ALU (RALU) data path and standard cells for control logic implementation, and was fabricated in Murray Hill in 1980 using a 3.5um process. A follow-on version named the BELLMAC-32A using 2.5um process was released in 1982.
The BELLMAC-32 microprocessor series incorporated at least six critical firsts in the VLSI industry.
1: First “32-bit wide internal and external transfers”
Introduced in 1980, the BELLMAC-32 was the first 32-bit microprocessor. It thus predated 32-bit microprocessors from Motorola, Intel, and other manufacturers. It was designed with the capability to move 32 bits in one clock cycle internally, and externally by way of its 32 I/O package pins. The BELLMAC-32 CPU performs all system address generation, control memory access, and processing functions required in a 32-bit microprocessor system. The system memory space is addressed over the 32-bit address bus using physical or virtual addresses. Data is read or written over the 32-bit bi-directional data bus in byte (8-bit), half-word (16-bit), word (32-bit), or bit-field (1 to 32 bits in length) widths as referenced in Chapter 2, Section 2.2.2 of the book “32-Bit Microprocessors, 2nd Edition.” The movement of 32 bits in one clock cycle is illustrated in the timing diagram in Figure 11 Block (Double Word) Fetch Timing, p. 2-130, of the “AT&T WE 32-Bit Microprocessors and Peripherals Data Book”.
2: First “high-speed domino circuits which reduced complex logic gate delay times”
Domino circuits were first developed at Bell Laboratories and were applied to replace complex logic gates in time-critical signal paths. In the BELLMAC-32, this overcame some bottlenecks. Multi-input CMOS complex gate structure consists of an n-channel tree and a p-channel tree connected to a shared drain node. P-channel trees cause large signal delays resulting in speed bottlenecks. In domino circuits, the p-channel tree is replaced by a single pMOS transistor controlled by a pre-charge clock signal, with an inverter added to allow domino circuits to be cascaded. This pre-charging capability was key to the speed gain as it significantly reduced the delay time. As such, domino circuits performed much faster than conventional complex CMOS logic circuits.
The original paper on domino circuits, “High-Speed Compact Circuits with CMOS”, was published in the June 1982 issue of the <i>IEEE Journal of Solid-State Circuits</i>. This paper described how domino circuits were used for the 32-bit arithmetic logic unit (ALU) in the datapath of the BELLMAC-32. Since its publication, this paper has been frequently cited. In addition, it received the 2000 IEEE Donald O. Pederson Award for Solid-State Circuits, the most prestigious award given by the IEEE Solid-State Circuits Society.
3: First “twin-tub CMOS process for improved power efficiency and performance”
The twin-tub CMOS process was developed at Bell Laboratories, and it was first disclosed publicly at the 1980 IEDM meeting and in the paper “Twin-tub CMOS- A technology for VLSI circuits.” In this paper, L. C. Parrillo and R. S. Payne, et al. described how latch-up was prevented by using a lightly doped n+ or p+ substrate. The process used an epitaxial layer, followed by high-purity silicon layers with precise dopant concentrations. It employed a single mask that allowed it to form two independently doped and self-aligned tubs. In each tub, transistors were formed by implanting source and drain regions. This process allows the optimization of NMOS and PMOS transistors independently. US Patent 4,435,896, with Inventors L. Parrillo and R. Payne, titled “Method for fabricating complementary field effect transistor devices,” and granted on March 13, 1984, discloses an 8-mask twin-tub CMOS 3.5um process as used in the BELLMAC-32.
4: First “interconnect-centric logic design for signal delay reduction”
In VLSI design, some signal paths suffer from long delays due to lengthy interconnects. The initial version of BELLMAC-32 in 1980 had serious delay problems due to long interconnects in its random logic control section layout. This problem is common when standard cells are used to lay out the control logic section of VLSI chips using automatic place-and-rout tools such as LTX. To address this issue, a new design method was pioneered and introduced in 1982 with the BELLMAC-32A to adapt the logic design for lengthy interconnects between logic gates. Specifically, logic gates driving lengthy interconnects were redesigned to reduce signal delay times. The paper by S. M. Kang, et al., “Gate Matrix Layout of Random Control Logic in a 32-bit CMOS CPU Adaptable to Evolving Logic Design,” in the <i>IEEE Trans. on CAD</i>, 1983, describes how the gate-matrix layout was used to implement logic design changes to reduce signal propagation delay times due to long interconnects.
5: First “gate-matrix layout which increased density”
The gate-matrix layout method was created to minimize RC parasitics amongst interconnected transistors, especially for dense connection of IGFET “gates,” by using a “matrix” structure of poly-silicon columns for transistor gates and rows of transistor diffusions connected by metal lines. This gate-matrix layout style proved to be highly effective for increasing the layout density and reducing the circuit delay impact in the datapath and control logic sections. A new software system was developed to use this entirely new layout method since no support was available for this layout method from the in-house CAD Center, or anywhere else. This method was the first of its kind and was disclosed in US Patent 4,319,396, “Method for fabricating IGFET integrated circuits,” issued on Mar. 16, 1982. After observing the gate-matrix layout process for BELLMAC-32 during his sabbatical leave, Prof. Omar Wing of Columbia University introduced a graph theory-based gate-matrix layout optimization method as described in the paper “Gate Matrix Layout,” published in the <i>IEEE Trans. on CAD</i>, vol. 4, no. 3, 1983.
6: First “instructions implementing some UNIX operating system and C programming language operations”
The paper "The Operating System and Language Support Features of the BELLMAC-32 Microprocessor" published at the 1982 ACM Symposium on Architectural Support for Programming Languages and Operating Systems observed that the BELLMAC-32 architecture was the first microprocessor to incorporate architecture elements to allow for more efficient operating system implementation, as well as easier compilation of C language programs. This included multiple data types (e.g., byte, half-word, and word integer), an instruction set that allowed for indication of the type of operands and results, automatic conversion of data sizes, and the saving of many data sizing conversion steps in its implementation. This design allowed for fewer instructions to code, which was particularly important at the time when memory space was limited.
Beyond logical and arithmetic functions, the BELLMAC-32 also had built-in operations to handle strings, which were defined in C as a sequence of bytes terminating with a NULL (0) byte. String operations such as move, copy, and merge were each implemented as a single instruction. Similar designs did exist in some mainframe systems, but not in other microprocessors at that time. While the Intel 8086 that came out in 1978 did offer some basic string functions, these did not lend themselves directly to C string operations.
The BELLMAC-32 Architecture included several design features to support operating systems, particularly UNIX and Duplex Multiple Environment Real Time (DMERT). Functions that were supported incorporated a process-and-procedure stack that allowed switching between processes, and subroutines could be invoked with a single instruction. This design simplified interrupt management by allowing an interrupt handler to operate in its own environment regardless of the interrupted process – a capability supported by some mainframe systems, but which was new to the microprocessor world. This greatly simplified UNIX implementations for general-purpose computing, and for real-time functions as needed for the switching systems in which the BELLMAC-32 was often employed.
This operating system-optimized architecture used the concept of register windows to automatically switch the register values as needed and was first implemented in BELLMAC-8. Earlier microprocessors such as Intel 4004 and Intel 8086 did provide simple operations for subroutine calls, but they lacked process management steps. The description of the Intel 16-bit processor can be found in Intel’s publication on the 8086 architectures. A comprehensive review and description of the AT&T 32-bit microprocessors and their chipsets are in Chapter 2 of the book “32-Bit Microprocessors: 2nd Edition.” This book also reviews the architectures for the 32-bit Intel and Motorola microprocessors that came out after the BELLMAC-32.
These design firsts were foundational for subsequent industry VLSI architectures
The six above-described design firsts laid the foundation for many subsequent developments and architectures across the industry, and many remain important in the 2020s. The 1998 IBM article, “Design issues in mixed static-domino circuit implementations,” published in the <i>Proceedings of the International Conference on Computer Design</i>, discusses design issues in mixed static-domino circuit implementations, including how domino circuits were crucial in VLSI technology at that time. Related articles published later on the subject are referenced in the Wikipedia page on domino logic, showing its subsequent impact across the industry.
The Intel 80386 introduced operating system process management and support for data operations in high-level languages as was first realized for microprocessors in the BELLMAC-32 series. RISC processor architectures were designed to use simpler instructions (for example not automatically converting data sizes) but followed the silicon technologies and features like register arrays that were introduced in the BELLMAC-32 series.
BELLMAC-32 Development in 1984-1987
The BELLMAC-32 was a series of microprocessors that included the six key "firsts" described above. The production version of the BELLMAC-32 was called the WE 32100, and it was mass-produced by Western Electric in Allentown, PA using a 2.5 um CMOS process. The WE 32100 was used in the AT&T Computer Systems' 3B series of computers (the 3B2, 3B5, and 3B15) which were unveiled commercially at the Spring 1984 Comdex show. In mid-1985, AT&T started to offer the WE 32100 and an associated chipset, along with VMEbus board-level evaluation systems, to other manufacturers. Descendant processes were used for the WE 32200 using a 1.75 um CMOS process, and manufactured in 1987.
Obstacles that needed to be overcome
The BELLMAC-32 development was not fully supported by CAD tools, test machines, and evolving process technology. Physical design verification had to rely on several tens of CALCOMP plots carefully scotch-taped together. Each signal tree was manually checked using colored pencils for correctness and continuity.
The most advanced test machine, the Takeda-Riken, was used to test the 4-inch BELLMAC-32 wafers. It was found that the measured delay times were not correct due to transmission line effects between the probe and the test head in the Takeda-Riken tester. Engineers from Takeda-Riken in Japan came to Murray Hill to address this problem, with help from Bell Laboratories test engineers. The paper, “Sub-nanosecond Measurements on MOS Devices Using Modern Test Systems”, was presented based on this work by Mark Barber, and it received the Best Paper award at the 1983 International Test Conference.
Features that set this work apart from similar achievements
Each of the six design firsts cited in the citation set this work apart. Note that the 1980 BELLMAC-32 predated these early 32-bit microprocessors: Intel iAPX 432 (1981), RISC-I prototype from UC Berkeley (1981), Motorola 68020 (1984), and Intel 80386 (1985).
US Patent 4,403,287, “Microprocessor Architecture with Internal Access Means,” shows that the BELLMAC-32 provided a single-chip architecture that permits the registers and control latches of the processor to be easily accessed without using instructions to achieve such access. This Internal Access capability facilitates program development in the processor because it provides an efficient means for observing internal machine states and register contents of the processor. Functional testing of the processor is particularly facilitated because the Internal Access function increases the availability of the internal nodes of the chip for the application of test signals, and for the observation of circuit responses.
Significant references
- BELLMAC-8 Wikipedia webpage
- CPU of the Day: Bell Labs BELLMAC-8, aka the WE212
- Bellmac 32 Microprocessor
- First-Hand Perspective from Dr. Sung Mo (Steve) Kang: The AT&T BELLMAC-32 Microprocessor Development
- BELLMAC-32 Module: From Modeling to Product
- The Operating System and Language Support Features of the BELLMAC-32 Microprocessor
- On The BELLMAC-32, And Perhaps The World’s Largest Plotter Pen Drawing
- Domino Logic Wikipedia webpage
- High-Speed Compact Circuits with CMOS
- Hardware configurations and I/O protocol of the WE32100 microprocessor chip set
- AT&T Computer Systems Wikipedia page
- AT&T 3B Series Computers Wikipedia page
- Bell Labs: A Brief Introduction
- COMDEX Wikipedia webpage
- M6800 Family Assembler User’s Manual