Man, the world has changed a lot since my days learning about silicon fabrication. Back then things were pretty easy, if you had a 10-micron process, that means you made a transistor with wires that could at most be 10 microns wide. Well, hold onto your hats, the world has gotten a lot more complicated, but to understand what is going on, you need to know a little bit about the history of computers and transistors
What is a Transistor? Holmdel BellWorks
Well, the most basic idea is what is a transistor, I actually lived in a town where a water tower was literally a transistor, and it is still there (not just as the exterior of Severance), but you can see a water tower with three legs (the source, drain and gate as explained below). But 61 years later, the building is still there
As an aside, if you want to see how this award winning building was made and see what in 1982 they thought the future would be
Back to the Plot: What’s a Transistor
The basis of everything is a pretty simple idea. If you have current going from the Source to a Drain, then a Gate allows the current to flow if it is powered, but when there is no power, there is no current. Amazingly, this simple idea powers basically every electronic device that you use. Turns out if you connect about 50 billion of these together, you get a MacBook Pro processor 🙂
As Physics World explains, the fundamental idea is that the gate has voltage on it and there is this magic layer called the insulating layer, so when there is power, the dielectric allows current to flow (the greener the more voltage at the gate). So the move voltage, the more power that flows.
As an aside, this whole idea is what is called a Metal-Oxide-Surface Field-Effect Transistor (MOSFET) and the name is actually pretty self-explanatory, the field effect is this idea that when there is an electric field at the gate, current flows. The metal oxide surface makes some sense too, this is a surface system.
The reason that there is a Bell Labs tower is that William Shockley and others proposed this kind of system. This got layered on and layered on (no pun intended) until in the 1960s, Mohamed Italia and Dawon Kahng demonstrated the first working MOSFET at Murray Hill Bell Labs and they fabricated the first modern transistors that were 20 microns the width of the smallest “feature” (think of this as a wire) was 20 microns (20,000 nanometers).
Things were great from 20 microns to 20 nanometers darn it Quantum Mechanics!
Well this kind of basic structure worked great for 20 years. What happened is pretty simple, at 20 microns, the barrier from source to drain because a typical atom used in silicon fabrication is 0.2nm wide. So at 20 nanometers, there are only 200 atoms (wow think about that) between the two sides. At this point, something called tunneling leakage happens, literally because of quantum mechanics, so that even though you don’t want any electrons to cross, they will quantum tunnel across. The result is that even if you don’t want current to flow and you get lots more power consumption and ultimately the transistor can break down.
Enter the FINFET Thanks you Berkeley from 20nm to 3nm
Well, the solution to this problem is pretty clever, instead of thinking of transistors as flat, go vertical, literally, put a “fin” up vertically through the gate and suddenly you have more surface area and things are more efficient as shown at Engineers Garage. This was invented at Berkeley and it has led us all the way down to the current 3 nm designs
As an aside, the whole idea that 3nm refers to the gate length is gone. It doesn’t refer to anything. In fact, in a 3nm design, the actual minimum wire is a nominal 24nm (?!) but they don’t call it a 24nm process, but the idea is that they for marketing reasons just define 3nm as being better than 5nm, go figure. That makes it hard to compare say the Intel 10nm process with the TSMC 3nm process as an example. So that’s one reason why in the old days, you could say that going from 50nm to 25nm would increase density by 4x, now that isn’t the case at all, so don’t just do that square math.
The second thing is that it has become so hard to move from one process to another that TSMC and others will have waves of each technology, so the so-called 3N node is actually divided into point versions called N3B (base), N3E (extended), N3P (performance) and N3X (higher voltage). So for instance, the new M3 MacBooks are probably running on the first technology the base N3 (now called N3B). To see how this is a little arbitrary as an example N3E vs N5 v1.2 (eg the previous process) when you look at the Power, Performance and Area improvements (PPA) is relatively modest, the way this is specified is a little strange, so bear with me, but Performance is how much faster if you are running at the same power, Power means how much less power at the same speed, then area is quoted in two ways, the problem is that logic circuits can shrink a lot, but certain parts like static RAM (used for caches0 do not shrink very much at all, so you typical quote both logic and overall chip improvements (where you assume a certain amount of unshrinkable area). You can see some figures thanks to Anandtech. So the improvements are 18% faster as long as the power you use is the same. Alternatively, if you don’t want it to go faster, you can reduce the power by 32%. Then the overall shrinkage of the logic areas is 1.6x which is higher than what would be expected for an average chip (with unshrinkable such as memory) so it is only 1.3x:
Moving to GAAFET and Nanowires and Nanosheets
The new big shrink will be the 2nm nodes or N2 and this is where even using fins do not work anymore, instead, the structures get more complicated. The easiest way to understand this is that to get more current through, you just keep making the fins taller and taller, but there is a natural limit. So instead, you basically cut the fins into wires and have the gate “all around the part they are going to allow flow through. This is a way to get the maximum surface area between the gate and the dielectric. The first scheme is called a nanowire where you put little circular parts. And then there are nano sheets where instead of a narrow wire, you pass a flat looking things and this is what is called a Gate-All-Around FET (Anandtech)
The benefit of this is pretty subtle but important, while the diagram above shows a single “fin” in a FinFET, in reality, modern designs can have multiple fins, so looking overhead, the gate is long and you just keep adding more fins so that you can tune how much power you need by the number of fins that you have. Modern designs can have 6-7 fins per transistor.
But with the Nano wire and sheets, you can change the actual shapes (not just the number of fins), this fin control let’s you precisely tune how much power you need to turn things on and off just by changing the wire or sheets dimensions, this is the idea if you look directly downwards on an multi-fin finFET or a Nanosheet FET
So there you have it what you though was simple going from 5nm to 3nm to 2nm should be a revolution proves way more complicated than you think, its all about these complex structures needed just to switch a signal on and off 🙂
Of Chiplets and 2.5D Integrated circuits
The final thing to know is that we are reaching some limits of a single piece of silicon, so in concert with the move to the N2 process which is going to happen in parallel with the rollout of N3E, N3P and N3X, there is a corresponding move to let you stick multiple chips in the same package, something called “chipset integration, this has moved from what is called 2D packaging, so you stick the chips side by side on a wafer to 2.5D IC where you can called “Chip on Wafer on Substrate” CoWoS. This is somewhere between 2D and 3D where you put the chips side by side, but there is underneath them an “interposer” which is an interconnect. You can think of it like a two-level high system, the logic lives at the top level and the level below has the connections only.
One advantage of this is that you can have lots of smaller chiplets with higher yields than a single big chip and stitch them all together. The ultimate implementation is a System on Chip where you put all the chips needed for say an Apple Watch into a single integrated package. This is also one reason why Apple Silicon is so much faster than traditional computers where the memory is mounted separately on boards which is more flexible, but with the memory in the same package, you can be incredibly fast since the distances are so short.
A true 3D system lets you stack multiple layers on top complete flexibly and that’s coming too.
Thank you ASML for the machines that make this all possible
What makes this all possible is Advanced Semiconductor Materials Lithography (ASML Holdings NV) which is a $300B company in the Netherlands that makes all the extreme ultraviolet lithography (EUV) equipment needed to make features this small. It is too much to go into our things are fabricated, but basically, it requires shining ultraviolet light at 13nm to make the masks for these kinds of chips.
To get a sense of the scope of these machines, their Twinscan NEX-3600D costs $200M and needs 40 shipping containers, 20 trucks and three Boeing 747s to deliver. There are only 140 in the world as of 2022 and TSMC has most of them.