Bohr’s 1913 paper which presented the idea of electrons “jumping” between fixed orbitals was a huge step forward, although its predictions only worked for single-electron hydrogen atoms and did not predict the correct wavelength of spectral lines for more complex atoms.
The world that Bohr grew up in was based on Newton’s mechanics (which explained how particles accelerate due to net forces) and the force of gravity and Maxwell’s electromagnetism along with statistical explanations of heat. But Bohr could see that those “rules” were wrong in some way – they predicted that the hydrogen electron (being an accelerating charge) would cause EM waves thereby losing energy and spiralling into the nucleus. Since this didn’t actually happen, it was clear to Bohr that new rules would be needed. But he didn’t rip up the whole rulebook – after all, the existing rules had done a good job of explaining all sorts of other phenomena. Instead he looked to add a minimal set of new rules or postulates and keep the rest of existing physics “in play”. He chose to retain the Rutherford picture of orbiting electrons, where electrons are like little planets with known mass, velocity and position at all times. To this, he added the new rule that electrons orbited in circles, and the angular momentum of the electron was only allowed to take on discrete values.
To stay in a circular orbit at some distance, there’s only one velocity that works (any other velocity gives an elliptical orbit). Since mass is fixed, and the orbital radius and velocity are interrelated, this means that discrete angular momentum only allow discrete orbits each with a specific radius and velocity and therefore kinetic and potential energy. Specifically, in the first allowed orbit, the electron is moving at about 1/137th the speed of light, the orbital radius is 0.05nm and the energy is -13.6eV (the zero point is taken to be an electron very far away).
How far does this model get us in terms of explaining our experimental data? It describes the hydrogen lines well – the visible Balmer lines are understood to due to electrons “jumping” to the 2nd lowest orbit from the 3rd/4th/5th/etc orbits. But it doesn’t explain what happens in multi-electron atoms like Helium. Nor does it explain why some lines are more intense than others. It doesn’t explain the Zeeman effect splitting of line. And finally, it is not a general explanation of how particles move in the presence of forces: it only describes the special case of a negative charge moving in a central electric field caused by the positive charge of the nucleus. It doesn’t tell you how a free electron would move, nor an electron in a linear electric field. Finally, even the foundations are flawed – the choice to explain the discrete energy levels in terms of discrete angular momentum isn’t right – we now know that the ground state of hydrogen has zero angular momentum, not the ? amount that Bohr modelled.
But still, it was a huge breakthrough – making it clear that the explanation of atom-level phenomena was going to require a fresh set of rules.
Bohr’s choice to focus on circular orbits was curious, since every physicist is familiar the fact that particles in a central inverse-square force move in elliptical orbits in general. Consequently, Sommerfeld tried to extend Bohr’s reasoning to include elliptical orbits, guided by the requirement that the resulting orbits still needed to have the discrete Bohr energies necessary to cause the hydrogen spectral lines. Sommerfeld realised that the eccentricity (the shape of the ellipse) had to also be quantised to achieve this. But initially, this extra step didn’t seem to yield anything useful except more complexity – it just gave the same ‘jumps’ as Bohr although there were now many more ways to achieve them. You now need two ‘quantum numbers’ to describe the orbital – Bohr’s original ‘n’ and Sommerfelds new ‘l’ but since the energy of the orbital is determined by ‘n’, what’s the point? Who cares if there’s a few different shapes of orbital if they all have the same energy, and it’s the energy we care about.
However, the nice things about elliptical orbits is that they’re not symmetric – the electron moves more in the long axis of the ellipse than the short, and creates the possibility of explaining the Stark and Zeeman effect as being the interaction of this motion with the direction of electric and magnetic fields. This gives a hint that Sommerfeld might’ve been onto something, but in the early days it was definitely just a “guess with some hope.
Bohr’s circular orbits imply that there is an ‘orbital plane’ and therefore a special distinguished axis. If you had a 100 hydrogen atoms, you might expect them to be randomly aligned. But since a charge moving in a circle causes a magnetic field, you could also argue that they might tend to line up with each other. Or, if you applied a strong external magnetic field, you could cause the axes to all align in a single direction. Or if you fired the atoms through an inhomogenous magnetic field, the amount they were deflected would tell you about the angle their axis made with the magnetic field direction.
However, Sommerfeld’s work added something surprising. Sommerfeld tried to generalize Bohr’s one-parameter circle orbits to two parameters (to allow for ellipses) and then three (to allow for ellipses oriented in 3d space) whilst retaining the spirit of Bohr’s quantization condition for angular momentum. What he found was, rather confusingly, that in 3d space the quantization condition only allowed for elliptical orbit planes in particular orientations. This seems very odd, since it presupposes that there is some ‘preferred’ direction in the universe against which these allowed orientations are measured. (Skipping ahead, we now understand this in terms measurement in a chosen axis, but with the particle state being in general a superposition of the possible basis states, but the idea of superpositions of quantum states was several years in the future). Weird as it may sounds, it’s nonetheless a prediction that you can design an experiment to test. A charge orbiting in a plane acts like a little magnet. If you fire a suitable atom through an inhomogenous field, they get deflected by an amount related to the alignment of the “little magnet” with the inhomogenous field. If the electrons really could only live in discrete orbital planes, the atoms ought to get deflected in a few discrete directions. If the electrons could live in any orbital plane, you’d get a continuous spread of deflections.
If you think the idea that orbital planes can only exist in certain orientations relative to an arbitrary choice of axis sounds, well, wrong – then you’re not alone. Even Debye, who had also derived the same idea, said to one of the people proposing to actually measure it “You surely don’t believe that [space quantization] is something that really exists; it is only a computational recipe”. In other words, even to the people who came up with the idea it was little more than a utilitarian heuristic – a mathematical procedure that got the right answers by a wrong route. Even Stern, one of the experimenters, later said he performed the experiment in order to show that the whole idea was incorrect. And his supervisor, Born, told him there was “no sense” in doing the experiment. Furthermore, according to classical physics when you put ‘little magnets’ into an external magnetic field, they precess around the axis of the magnetic field rather than doing any kind of ‘aligning’.
At this point in history, a rather surprising thing happens. We now know that Bohr/Sommerfeld’s prediction of the magnetic moment and angular momentum was wrong – they predicted it was ? whereas we now know it is zero. But Stern and Gerlach, who performed the inhomogeous magnetic field experiment, didn’t know that. Had that been the full story, they would’ve found no deflection. But in fact, they found that their beam of atoms did split nicely into two. What they didn’t know about – noone knew at that time – was that electrons have an intrinsic magnetic moment of their own that can take on two values. This electron “spin” was the mechanism that produced their observed result. But, being unaware of spin, they wrongly concluded that they had demonstrated the reality of Sommerfeld’s “space quantization” – in fact, they had demonstrated a different kind of quantization.
(Interestingly, although most descriptions focus on angular momentum as the important concept, Stern’s own nobel lecture doesn’t mention angular momentum at all. It only talks about the magnetic moment. There’s an assumption implicit that magnetic moments are what you get when you have charge and angular momentum, but since it’s the magnetic moment that determines the deflection in the Stern-Gerlach experiment I, like Stern, prefer to talk about magnetic moments and leave it for someone else to worry about how that magnetic moment comes about).
So where does that leave Sommerfeld’s ellipses? They’re still supported both by their ability to explain the Stern and Zeeman effect (partially) and also for the fact that Sommerfeld also calculated a relativistic correction for his elliptical orbits which made the prediction of spectral line wavelengths match experimental data slightly more accurately (in Bohr’s circular orbits, the electrons travel at c/137 or gamma=1.00002, and the speed will be higher in ellipses that do “close passes” to the nucleus, so you start to get close to the point where special relativity starts making an impact).
Spin now enters the picture, as a highly “unclassical” concept. The story starts with simple pattern spotting. In 1871, Mendeleev organised the known elements into a table based on their chemical properties. He didn’t know it at the time but he’d stumbled upon the sequence of atoms with increasing number of electrons, and the groups he perceived gained their commonality through having the same number of electrons in their outermost shells. But several steps were required to make this connection. Firstly, the Bohr model gave the idea of discrete orbits each with different energy. Then Sommerfelds elliptical orbits gave several different alternative shapes for a particular energy of orbit (“degeneracy”). A paper by Stoner in 1924 made a connection between the number of spectral lines of an element (once degenerate states had been split out using the Zeeman effect) and the number of electrons in the subsequent noble gas. (Stoner’s career prior to this point had been rather desperate). This observation lead Pauli to realise that a simple rule of “only one electron is allowed in each quantum state” was possible, but only if an extra two-valued quantum number was used. Initially Pauli didn’t offer up any explanation of what this two-valued thing was. Goudsmit and Uhlenbeck subsequently proposed that it could be caused by the electron spinning around its own axis, something which was later shown to be wrong (electrons seem to have no size, at least every attempt to measure their size finds it smaller than we can measure, and so to create enough angular momentum the tiny tiny spinning top would have to rotate very quickly, such that its surface would be going faster than the speed of light). But although the picture was wrong, the idea that electrons have their own intrinsic two-valued angular momentum and magnetic moment is correct – as, in fact, the Stern Gerlach experiment showed.
Like Sommerfeld’s ellipses, the two possible electron spin states don’t have much effect on the energy – it’s still dominated by the original Bohr ‘n’. But spin does make small changes to the energy. A particle with spin is like a small magnet, and a small magnet orbiting a positive nucleus has an electromagnetic interaction – Lamour interaction and Thomas precession. This causes small changes to the orbit energy, resulting in splitting of spectral lines – a processes now named “spin-orbit interaction”. Sommerfeld’s ellipses cause a
But how was Pauli to incorporate his new “two valued” quantity into the Bohr-Sommerfeld model. It seems that he didn’t. Pauli published his exclusion principle in January 1925. Heisenberg wrote his matrix mechanics paper in July 1925, and Schrodinger published his wave mechanics in 1926. These approaches were much more general than the Bohr-Sommerfeld approach – a genuine ‘mechanics’ explaining how particles evolve over time due to forces. In 1927, Pauli formulated the “Pauli Equation” which is an extension of the Schrodinger equation for spin-1/2 particles that takes into account the interaction between spin and external electromagnetic fields.
Although initially the Heisenberg and Schrodinger approach looked very different, Dirac was able to show that both are just different realisations of a kind of vector space, and that quantum mechanics was a big game of linear algebra which didn’t care if you thought of those vectors were ‘really functions’ or not. Dirac was happy to go somewhat off-piste mathematically, using his “Dirac delta” functions which are non-zero except at a point yet their integral is one. His work was followed up by von Neumann whose book took a more formal rigorous mathematical approach, objecting to Dirac’s use of “mathematical fictions” and “improper functions with self-contradictory properties”. The approach is much the same, but the foundations are made solid.
In the Schrodinger picture, a particle is described by a complex-valued wave function in space. The Schrodinger equation shows how the wave evolves in time, as a function of the curvature of the wave and a term describing the spatial potential. In the case where a particle is constrained within a potential well, such as an electron experiencing the electrostatic attraction of a nucleus, the waves form ‘stationary’ patterns (the wave continues to change phase over time, but the amplitude is not time-dependent). In a hydrogen atom, the stationary states in three dimension are combinations of radial, polar and azimuthal half-waves which result in amplitudes that vary spatially but not with time. The radial, polar and azimuthal contributions match up with the three quantum numbers from the Bohr model (n,l,m) reflecting the fact that the Schrodinger approach is much more general – the Bohr model “falls out” as being the special case of a single particle in a central electrostatic field.
As is often the case, although the Schrodinger equation is very general, only a few simple symmetric cases (such as the Hydrogen atom) result in a nice compact mathematical expression. For more complex cases, one can do numeric simulation (ie. rather than viewing the Schrodinger equation as stating a criteria for a solution in terms of it’s time derivative and spatial curvature, you can view it as an algorithm for evolving a function forward in time). Alternatively, one can apply perturbation methods, originally invented when studying planetary motion. Perturbation methods are similar to approximating a function using the first few terms of a power series; you take a state you can solve exactly (hydrogen atom) and assume that a small change (small electric field) can be modelled roughly using a simplified term for the difference. For example, this can be used to show the Stark effect (approximately) – where the lines of Hydrogen are split by an electric field.
But the new ‘quantum mechanics’ were quite different to the Bohr model. The Bohr model painted a picture of electrons being “in” some orbital then (for reasons unknown) deciding to jump to some other orbital. But in the Schrodinger/Dirac picture there were two very different processes going on. As time passed, the system would evolve according to the wave equation. But if a measurement of position, energy or momentum was made the wave function would “collapse” into a basis state (eigenvector) of the linear operator associated with that observable quantity. This collapse was evident because subsequent measurements would give the same answer, since the system had not had a chance to evolve away from the eigenstate. However, in general, the state would exist in some weighted linear combination (“superposition”) of any choice of basis states. If you made two different measurements (say position and momentum) whose linear operators did not have the same set of eigenvectors, then the result is dependent on the order you perform the measurements.
Schrodinger did not consider the effect of spin in his original equation (ie. the spin-orbit coupling, or the interaction of spin with an externally applied field). Thus, it required an extension by Pauli to reflect the fact that an electron’s state wasn’t just captured in the wave function. To include spin into the system state isn’t just as simple as recording a “spin up” or “spin down” for a given electron. The particle can be in a linear combination of two spin basis states. And, much like how multi-particle systems are modelled with tensor products to yield joint-probabilities, there can be dependencies between the spin state and the rest of the state.