COVID + Hydroxychloroquine

I’m not a medic, but I’ve worked with data, analysis and experiments for a while. This blog post is a “what’s going on” summary of hydroxychloroquine use in treating COVID-19.

Hydroxychloroquine (HCQ for short) is a medicine that helps treat a few different conditions, like malaria and arthritis. Maybe it’ll help COVID too? Noone was sure, so many hospitals around the world started trying it and recording the outcome.

The question we want to answer is: does giving this medicine 1) help people, 2) harm people, or 3) does nothing at all when given to COVID patients. If it does have an effect, it might have a big effect or a small effect. The effect might be different in different patients, possibly due to age, genetics (including gender, race/ethnicity) and existing health conditions. We always start in a “we don’t know” state, and use data to end up in one of the three answers.

What’s happened recently is that we went from “don’t know” to “it harms people”, based on one research group’s analysis of some hospital data. But, it looks like their analysis might not be done to a high enough standard. So we’re shifting back towards “don’t know”. The fact that the evidence that “it harms” has gone away does NOT mean that “it helps”. Lack of evidence is not evidence of lack. It just takes us back to “we don’t know”.

So how do try to answer the help/harms question? The ideal thing to do would be a “randomized control trial”, where we find a group of patients suffering from COVID and then randomly select half them to receive HCQ. This approach gives you the best ability to measure the true effect that HCQ has. However, this is not what’s happened in this story. Randomised controlled trials are slow to set up – you usually need to find consenting volunteers. COVID is more like war-time – doctors are trying lots of things quickly. They are tracking the outcomes, but they’re basing their decision on whether to give HCQ on their experience and belief that it’s the best course of action, rather than the coin-flip of a randomized control trial. Everyone wants the certainty of a randomized controlled trials (and the authors of the controversial paper explicitly call for one). But all we have just now is “observational data” – a record of who got what and when, and what the outcome was.

So can we use the outcome data to answer our question? To get enough data to answer the question, we need access to data from more than one hospital. Hospitals are rightly careful about sharing patient data so this isn’t an easy task. Fortunately, some companies have put in the effort to get contracts signed with several hospitals around the world and so the human race can potentially benefit from insights that are made possible by having this data in one place. One such company is Surgisphere. Surgisphere (and their legal team) have got agreements with 671 hospitals around the world. This gives them access to data about individual patients – their age/gender/etc as well as medical conditions, treatments they’ve received and outcomes.

Surgisphere therefore have a very useful dataset. For now, let’s assume that they’ve managed to pull all this data together without making any systematic mistakes (for example, some countries measure a patients height in centimetres whereas other might use inches – would Surgisphere have noticed this?).

Within Surgisphere’s dataset, they had information about 96032 patients who tested positive for covid. Of those patients, it so happens that the various hospitals had chosen to give HCQ (or chloroquine) to 14,888 patients. The dataset doesn’t tell us specifically why those 14888 got given HCQ – presumably the doctors thought it was their best option at the time based on the patient’s condition, age, weight etc.

Naively, you might expect that we could just compare the death rate in patients who got HCQ (those who we given the drug) with the death rate in patient who didn’t receive HCQ and see if it’s different.

Unfortunately, it’s not that simple. I’ll explain why shortly, but one key message here is “statistical data analysis isn’t simple, there’s a bunch of mistakes that are easy to make, even if you do this a lot”. Consequently, it’s important that people “show their working” by sharing their dataset and analysis so that others can check whether they’ve made any mistakes. If other people don’t have access to the same raw data, they can’t check for these easy-to-make mistakes – and lots of papers get published every year which end up being retracted because they made a data analysis mistakes. Sharing raw data is hard in a medical setting – Surgisphere’s contracts with hospitals probably don’t allow them to share it. But without the raw data being shared and cross-checked by others, it’s reasonable to expect that any analysis has a good chance of having some flaws.

Why can’t we simply compare death rates? It’s because something like your age is a factor in both your chance of dying and whether you end up receiving HCQ from a doctor. Let’s assume for a moment that COVID is more deadly in elderly people (it is). Let’s also assume that doctors might decide the HCQ was the best treatment option for older people, but that younger people had some other better treatment option. In this scenario, even if HCQ has no effect, you’d expect the HCQ-treated patients to have a higher death rate than non-HCQ patients, simply due to their greater age. This kind of mixup is possible to try and fix though – if we know patient ages, we can make sure we’re comparing (say) the group of 80 year olds who got HCQ against the group of 80 year olds who didn’t get HCQ. We’ll look at some of the difficulties in this approach shortly.

The same reasoning applies for other patient factors like gender/race/ethnicity, existing health conditions etc. It also applies to other things that might influence patient outcome, such as what dose of HCQ was given, or how badly ill a patient was when they received HCQ. In an ideal world, we’d have data on all of these factors and we’d be able to adjust our analysis to take it all into account. But the more factors we try to take into account, the larger the dataset we need to do our analysis – otherwise we end up with just 1 or 2 patients in each ‘group’.

The whole dataset itself can easily be skewed. The hospitals which gave Surgisphere their data might all be expensive private hospitals with fancy equipment and good connections to whizzy American medical corporations, whereas hospitals in poorer areas might be too busy treating basic needs to worry about signing data sharing contracts. Private hospitals are more likely to be treating affluent people who suffer less from poverty-related illness. We can try to correct for known factors (like existing medical conditions) in our data analysis, but if the selection of hospitals itself was skewed then we’re starting the game with a deck stacked against us.

One worry is always that you can only adjust for factors that are mentioned in your dataset. For example, let’s suppose asthma makes COVID more deadly (I’m making this up as an example) but that our dataset did not provide details of patient asthma. It might the case that all patients with asthma all ended up in the HCQ group (could happen if some alternative treatment was available but known to be not-safe if you have asthma). But if our dataset doesn’t tell us about asthma, we just see that, overall, more HCQ patients died. We wouldn’t be able to see that this difference in death was actually due to a common underlying factor. We might WRONGLY go on to believe that the increased death rate was CAUSED by HCQ, when actually all that happened was higher-risk patients had systematically ended up in the HCQ group.

Back to the story: our plan is to try to pair up each patient in the HCQ group with a “twin” in the non-HCQ group who has exactly the same age, weight, health conditions etc. Doing so allows us tease apart the effect of age/weight/etc from the effect of getting given HCQ. But we almost certainly won’t find an “exact twin” for each HCQ patient – ie. someone who matches on ALL characteristics. Instead, we typically try to identify a subset of non-HCQ patients who are similar in age/weight/etc to the group of patients who were give HCQ. (This is called “propensity score matching analysis”).

The important work here is “try”. There’s usually not a good way to check whether you’ve done a good job here. I might do a rubbish job – perhaps the subset of non-HCQ patients I pick contains way more smokers than are in the HCQ group. We hope that our dataset contains all the important characteristics that allow us to make a genuinely representative set, but if it doesn’t then any comparisons we make between the HCQ group and our non-HCQ “twins” will not be telling us solely about the effect HCQ has. This is the fundamental problem with observational studies, and the only real solution is to do a randomised trial. (BTW, all of economics is based on observational data and suffers this problem throughout).

That’s enough stats details. The main point is that this kind of analysis is hard, and there’s a number of choices that the researcher has to make along the way which might be good or bad choices. The only way to check those choices is to have other experts look at the data.

This brings us to the objections that were raised against this initial publication. There are three kinds of objections raised:

1. The “we know it’s easy to make mistakes, and sharing data is the best way to catch mistakes” stuff. (objection 2). There’s no implication of malicious intent here; Surgisphere need to honour their contracts. But the societal important of understanding COVID is so high that we need to find ways to meet in the middle.
2. The “despite not releasing your raw data, there’s enough data in your paper that we can already spot data mistakes” (objection 5,6,7,8,9). Things like “the reported average HCQ dose is higher than the US max dose, and 66% of the data came from the US”. Or “your dataset says more people died in australia from covid than actually died”. It just doesn’t smell right. If you can spot two mistakes that easily, how many more are lingering in the data.
3. The “you skipped some of the basics” objections – no ethics review, no crediting of hospitals that gave data (objection 3+4)
4. The “you’ve not done the stats right” stuff – (objections 1 and 10)

None of this means that the researchers were definitely wrong; it just means they might be wrong. It also doesn’t mean the researchers were malicious; countless papers are published every year which contain errors that are then picked up by peers. To me that’s a science success – it help us learn what is true and false in the world. But it does mean that a single scientific paper that hasn’t been reproduced by other groups is “early stages” as far as gaining certainty goes.

The best way to know for sure what HCQ does to COVID patients is to run a controlled trial, and this had already started. But if you believe there’s evidence that HCQ causes harm, then ethically you would stop any trial immediately – and this is what happened (WHO trial and UK trial were both paused). But now the “evidence” of harm is perhaps not so strong, and so perhaps it makes sense to restart the controlled trials and learn more directly what the effect of HCQ on COVID patients actually is.


Hydrogen Atom 2

Bohr’s 1913 paper which presented the idea of electrons “jumping” between fixed orbitals was a huge step forward, although its predictions only worked for single-electron hydrogen atoms and did not predict the correct wavelength of spectral lines for more complex atoms.

The world that Bohr grew up in was based on Newton’s mechanics (which explained how particles accelerate due to net forces) and the force of gravity and Maxwell’s electromagnetism along with statistical explanations of heat. But Bohr could see that those “rules” were wrong in some way – they predicted that the hydrogen electron (being an accelerating charge) would cause EM waves thereby losing energy and spiralling into the nucleus. Since this didn’t actually happen, it was clear to Bohr that new rules would be needed. But he didn’t rip up the whole rulebook – after all, the existing rules had done a good job of explaining all sorts of other phenomena. Instead he looked to add a minimal set of new rules or postulates and keep the rest of existing physics “in play”. He chose to retain the Rutherford picture of orbiting electrons, where electrons are like little planets with known mass, velocity and position at all times. To this, he added the new rule that electrons orbited in circles, and the angular momentum of the electron was only allowed to take on discrete values.

To stay in a circular orbit at some distance, there’s only one velocity that works (any other velocity gives an elliptical orbit). Since mass is fixed, and the orbital radius and velocity are interrelated, this means that discrete angular momentum only allow discrete orbits each with a specific radius and velocity and therefore kinetic and potential energy. Specifically, in the first allowed orbit, the electron is moving at about 1/137th the speed of light, the orbital radius is 0.05nm and the energy is -13.6eV (the zero point is taken to be an electron very far away).

How far does this model get us in terms of explaining our experimental data? It describes the hydrogen lines well – the visible Balmer lines are understood to due to electrons “jumping” to the 2nd lowest orbit from the 3rd/4th/5th/etc orbits. But it doesn’t explain what happens in multi-electron atoms like Helium. Nor does it explain why some lines are more intense than others. It doesn’t explain the Zeeman effect splitting of line. And finally, it is not a general explanation of how particles move in the presence of forces: it only describes the special case of a negative charge moving in a central electric field caused by the positive charge of the nucleus. It doesn’t tell you how a free electron would move, nor an electron in a linear electric field. Finally, even the foundations are flawed – the choice to explain the discrete energy levels in terms of discrete angular momentum isn’t right – we now know that the ground state of hydrogen has zero angular momentum, not the ? amount that Bohr modelled.

But still, it was a huge breakthrough – making it clear that the explanation of atom-level phenomena was going to require a fresh set of rules.

Bohr’s choice to focus on circular orbits was curious, since every physicist is familiar the fact that particles in a central inverse-square force move in elliptical orbits in general. Consequently, Sommerfeld tried to extend Bohr’s reasoning to include elliptical orbits, guided by the requirement that the resulting orbits still needed to have the discrete Bohr energies necessary to cause the hydrogen spectral lines. Sommerfeld realised that the eccentricity (the shape of the ellipse) had to also be quantised to achieve this. But initially, this extra step didn’t seem to yield anything useful except more complexity – it just gave the same ‘jumps’ as Bohr although there were now many more ways to achieve them. You now need two ‘quantum numbers’ to describe the orbital – Bohr’s original ‘n’ and Sommerfelds new ‘l’ but since the energy of the orbital is determined by ‘n’, what’s the point? Who cares if there’s a few different shapes of orbital if they all have the same energy, and it’s the energy we care about.

However, the nice things about elliptical orbits is that they’re not symmetric – the electron moves more in the long axis of the ellipse than the short, and creates the possibility of explaining the Stark and Zeeman effect as being the interaction of this motion with the direction of electric and magnetic fields. This gives a hint that Sommerfeld might’ve been onto something, but in the early days it was definitely just a “guess with some hope.

Bohr’s circular orbits imply that there is an ‘orbital plane’ and therefore a special distinguished axis. If you had a 100 hydrogen atoms, you might expect them to be randomly aligned. But since a charge moving in a circle causes a magnetic field, you could also argue that they might tend to line up with each other. Or, if you applied a strong external magnetic field, you could cause the axes to all align in a single direction. Or if you fired the atoms through an inhomogenous magnetic field, the amount they were deflected would tell you about the angle their axis made with the magnetic field direction.

However, Sommerfeld’s work added something surprising. Sommerfeld tried to generalize Bohr’s one-parameter circle orbits to two parameters (to allow for ellipses) and then three (to allow for ellipses oriented in 3d space) whilst retaining the spirit of Bohr’s quantization condition for angular momentum. What he found was, rather confusingly, that in 3d space the quantization condition only allowed for elliptical orbit planes in particular orientations. This seems very odd, since it presupposes that there is some ‘preferred’ direction in the universe against which these allowed orientations are measured. (Skipping ahead, we now understand this in terms measurement in a chosen axis, but with the particle state being in general a superposition of the possible basis states, but the idea of superpositions of quantum states was several years in the future). Weird as it may sounds, it’s nonetheless a prediction that you can design an experiment to test. A charge orbiting in a plane acts like a little magnet. If you fire a suitable atom through an inhomogenous field, they get deflected by an amount related to the alignment of the “little magnet” with the inhomogenous field. If the electrons really could only live in discrete orbital planes, the atoms ought to get deflected in a few discrete directions. If the electrons could live in any orbital plane, you’d get a continuous spread of deflections.

If you think the idea that orbital planes can only exist in certain orientations relative to an arbitrary choice of axis sounds, well, wrong – then you’re not alone. Even Debye, who had also derived the same idea, said to one of the people proposing to actually measure it “You surely don’t believe that [space quantization] is something that really exists; it is only a computational recipe”. In other words, even to the people who came up with the idea it was little more than a utilitarian heuristic – a mathematical procedure that got the right answers by a wrong route. Even Stern, one of the experimenters, later said he performed the experiment in order to show that the whole idea was incorrect. And his supervisor, Born, told him there was “no sense” in doing the experiment. Furthermore, according to classical physics when you put ‘little magnets’ into an external magnetic field, they precess around the axis of the magnetic field rather than doing any kind of ‘aligning’.

At this point in history, a rather surprising thing happens. We now know that Bohr/Sommerfeld’s prediction of the magnetic moment and angular momentum was wrong – they predicted it was ? whereas we now know it is zero. But Stern and Gerlach, who performed the inhomogeous magnetic field experiment, didn’t know that. Had that been the full story, they would’ve found no deflection. But in fact, they found that their beam of atoms did split nicely into two. What they didn’t know about – noone knew at that time – was that electrons have an intrinsic magnetic moment of their own that can take on two values. This electron “spin” was the mechanism that produced their observed result. But, being unaware of spin, they wrongly concluded that they had demonstrated the reality of Sommerfeld’s “space quantization” – in fact, they had demonstrated a different kind of quantization.

(Interestingly, although most descriptions focus on angular momentum as the important concept, Stern’s own nobel lecture doesn’t mention angular momentum at all. It only talks about the magnetic moment. There’s an assumption implicit that magnetic moments are what you get when you have charge and angular momentum, but since it’s the magnetic moment that determines the deflection in the Stern-Gerlach experiment I, like Stern, prefer to talk about magnetic moments and leave it for someone else to worry about how that magnetic moment comes about).

So where does that leave Sommerfeld’s ellipses? They’re still supported both by their ability to explain the Stern and Zeeman effect (partially) and also for the fact that Sommerfeld also calculated a relativistic correction for his elliptical orbits which made the prediction of spectral line wavelengths match experimental data slightly more accurately (in Bohr’s circular orbits, the electrons travel at c/137 or gamma=1.00002, and the speed will be higher in ellipses that do “close passes” to the nucleus, so you start to get close to the point where special relativity starts making an impact).

Spin now enters the picture, as a highly “unclassical” concept. The story starts with simple pattern spotting. In 1871, Mendeleev organised the known elements into a table based on their chemical properties. He didn’t know it at the time but he’d stumbled upon the sequence of atoms with increasing number of electrons, and the groups he perceived gained their commonality through having the same number of electrons in their outermost shells. But several steps were required to make this connection. Firstly, the Bohr model gave the idea of discrete orbits each with different energy. Then Sommerfelds elliptical orbits gave several different alternative shapes for a particular energy of orbit (“degeneracy”). A paper by Stoner in 1924 made a connection between the number of spectral lines of an element (once degenerate states had been split out using the Zeeman effect) and the number of electrons in the subsequent noble gas. (Stoner’s career prior to this point had been rather desperate). This observation lead Pauli to realise that a simple rule of “only one electron is allowed in each quantum state” was possible, but only if an extra two-valued quantum number was used. Initially Pauli didn’t offer up any explanation of what this two-valued thing was. Goudsmit and Uhlenbeck subsequently proposed that it could be caused by the electron spinning around its own axis, something which was later shown to be wrong (electrons seem to have no size, at least every attempt to measure their size finds it smaller than we can measure, and so to create enough angular momentum the tiny tiny spinning top would have to rotate very quickly, such that its surface would be going faster than the speed of light). But although the picture was wrong, the idea that electrons have their own intrinsic two-valued angular momentum and magnetic moment is correct – as, in fact, the Stern Gerlach experiment showed.

Like Sommerfeld’s ellipses, the two possible electron spin states don’t have much effect on the energy – it’s still dominated by the original Bohr ‘n’. But spin does make small changes to the energy. A particle with spin is like a small magnet, and a small magnet orbiting a positive nucleus has an electromagnetic interaction – Lamour interaction and Thomas precession. This causes small changes to the orbit energy, resulting in splitting of spectral lines – a processes now named “spin-orbit interaction”. Sommerfeld’s ellipses cause a

But how was Pauli to incorporate his new “two valued” quantity into the Bohr-Sommerfeld model. It seems that he didn’t. Pauli published his exclusion principle in January 1925. Heisenberg wrote his matrix mechanics paper in July 1925, and Schrodinger published his wave mechanics in 1926. These approaches were much more general than the Bohr-Sommerfeld approach – a genuine ‘mechanics’ explaining how particles evolve over time due to forces. In 1927, Pauli formulated the “Pauli Equation” which is an extension of the Schrodinger equation for spin-1/2 particles that takes into account the interaction between spin and external electromagnetic fields.

Although initially the Heisenberg and Schrodinger approach looked very different, Dirac was able to show that both are just different realisations of a kind of vector space, and that quantum mechanics was a big game of linear algebra which didn’t care if you thought of those vectors were ‘really functions’ or not. Dirac was happy to go somewhat off-piste mathematically, using his “Dirac delta” functions which are non-zero except at a point yet their integral is one. His work was followed up by von Neumann whose book took a more formal rigorous mathematical approach, objecting to Dirac’s use of “mathematical fictions” and “improper functions with self-contradictory properties”. The approach is much the same, but the foundations are made solid.

In the Schrodinger picture, a particle is described by a complex-valued wave function in space. The Schrodinger equation shows how the wave evolves in time, as a function of the curvature of the wave and a term describing the spatial potential. In the case where a particle is constrained within a potential well, such as an electron experiencing the electrostatic attraction of a nucleus, the waves form ‘stationary’ patterns (the wave continues to change phase over time, but the amplitude is not time-dependent). In a hydrogen atom, the stationary states in three dimension are combinations of radial, polar and azimuthal half-waves which result in amplitudes that vary spatially but not with time. The radial, polar and azimuthal contributions match up with the three quantum numbers from the Bohr model (n,l,m) reflecting the fact that the Schrodinger approach is much more general – the Bohr model “falls out” as being the special case of a single particle in a central electrostatic field.

As is often the case, although the Schrodinger equation is very general, only a few simple symmetric cases (such as the Hydrogen atom) result in a nice compact mathematical expression. For more complex cases, one can do numeric simulation (ie. rather than viewing the Schrodinger equation as stating a criteria for a solution in terms of it’s time derivative and spatial curvature, you can view it as an algorithm for evolving a function forward in time). Alternatively, one can apply perturbation methods, originally invented when studying planetary motion. Perturbation methods are similar to approximating a function using the first few terms of a power series; you take a state you can solve exactly (hydrogen atom) and assume that a small change (small electric field) can be modelled roughly using a simplified term for the difference. For example, this can be used to show the Stark effect (approximately) – where the lines of Hydrogen are split by an electric field.

But the new ‘quantum mechanics’ were quite different to the Bohr model. The Bohr model painted a picture of electrons being “in” some orbital then (for reasons unknown) deciding to jump to some other orbital. But in the Schrodinger/Dirac picture there were two very different processes going on. As time passed, the system would evolve according to the wave equation. But if a measurement of position, energy or momentum was made the wave function would “collapse” into a basis state (eigenvector) of the linear operator associated with that observable quantity. This collapse was evident because subsequent measurements would give the same answer, since the system had not had a chance to evolve away from the eigenstate. However, in general, the state would exist in some weighted linear combination (“superposition”) of any choice of basis states. If you made two different measurements (say position and momentum) whose linear operators did not have the same set of eigenvectors, then the result is dependent on the order you perform the measurements.

Schrodinger did not consider the effect of spin in his original equation (ie. the spin-orbit coupling, or the interaction of spin with an externally applied field). Thus, it required an extension by Pauli to reflect the fact that an electron’s state wasn’t just captured in the wave function. To include spin into the system state isn’t just as simple as recording a “spin up” or “spin down” for a given electron. The particle can be in a linear combination of two spin basis states. And, much like how multi-particle systems are modelled with tensor products to yield joint-probabilities, there can be dependencies between the spin state and the rest of the state.


The hydrogen atom

A lot of the early development of quantum mechanics focused on the hydrogen atom. Fortunately, the hydrogen atoms that are around today are just as good as the ones from the 1910’s and furthermore we’ve got the benefit of hindsight and improved instruments to help us. So let’s take a look at what raw experimental data we can get from hydrogen and use that to trace the development of ideas in quantum mechanics.

Back in 1740’s, lots of people were messing around with static electricity. For example, the first capacitor (the Leyden jar) was invented in 1745, allowing people to store larger amounts of electrical energy. Anyone playing around with electricity – even just rubbing your shoes across a carpet – is familiar with the fact that electricity can jump across small distances of air. In 1749, Abbe Nollet was experimenting with “electrical eggs” which was a glass globe with some of the air pumped out, with two wires poking into it. Pumping the air out allowed longer sparks, apparently giving enough light to read by at night. (Aside: one of these eggs featured in a painting from around 1820 by Paul Lelong). a video of someone with a hydrogen-filled tube so we don’t all have to actually buy one.

By passing the light through a diffraction grating (first made in 1785, although natural diffraction gratings such as feathers were in use by then) the different wavelengths of light get separated out to different angles. When we do this with the reddish glow of the hydrogen tube, it separates out into three lines – a red line, a cyan line, and a violet line. Although many people were using diffraction gratings to look at light (often sunlight) it was Ångström who took the important step of quantifying the different colours of light in terms of their wavelength (Kirchoff and Bunsen used a scale specific to their particular instrument). This accurately quantified data, published in 1868 in Ångström’s book was crucial. Although Ångström’s instrument allowed him to make accurate measurements of lines, he was still just using his eyes and therefore could only measure lines in the visible part of the spectrum (380 to 740nm). The three lines visible in the youtube video are at 656nm (red), 486nm (cyan), 434nm (blue) and there’s a 4th line at 410nm that doesn’t really show up in the video.

These four numbers are our first clues, little bits of evidence about what’s going on inside hydrogen. But the next breakthrough came apparently from mere pattern matching. In 1885 Balmer (an elderly school teacher) spotted that those numbers have a pattern to them. If you take the series n^2/(n^2-2^2) for n=3,4,5… and multiply it by 364.5nm then the 4 hydrogen lines pop out (eg. for n=3 we have 365.5 * 9/(9-4) = 656nm and for n=6 we have 365.5 * 36/32 = 410nm). Alluringly, that pattern suggests that there might be more than just four lines. For n=7 it predicts 396.9nm which is just into the ultraviolet range. As n gets bigger, the lines bunch up as they approach the “magic constant” 365.5nm.

We now know those visible lines are caused when the sole electron in a hydrogen atom transitions to the second-lowest energy state. Why second lowest and not lowest? Jumping all the way to the lowest gives off photons with more energy, so they are higher frequency aka shorter wavelengths and are all in the ultraviolet range that we can’t see with our eyes.

Balmer produced his formula in 1885, and it was a while until Lyman went looking for more lines in the ultraviolet range in 1906 – finding lines starting at 121nm then bunching down to 91.175nm – and we now know these are jumps down to the lowest energy level. Similarly, Paschen found another group of lines in the infrared range in 1908, then Brackett in 1922, Pfund in 1924, Humphreys in 1953 – as better instruments allowed them to detect those non-visible.

Back in 1888, three years after Balmers discovery, Rydberg was trying to explain the spectral lines from various different elements and came up with a more general formula, of which Balmer’s was just a special case. Rydberg’s formula predicted the existence (and the wavelength) of all these above groups of spectral lines. However, neither Rydberg or Balmer suggested any physical basis for their formula – they were just noting a pattern.

To recap: so far we have collected a dataset consisting of the wavelengths of various spectral lines that are present in the visible, ultraviolet and infrared portions of the spectrum.

In 1887, Michelson and Morley (using the same apparatus they used for their famous ether experiments) were able to establish that the red hydrogen line ‘must actually be a double line’. Nobody had spotted this before, because it needed the super-accurate interference approach used by Michelson and Morley as opposed to just looking at the results of a diffraction grating directly. So now we start to have an additional layer of detail – many of the lines we thought were “lines” turn out to be collections of very close together lines.

In order to learn about how something works, it’s a good idea to prod it and poke it to see if you get a reaction. This was what Zeeman did in 1896 – subjecting a light source (sodium in kitchen salt placed in a bunsen burner flame) to a strong magnetic field. He found that turning on the magnet makes the spectral lines two or three times wider. The next year, having improved his setup, he was able to observe splitting of the lines of cadmium. This indicates that whatever process is involved in generating the spectral lines is influenced by magnetic fields, in a way that separates some lines into two, some into three, and some don’t split at all.

Another kind of atomic prodding happened in 1913 when Stark did an experiment using strong electric fields rather than magnetic fields. This also caused shifting and splitting of spectral lines. We now know that the electric field alters the relative position of the nucleus and electrons, but bear in mind that the Rutherford goil foil experiment which first suggested that atoms consist of a dense nucleus and orbiting electrons was published in 1913 and so even the idea of a ‘nucleus’ was very fresh at that time.

Finally, it had been known since 1690 that light exhibited polarization. Faraday had shown that magnets can affect the polarization of light, and ultimately this had been explained by Maxwell in terms of the direction of the electric field. When Zeeman had split spectral lines using magnetic field, he noticed the magnetic field affected polarization too.

So that concludes our collection of raw experimental data that was available to the founders of quantum mechanics. We have accurate measurements of the wavelength of spectral lines for various substances – hydrogen, sodium etc – and the knowledge that some lines are doublets or triplets and those can be shifted by both electric and magnetic fields. Some lines are more intense than others.

It’s interesting to note what isn’t on that list. The lines don’t move around with changes in temperature. They do change if the light source is moving away from you at constant velocity, but this was understood to be the doppler effect due to the wave nature of light rather than any effect on the light-generating process itself. I don’t know if anyone tried continuously accelerating the light source, eg. in a circle, to see if that changed the lines, or to see if nearby massive objects had any impact.


The Sun Also Rises

I was walking north in London. I was definitely walking north. My phone map showed I was going north. The road signs pointed straight ahead to go to Kings Cross, which was to the north. I knew I going north. But, in a post-apocalyptic way, I also like trying to navigate by using the sun. It was afternoon. The sun would be setting in the west. I was heading north. Shadows would stretch out to my right. I looked at the shadows on the ground. They were stretching out to the left. Left, not right. Wrong. I stopped dead, now completely unsure which direction I was going and what part of my reasoning chain was broken.

It took me a while, but I figured it out. There were tall buildings to my left blocking the direct sun. On my right was tall glass-fronted offices. The windows were bouncing sunlight from up high down onto the pavement. From the “wrong” direction!

Moral of the story: don’t trust the sun!


NTSB, Coop Bank

My two interests of air crash investigations and financial systems are coinciding today as I read through the Coop Bank annual results. Unlike RBS’s decline in 2008, this isn’t a dramatic story of poorly understood risk lurking behind complex financial instrument, it’s a bit more straightforward. But, since I spent some time picking through the numbers I thought I’d capture it for posterity.

A traditional high-street bank makes money from loans because customers have to pay interest on their mortgages and car loans, hence banks consider loans to be assets. The money which you or I have in our current or instant-access saving accounts, “demand deposits”, are liabilities of the bank. The bank pays out interest to savers. Unsurprisingly, the interest rate on loans is higher than what the bank pays to savers, and the difference (called “net interest income”) is income for the bank which ideally helps increase the banks equity (ie. money owned by the bank, which shareholders have claims on).

At first glance, Coop Bank are doing fine here. They have £15.3bn of loans to people (14.8bn) and businesses (0.4bn). They have £22.1bn of customer deposits [page 16], spread fairly evenly between current accounts, instant savings accounts, term savings accounts and ISAs, being a mixture of individuals (£19.4bn) and companies (£2.7bn). A quick check of their website shows they pay savers around 0.25%, and mortgage rates around something like 3%, which directly gets you to their “net interest income” of £394m from their high-street (aka “retail operations”). So that’s a big bunch of money coming in the door, good news!

(They used to be big into commercial property loans, but by 2014 their £1650m of loans included about £900m which were defaulting, and they sold off the rest and got out of that business)

But every business has day-to-day costs, like rent and staff salaries to pay. Staff costs were £187m which sounds like a lot of money, but a UK-wide bank has a lot of staff – 4266 [page 33] of which 3748 were fulltime and 1018 part-time. That’s an average of £43k each, but it’s not spread evenly – the four executive directors got £4172k between them [page 92], and the eleven non-exec directors got £1052k between them [page 95]. In addition, they paid £121m on contractors [page 178]. So, total staff costs were £300m. Hmm, now that £394m income isn’t looking so peachy. We’ve only got £94m left – let’s hope there’s nothing else we have to pay for.

Oops, forgot about the creaking IT infrastructure! The old IT setup was pretty bad it seems. The bank themselves warned investors in 2015 that “the Bank does not
currently have a proven end-to-end disaster recovery capability, especially in the case of a
mainframe failure or a significant data centre outage.” (page 75). The FCA (Financial Conduct Authority) who regulate banks and check that they are at least meeting some basic minimum standards told the Coop Bank in 2015 that they were in breach of those basic standards. So, they came up with a cunning plan to get off their clunky mainframes and onto a whizzy IBM “managed service platform” which, one would hope, is much shinier and has a working and tested backup solution. All of this “remediation” work wasn’t cheap though, clocking in at £141m for the year. The good news is that the FCA are all happy again and it should be a one-off cost, but we’re looking at loss overall for the year of £47m.

But we’re not done yet! We also have some “strategic” projects on the go, which managed to burn £134m [page 19]. A while back, Coop decided to “outsource” its retail mortgage business to Capita, and then spent a lot of time bickering with them, before finally making up this year. Nonetheless, a planned “transformation” of IT systems is getting canned, the demise of which is somehow costing the bank £82m! At the more sensible end, £10m went into “digital” projects, which I assume includes their shiny new mobile app [page 12]. But all in all, those “strategic” projects means we’re now up to a £181m loss.

Only one more big thing to go. Back in 2009, Coop Bank merged/acquired Britannia Building Society, gaining about £4bn of assets in the form of risky commercial property loans, and some liabilities. Those liabilities included IOUs known as Leek Notes which Britannia had issued to get money in the short-term. When Coop acquired Britannia, there was some accountancy sleight of hand done to make the liability look smaller [page 26 in Kelly Review] but nonetheless a £100 IOU still has to ultimately be paid back with £100, and so now Coop Bank is drudging through the reality of a paying back (aka “unwinding”, gotta love the euphemisms) a larger-then-expected liability. In 2016, that was to the tune of £180m.

So now we’re up to a £361m loss. Chuck in a few more projects like EU Payment Directives, some “organizational design changes” which cost £20m and you get to a final overall loss for the year of £477m.

Now, in the same way that I (as a person) can have money that I own in my pocket, Banks can have money that they (as a company) own – which is their equity. In good times, some of that equity gets paid out to shareholders as a dividend, and some is retained within the company to fund future growth. But in bad times, that equity is eroded by losses. Coop Bank started the year with about £1100m of (tier 1) equity, and the £400m loss has chopped that down to £700m. If you’re losing £400m in a year, £700m doesn’t look like a lot of runway and that’s why they’re trying to sell the business or bit of it, or raise equity by converting bonds to shares or issuing bonds.

Like any business, you’ve got to have more assets than liabilities otherwise your creditors can have you declared insolvent. And Coop Bank certainly has more assets than liabilities. But the loans which make up part of the banks assets are fairly illiquid, meaning they can’t be readily turned into cash. Furthermore, they’re somewhat risky since the borrower might run away and default on the loan. So, in order to be able to soak up defaulting loans and have enough money around to people to withdraw their deposits on demand, banks need to have a certain level of equity in proportion to their loans. You can either look at straight equity/assets, aka leverage ratio, which is 2.6% for Coop Bank (down from 3.8% last year). Or you can do some risk-weighting of assets, and get the Tier 1 Capital ratio of 11% (down from 15%). The Bank of England says that “the appropriate Tier 1 equity requirement …be 11% of risk-weighted assets” so Coop Bank is skirting the edges of that.

All in all, if interest rates stay unchanged and Coop Bank’s loans and deposits stay where they are, then you could imagine a small profit from net interest income minus staff/related costs. But the burden of bad acquisitions, failed integration projects and massive IT overhauls are overshadowing all of that and that’s what’s put Coop Bank where it is today.