Why do they call it: Referentially transparent (II)

Post author By Andrew
Post date 11/12/2008
5 Comments on Why do they call it: Referentially transparent (II)

In the previous post, I explored the idea that some sentences change meaning if you switch between different names for the same object. Some other sentences don’t change meaning – these ones are referentially transparent, which in non-posh-speak means “you don’t notice if I switch to a different name”.

I’ve now asked some questions on haskell-cafe, and consulted various gurus at my work so I’m a little less confused that I was yesterday. 🙂

How does this concept apply to programming languages?

In haskell, you could say that the “names” were expressions like “2+2” and “sqrt(9)” and the underlying ‘thing’ which was being named are the values which those expressions evaluate to. So the integer “two” has many names – “1+1” and “sqrt(4)” and many others.

That gives us a nice connection back to the Quine definition of “referentially transparent”. Essentially, when we claim that Haskell is RT, its just a shorthand way of saying “you can replace an expression with any other expression that evaluates to the same result, and it won’t affect the overall meaning of the program”.

However, I feel that I’ve just handwaved over a few interesting points.

It’s reasonable to think of “sqrt(4)” as being a name for “2” in haskell because, well, functions always return the same value in haskell. But in Java or C, what kind of name could “rand()” be? The function evaluates to various values and it does this because it uses some kind of additional runtime context/state. So the basic idea that you can always have “names” and underlying “things” is a bit shaky – and we can’t really use Quine’s definition without them. What does it mean to say that C++ is “refentially opaque” if the very idea of names/references is shaky?

The same problem applies to the english-language examples from yesterday. I claimed that “Scotlands capital” was another name for edinburgh-the-city. That’s certainly true today. But it might not be true in 100 years. And perhaps, someone who lives in Glasgow might even disagree that its true today! So names which are co-referent today might not always be co-referent.

There’s a famous-ish example that’s similar too. In the sentence “The King of France is bald”, what kind of thing is being named by ‘The King of France’, given that France is a republic. That’s the kind of thing which keeps philosophers awake at night.

And if you’re into books, you’ll know that the author “Iain M Banks” writes sci-fi but “Iain Banks” writes mainstream fiction. There’s only one human being, but he writes under two “personas”. This makes me think that you have to be pretty careful about the THINGS that you think you are NAMING. In this example, “Iain M Banks” and “Iain Banks” are coreferent if you are discuss the human being, but they are not coreferent if you are discussing his “writing personas”.

There was one more point that I handwaved over. Why do I believe that Haskell really is referentially transparent? I might not trust Simon Peyton Jones when he claims that it is! But to prove some property of a language, you need a formal semantics for the language. And haskell98 doesn’t have that (unlike Standard ML). So there’s no much hope of doing a bottom-up inductive proof over the structure of the language – not just yet anyway. That’s a pity.

I’ll finish off with a couple of links about what RT means and RT and haskell.

Now I’m going to return to finish watching the SICP video which triggered this whole day-long distraction. 🙂

Tags etymology, haskell, theory

5 replies on “Why do they call it: Referentially transparent (II)”

Oh, right, well, here we go then. “Type” is exactly what you’re getting at with the phrase “kind of name”, I guess, in a roundabout kind of way. sqrt is a function from, say, reals to reals, whereas rand is a function from global states to reals; you can’t just replace “evaluate rand()” with “the result of evaluating rand()” because the result (a real) doesn’t “take a state”, so it can’t even begin to be equal. That’s the important difference between the “names”.

I suppose the whole point of types is to codify this notion of ‘”names” and “underlying things”‘. Non-RT languages kind of blind you to this because they make you treat computations as different kinds of things to values (for the reasons you describe here), but that’s a kludge that’s acting to obscure reality, i.e. that computations are values.

I’ll shut up now.

I agree that especially once one has seen its history, “referentially transparent” is a fairly awful name for the property of programming languages that we are talking about. What is a better name? “Substitutive”?

I understand that the “pure functional” i,e, “side-effect free” nature of Haskell is enough to guarantee most of expression substitutivity – but how does it deal with your “rand()” example? Can I write a non-deterministic program in Haskell, or do I always end up writing “roll the dice, using a seed of explicit value x to prime a deterministic random number generator”?

> Can I write a non-deterministic program in Haskell […]?

Yes, via the wonder of monadic I/O. In Haskell, the equivalent of rand() has the type “IO Int”. One way to look at this value (see http://www.haskell.org/haskellwiki/IO_inside) is as a function (rand :: World -> (World, Int)).

Now your haskell program, “main” has the type (IO ()), or rather (World -> (World, ())). Since the only value of the unit type () is () (leaving aside non-termination), we can handwavingly look at main as a function that takes the state of the world and returns a new state. It is only through the observation of this function by the Haskell runtime that side effects and nondeterminism can happen as it transforms the input World into the output World. main is the same function every time the program is run, but the contents of the World input to that function changes!

In other words, the “seed of explicit value” is the global state – naturally if you can reproduce that as input, the output of rand() will be the same. In this important respect, every possible program in every possible language is deterministic and side-effect free; languages just differ in their ability to let you see this through the type system (and hence equality of values).

What’s interesting to me is that one can never be finished with capturing the idea of “global state” – we may expand the definition from “stack contents” to “heap contents” to “sequences of inputs” etc, but they’re all approximations to something we can’t possibly model in a type system (the entire state of the universe). In the presence of, say, ionizing radiation from outer space, there is no referential transparency in Haskell – a cosmic ray might make the same computation erroneously evaluate to two different values – but we’ve decided that’s rare and uncontrollable and hemce outside of the remit of the type system in exactly the same way that Java’s type system doesn’t care about the contents of memory locations and IO streams.

Thanks for the clarification/explanation.

So far as I can see, then, referential transparency is guaranteed in a “pure” functional programming language (“stateless”, “side-effect free”, however you want to phrase it.)

Comments are closed.