The following is an essay I wrote in 2021 during my time as an undergraduate student in Computer Science Philosophy at Oxford. Of my philosophical writings from that time, it's the one of which I'm most proud and which I believe remains highly relevant to my work these days. The essay deals with the various problems of *induction* that have been posed by philosophers from David Hume to Nelson Goodman down through the centuries. Since induction is broadly held to form the basis of empirical science and human learning, it stands to reason that coming to grips with these problems may be able to inform our approach to problems in machine learning, artificial intelligence, etc. In that spirit, I offer these reflections on the nature of and justification for scientific induction.
---
# On the Origin of Theories
by Corinthia Beatrix Aberlé
November 7, 2021
> "Planet Earth is a machine... [and] all the organic life that has ever existed amounts to a greasy film that has survived on the exterior of that machine thanks to furious improvisation."
>
> – Sam Hughes, *Ra*
---
The laws of science are commonly upheld as paradigm cases of our capacity for the production of knowledge – if we are justified in believing anything, we are justified in believing what science tells us about nature. But to what faculty of the intellect do we owe such laws and their justification? The standard empiricist answer is that our scientific hypotheses are arrived at by way of *induction* or *inductive inference* from observation of the natural world. *Induction* is a somewhat slippery term that is perhaps better characterised in what it *isn't* than what it *is*. Namely, *induction* is not *deduction*, i.e. the process of inferring in such a way as to make one's conclusion a logical consequence of one's premises. Inductive inference is commonly introduced by way of example, such as the classic: *all observed swans have been white, therefore all swans are white*. The above example serves to illustrate several distinguishing features of induction – the given argument is *not* deductively valid, because it is possible for the premises to be true (there are people, I imagine, who are only aware of the existence of white swans) and yet for the conclusion to be false. Moreover, the conclusion *is* false (there are black swans), and this highlights that although we are inclined to think of inductive inference as *generally* reliable, it is not *totally* reliable. Upon reflection, we might be inclined to ask: if induction is demonstrably not reliable in all instances, what licenses our belief in its *general* reliability?
So arises the scandal of empiricism: the problem of induction. There is seemingly no legitimate way of convincing ourselves of the reliability of induction without first taking induction for granted. This problem was first noted by David Hume (c.f. Hume, 1740). In modern terms, Hume argues that it is impossible to demonstrate deductively that inductively-inferred hypotheses are generally true, and it is circular to try and do so inductively. Hume's argument has for the most part withstood all attempts at criticism, and as no third mode of generating knowledge beyond induction and deduction has been forthcoming, the problem of justifying induction has remained a conundrum for the philosophy of science.
One may wonder, however, to what degree the apparent intractability of the problem of induction is due to the vagueness surrounding induction itself. The conception of induction involved in Hume's arguments is something like a process of inferring, on the basis of a perceived regularity in some observed phenomena, that this regularity holds for *all* such phenomena, unobserved as well as observed. But is this process of *generalization* one and the same as that by which scientific hypotheses are generated and evaluated (which, hereafter, I call *scientific induction*)? Perhaps, by getting clearer on how scientific theorizing *in fact* proceeds, we may shed new light on the problem of induction, and whether it is indeed a problem for science. This has been the approach taken by many authors from the time of Hume onwards, but such approaches have each been beset by problems of their own. By a careful consideration of one such flawed but promising attempt at resolving the problem of induction – Hans Reichenbach's *pragmatic* justification of induction – I shall proceed to refine the argument and its attendant notion of induction to something that at once accords better with scientific practice and stands a better chance of success. Along the way, we shall meet Goodman's so-called "new riddle of induction," which stands as a hurdle in the way of any attempt at explicating the process of scientific induction. The concept of scientific induction ultimately arrived at shall be *evolutionary* in conception. That is, scientific induction, as I understand it, is nothing more or less than the mechanism of intellectual adaptation of a community of rational agents to their environment, the *furious improvisation* by which science squares itself with the world as we find it.
## I. The pragmatic argument
In *Experience and Prediction* (1938), Reichenbach begins his consideration of induction with a critique of Hume's arguments concerning induction, wherein he accepts as decisive Hume's objections to the possibility of demonstrating the general truth of inductively-made hypotheses, but finds lacking Hume's own attempt at resolution of the problem he exposed. According to Hume, induction – though not rationally justifiable – is nonetheless part of human nature, arising as the intellectual reflection of our tendency for the formation of *habits*. Reichenbach (rightly, to my mind) deplores Hume's solution as intolerably defeatist. Appeals to *human nature* and *habit* be damned – if induction turned out to be a *bad* habit, and our nature flawed in this respect, then we would have every reason to look for alternatives to induction, and *change* our nature accordingly as much as possible. Of course, Reichenbach does not think induction to be a bad habit, but an explanation of why this should be the case is nowhere to be found in the Humean picture. Reichenbach then sets out to give a precise account of what makes induction preferable to other methods of reasoning about the unobserved.
Reichenbach notes that although Hume's argument forecloses the possibility of justifying induction on the grounds of its likelihood of leading us to truth, this does not rule out the possibility of justifying induction by appeal to some other, more *pragmatic*, criterion of assessment. The idea of Reichenbach's approach, broadly speaking, is to show that the method of induction employed by science satisfies some property that makes it at least as favorable as any comparable method. This leads Reichenbach to an ingenious argument, the specifics of which are couched in and dependent upon the highly technical apparatus of probabilistic reasoning developed by Reichenbach in prior sections of *Experience and Prediction*, but the practical upshot of which is simple to state: if any method of predicting the unobserved on the basis of the observed works, then induction works.
It is regrettable that Reichenbach staked much of his argument on the specifics of his particular, *frequentist* interpretation of probability and its relation to science, which weakens the argument's generality. Central to Reichenbach's argument is his criterion of success for methods of prediction: such a method is successful at predicting some event if the frequency of the event converges to a limit and the probabilities assigned by the method to the event converge to the same limit. Induction, for Reichenbach, is then just the method of assuming that the observed frequency of an event is its actual frequency. A Bayesian, to give but one example, would define successful prediction and perhaps also induction itself differently, and so would not accept this part of Reichenbach's argument. Nonetheless, as Skyrms (2000) notes, the core of Reichenbach's argument – specifically the argument given by Reichenbach for the above-stated conclusion – applies somewhat more broadly. In the simplified and generalized form given by Skyrms, the argument runs as follows: if there is a method that yields reliable predictions regarding some phenomenon, then induction, applied to the *success of that method* would eventually lead us to accept the predictions made by the method. Now the validity of this argument turns a great deal upon what is meant by *reliability* of prediction, and how we are to understand *induction* as proceeding. We shall see, in due course, whether it is possible to cash out *reliability* and *induction* in sufficiently weak terms as to be widely acceptable, yet that make Reichenbach's argument – or something like it – valid. First, however, it will be fruitful to consider the main objection posed by Skyrms to Reichenbach's argument.
Skyrms makes a productive distinction between the *levels* (or *orders*, as I shall call them) at which induction – and prediction more generally – occurs. At the first order, we make predictions regarding some phenomenon or class of phenomena – natural phenomena if our science is a natural science, social phenomena if our science is a social science, etc. At the second order, we make predictions regarding first-order methods of prediction. So the claim that one theory of a natural phenomenon will yield better predictions than another is a kind of *second-order* prediction. There are then third-order predictions concerning methods for making second-order predictions, and so on. Skyrms then characterizes Reichenbach's argument as an argument by *mathematical induction* (N.B. despite the nominative similarity to induction, mathematical induction remains a form of *deduction,* rather than induction in the empirical sense), attempting to show how the validity of induction at lower orders justifies induction at higher orders as well. Skyrms concludes, on this basis, that Reichenbach has at best only demonstrated the inductive step of such an argument, wherein the lower orders justify those above them, but has *not* provided a solid base case justifying first-order induction. Without such a base case, Skyrms concludes that Reichenbach's argument remains as circular as any of the other myriad ill-fated attempts to justify induction.
However, I think the confusion here lies not in Reichenbach's argument, but in Skyrms' interpretation of it. Reichenbach's argument, to my mind, is decidedly *not* a (mathematically) *inductive* argument, but rather, in the jargon of category theory and type theory, a *[coinductive](https://en.wikipedia.org/wiki/Coinduction)* one.
For those not in the know, coinduction is a type of mathematical construction which is formally dual (in a precise, category-theoretic sense) to mathematical induction. Technically speaking, an inductive type or set is defined as the least fixed point of some transformation, while the corresponding coinductive object is defined as the *greatest* fixed point of such a transformation. In the case of sets, this means that an inductively-defined set is *contained within* any other set that is a fixed point of the transformation with respect to which it was defined, while a *coinductively-defined* set *contains* all such sets.
At this level of generality, the difference between mathematical induction and coinduction may seem rather abstract, so by way of a concrete example: for any set $A$, the set of finite sequences of elements of $A$ can be (and typically is) defined by induction, as the least fixed point of the endofunctor $F(X) = 1 + A \times X$ on the category of sets. By contrast, the set of *potentially infinite* sequences of elements of $A$ may be coinductively defined as the greatest fixed point of this endofunctor. This example is typical of the duality between inductively- and coinductively- defined set—sets defined by induction tend to consist of finite things, while sets defined by coinduction tend to consist of *potentially infinite* things. Likewise, the way in which one typically works with such structures is dual. In the case of lists, one can build a list through finitely many actions, but then recursively decompose any such list. Conversely, a potentially infinite sequence can be *built recursively,* but then can typically only be *observed* finitely (e.g. by repeatedly getting the next element of the sequence).
Coming back to Reichenbach's pragmatic argument, then, we see that it has much more in common with the coinductive strategy than the (mathematically) inductive. Contra Skyrms, Reichenbach is *not* saying that the lower orders of induction justify the higher orders—indeed, this gets the direction of justification in Reichenbach's argument backwards. Reichenbach's claim is rather that induction at higher orders can act as a potentially infinite structure whereby the higher orders recursively act as corrective measures on the lower orders. In this sense, Reichenbach is *coinductively* building the structure of induction by a recursive argument, and attempting to use this to demonstrate its reliability. More formally, we might say that Reichenbach implicitly defines the set of reliable methods for reasoning about empirical data as the greatest fixed point of some transformation, and then attempts to show that induction belongs to this set by showing that it is (contained in) a fixed point of this transformation.
Now, whether this latter interpretation of Reichenbach's argument turns out to be valid depends upon whether and how we interpret the set of reliable prediction methods as coinductively defined. In order to investigate this further, it shall first be prudent for us to ask whether induction of the sort described by Reichenbach, which I shall call *simple induction*, really corresponds to the methods of prediction employed in science, what I have called *scientific induction*. This naturally prompts consideration of a problem for any attempt at elucidating the specifics of such induction: Nelson Goodman's 'new riddle' of induction.
## II. The new riddle
Unlike Reichenbach, Goodman more-or-less accepts Hume's resolution of the problem of induction, what Goodman calls the 'old riddle', and the conclusion that there is no wholly rational justification of induction (Goodman, 1955, ch. 3-4). What remains then, on Goodman's account, isn't to try and justify our inductive practices, but rather to spell them out precisely. The problem, as Goodman puts it, is to state exactly what relation must inhere between some evidence and a hypothesis in order for the evidence to count toward the hypothesis, the so-called 'confirmation' relation. Here, however, Goodman poses a new challenge, what he terms the 'new riddle' of induction.
Goodman considers inductive arguments that take the form 'all observed $P$s have had property $Q$, therefore all $P$s have property $Q