Chapter 2 - Scientific Logic and Method

This chapter reviews ideas on scientific philosophy method and logic, including Popper's hypothetico-deductive method. Although a long essay, its primary aim is very simple, to show that, by all serious views, ignoring a scientific theory is equivalent to rejecting it and do skip the appendix if this material is not your thing.


2.1 Models, Hypotheses and Logic in Science
2.2 Models
2.3 Classic Scientific Logic
2.4 Probable and Improbable Hypotheses
2.5 Scientific and Non-scientific Hypotheses - Demarcation Criteria
2.6 Three Stages of Scientific Method
2.7 Gatekeepers and the Management of Science
2.8 Closing the Gates?
2.9 Chronological Order Dictates Merit
Appendix to Chapter 2
2.10 Other Views of Scientific Method
2.11 Weakness of the Hypothetico-deductive Method
2.12 Reducing the number of models - Demarcation Criteria
2.13 Popper
2.14 Vacuousness
2.15 Metaphysics
2.16 Metaphysical Logic and Scientific Logic
2.17 Parsimony and Aesthetics
2.18 Toulmin's Applicability Criterion
2.19 Occam's Razor - the Coherence Criterion
2.20 The Evolvability Criterion and Teleology
2.21 Interests and Relativist Reasons
2.22 Singularities
2.23 Summary of Strategies for Model Elimination
2.24 Hypothesis Testing and Probability
2.25 Assessment of Antecedent Probabilities
2.26 The Origins of Uncertainty
2.27 Quality Assessment - Peer Review and Citation Analysis
2.28 Conclusion
Summary

2.1 Models, Hypotheses and Logic in Science

"Hypotheses are .... adventures of the mind" (Sir Peter Medawar)

"Hypotheses .... can and must be tested rigorously" (Phillips and Pugh)

This chapter is, academically, one of the heaviest part of the work but, unfortunately, sequence dictates that it should appear so early. Chapter 2 reviews basic ideas on scientific logic and philosophy, with the aim of demonstrating that scientists cannot, with logic, ignore alternative theories. It has its own appendix, to encourage those who do not require its weight to go on to chapter 3 after the first few sections, or when they reach the appendix. They can do so because, really, we all already know the conclusions this chapter must reach, "the logic of science," said John Stuart Mill, "is also that of business and life," and science, said T. H. Huxley, is "organised common sense." Indeed, scientific philosophy does produce much the same conclusions as common sense.

Faced with unexplained observations, a scientist is advised to devise a model. In some fields this model can be a physical object but, on many other occasions, the word model is interchangeable with hypothesis. In the philosophy of science the two words have somewhat different meanings but here the distinction is unimportant. (See the glossary.) If a hypothesis successfully predicts the outcome of many critical experiments, then it is proved beyond reasonable doubt and has become a theory.

The term beyond reasonable doubt, again brings out the analogies between scientific and legal investigations. Scientific logic is the logic of investigation and decision making everywhere. No theory is ever actually proved, it is only not disproved while lawyers use the phrase beyond reasonable doubt to recognise that the guilt of a defendant is never proved with absolute certainty. Guilt is proved only beyond reasonable doubt.

2.2 Models

A model is a set of axioms or postulates which, it is thought, might fairly describe the nature of the phenomenon being studied. Model building is like using an intellectual version of a child's construction kit; scientists gather a set of axioms and concepts (the component parts of a hypothesis), assemble them into a model and compare its behaviour with that of nature. Models are valuable because they can be used to predict the outcome of experiments and scientists compare these predictions with observation. They may discard a new model immediately if it fails to predict existing results. More usefully, the predictions of a model will guide the experimenter's hand, enabling him to design investigations to differentiate two or more opposing ideas. The model(s) failing to predict the outcome of the test being discarded in favour of those that do.

Philosophers of science point out overarching or general models, called paradigms, ideas that are very wide-ranging and provide the framework for the formation of many more specific models. An example might be Newton's mechanics, a paradigm whose ideas are contained in lots of narrower models from fields as diverse as atomic theory and cosmology.

2.3 Classic Scientific Logic

There is no more to science than its method, and there is no more to its method than Popper has said. Hermann Bondi (Quoted by Magee (1973))

Model building is the classic description of scientific method expounded at length by Karl Popper in his famous books The Logic of Scientific Discovery (1968) and Conjectures and Refutations (1972). His approach, often called the hypothetico-deductive method, is accepted as a major feature of scientific logic. Popper is often thought to have regarded falsification as the centre of scientific logic but this is an error. To him falsification was extremely important and the elaboration of this principle was his own major contribution. However, he also held that all ideas, even his own, could and should be subject to reasoned, rational criticism. This principle of critical rationalism originated in ancient Greece, not with Popper, but to him it, not falsification, was the central scientific principle. Thus, it is necessary to be clear about the meaning of these two words, rationality and criticism.

The philosophy of rationality is the philosophy of the enlightenment. It originated much earlier but was elaborated in the 17th and 18th century by Descartes, Spinoza, Leibnitz and others in response to the growing success of science. Rationalism incorporates the principles of logic and certain ideas about the universe. It holds, for example, that there is only one single reality, hence that a person cannot simultaneously hold two contradictory beliefs about the world. It follows that to assert one theory is to simultaneously reject all competing theories. To assert otherwise is, in the strict meaning of the word, irrational. Further, a rational belief must be based on sufficient reason and that a rational believer should proffer reasons that are sufficient to justify holding his view. Rationality asserts that, to hold any belief, one must equally accept all the logical deductions that flow from it. The process of testing ideas by experiment depends on this principle, it leads to the conclusion that inconsistent experimental results undermine a theory.

Rationalism does contain different streams of thought, one split being into subjective and objective rationality. The latter is exemplified by Popper and asserts that the external world is real and that science seeks that reality. Objective rationalisty is the traditional system and remains the foundation of science, it reject all authorities other than observation and reason but does accept that no certain conclusions can ever be drawn. Subjective rationalists include pragmatists and naturalists, who note that lack of certainty and conclude that ultimate reality must reside in humans themselves - their motives, objectives and beliefs. The subjective/objective distinction was made by Horkheimer, The Eclipse of Reason (1947), who attacked subjective philosophies noting how they can rationalise any act, for example, "I have to consider my own best interests," or "I was just following orders". Thus subjective rationality can maintain bizarre social practices, such as witchcraft, or become the tool of authoritarian social attitudes. Such social impacts led Horkheimer to reject all subjective rationality, adding that the, "denunciation of what is currently called reason is the greatest service reason can render." Both in science and elsewhere, people who use the word rationality normally mean objective rationality.

Coming now to the meaning of criticise - to find fault with. This is word that does have quite negative overtones but finding fault is exactly what scientists are asked to do with theories - hypothesis testing is a negative logic. However, they are not asked to give just any criticism, it should be rational, reasoned criticism. The three practical characteristics, of such criticism were summed up by Bertrand Russell (1935, p66) in his description of reason, "in the first place it relies upon persuasion rather than force; in the second place it seeks to persuade by arguments which the man using them believes to be completely valid; and in the third place it uses observation .... as much as possible and intuition as little as possible." The first of these rules out the use of inquisitorial methods, the second rules out the use of propaganda and the third rules out appeals to the emotions or self-interest of the audience.

The implication of this is that critically rationalist debate requires certain behaviours from participants, generally that they be seriously seeking the truth. Thus, they must present all arguments they believe to be valid and may only present arguments they believe to be valid; both facts and opinions must be reported honestly. To enable criticism, such presentations must be open and available to all. A further facet of critical rationalism is, "the principle of sufficient reason", decisions are not made arbitrarily but must be founded on reasons that are stated and adequate to justify the verdict.

Critically rational debate in science, involves relevant experiment and the last idea surviving after a period of such debate becomes knowledge. We can never be sure that a piece of knowledge is true, because a better idea or contrary observation may come along later. Nevertheless such knowledge is the closest we can come to knowing external reality. Because doubt can always be expressed, it is often useful to think of knowledge as a contrast concept to a guess (Harré (1972)). Knowledge is the product of a rationally considered choice between alternative hypotheses, rather than choosing between them by guesswork. Thus, one may not randomly choose two alternatives from three, then conduct a rational debate to decide which of these two is correct. Such a mixing of rationality with irrationality is simply irrational.

These principles of critical rationalism generate the ethical imperatives of science. Popper suggested that they separate random ideas from knowledge, pseudoscience from science; modern scientists agree. It is evident that many human dialogues are not critically rationalist. In many situations the aim of participants in dialogue is to "win," whatever that may mean in their circumstances. Accordingly, in Popper's hands, critical rationalism became more than a scientific principle, he saw it as the alternative to all authoritarianism and it guided his political thinking. To him these principles underlay the freedom of speech and democracy upon which western society prides itself. Science is often held up as a bastion against authoritarianism because of this.

Today Popper's ideas are widely accepted. So much so that they are offered as advice to prospective research students. For example, Phillips & Pugh (1987), begin their advice to students by demolishing an older scientific philosophy, the idea that science starts with the gathering of disparate facts by entirely objective and dispassionate researchers:-

The myth of scientific method is that it is inductive: that the formulation of scientific theory starts with the basic raw evidence of the senses - simple unbiased unprejudiced observation. Out of these sensory data, commonly referred to as "facts" - generalizations will form. The myth is that from a disorderly array of factual information an orderly, relevant theory will somehow emerge. However the starting point of induction is an impossible one.

They point out that even scientists are human and begin with their own prejudices:-

There is no such thing as an unbiased observation. Every act of observation is a function of what we have seen or otherwise experienced in the past. All scientific work of an experimental or exploratory nature starts with some expectation about the outcome. This expectation is an hypothesis. They provide the initiative and incentive for the enquiry and influence the method. It is in the light of an expectation that some observations are held to be relevant and some irrelevant, that one methodology is chosen and others discarded, that some experiments are conducted and others are not. Where is your naive pure and objective researcher now?

Then, crucially, they go on - all scientists start with a hypothesis, a model, but they must never think they have proved it - they must try to disprove it :-

Hypotheses arise by guesswork, or by inspiration, but having been formulated they can and must be tested rigorously, using the appropriate methodology. If the predictions you make as a result of deducing certain consequences from your hypothesis are not shown to be correct then you must discard or modify your hypothesis. If the predictions turn out to be correct then your hypothesis has been supported and may be retained until such time as some further test shows it not to be correct. Once you have arrived at your hypothesis, which is a product of your imagination, you then proceed to a strictly logical and rigorous process, based upon deductive argument - hence the term "hypothetico-deductive". (Italics added.)

Prejudices may govern how a hypothesis is created but it is illegitimate to display the same prejudice when comparing its predictions with data. A scientist should permit criticism of his ideas and accept disproofs, even of his own models, when they are there.

2.4 Probable and Improbable Hypotheses

Not all models are equal. Apart from well thought out concepts, a whole range of improbable or downright silly notions could be created to account for a set of observed results - Heath Robinson could have worked on scientific theories had he so chosen. How one model is chosen for test, and another deemed silly, is for the judgement of scientists but the verdict should not be random. Intuition, guesswork, prejudice, analogy or any other thought process may help conceive a model but, once devised, there is little reason for the judgement of its reasonableness to be personal and absolutely none for the interpretation to be inexplicable or secret. Scientists can articulate the reasons to consider one model, while dismissing another. There are analogous situations.

Compare a computer chess player with a human master of that game. The computer's approach is to examine and analyse every single move. There may be hundreds. If the computer is programmed to analyze several moves deep, these hundreds multiply into a need to assess millions of positions. The computer is very stupid but very fast - for the chess computer examining every single possibility is an approach that works and modern computers are good at chess. Human beings are slow, they cannot play the same way; faced with the board and the hundred or more possible moves, leading shortly to millions of possible positions, they can consider only a very small number, perhaps four or five, before deciding which to play. However, a good human chess player is intelligent and usually considers the right moves. The principles of chess strategy allow him to reduce the number of moves to ponder in depth but still remain confident he is considering the best.

The judgement the chess master uses to eliminate bad moves consists of the application of chess strategy, a topic explained at great length in many books - central control, space control, piece development, weak points, pawn structure, freedom of movement etc. Great chess players are distinguished by their insight into such principles. If a great player is asked why he chose one move in preference to another, he will probably reply in those terms.

Great scientists may be distinguished by their insight into how to eliminate unworkable models. This is scientific strategy but it is a phase of reasoning almost never recorded. During their training, scientists do not read books explaining the principles used to reduce the number of hypotheses to be considered. Even so, practising scientists must surely use such principles, possibly subconsciously. Analysis of this thinking is quite disparate. Most thought has been due to philosophers of science, with their demarcation criteria, and to sociologists of science, who simply ask the workers concerned. In both cases their studies are little read by practising scientists, some will be reviewed later. It is strange that this stage of reasoning is so little recorded. Not only is it perfectly possible to make a record but, at times, scientists have an evident duty to do so.

2.5 Scientific and Non-scientific Hypotheses - Demarcation Criteria

Merely constructing a hypothesis is not science. Some hypotheses are scientific, others are not. Philosophers of science have long sought to establish a basis to decide which ideas are "scientific". The yardsticks they lay down are called demarcation criteria - tests that can be applied to distinguish scientific from non-scientific hypotheses. They are reviewed in the appendix to this chapter.

2.6 Three Stages of Scientific Method

The hypothetico-deductive method can be seen as requiring three phases in a scientific thought. These phases are -

  1. Laying down, or brainstorming, of all possible explanations of an observation. As many hypotheses as possible can be created here as this gives the best chance of the "correct" model being among those considered. The inclusion of incorrect models should be unimportant.
  2. A judgement or strategy based screening of the various models to decide between those worthy of being tested and those that can be discarded on some general principle - some demarcation criterion. For this stage to work, it should be regarded as permissible to criticise the ideas put forward in stage 1. The models surviving this stage are likely to be those for which a reasonable … priori (or prima facie) case can be made.
  3. Test of surviving models against empirical observation, either by reference to available data, or by designing new and critical experiments.

The three stages need not be executed consecutively. A new hypothesis may be advanced at any time, even after attempts have been made to test other hypotheses. No theory is ever proved. All theories are open to challenge and criticism may be advanced at any time.

Moreover, there is no reason why a new hypothesis should not be proposed by anybody, including people not deemed to be "expert". Non-experts, people without considerable training, would find it difficult to produce a theoretical novelty that could not be dismissed by reference to established experimental data or a demarcation criterion. Even so, there is no logical barrier to them doing so. The task of criticising theories seems easier than that of devising them and may well be within the capabilities of non-experts but, in practice, the difficulty of the task is not the only fence an amateur would have to jump. Even if his new theory, or his criticisms, met all scientific criteria, the non-scientist may not be listened to by professionals. Even well-established scientists find it difficult to get new theories heard against earlier alternatives.

Of the three stages, generally only the third is found in the scientific literature. The processes going on during the first two stages are rarely recorded. This is unfortunate as the agenda of science, its operational timetable, is laid down during those earlier periods. The exclusion of a concept from that agenda is just as important as the inclusion of another, and more capable of invalidating scientific conclusions. Exclusion, at any stage, is equivalent to saying a theory is wrong. No experiment can ever be done without some form of screening process having been performed but the scientific literature explains these stages only after the event or, more probably, does not explain them at all. When it does, the presentation is a sanitised representation of what may have been a messy process.

To put it another way, and more baldly, it is during those first two stages of a scientific programme that decisions are made as to how research funds will be allocated. In the real world, those decisions largely prejudge the outcome of scientific inquiry, yet there is little study of their formation and only the most opaque of records.

2.7 Gatekeepers and the Management of Science

I am the gate. Whoever enters by me will be saved (John Ch10 v9, NIB)

Whatever system of philosophy is adopted, science poses certain unavoidable management problems. Its fields are highly specialised and proper, effective decisions depend upon access to technical knowhow. Such expertise is normally available only from the scientists themselves. To ensure such knowledge is available during administrative decisions, certain scientists, are appointed to decision making positions involving, for example, deciding what projects should receive research funds, which individuals will be appointed or promoted, or what papers will be published. The scientists chosen for these roles have often distinguished themselves in some way and are the elite of science. These gatekeepers play a key role in scientific management.

Scientific gatekeepers decide what is, or is not, science. Their corporate decisions define science in an administrative and practical way, marking out the area of human endeavour called science. Something in the nature of gatekeeping exists for all subcultures and the role is a key and often very powerful one. Most professional subcultures try to select gatekeepers so as to avoid their having any personal vested interest in the decisions they will take. However, science is different in this regard. Because of its highly technical nature, science selects its gatekeepers solely from the field being gatekept. As a result, virtually every gatekeeping decision in science is taken by an individual with a very definite self-interest in its outcome. Also, there is almost no definition of gatekeeping responsibilities and virtually no public accountability for the way gatekeepers discharge their roles. Scientific gatekeeping decisions are taken anonymously, even those affected are kept ignorant of the identity of the person who made it and the rationale he used.

In principle, the role of the scientific gatekeeper to use his acknowledged expertise to:-

  • Define the scientific subjects by filtering out what is not science.
  • Maintain standards by ensuring that funded projects and published work are of acceptable quality.
  • Ensure the people who work in a field have a level of ability and integrity that enables standards to be maintained.

A gatekeeping scientist takes on a duty to make decisions in the best interests of science and those who fund the research, often the taxpayer. For a gatekeeper involved in the publication of a scientific journal (for example an editor or referee) this would include ensuring papers followed proper scientific methodology, which would involve reporting pre-existing studies appropriately. A gatekeeper involved in funding decisions has a responsibility to check projects for feasibility and whether regard was paid to reasonable alternative approaches. In any event, the gatekeeper should use his knowledge and expertise to represent the wider interests of science and the community.

2.8 Closing the Gates?

However, since scientific decision making takes place behind closed doors, it is normally impossible to establish who has taken a particular decision; it is even difficult to establish whether a decision has been taken. "The Habit of Lies", reports contacts with gatekeepers and it is hard to discern awareness of a duty to represent anything but the gatekeeper's own opinions, interests or convenience. Administratively, a demarcation criterion is being created which has nothing to do with logic or experiment. Scientific ideas are dismissed, with the grounds given being simply, "the (anonymous) gatekeeper says so". This is simply not a rationally valid criterion and raises questions about the motivations and accountability of the gatekeepers themselves.

It is most disturbing. The gatekeepers of a field are its existing experts. They can exclude views, not merely because those views lack sense, but simply because they "disagree" with them, and in this context "disagree" can have a range of meaning running from "disagree," through "can't reply," to "I'm jealous." In "disagreeing", gatekeepers can and do turn their back on reasoned explanations. This administrative state of affairs flies in the face of Popperian logic, the principles of critical rationalism, openness and freedom of speech. In effect, science is subject to authoritarian government by gatekeepers.

2.9 Chronological Order Dictates Merit

It seems that what matters about a theory is not whether it is right or wrong but whether it was proposed first, second or third etc. (Who proposed it also matters, if the innovator is himself already a gatekeeper things are different.) The first theory in a field is advocated by its first workers. Those workers are taken to be experts. New hypotheses are assessed, anonymously and without unaccountability, by the same men who, now acting in the role of gatekeeper, have a vested interest - an interest in thwarting any ideas that threaten to replace those from which their own influence flows. Those "experts" have complete freedom to reply to the alternative in a rationalist way, simply ignore or patronise the upstart idea or perhaps even steal it. If a good argument is available to rebut an alternative theory, they will no doubt present it in their reply. But even if the newly developed theory is plainly superior, the "expert" gatekeeper is in no way obliged to accept or even consider it. New theories can simply be stifled by gatekeeper disinterest.

Gatekeepers may not intend this to be the outcome of their actions but it can easily be so, whatever the intention might have been. It certainly cannot be the intention of those who fund science, yet there are no guidelines to protect inchoate concepts from predatory gatekeepers. In the case of the capping field reviewed here, gatekeepers are largely the advocates of the two theories considered first. Their views are simply imposed with no shred of an attempt at rationalist justification.

Appendix to Chapter 2

Scientific philosophy is a large subject and it deserves a more thorough review than has yet been given. Unfortunately, the material can be rather specialised, while some readers will require it, others may prefer to go on. Accordingly, much material is collected into this appendix. Those preferring to be guided by common sense may like to skip it.

Next Chapter
Summary to chapter 2

2.10 Other Views of Scientific Method

Hypothesis falsification is the underpinning logic of scientific knowledge and the hypothetico-deductive method is fundamental to scientific thought. If an alternative, viable theory is advanced, scientists need to be able to view their work in that context. Otherwise, they will be unable to devise experiments to decide between received wisdom and the new theory.

However, that does not mean scientists spend most of their time thinking of models to eliminate. For much of the time, as Kuhn (1970) pointed out, they do not need new theories but are merely elaborating on the existing body of knowledge not challenging it. Kuhn's description of scientific progress was based on sociological and historical observation and has two phases. One phase comprises periods of revolution during which competing ideas battle for the minds of the scientific community, the second being periods of normal science when fundamental ideas are generally accepted. In Kuhn's model the hypothetico-deductive method, if it applies at all, applies in the revolutionary phases, when bitter exchanges between opposing camps often occur.

There are other descriptions of science based on observation of the behaviour of scientists. Examples are due to Polanyi (1958) and Maslow (1966), who approach science from an individual, psychological standpoint, Knorr-Cetina (1981) who sees knowledge as, in effect, a manufactured product, Feyerabend (1975) describing an anarchic struggle, Ziman (1978) who views science as a search for (or battle for) consensus and Toulmin (1972) who takes an evolutionary view of ideas. Popper's social, as distinct from logical, view was also evolutionary (Popper (1979)). Toulmin's and Popper's studies are of particular importance as they arrived at the evolutionary stance largely from a traditional scientific/philosophical standpoint. They presage recent advances made by the application of evolutionary ideas to philosophical areas such as epistemology (the theory of knowledge) and ethics and form a link between those fields and the philosophy of science. (Plotkin (1994) reviews this.) There are still other empirical descriptions of science but further review will not be undertaken here. Like the work of Kuhn, they contain many valid perspectives.

Empirical approaches do not compete with Popper's logic and should not be seen as replacements for it. Popper set out to elaborate the underlying principles of scientific logic, not to present a sociological study. Kuhn, and the other empiricists, have set out to observe scientists at work, or report from scientific history, arguing that the underlying principles of science would emerge from this examination of scientists in action.

Such empirical studies will undoubtedly yield new principles but the standards discerned will be those of scientific sociology not logic, quite a different matter. Very often positions are sociologically quite logical but scientifically illogical. For example, a student may have data showing his eminent Professor is wrong but still take the public line that he agrees with him. Scientifically quite wrong but, remembering the considerable influence the same Professor will have over his future career, the student is being sociologically very logical. Scientists depart from scientific logic regularly. Often this will lead them to error, sometimes it will lead them to malpractice or even fraud. Approaching logic empirically will, in its nature, incorporate such actions as if they were principles of scientific logic. As Toulmin (1972) comments "it was .... always a mistake to identify rationality and logicality". When people choose one action over another they always have a rationale but it will not necessarily be logical and may not be apparent.

Compare the situation with the problems facing a Martian astronomer equipped with a very powerful telescope, who wishes to use his instrument to determine the speed limit on British motorways. The maximum speed, by law (by logic, if you like) is in fact 70 mph but if the Martian sought to determine this principle of driving merely by observing the actual speed of drivers, he would arrive at a figure rather in excess of that! Drivers are fallible people, as are scientists. They are not always law abiding, truthful or logical. Observation of scientific behaviour will certainly teach us much about human nature but rather less about scientific logic.

2.11 Weakness of the Hypothetico-deductive Method

Popper's basic idea, of model (or hypothesis) falsification based on critical rationalism and its concomitant antiauthoritarianism, is the accepted base of scientific logic. It is a testing protocol linking scientific ideas to experimental reality. This link, connecting theory, through experiment, to reality, is the reason for the great success of science as a philosophy but it is not a perfect link - it has weaknesses. The main problem is in the early phases of the process. Firstly, science makes almost no record of how it decides which models or theories it should test. Secondly, and compounding the first problem, in the real world scientific judgement is clouded by the personal subjectivities and deviations of scientists themselves. Thus it is that the initial development and selection of models to be tested, a process not necessarily linked to experiment at all, that remains the major logical difficulty inherent in the paradigm of falsification.

Robert K. Merton (discussed in Section 12.4) enunciated principles of scientific ethics which included Universalism, the belief that ideas must be considered without regard for their origins or who proposed them and this is implicit in Popper's logic. However, that cannot mean all theories must be translated into experiment, that would be impractical. To put it baldly, again, the problem is how to decide which research projects to fund. Especially, how this is decided when sociological observation indicates that the advice given by scientists themselves is hampered by personal subjectivities and deviations from logic. It is necessary to have some ground, some demarcation criterion, to decide before experiment, which theories are most likely to be correct.

In law, similar problems can arise. On the basis of the law and the evidence before him, a judge must often try a case but be unsure of the right decision. If the case is a criminal case, the benefit of this doubt will go to the defendant. In a civil action a judge may be forced to take some kind of practical line. He does not have the luxury of scratching his head for ever, he must decide on the balance of probability. He will need to find a rationale, even if it is not perfectly logical. This may lead the judge to error but it is unlikely it will lead him to fraud - he must give an open account of his judgement and explain the case and how it relates to the law. If he gets these things wrong his judgement is subject to appeal. What is more, a judge should never try a case in which there was any hint of a personal interest.

In one role, a scientist can scratch his head and vacillate between two theories for ever, or stick to a wrong theory purely to save face. There will always be some argument to put. Set against a great mass of often conflicting experimental data, no opposing scientific theory will ever be completely perfect. But gatekeepers are the judges of science and for scientists in this things are different, at the end of the day they must decide. When they go home at night, they must have made funding decisions, or job appointment decisions, or publication decisions. They must decide - whether or not they are sure. A rationale must be found even if it is not perfectly logical. However, although he is forming a judgement, the gatekeeper is not in nearly the same position as a judge. He is not subject to the discipline of explaining his decisions or recounting any scientific law or principle. What is more, he would not be deciding the issue at all unless he had a vested interest in its outcome. For the gatekeeper the temptation to follow the easy route of his interests or relativism must be very real.

In these circumstances problems arise, more for everyone else than for the gatekeeper. There are logical approaches, demarcation criteria, for selecting without experiment those theories most likely to be valid and therefore to reward funding. But how can anyone be sure the gatekeeper follows them? The observer is in a predicament. Strictly, the problem should be addressed by the administrators of science but, as "A Habit of Lies" will recount, they are content. That is not surprising - they are the gatekeepers.

2.12 Reducing the number of models - Demarcation Criteria

We will now turn to the question of which hypotheses are scientific. How to choose from a range of possibilities those hypotheses that are worthy of attention and deserve to be pursued. Philosophers of science address this problem by laying down demarcation criteria. A new theory should then be tested against the chosen criterion. Those ideas which satisfy the demarcation criteria would be most likely to be productive and most attention would be payed to them. The following sections present a series of demarcation criteria, though it may not be complete.

2.13 Popper

The main demarcation criterion associated with Popper is falsifiability - in order to be scientific, a hypothesis should be falsifiable - it should make predictions that can be tested by observation or experiment. By tested, Popper meant some of its predictions must be such that, at least in principle, the contrary could be observed. This was his primary demarcation criterion and was seen by him as very important. On this basis, for example, he criticised the various schools of psychiatric thought because each could accommodate all observations. As a result the ideas did not compete with one another and attempts to distinguish them could not be informative. This test separates the hypotheses inherent in an act of faith - religion for example - from a scientific hypothesis. The statement, "God created the heavens and the earth," cannot be contradicted by observation. Therefore, Popper would not see it as a scientific hypothesis, whether or not it is believed true.

The idea is that only models which can, in principle, be falsified are scientific - others need not be considered. It is useful to view this assertion from a different perspective. Popper is saying that, to be meaningful, a scientific theory must deny something. The idea must prohibit some observations from being made; this is extremely important, because Popper's logic is purely negative, it asserts that the actual meaning of a theory lies not in what it asserts about the universe but what it denies. Some philosophers go further, arguing that any statement has meaning only in what it denies. Thus, even a sentence as simple as, "this paper is white," actually means, "this paper is not, not white." I.e. it is not green, not blue etc.

Falsifiability is the first example of a strategy, or general principle, for reducing the number of models. It is probably the most widely discussed demarcation criterion and shows at once that asserting a scientific theory is equivalent to denying alternatives.

Popper listed two other criteria besides falsifiability. Firstly, a good, new theory should, "proceed from some simple, new, and powerful unifying idea," (Conjectures and Refutations). It should, in principle, be able to unify a body of knowledge that would otherwise be a set of disparate facts. Secondly, Popper held that it should pass some tests. A good new theory should make at least one successful prediction not apparent from existing theory. This seems rather restrictive but is not as bad as sounds. Popper would not have demanded that a theoretical astronomer build a radio telescope before publishing a new theory. Predictions explaining data within existing knowledge do meet this criterion.

2.14 Vacuousness

One of the favourite maxims of my father was the distinction between the two sorts of truth, profound truths recognized by the fact that the opposite is also a profound truth, in contrast to trivialities where opposites are obviously absurd. Niels Bohr

Some hypotheses transcend, rather than compete with, alternatives and ideas with this property are vacuous. Transcending the opposition may seem good but the result is that such a hypothesis is unfalsifiable and thus lacks meaning and information content.

To clarify the problem with vacuousness, consider the hypotheses that might be developed to explain the fact that cars move. One hypothesis could be the idea that engines cause this but the idea is vacuous. It is perhaps informative, telling us that neither the steering wheel nor the fuel tank cause motion, but at bottom it is merely descriptive. Few people would really have to go through a formal process of hypothesising and testing to conclude that the engine provided the motive force for a car. Asking how a car moves is materially equivalent to asking how an engine works. Even less informative, but perfectly true, ideas could be constructed (for example, that a car's motion is caused by molecules) but the point is that all theories should be assessed for their information content.

A rejection of vacuousness is essentially the same as the requirement of falsifiability. Because they deny nothing, vacuous theories have no information content, they are meaningless and minimal attention should be paid to them. These issues are clarified further by section 2.24, while vacuousness in relation to capping and particle movement is discussed in section 6.6.

2.15 Metaphysics

Many non-scientific hypotheses fall into the area of thought known as metaphysics, the study of existence. Metaphysics seeks to learn what entities exist, while science investigates the properties and behaviour of known things. Whether a postulate is science or metaphysics is often clear but at other times this is far from so. Hypotheses can be a mixture of science and metaphysics and many important theories have contained metaphysical elements. An example is Newton's theory of gravitation, which contains the metaphysical idea of action at a distance. There were many years of anguish before the undeniable, positive reasons for accepting Newton's laws, made acceptance of action at a distance inevitable.

Popular scientific theories are often, in essence, metaphysical postulates and many serious scientific investigations are, at bottom metaphysical, seeking evidence to support a postulate of existence. These comments are not intended to be disparaging to those investigations - this author does not share the common view in science today, where calling a theory metaphysical is tantamount to insulting it and its authors. It is better to view science and metaphysics as merely different, mutually dependent, areas of knowledge. For example, seeking life on Mars seems a perfectly legitimate investigation that is predominantly metaphysical. Another example of an almost entirely metaphysical discussion would be the reviews by Francis Crick (1982) and Fred Hoyle (1983) about panspermia, the suggestion that life on earth arose in outer space, and crossed the interstellar in spacecraft or as dust.

2.16 Metaphysical Logic and Scientific Logic

It is undesirable to believe a proposition when there is no ground whatever for supposing it true. (Bertrand Russell, Sceptical Essays)

The distinction between science and metaphysics is significant because there seems to be a significant difference between the logics of metaphysics and science. Science seeks to disprove a hypothesis and a persistent failure to do so leads to its acceptance. This is the negative logic of falsification. Metaphysics is not quite like this; before the existence of a postulated entity should be accepted, there needs to be positive reason to require the existence in question. For example, the postulate of life on Mars is a postulate of existence. It may be believed or not but well-justified belief would require positive supportive evidence, such as Martian roses.

In laying down theories, scientists do not normally distinguish science from metaphysics. That may be unfortunate, much of the philosophical disputation between confirmation and elimination of theories might be removed if this were done. Metaphysical logic seems to be largely the positive logic of confirmation, while scientific logic seems largely the negative logic of falsification.

Popper's hypothetico-deductive model applies to the scientific parts of theories but not so obviously to their metaphysical elements. It is generally a very difficult, or even universally impossible, task to disprove a metaphysical postulate. Even though it seems very unlikely, it would be difficult to actually prove that there is no life on the moon.

However, it is only when a metaphysical idea has supportive evidence that it becomes important. As an example, consider the atomic theory of matter. As every schoolboy knows, the idea of atoms was originally advanced by the Greeks but in this form the idea was metaphysical speculation unsupported by evidence. The idea of atoms was merely a conjecture, unproven, unlinked to any body of experimental evidence, and irrelevant to any possible course of action. Agnosticism was a rational view of the debate about atoms until Dalton's chemical laws, based as they were on observation, began to require them for chemical interpretations. The observations that positively required atoms also made them relevant, and they began to influence men's actions. In the twentieth century, photographs of atoms have been obtained, and disbelief has become irrational.

In logic, then, you just cannot win. Theories need positive evidence for the entities whose existence they postulate. Then they need negative disproof of competing theories.

2.17 Parsimony and Aesthetics

It is often said that a good scientific model should be parsimonious, it should make the minimum number of postulates necessary to explain the data in question. The principle of parsimony is essentially Occam's razor (Section 2.19).

Einstein is said to have favoured relativity strongly on aesthetic grounds, he found it more beautiful than the alternatives. This may be similar to parsimony as people could find simple hypotheses attractive, beyond that it is unclear to what extent aesthetics could be converted to a generally valid criterion. Even in Einstein's hands it seems to have been imprecise, he rejected the probabilistic aspects of quantum mechanics on much the same grounds - "God does not play dice with the universe".

2.18 Toulmin's Applicability Criterion

Toulmin (1972) argues that what distinguishes a good scientific concept from a less good one is its applicability rather than its acceptability (acceptable in his usage meaning not falsified). Where the erroneous neglect of scientific theories is being discussed, it brings us back to the start. Applicable to what? Applied by whom? The man in the street rarely thinks about scientific theories and when he needs one he goes to a textbook to find it. By the time a theory gets there, everything is done and dusted. Theories are applied mainly by scientists to problems chosen by scientists. Moreover, it is gatekeeping scientists who do the choosing and determine what theories will be applied to what problems.

2.19 Occam's Razor - the Coherence Criterion

A principle stated in correspondence by Dr. John Maddox, as Editor-in-Chief of Nature (his letter is quoted in Section 10.5) is that a hypothesis should be "grounded on previous understanding or observation." To take his example, in the nineteenth century there might have been competing hypotheses about the make up of the moon. One school of thought arguing the moon was made of rock, another school advancing the view that it was green cheese. As he says, even without experiment intelligent scientists would not have considered the green cheese hypothesis, because it was founded upon no present knowledge or observation. There are other, rather trite, reasons to reject the green cheese model. Cheese is a dairy product made by men from milk, in turn produced by lactating mammals. The green cheese hypothesis implies that men and other mammals are at large within the solar system, giving the green cheese hypothesis some very complex, improbable and unsupported implications.

The existence of such complex ramifications is a general reason for rejecting, or at least downgrading, a hypothesis without experiment. All this boils down to Occam's razor - hypotheses involving the least possible departure from the existing body of knowledge are most likely to be correct. Hypotheses that pick up well-established ideas from related areas inherit much of their supportive evidence, much as an organism inherits many characteristics from its evolutionary forebears.

Occam's razor is related to the idea of coherence with existing knowledge. To understand coherence one may think of all knowledge as being cut into a large number of small pieces much like a jigsaw puzzle. To reassemble the picture we must examine a piece to see if the pattern on it fits in with, or coheres with, the pattern on those pieces we have already assembled in that area. For a new piece of knowledge fits comfortably in place, the shape of knowledge painted onto it should form a continuous pattern with, or cohere with, surrounding pieces.

A new claim to knowledge or a hypothesis which fails to cohere with surrounding knowledge is an extraordinary claim. Its acceptance would demand the revision of knowledge within those surrounding areas and, consequently, its acceptance demands extraordinary evidence.

Coherence, or Occam's razor, is a well known and important principle but two important caveats should be stated. Firstly, the coherence criterion must be used with care and moderation, applied rigidly it produces closed systems of thought. The pieces of the jigsaw already assembled may actually be in the wrong places. Secondly, the existing body of knowledge means exactly what it says and knowledge is well-founded belief (Popper). The existing body of knowledge does not mean the existing body of hypotheses. To be of any real value, a new idea must compete with existing suppositions used to explain the same data set. It is diametrically wrong to demand of a new hypothesis that it be consistent with the ideas it sets out to replace.

2.20 The Evolvability Criterion and Teleology

Biology presents a special case of (or at least a special justification for) Occam's razor. Darwin's theory of evolution by natural selection is the pivot around which biological theory revolves. It is biology's paradigm and states that all living things evolve by the accumulation of those small changes which are favourable to its reproduction. As a result, all individual structures and functions within an organism evolve from other similar structures or functions. No feature develops unless both it, and each individual step on the way to its production, enhance the evolutionary fitness of the organism concerned.

So strong is evolutionary theory in contemporary biology that this argument is comparable in force to assertions in physics based on Newton's laws or chemistry based on the balance of equations. As Williams (quoted in Barkow et al. (1992)) puts it, "A biological explanation should invoke no factors other than the laws of physical sciences, natural selection and the contingencies of history." Evolutionary theory identifies a number of relevant generalisations.

Parallel evolution. When unrelated organisms have similar evolutionary problems to solve, they often solve it in similar ways. Thus modern dolphins are remarkably similar in shape to those members of the dinosaur group that once occupied the same niche. In each case they had the problem of how to move efficiently through the water - each solved the problem by arriving at a similar, and very streamlined, shape.

Adaptations do not arise from nowhere, structures or functions arise as modifications of pre-existing features. Evolution takes a structure or process that already exists in an organism and does a job well enough, (it "satisfices," as evolutionary theorists say, see glossary) and then optimises this "good enough" adaptation. In evolution, an adaptation does not need to be perfect. A gazelle does not need to be a champion athlete, just run fast enough to dodge lions. It is often the case that an engineer, looking at a living system, can say, "I could design a better way of doing that". Nonetheless, the engineer's ideal could not arise in evolution unless some other adaptation already existed to serve as a starting point, and unless each step to producing it benefitted the evolutionary fitness of the organism in which it arose. For example, the wheel is a common engineering solution which is well-nigh absent from nature for this reason. (The bacterial flagellum is the only natural rotating device known to this author.)

Closely related organisms do the same job in closely related ways.

All this adds evolutionary weight to Occam's razor when used in biology - hypotheses involving the least possible departure from the existing body of knowledge are most likely to be correct. If one organism is known to solve a problem one way, scientists should consider similar solutions to similar problems in that or related organisms. They should also consider the possibility that similar evolutionary problems will have been solved by unrelated organisms in similar ways.

Closely related to the evolvability criterion is the ban on teleological argument. A "teleological" explanation posits a purpose about the way living things behave, for example, a giraffe has a long neck because it "wants" to eat from trees. In Darwinian theory, giraffes have long neck because that is a successful life strategy, long-necked giraffes eat well and so reproduce. When evolutionists talk about the evolution of organisms they are not using the word in the same way as do engineers when they speak of evolving the design of a new car. The engineers have a purpose, to produce a design which is the best compromise between the various design parameters for this new vehicle.

In general, all scientific explanation is mechanistic, and does not address itself to purpose. It is true that Darwinian natural selection can mimic purpose but that is no part of the theory, consequently scientists cannot propose mechanisms implying any kind of aim or advance planning by their subject organisms. To do so would reopen of the debate about Darwinian and Lamarckian evolution. Of course, exceptions arise with intelligent organisms but that qualification does not apply to cell biology. The ban on teleology prohibits any blueprints or planning by individual cells.

2.21 Interests and Relativist Reasons

Many theories, now known to be correct, were initially ignored by establishment figures despite the evidence. A carefully studied example was the theory of plate tectonics, (the idea that the continents are very slowly drifting across the earth's surface) proposed in the 1920s by Alfred Weggener. He developed the idea in several editions of his privately published book entitled The Origins of Continents and Oceans. His basic premise arose from the obvious "fit" between the continents visible to anyone with a map but he presented much other supportive evidence, including the geological similarities between the sections of land that seemed once to have been in contact and the phylogenetic affinities between the life forms in those regions.

All this evidence was correct and is cited in modern textbooks as support for plate tectonics but Weggener did make some mistakes. He proposed an incorrect mechanism for drift and his estimates for the rate of drift were wrong. His ideas were largely derided by the establishment for the next 40 years. They criticised his estimates of the rate of drift and his proposed mechanism, while largely ignoring the positive evidence underlying his proposal and the disproofs of vertical tectonics, the alternative they themselves advocated. However, in defence of that establishment, there was at least a debate, they did not completely ignore his work until after his death. Even so, at times like this, the number of models for test seems to be reduced by applying criteria of very dubious objective validity. It is when circumstances such as these might be arising that a duty should exist to explain the rationale for the rejection of an alternative.

Sociologists of science have offered two, non-exclusive, perspectives on the reasons why scientists choose one theory for test while ignoring another - the "interests" perspective and the "relativist" perspective. In the "interests" view a scientist may prefer one theoretical approach because it reflects their personal "interests". Interests in this context being either scientific or non-scientific. For example, see Stewart (1983) who summarises the interests perspective thus :-

"Scientific" interests might include desires to use quantitative theories or theories that "solve" particular problems. Less scientific interests are represented by desires to protect the basis of one's previous intellectual contributions or even to use theories compatible with one's social beliefs or class position in the broader society.

In short, in the interests perspective, scientists choose to accept or reject theories on a basis that is not objective but reflects the results they themselves want to achieve with the theory. Some interests are valid reasons for choice, such as quantitative over qualitative theories or the possibility that one hypothesis has some "pragmatic" use while the other would not have. Others are not valid, it is a question of scientific interest or self-interest. Interests have been a major theme in the sociological investigation of science for twenty years (Jardine (1991)).

"Relativist" reasons for choosing a particular approach would include the relative social, reputational or hierarchical status of the scientists concerned. Preexisting family or social relationships and even the personal persuasive qualities of an individual are relativist reasons. In other words relativism is taking a view of the person who advocates an idea, rather than examining the idea itself. (See Collins (1981) for a review of the relativist perspective.) "Sound chap," "old school tie" and "not invented here" are only three of the many phrases associated with relativist reasoning.

Interests and relativism are dark clouds lowering over the logic, philosophy and administration of science. They are the names we give to social rather than rational reasoning. When a gatekeeper shows a preference, the challenge for him is to show that it is a rational, not social, preference. When a scientific administration apply a quality indicator, their challenge is to establish it as a rational not a social measure.

2.22 Singularities

Singularities are events that occur unpredictably and in isolation, leaving little trace behind them. They are difficult or impossible to study scientifically. One example is meteorite impact and for many years scientists dismissed the idea of rocks falling from the sky as a hoax. The disbelief arose because of the difficulty inherent in studying such random events but in 1803, some 2,000 meteorites fell in one night on the French town of l'Aigle. The resulting proliferation of evidence and witnesses made a denial untenable.

Today some paranormal events might be described as singularities. It is very difficult to say whether ghost sightings, extraterrestrial visits and some miracles are real, mistaken or hoax events because, as singularities, they defy study. In many respects it is best to remove such topics from science altogether and leave agnosticism, rather than denial, as the most rational debating posture.

2.23 Summary of Strategies for Model Elimination

Strategies that can be used to reduce the number of models without experiment have now been reviewed. A summary is given here, though the list is not claimed to be original or complete.

  • Popper's principles of falsifiability, unification of knowledge and successful test.
  • Occam's razor, or the coherence criterion. Good theories will involve the least possible departure from existing knowledge.
  • Evolvability, biological theories proposing mechanisms very different from those found in similar organisms or elsewhere in nature, should be viewed as doubtful.
  • Vacuous theories are not scientific.
  • Parsimony and aesthetics.
  • Applicability to agreed problems.
  • Reasons based on the scientific or non-scientific "interests" of the individual scientist.
  • Reasons based on the "relativist" perspective or the relationship and relative status between protagonists of different theories.
  • Theories whose metaphysical components lack positive support may be downgraded.
  • Gatekeeper preference, as discussed in section 2.7. This is also tied up with scientific administration and quality indicators.
  • Evidence derived from singular observations will be treated as weak.

Non-scientific interests or the relativist arguments are not scientifically valid and gatekeeper preference is dubious.

2.24 Hypothesis Testing and Probability

The field at issue in this work has aroused strong passions and this author does feel deeply about the case that will be put; others may as vehemently disagree and that fervour clouds the real issues. A large part of the contribution mathematics makes to science is its tendency to strip personal values and passions from facts or assertions. With this in mind, the principal conclusions drawn from Popperian hypothesis testing will now be given in their mathematical formalism. This has often been reviewed, for example Losee (1993) briefly does so.

To any hypothesis we can attach a probability value, anywhere between 0 and 1, representing the likelihood that it is correct. An assigned value of 0 states that the hypothesis is certainly false while a probability of 1 claims the hypothesis to be definitely true. Intermediate values reflect intermediate degrees of confidence about the truth of the hypothesis.

A hypothesis set is all the hypotheses being considered to account for a given set of data. Such a set is well chosen if one, and only one, of its hypotheses must be true. In these circumstances the total of probabilities assigned to the hypotheses must be 1. A well-chosen set covers the available possibility space. A possibility space is all the possible outcomes of an experiment or all possible explanations of the experimental data. For example, if a dice is thrown, it may fall with any one of the numbers 1 to 6 uppermost. The six hypotheses P(1) to P(6) are all the available possibilities for how the dice will fall and cover this possibility space. One, and only one, of them must be correct. This hypothesis set is therefore well- chosen and the sum of the six probabilities must be 1.

There are problems involved in applying this probability theory to science. Firstly, scientists are usually unable demonstrate that a hypothesis set is well chosen and does cover the possibility space. Accordingly, they must remain alert to the existence of alternative but unconsidered hypotheses. A lesser problem is that sometimes hypotheses overlap and two may be correct at once. Most importantly, the values of the probabilities assigned are highly subjective, so only very general statements can be made.

If a value of zero is assigned to the truth probability P(A) of hypothesis A then the hypothesis is being rejected. The precise vocabulary used to make the assignment is immaterial. The English language offers many synonyms for the word reject and semantic analysis may attach different shades of meaning to each. However, the mathematical interpretation does not allow those shades. For example, rather than saying it is false or rejected, hypothesis A may be said to be "superseded". The word superseded attaches a probability P(A) = 0 to hypothesis A, just as rejected does. Thus, in this context, superseded means exactly the same as rejected. The same would be true for any other word, expression or process whose impact was to set P(A) = 0. Thus it is a fallacy to think that such a word or process has a meaning different from reject. See section 9.6 for an example of this fallacy.

Consider a hypothesis set comprising three theories, A, B, and C, and attach probabilities P(A), P(B) and P(C) to each of them. Assuming that the hypothesis set is well chosen, then

P(A) + P(B) + P(C) = 1

An ardent advocate of hypotheses B and C might assign probabilities to these ideas such that

P(B) + P(C) = 1

If so, then he is also stating that P(A) = 0 and therefore rejecting hypothesis A. It is impossible to attach all available probability to hypotheses B and C without rejecting A. Science is an uncertain world where it is normally impossible to prove a hypothesis set covers the probability space and the claim that

P(B) + P(C) = 1

is equivalent to a rejection of all other hypotheses, whether or not explicitly considered. Such certitude is unreasonable and a rejection of it is the central thesis of "A Habit of Lies", which describes a field in which the known hypothesis set comprises three possibilities. In that field, prominent scientific workers seem willing to consider only two, thus assigning a probability of 0 to the third, thus rejecting it and any other possibility.

Many years before Popper, Bayes investigated the branch of mathematics applied to formal hypothesis evaluation and now known as Bayesian statistics. A scientific investigation links experimental results with the probability assignments attached to particular hypotheses. Before any experimental test is performed initial probabilities (known as antecedent probabilities) must be assigned to the various hypotheses. As experimental data become available these antecedent probabilities are adjusted up or down depending on whether the observations support or do not support the corresponding hypothesis. The theorem used to adjust the probabilities is known as Bayes' theorem. Some fields can use the procedure quite formally. For example, in medical diagnostics, antecedent probabilities reflect the incidence of a disease in the population. In practical science Bayes' theorem has little formal use because of the general difficulty giving objective numerical values to the antecedent probabilities. Accordingly, the theorem is neither stated nor used here. Even so, scientists must intuitively use Bayes' theorem, assigning antecedent probabilities by judgement.

Mathematicians have investigated the fallacies arising in Bayesian statistics, some of which help to clarify points made earlier. A hypothesis is meaningful only if it partitions the possibility space; for example, the hypotheses that a dice will fall as a five or as an odd number are both meaningful in that they can both be wrong - it may fall as a four. On the other hand, the hypothesis that the dice will fall with a number uppermost is not meaningful because all possible outcomes are numbers - the hypothesis cannot be falsified because it fails to partition the possibility space. This failure is what philosophers of science mean when a hypothesis is described as vacuous.

A hypothesis may be "academic" (in a pejorative sense); whether it be true or not will make no difference to actions or beliefs flowing from the statistical analysis. The distinction is important for doctors making a diagnosis - only if two diseases require different treatment, is the physician concerned to know which his patient suffers from. Returning to the example of the dice, whether it falls as a five or not will affect my actions only if I am playing snakes and ladders or have some other link to this test. For most people, the outcome of throwing dice is academic and uninteresting. In science, this pejorative form of the word academic means that whether a hypothesis is true will have no effect on perceptions of the world or how people act.

Finally, note again that a hypothesis set should be well chosen and, without overlap, cover all possible explanations. It is hard, in science, to prove that a hypothesis set does entirely cover the possibility space. The proper response to this problem is to contemplate the possibility that all the considered hypotheses are wrong. It remains very wrong to use a hypothesis set that is known not cover the possibility space.

2.25 Assessment of Antecedent Probabilities

Much of the intuitive Bayesian statistics used by practising scientists consists of the assignment of antecedent probabilities to any suggested hypotheses. This is the statistical equivalent of initial hypothesis screening, the criteria for which were summarised in section 2.23 and many of the criteria discussed there could be used to assign high or low antecedent probabilities to a hypothesis. Modern biology would attach great importance to the evolvability criterion, while Kaplan (1964), suggests that high probabilities be assigned to hypotheses that cohere with existing knowledge (section 2.19) or have high pragmatic value (section 2.22). Many philosophers have suggested both coherence and pragmatics as criteria for truth but they seem better used to adjust antecedent probabilities.

An aside is useful here; pseudoscience is an insulting term used by many establishment scientists to dismiss areas such as homeopathy and investigations of the paranormal. Such fields make claims which do not cohere with existing scientific knowledge. When it stops being insulting, the demand mainstream science makes of such fields is that the extraordinary claims being made demand extraordinary evidence in their support. The mathematical justification for that requirement comes from the assessment of antecedent probabilities. If a hypothesis fails to cohere with existing knowledge, it is right to assign it a low antecedent probability. Only very clear evidence supporting it, and contradicting more cohering hypotheses, will bring its probability assignment up to a point where it would be accepted.

Invalid criteria such as relativism and self-interest will intrude on the intuitive assignment of antecedent probabilities. They will lead to the assignment of a low antecedent probability to a correct hypothesis and vice versa. However, unless the correct theory is actually assigned an antecedent probability of zero, this should only slow things down. The objective application of Bayes' theorem would steadily improve the probability assigned to the correct hypothesis as experimental data became available. (In Bayesian statistics, antecedent probabilities can, in principle, be assigned randomly but still ultimately produce good knowledge. This may be how some sciences arose from areas we would today classify as mythology. Alchemy for example led to chemistry and astrology to astronomy.) Only if the correct hypothesis is dismissed entirely will Bayesian statistics fail. If the antecedent probability assigned to a correct hypothesis is zero, Bayes' theorem will keep the probability at zero no matter what the outcome of experiment and the remaining ideas will become a closed system of thought. This seems to be true of the intuitive Bayesian scientist, just as it is of the formal statistical process.

Intuitive Bayesian statistics are applied both by individuals and by the community of scientists. Both levels will assign intuitive antecedent probabilities to hypotheses and both, being human, will err. The discussion of psychology, given in chapter 13, will address the expected directions of those errors. It will be argued that the consensus formed during scientific quality assessment is usually illogically extreme. In general, the scientific community is too willing both to assign a probability of zero to dissenting ideas and to assign a probability of one its own beliefs.

2.26 The Origins of Uncertainty

It is universally accepted, and implied by use of probability theory in hypothesis testing, that no scientific theory can be known, with total certainty, to be true. Scientific certainty is lost in two general ways - uncertainty in the outcome of experiments and uncertainty in their interpretation. Our certainty in the outcome of experiments is greatly increased by care in its execution and repetition by other groups or on analogous systems. Unfortunately, these hardly improve our confidence in the interpretation of the results.

Clearly repeat experiments and studies on related systems has a role in ensuring validity of results but there are also structural and social reasons for such studies. If an experiment is cheap, quick and already within the laboratory's range, it is quite easy to perform a series of studies around a theme. Moreover, results that accord with earlier data are theoretically uncontroversial and, if the field already understands a technique, other workers are less likely to obstruct publication by raising queries about the validity of the observations. Thus, a large body of publication can quickly accumulate that hinges round one basic experiment.

For purposes of interpretation it is important to realise that, for all its size, that body of papers only amounts to one experiment. Failing to recognise this is to act like the man Wittgenstein mentions in Philosophical Investigations, who purchases several copies of the morning paper to reassure himself that what he reads there is true. Committing this fallacy is both a common individual fault and also structurally embedded in modern scientific administration. Of course, scientists do not buy many copies of their morning paper, but they do publish many copies of the same, or very similar, experiment; then they point to the "mountain of evidence" supporting their ideas.

Experiments report reality much as newspapers report news. The hypothesis used to explain their outcome is the impression of reality they give. Like a newspaper article, the scientific observations may be clear and accurate, or misleading and inaccurate. Because observations may be inaccurate, they need to be reported in a way that enables other workers to replicate them. Because the observation may be misleading, even though accurate, the generated hypothesis should be confirmed by data which is as unrelated to the original observations as possible. Reverting to Wittgenstein's analogy, his man would have been well advised to read another newspaper, one which employed a different reporter who, himself, employed different sources for the news he reported.

This point has been made by many philosophers of science; for example, in the nineteenth century, Whewell, adopted it as a criterion of induction, referring to it as the consilience of hypotheses. Although we no longer think there is a logic of induction, his point remains valid as a means of increasing our confidence in a theory. On the same lines, Popper asserted that a hypothesis supported by data of two or more distinctly different types should be preferred to an alternative able to explain only a narrow domain of data.

In summary, repetition offers confidence that the published data are accurate but those scientists who believe that repetition of data can support ideas are buying too many copies of the morning paper. No matter how many times an experiment, or its close siblings, are repeated - one hundred times or one thousand papers - repetition adds no assurance that any particular interpretation of that result into a hypothesis is valid. If another idea will explain the data from one such experiment, then it will equally apply to any number of repetitions. Assurance of interpretation can come only by comparing the success of competing hypotheses in interpreting data from disparate areas. The more dissimilar are the sources of data used the better, providing only that they do fall within the range of application of the hypotheses in question. Modern scientific administrations fail to recognise this fallacy, a failure linked closely to the procedures they use for quality assessment.

2.27 Quality Assessment - Peer Review and Citation Analysis

Science managers and gatekeepers base many policies and decisions on quality assessments. Consequently, how quality is defined, maintained and assessed, is a pivotal issue for modern science - it is also one of the few areas in which scientific practice overlaps with scientific philosophy. In principle assessment of quality in research programmes should include a rational assignment of the antecedent probability of the underlying ideas. In practice, however, the methods adopted simply abandon rationality and one of them jumps head first into Wittgenstein's fallacy, buying as many copies of the morning paper as leaders in a field might find convenient. Assessments are made at several levels, for example of :-

  • Research projects before they are funded.

  • The value of work before it is published in the scientific literature.

  • The worth of researchers before they are appointed to posts.

These prospective evaluations are usually made by peer review. Referees, anonymous experts in the field, are selected by scientific authorities. The expert will then write a report, which is taken to be an objective evaluation of the work in question, but that report is unlikely to make any attempt at explanation and it may not be seen by the scientist concerned, who will have little or no opportunity to reply if he does see it. Besides these initial screening steps, post hoc assessments are also made of :-

  • The "success" of published articles in terms of their scientific impact when set against competing articles.

  • The "success" of published scientists in terms of their scientific impact when set against competing workers.

  • The "status" of institutions and journals.

Sometimes such assessments are made by committees of experts but one of the most important tools used for the appraisal is citation analysis, a tool developed over the past twenty to thirty years.

A scientific paper does not stand alone, it builds on what has gone before, using other workers ideas, techniques and results. To place the work in context, the scientific article ends with a list of relevant publications showing where the ideas it used came from, just as "A Habit of Lies" includes a list of references. These are citations and they interested an American named Eugene Garfield. His Institute of Scientific Information (ISI) notes every scientific paper published and, from their citation lists, constructs a computer database, called the Science Citations Index (SCI). Scientists can use the SCI to find all papers citing any earlier article. It has proved to be a very valuable research tool, enabling workers to research a topic forward through the literature, whereas traditional abstracting media permitted only a backwards search.

The SCI is also used in quality assessments. Using it, one can easily determine how often, or whether, a paper is cited by subsequent publications, a process called citation analysis. The argument is that rarely-cited studies cannot have been very important. In making this count, the ISI itself carefully avoids the term "quality", preferring to call the resulting measure the "impact" of a paper, but scientific institutions do take this impact as a measure of quality. Journals and institutions can also be ranked according to the impact of articles published during a given period. Journals even tout their impact rating when advertising to libraries for sales or soliciting the scientific community for new papers.

This way of assessing quality means that the citation practices of authors influence the assessment of the work done by their contemporaries and colleagues. If a scientific theory is not mentioned by establishment figures, and the articles which propose it are not cited by them, the theory is automatically assessed as of low quality, even if no reason for disregarding it has been given. By contrast, if scientists go to great lengths to rebut an incorrect theory, that theory will be assessed as being of high quality, even if most observers regarded the theory as absurd from the outset.

Whatever its value as a management tool, quality assessment by citation analysis is clearly prone to Wittgenstein's fallacy. Moreover, its practical implications for the assessment of theories are clear. Under that process, ignoring, or not citing, a theory is the same as rejecting it. For their part, scientists are well aware of the quality assessment procedures used and the implications of their actions. When a scientist disregards a theory, he knows the result this will have for its assessment and presumably intends that outcome. In short, a scientist who chooses to ignore a theory, is broadcasting a message about that theory - namely that he rejects the theory as of low quality. The message thus broadcast may be implicit but the scientist knows it is sent, he knows who receives it and he knows how they will interpret and act upon it.

Both citation analysis and peer review are highly questionable as methods of quality assessment and amount to little more than statements of establishment opinion, a point to be detailed later. Chapter 13, discusses reviews some psychological observations, including automatic subservience to authority, while chapter 15 briefly reviews sociology, including Pareto's law. Both fields are highly relevant to understanding the failure of modern scientific quality assessment.

2.28 Conclusion

Despite its length this chapter has merely scratched the surface of many quite subtle ideas. Even so, its central purpose has been very simple, to show that ignoring a scientific theory is the same as rejecting it. This is a conclusion that can be inferred from rationality and logic - expressed by Popper's hypothetico-deductive method, from mathematics by hypothesis testing procedures, from psychology using Polanyi's description of scientific knowledge and from the management procedures currently used to assess the quality of scientific work. All analyses of science lead to the conclusion that ignoring a theory is the same as rejecting it and, moreover, that conclusion is just plain common sense.

Alternative scientific theories are often rejected during the early stages of a scientific investigation, during the initial sorting of possible models. This sifting through alternatives is an important stage of scientific logic, because an erroneous elimination invalidates knowledge claims just as surely as conflicting experimental data. This is a stage of thought quite as deserving of record as any other and just as amenable to explanation. The reasoning involved in such sorting should no more be private than other scientific reasoning and, while the reasons used may be eclectic, scientists should not use mere hunches or guesswork, neither should they base their position on non-scientific interests or relativism. The best way to ensure this reasoning is valid, is to expose it to rationalist criticism and that can only happen if the reasoning is explained.

In one field, "A Habit of Lies" will describe such a failure, indeed a refusal, to give that reasoning and chapter 3 will describe the field involved. The last word in this chapter will be given to Michael Polanyi (1958). The son of a Nobel prize winner, he grew up in a scientific household, became a physician, turned to physical chemistry and finally became one of Britain's most distinguished philosophers of science. Exploring the role of doubt in scientific thinking, he observed, in elegant prose, that :-

A scientist must commit himself in respect to any important claim put forward within his field of knowledge. If he ignores the claim he does in fact imply that he believes it to be unfounded. If he takes notice of it, the time and attention which he diverts to its examination and the extent to which he takes account of it in guiding his own investigations are a measure of the likelihood he ascribes to its validity. Only if a claim lies totally outside his range of responsible interests can the scientist assume an attitude of completely impartial doubt towards it. He can be strictly agnostic only on subjects of which he knows little and cares nothing.

Summary

This Chapter has :-

  • Defined a model as a mental assembly of various axioms.
  • Described how the properties of a model can be examined and compared to a real situation.
  • Shown how the hypothetico-deductive logic of Popper requires three stages in a scientific investigation.
    • Generation of models.
    • Initial screening to remove unrealistic ideas.
    • Experimental tests to decide between the remaining concepts.
  • Listed some demarcation criteria that can be used for screening out unrealistic ideas and observed that such screening out is equivalent to saying that a theory is wrong.
  • Introduced the idea of scientific gatekeepers who do the screening and the occasional abuses of their role.
  • Described Bayesian statistics, the mathematical treatment of hypothesis testing.
  • Mentioned quality assessments and their role in science.
  • Stated that scientists who ignore a competing theory in their field of interest, thereby reject it.

 

© Copyright John A Hewitt.
For contact information, see copyright page.
Last Modified 21 October 2005