Today we have naming of parts. Yesterday,
We had daily cleaning. And tomorrow morning,
We shall have what to do after firing. But to-day,
Today we have naming of parts. Japonica
Glistens like coral in all of the neighbouring gardens,
And today we have naming of parts.
(Naming of Parts, Henry Reed, 1942)
It is not difficult to establish by casual reading that there are almost as many ways of using crucial terms like calibration and validation in ABM as there are actual instances of their use. This creates several damaging problems for scientific progress in the field. Firstly, when two different researchers both say they “validated” their ABMs they may mean different specific scientific activities. This makes it hard for readers to evaluate research generally, particularly if researchers assume that it is obvious what their terms mean (rather than explaining explicitly what they did in their analysis). Secondly, based on this, each researcher may feel that the other has not really validated their ABM but has instead done something to which a different name should more properly be given. This compounds the possible confusion in debate. Thirdly, there is a danger that researchers may rhetorically favour (perhaps unconsciously) uses that, for example, make their research sound more robustly empirical than it actually is. For example, validation is sometimes used to mean consistency with stylised facts (rather than, say, correspondence with a specific time series according to some formal measure). But we often have no way of telling what the status of the presented stylised facts is. Are they an effective summary of what is known in a field? Are they the facts on which most researchers agree or for which the available data presents the clearest picture? (Less reputably, can readers be confident that they were not selected for presentation because of their correspondence?) Fourthly, because these terms are used differently by different researchers it is possible that valuable scientific activities that “should” have agreed labels will “slip down the terminological cracks” (either for the individual or for the ABM community generally). Apart from clear labels avoiding confusion for others, they may help to avoid confusion for you too!
But apart from these problems (and there may be others but these are not the main thrust of my argument here) there is also a potential impasse. There simply doesn’t seem to be any value in arguing about what the “correct” meaning of validation (for example) should be. Because these are merely labels there is no objective way to resolve this issue. Further, even if we undertook to agree the terminology collectively, each individual would tend to argue for their own interpretation without solid grounds (because there are none to be had) and any collective decision would probably therefore be unenforceable. If we decide to invent arbitrary new terminology from scratch we not only run the risk of adding to the existing confusion of terms (rather than reducing it) but it is also quite likely that everyone will find the new terms unhelpful.
Unfortunately, however, we probably cannot do without labels for these scientific activities involved in quality controlling ABMs. If we had to describe everything we did without any technical shorthand, presenting research might well become impossibly unwieldy.
My proposed solution is therefore to invent terms from scratch (so we don’t end up arguing about our different customary usages to no purpose) but to do so on the basis of actual scientific practices reported in published research. For example, we might call the comparison of corresponding real and simulated data (which at least has the endorsement of the much used Gilbert and Troitzsch 2005 – see pp. 15-19 – to be referred to as validation) CORAS – Comparison Of Real And Simulated. Similarly, assigning values to parameters given the assumptions of model “structures” might be called PANV – Parameters Assigned Numerical Values.
It is very important to be clear what the intention is here. Naming cannot solve scientific problems or disagreements. (Indeed, failure to grasp this may well be why our terminology is currently so muddled as people try to get their different positions through “on the nod”.) For example, if we do not believe that correspondence with stylised facts and comparison measures on time series have equivalent scientific status then we will have to agree distinct labels for them and have the debate about their respective value separately. Perhaps the former could be called COSF – Comparison Of Stylised Facts. But it seems plainly easier to describe specific scientific activities accurately and then find labels for them than to have to wade through the existing marsh of ambiguous terminology and try to extract the associated science. An example of a practice which does not seem to have even one generally agreed label (and therefore seems to be neglected in ABM as a practice) is JAMS – Justifying A Model Structure. (Why are your agents adaptive rather than habitual or rational? Why do they mix randomly rather than in social networks?)
Obviously, there still needs to be community agreement for such a convention to be useful (and this may need to be backed institutionally for example by reviewing requirements). But the logic of the approach avoids several existing problems. Firstly, while the labels are useful shorthand, they are not arbitrary. Each can be traced back to a clearly definable scientific practice. Secondly, this approach steers a course between the Scylla of fruitless arguments from current muddled usage and the Charybdis of a novel set of terminology that is equally unhelpful to everybody. (Even if people cannot agree on labels, they knew how they built and evaluated their ABMs so they can choose – or create – new labels accordingly.) Thirdly, the proposed logic is extendable. As we clarify our thinking, we can use it to label (or improve the labels of) any current set of scientific practices. We will do not have to worry that we will run out of plausible words in everyday usage.
Below I suggest some more scientific practices and possible terms for them. (You will see that I have also tried to make the terms as pronounceable and distinct as possible.)
|Checking the results of an ABM by building another.||CAMWA (Checking A Model With Another).|
|Checking ABM code behaves as intended (for example by debugging procedures, destructive testing using extreme values and so on).||TAMAD (Testing A Model Against Description).|
|Justifying the structure of the environment in which agents act.||JEM (Justifying the Environment of a Model): This is again a process that may pass unnoticed in ABM typically. For example, by assuming that agents only consider ethnic composition, the Schelling Model (Schelling 1969, 1971) does not “allow” locations to be desirable because, for example, they are near good schools. This contradicts what was known empirically well before (see, for example, Rossi 1955) and it isn’t clear whether simply saying that your interest is in an “abstract” model can justify this level of empirical neglect.|
|Finding out what effect parameter values have on ABM behaviour.||EVOPE (Exploring Value Of Parameter Effects).|
|Exploring the sensitivity of an ABM to structural assumptions not justified empirically (see Chattoe-Brown 2021).||ESOSA (Exploring the Sensitivity Of Structural Assumptions).|
Clearly this list is incomplete but I think it would be more effective if characterising the scientific practices in existing ABM and naming them distinctively was a collective enterprise.
This research is funded by the project “Towards Realistic Computational Models Of Social Influence Dynamics” (ES/S015159/1) funded by ESRC via ORA Round 5 (PI: Professor Bruce Edmonds, Centre for Policy Modelling, Manchester Metropolitan University: https://gtr.ukri.org/projects?ref=ES%2FS015159%2F1).
 It is likely that we will have to invent terms for subcategories of practices which differ in their aims or warranted conclusions. For example, rerunning the code of the original author (CAMWOC – Checking A Model With Original Code), building a new ABM from a formal description like ODD (CAMUS – Checking A Model Using Specification) and building a new ABM from the published description (CAMAP – Checking A Model As Published, see Chattoe-Brown et al. 2021).
Chattoe-Brown, Edmund (2021) ‘Why Questions Like “Do Networks Matter?” Matter to Methodology: How Agent-Based Modelling Makes It Possible to Answer Them’, International Journal of Social Research Methodology, 24(4), pp. 429-442. doi:10.1080/13645579.2020.1801602
Chattoe-Brown, Edmund, Gilbert, Nigel, Robertson, Duncan A. and Watts Christopher (2021) ‘Reproduction as a Means of Evaluating Policy Models: A Case Study of a COVID-19 Simulation’, medRXiv, 23 February. doi:10.1101/2021.01.29.21250743
Gilbert, Nigel and Troitzsch, Klaus G. (2005) Simulation for the Social Scientist, second edition (Maidenhead: Open University Press).
Rossi, Peter H. (1955) Why Families Move: A Study in the Social Psychology of Urban Residential Mobility (Glencoe, IL, Free Press).
Schelling, Thomas C. (1969) ‘Models of Segregation’, American Economic Review, 59(2), May, pp. 488-493. (available at https://www.jstor.org/stable/1823701)
Chattoe-Brown, E. (2022) Today We Have Naming Of Parts: A Possible Way Out Of Some Terminological Problems With ABM. Review of Artificial Societies and Social Simulation, 11th January 2022. https://rofasss.org/2022/01/11/naming-of-parts/