Tag Archives: bruceedmonds

Introduction

Machine-learning systems, including Large Language Models (LLMs), are algorithms trained on large datasets rather than something categorically different. Consequently, they inherit the standard theoretical and practical limitations that apply to all algorithmic methods. Here we look at the computational limits in terms of “Vibe coding” – where the LLM writes some of the code for an Agent-Based Model (ABM) in response to a descriptive prompt. Firstly, we review a couple of fundamental theorems in terms of producing or checking code relative to its descriptive specification. In general, this is shown to be impossible (Edmonds & Bryson 2004). However, this does not rule out the possibility that this could work with simpler classes of code that fall short of full Turing completeness (being able to mimic any conceivable computation). Thus, I recap a result that shows how simple a class of ABMs can be and still be Turing complete (and thus subject to the previous results) (Edmonds 2004). If you do not like discussion of proofs, I suggest you skip to the concluding discussion. More formal versions of the proofs can be found in the Appendix.

Descriptions of Agent-Based Models

When we describe code, the code should be consistent with that description. That is, the code should be one of the possible codes which are right for the code. In this paper we are crucially concerned with the relationship between code and its description and the difficulty of passing from one to the other.

When we describe an ABM, we can do so in a number of different ways. We can do this with different degrees of formality: from chatty natural language to pseudo code to UML diagrams. The more formal the method of description, the less ambiguity there is concerning its meaning. We can also describe what we want at low or high levels. A low-level description specifies the detail of what bit of code does at each time – an imperative description. A high-level description specifies what the system should do as a whole or what the results should be like. These tend to be declarative descriptions.

Whilst a compiler takes a formal, low-level description of code, an LLM takes a high-level, informal description – in the former case the description is very prescriptive with the same description always producing the same code, but in the case of LLMs there are usually a great many sets of code that are consistent with the input prompt. In other words, the LLM makes a great many decisions for the user, saving time – decisions that a programmer would be forced to confront if using a compiler (Keles 2026).

Here, we are focusing on the latter case, when we use an LLM to do some or all of our ABM programming for us. We use high-level, informal natural language in order to direct the LLM as to what ABM (or aspect of ABM) we would like. Of course, one can be more precise in one’s use of language, but it will tend to remain at a fairly high level (if we are going to do a complete low-level description then we might as well write the code ourselves).

In the formal results below, we restrict ourselves to formal descriptions as this is necessary to do any proofs. However, what is true below for formal descriptions is also true for the wider class of any description as one can always use natural language in a precise manner. For a bit more detail in what we mean here by a formal language, see the Appendix.

The impossibility of a general “specification compiler”

The dream is that one could write a description of what one would like and an algorithm, T, would produce code that fitted that description. However, to enable proofs to be used we need to formalize the situation so that the description, is in some suitably expressive but formal language (e.g. a logic with enumerable expressions). This situation is illustrated in figure 1.

Figure 1. Automatically generating code from a specification.

Obviously, it is easy to write an impossible formal specification – one for which no code exists –so the question is whether there could be such an algorithm, T, that would give us code that fitted a specification when it does exist. The proof is taken from (Edmonds & Bryson 2004) and given in more formal detail in the Appendix.

The proof uses a version of Turing’s “halting problem” (Turing 1937). This is the problem of checking some code (which takes a number as an input parameter) to see if would come to a halt (the program finishes) or go on for ever. The question here is whether there is any effective and systematic way of doing this. In other words, whether there an “automatic program checker” is possible – a program, H, which takes two inputs: the program number, x, and a possible input, y and then works out if the computation P_x(y) ever ends. Whilst in some cases spotting this is easy – e.g. trivial infinite loops – other cases are hard (e.g. testing the even numbers to find one that is not the sum of two prime numbers¹).

For our purposes, let us consider a series of easier problems – what I call “limited halting problems”. This is the problem of checking whether programs, x, applied to inputs y ever come to an end, but only for x, y ≤ n, where n is a fixed upper limit. Imagine a big n ´ n table with the columns being the program numbers and the rows being the inputs. Each element is 0 if the combination never stops and 1 if it does. A series of simpler checking programs, H_n, would just look up the answers in the table as long as they had been filled in correctly. We know that these programs exist, since programs that implement simple look up tables always exist and that one of the possible n x´n tables will be the right one for H_n. For each limited halting problem, we can write a formal specification for this, giving us a series of specifications (one for each n).

Now imagine that we had a general-purpose specification compiling program, T, as described above and illustrated in Figure 1. Then, we could do the following:

work out max(x,y)
given any computation P_x(y) we could construct the specification for the limited halting problem with index, max(x,y)
then we could use T to construct some code for H_n and
use that code to see if P_x(y) ever halted.

Taken together, these steps (a-d) can be written as a new piece of computer code that would solve the general halting problem. However, this we know is impossible (Turing 1937), therefore there is not a general compiling program like T above, it is impossible.

The impossibility of a general “code checker”

The checking problem is apparently less ambitious than the programming problem – here we are given a program and a specification and have ‘only’ to check whether they correspond. That is whether the code satisfies the specification.

Figure 2. Algorithmically checking if some code satisfies a descriptive specification.

Again, the answer is in the negative. The proof to this is similar. If there were a checking program C that, given a program and a formal specification would tell us whether the program met the specification, we could again solve the general halting problem. We would be able to do this as follows:

work out the maximum of x and y (call this m);
construct a sequence of programs implementing all possible finite lookup tables of type: mxm→{0,1};
test these programs one at a time using C to find one that satisfies SH_n (we know there is at least one);
use this program to compute whether P_x(y) halts.

Thus, there is no general specification checking program, like C.

Thus, we can see that there are some, perfectly well-formed specifications, where we know code exists that would comply with the specification but where there is no such algorithm, however clever, that will always take us from a specification to the code. Since trained neural nets are a kind of clever algorithm, they cannot do this either.

What about simple Agent-Based Models?

To illustrate how simple such systems can be, I defined a particular class of particularly simple multi-agent system, called “GASP” systems (Giving Agent System with Plans). These are defined as follows. There are n agents, labelled: 1, 2, 3, etc., each of which has an integer store which can change and a finite number of simple plans (which do not change). Each time interval the store of each agent is incremented by one. Each plan is composed of: a (possibly empty) sequence of ‘give instructions’ and finishes with a single ‘test instruction’. Each ‘give instruction’, G_a, has the effect of giving 1 unit to agent a (if the store is non-zero). The ‘test instruction’ is of the form JZ_a,p,q, which has the effect of jumping (i.e. designating the plan that will be executed next time period) to plan p if the store of agent a is zero and plan q otherwise. This class is described more in (Edmonds 2004). This is illustrated in Figure 3.

Figure 3. An Illustration of a “GASP” multi-agent system.

Thus ‘all’ that happens in this class of GASP systems is the giving of tokens with value 1 to other agents and the testing of other agents’ store to see if they are zero to determine the next plan. There is no fancy communication, learning or reasoning done by agents. Agents have fixed and very simple plans and only one variable. However, this class of agent can be shown to be Turing Complete. The proof is taken from (Edmonds & Bryson 2004).

The detail of this proof is somewhat tedious, but basically involves showing that any computation (any Turing Machine) can be mapped to a GASP machine using a suitable effective and systematic mapping. This is done in three stages. That is for any particular Turing Machine:

Create an equivalent “Unlimited Register Machine” (URM), with an indefinitely large (but finite) number of integer variables and four basic kinds of instruction (add one to a variable, set a variable to 0, copy the number in a variable to another, jump to a set instruction if two specified variables are equal. This is known to be possible (Cutland page 57).
Create an equivalent “AURA” machine for this URM machine (Moss & Edmonds 1994)
Create an equivalent “GASP” ABM for this AURA system.

This is unsurprising – many systems that allow for an indefinite storage, basic arithmetic operations and a kind of “IF” statement are Turing Complete (see any textbook on computability, e.g. Cutland 1980).

This example of a class of GASP agents shows just how simple an ABM can be and still be Turing Complete, and subject to the impossibility of a general compiler (like T above) or checker (like C above), however clever these might be.

Discussion

What the above results show is that:

There is no algorithm that will take any formal specification and give you code that satisfies it. This includes trained LLMs.
There is no algorithm that will take any formal specification and some code and then check whether the code satisfies that specification. This includes trained LLMs.
Even apparently simple classes of agent-based model are capable of doing any computation and so there will be examples where the above two negative results hold.

These general results do not mean that there are not special cases where programs like T or C are possible (e.g. compilers). However, as we see from the example ABMs above, it does not take much in the way of its abilities to make this impossible for high level descriptions. Using informal, rather than formal, language does not escape these results, but merely adds more complication (such as vagueness).

In conclusion, this means that there will be kinds of ABMs for which no algorithm can turn descriptions into the correct, working code². This does not mean that LLMs can’t be very good at producing working code from the prompts given to them. They might (in some cases) be better than the best humans at this but they can never be perfect. There will always be specifications where they either cannot produce the code or produce the wrong code.

The underlying problem is that coding is a very hard, in general. There are no practical, universal methods that always work – even when it is known to be possible. Suitably-trained LLMs, human ingenuity, various methodologies can help but none will be a panacea.

Notes

1. Which would disprove “Goldbach’s conjecture”, whose status is still unknown despite centuries of mathematical effort. If there is such a number it is known to be more than 4×10¹⁷.

2. Of course, if humans are limited to effective procedures – ones that could be formally written down as a program (which seems likely) – then humans are similarly limited.

Acknowledgements

Many thanks for the very helpful comments on earlier drafts of this by Peer-Olaf Siebers, Luis Izquierdo and other members of the LLM4ABM SIG. Also, the participants of AAMAS 2004 for their support and discussion on the formal results when they were originally presented.

Appendix

Formal descriptions

The above proofs rely on the fact that the descriptions are “recursively enumerable”, as in the construction of Gödel (1931). That is, one can index the descriptions (1, 2, 3…) in such a way that once can reconstruct the description from the index. Most formal languages, including those compilers take, computer code, formal logic expressions, are recursively enumerable since they can be constructed from an enumerable set of atoms (e.g. variable names) using a finite number of formal composition rules (e.g. if A and B are allowed expressions, then so are A → B, A & B etc. Any language that can be specified using syntax diagrams (e.g. using Backus–Naur form) will be recursively enumerable in this sense.

Producing code from a specification

The ‘halting problem’ is an undecidable problem (Turing 1937), (that is it is a question for which there does not exist a program that will answer it, say outputting 1 for yes and 0 for no). This is the problem of whether a given program will eventually come to a halt with a given input. In other words, whether P_x(y), program number x applied to input y, ever finishes with a result or whether it goes on for ever. Turing proved that there is no such program (Turing 1937).

Define a series of problems, LH₁, LH₂, etc., which we call ‘limited halting problems’. LH_n is the problem of ‘whether a program with number £n and an input £n will ever halt’. The crucial fact is that each of these is computable, since each can be implemented as a finite lookup table. Call the programs that implement these lookup tables: PH₁, PH₂, etc. respectively. Now if the specification language can specify each such program, one can form a corresponding enumeration of formal specifications: SH₁, SH₂,etc.

The question now is whether there is any way of computationally finding PH_n from the specification SH_n. But if there were such a way we could solve Turing’s general halting problem in the following manner: first find the maximum of x and y (call this m); then compute PH_m from SH_m; and finally use PH_m to compute whether P_x(y) halts. Since we know the general halting problem is not computable, we also know that there is no effective way of discovering PH_n from SH_n even though for each SH_n we know an appropriate PH_n exists!

Thus, the only question left is whether the specification language is sufficiently expressive to enable SH1, SH2, etc. to be formulated. Unfortunately, the construction in Gödel’s famous incompleteness proof (Gödel 1931) guarantees that any formal language that can express even basic arithmetic properties will be able to formulate such specifications.

Checking code meets a specification

To demonstrate this, we can reuse the limited halting problems defined in the last subsection. The counter-example is whether one can computationally check (using C) that a given program P meets the specification SH_n. In this case we will limit ourselves to programs, P, that implement n´n finite lookup tables with entries: {0,1}.

Now we can see that if there were a checking program C that, given a program and a formal specification would tell us whether the program met the specification, we could again solve the general halting problem. We would be able to do this as follows: first find the maximum of x and y (call this m); then construct a sequence of programs implementing all possible finite lookup tables of type: mxm→{0,1}; then test these programs one at a time using C to find one that satisfies SH_n (we know there is at least one: PH_m);and finally use this program to compute whether P_x(y) halts. Thus, there is no such program, C.

Showing GASP ABMs are Turning Complete

The class of Turing machines is computationally equivalent to that of unlimited register machines (URMs) (Cutland page 57). That is the class of programs with 4 types of instructions which refer to registers, R₁, R₂, etc. which hold positive integers. The instruction types are: S_n, increment register R_n by one; Z_n, set register R_n to 0; C_n,m, copy the number from R_n to R_m (erasing the previous value); and J_n,m,q, if R_n=R_m jump to instruction number q. This is equivalent to the class of AURA programs which just have two types of instruction: S_n, increment register R_n by one; and DJZ_n,q, decrement R_n if this is non-zero then if the result is zero jump to instruction step q (Moss & Edmonds 1994). Thus we only need to prove that given any AURA program we can simulate its effect with a suitable GASP system. Given an AURA program of m instructions: i₁, i₂,…, i_m which refers to registers R₁, …, R_n, we construct a GASP system with n+2 agents, each of which has m plans. Agent A_n+1 is basically a dump for discarded tokens and agent A_n+2 remains zero (it has the single plan: (G_n+1, J_a+1,1,1)). Plan s (sÎ{1,…,m}) in agent number a (aÎ{1,…,n}) is determined as follows: there are four cases depending on the nature of instruction number s:

1. i_s is S_a: plan s is (J_a,s+1,s+1);

2. i_s is S_b where b¹a: plan s is (G_n+1, J_a,s+1,s+1);

3. i_s is DJZ_a,q: plan s is (G_n+1, G_n+1, J_a,q,s+1);

4. i_s is DJZ_b,q where b¹a: plan s is (G_n+1, J_a,q,s+1).

Thus, each plan s in each agent mimics the effect of instruction s in the AURA program with respect to the particular register that the agent corresponds to.

References

Cutland, N. (1980) Computability: An Introduction to Recursive Function Theory. Oxford University Press.

Edmonds, B. (2004) Using the Experimental Method to Produce Reliable Self-Organised Systems. In Brueckner, S. et al. (eds.) Engineering Self Organising Sytems: Methodologies and Applications, Springer, Lecture Notes in Artificial Intelligence, 3464:84-99. http://cfpm.org/cpmrep131.html

Edmonds, B. & Bryson, J. (2004) The Insufficiency of Formal Design Methods – the necessity of an experimental approach for the understanding and control of complex MAS. In Jennings, N. R. et al. (eds.) Proceedings of the 3rd Internation Joint Conference on Autonomous Agents & Multi Agent Systems (AAMAS’04), July 19-23, 2004, New York. ACM Press, 938-945. http://cfpm.org/cpmrep128.html

Gödel, K. (1931), Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme, I, Monatshefte für Mathematik und Physik, 38(1):173–198. http://doi.org/10.1007/BF01700692

Keles, A. (2026) LLMs could be, but shouldn’t be compilers. Online https://alperenkeles.com/posts/llms-could-be-but-shouldnt-be-compilers/ (viewed 11 Feb 2026)

Moss, S. and Edmonds, B. (1994) Economic Methodology and Computability: Some Implications for Economic Modelling, IFAC Conf. on Computational Economics, Amsterdam, 1994. http://cfpm.org/cpmrep01.html

Turing, A.M. (1937), On Computable Numbers, with an Application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, s2-42: 230-265. https://doi.org/10.1112/plms/s2-42.1.230

Edmonds, B. (2026) A Reminder – computability limits on “vibe coding” ABMs (using LLMS to do the programming for us). Review of Artificial Societies and Social Simulation, 12 Feb 2026. https://rofasss.org/2026/02/12/vibe

© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

Content

Delusional Generality – how models can give a false impression of their applicability even when they lack any empirical foundation

May 6, 2024 thesubmissionauthor Leave a comment

By Bruce Edmonds¹, Dino Carpentras², Nick Roxburgh³, Edmund Chattoe-Brown⁴ and Gary Polhill³

Centre for Policy Modelling, Manchester Metropolitan University
Computational Social Science, ETH Zurich
James Hutton Institute, Aberdeen
University of Leicester

“Hamlet: Do you see yonder cloud that’s almost in shape of a camel?
Polonius: By the mass, and ‘tis like a camel, indeed.
Hamlet: Methinks it is like a weasel.
Polonius: It is backed like a weasel.
Hamlet: Or like a whale?
Polonius: Very like a whale.”

Models and Generality

The essence of a model is that it represents – if it is not a model of something it is not a model at all (Zeigler 1976, Wartofsky 1979). A random bit of code or set of equations is not a model. The point of a model is that one can use the model to infer or understand some aspects about what it represents. However, models can represent a variety of kinds of things in a variety of ways (Edmonds & al. 2019) – it can represent ideas, correspond to data, or aspects of other models and it can represent each of these in either a vague or precise manner. To completely understand a model – its construction, properties and working – one needs to understand how it does this mapping. This piece focuses attention on this mapping, rather than the internal construction of models.

What a model reliably represents may be a single observed situation, but it might satisfactorily represent more than one such situation. The range of situations that the model satisfactorily represents is called the “scope” of the model (what is “satisfactory” depending on the purpose for which the model is being used). The more extensive the scope, the more “general” we say the model is. A model that only represents one case has no generality at all and may be more in the nature of a description.

There is a hunger for general accounts of social phenomena (let us call these ‘theories’). However, this hunger is often frustrated by the sheer complexity and ‘messiness’ involved in such phenomena. If every situation we observe is essentially different, then no such theory is possible. However, we hope that this is not the case for the social world and, indeed, informal observation suggests that there is, at least some, commonality between situations – in other words, that some kind of reliable generalisation about social phenomena might be achievable, however modest (Merton 1968). This piece looks at two kinds of applicability – analogical applicability and empirical applicability – and critiques those that conflate them. Although the expertise of the authors is in the agent-based modelling of social phenomena, and so we restrict our discussion to this, we strongly suspect that our arguments are true for many kinds of modelling across a range of domains.

In the next sections we contrast two uses for models: as analogies (ways of thinking about observed systems) and those that intend to represent empirical data in a more precise way. There are, of course, other uses of model such as that of exploring theory which have nothing to do with anything observed.

Models used as analogies

Analogical applicability comes from the flexibility of the human mind in interpreting accounts in terms of the different situations. When we encounter a new situation, the account is mapped onto it – the account being used as an analogy for understanding this situation. Such accounts are typically in the form of a narrative, but a model can also be used as an analogy (which is the case we are concerned with here). The flexibility with which this mapping can be constructed means that such an account can be related to a wide range of phenomena. Such analogical mapping can lead to an impression that the account has a wide range of applicability. Analogies are a powerful tool for thinking since it may give us some insights into otherwise novel situations. There are arguments that analogical thinking is a fundamental aspect of human thought (Hofstadter 1995) and language (Lakoff 2008). We can construct and use analogical mappings so effortlessly that they seem natural to us. The key thing about analogical thinking is that the mapping from the analogy to the situation to which it is applied is re-invented each time – there is no fixed relationship between the analogy and what it might be applied to. We are so good at doing this that we may not be aware of how different the constructed mapping is each time. However, its flexibility comes at a cost, namely that because there is no well-defined relationship with what it applies to, the mapping tends to be more intuitive than precise. An analogy can give insights but analogical reasoning suggests rather than establishes anything reliably and you cannot empirically test it (since analogical mappings can be adjusted to avoid falsification). Such “ways of thinking” might be helpful, but equally might be misleading [note ‎1].

Just because the content of an analogy might be expressed formally does not change any of this (Edmonds 2018), in fact formally expressed analogies might give the impression of being applicable, but often are only related to anything observed via ideas – the model relates to some ideas, and the ideas relate to reality (Edmonds 2000). Using models as analogies is a valid use of models but this is not an empirically reliable one (Edmonds et al. 2019). Arnold (2013) makes a powerful argument that many of the more abstract simulation models are of this variety and simply not relatable to empirically observed cases and data at all – although these give the illusion of wide applicability, that applicability is not empirical. In physics the ways of thinking about atomic or subatomic entities have changed over time whilst the mathematically-expressed, empirically-relevant models have not (Hartman 1997). Although Thompson (2022) concentrates on mathematically formulated models, she also distinguishes between well-validated empirical models and those that just encapsulate the expertise/opinion of the modeller. She gives some detailed examples of where the latter kind had disproportionate influence, beyond that of other expertise, just because it was in the form of a model (e.g. the economic impact of climate change).

An example of an analogical model is described in Axelrod (1984) – a formalised tournament where algorithmically-expressed strategies are pitted against each other, playing the iterated prisoner’s dilemma game. It is shown how the ‘tit for tat’ strategy can survive against many other mixes of strategies (static or evolving). In the book, the purpose of the model is to suggest a new way of thinking about the evolution of cooperation. The book claims the idea ‘explains’ many observed phenomena, but this in an analogical manner – no precise relationship with any observed measurements is described. There is no validation of the model here or in the more academic paper that described these results (Axelrod & Hamilton 1981).

Of course, researchers do not usually call their models “analogies” or “analogical” explicitly but tend to use other phrasings that imply a greater importance. An exception is Epstein (2008) where it is explicitly listed as one of the 15 modelling purposes, other than prediction, that he discusses. Here he says such models are “…more than beautiful testaments to the unifying power of models: they are headlights in dark unexplored territory.” (ibid.) thus suggesting their use in thinking about phenomena where we do not already have reliable empirical models. Anything that helps us think about such phenomena could be useful, but that does not mean they are at all reliable. As Herbert Simon said: “Metaphor and analogy can be helpful, or they can be misleading. ” (Simon 1968, p. 467).

Another purpose listed in Epstein (2008) is to “Illuminate core dynamics”. After raising the old chestnut that “All models are wrong”, he goes on to justify them on the grounds that “…they capture qualitative behaviors of overarching interest”. This is fine if the models are, in fact, known to be useful as more than vague analogies [Note 2] – that they do, in some sense, approximate observed phenomena – but this is not the case with novel models that have not been empirically tested. This phrase is more insidious, because it implies that the dynamics that have been illuminated by the model are “core” – some kind of approximation of what is important about the phenomena, allowing for future elaborations to refine the representation. This implies a process where an initially rough idea is iteratively improved. However, this is premature because we do not know if what has been abstracted away in the abstract model was essential to the dynamics of the target phenomena or not without empirical testing – this is just assumed or asserted based on the intuitions of the modeller.

This idea of the “core dynamics” leads to some paradoxical situations – where a set of competing models are all deemed to be core. Indeed, the literature has shown how the same phenomenon can be modelled in many contrasting ways. For instance, political polarisation has been modelled through models with mechanisms for repulsion, bounded confidence, reinforcement, or even just random fluctuations, to name a few (Flache et al., 2017; Banisch & Olbrich 2019; Carpentras et al. 2022). However, it is likely that only a few of them contribute substantially to the political polarisation we observe in the real world, and so that all the others are not a real “core dynamic” but until we have more empirical work we do not know which are core and which not.

A related problem with analogical models is that, even when relying on parsimony principles [Note 3], it is not possible to decide which model is better. This aspect, combined with the constant production of new models, can makes the relevant literature increasingly difficult to navigate as models proliferate without any empirical selection, especially for researchers new to ABM. Furthermore, most analogical models define their object of study in an imprecise manner so that it is hard to evaluate whether they are even intended to capture element of any particular observed situation. For example, opinion dynamics models rarely define the type of interaction they represent (e.g. in person vs online) or even what an opinion is. This has led to cases where even knowledge of facts has been studied as “opinions” (e.g. Chacoma & Zanette, 2015).

In summary, analogical models can be a useful tool to start thinking about complex phenomena. However, the danger with them is that they give an impression of progress but result in more confusion than clarity, possibly slowing down scientific progress. Once one has some possible insights, one needs to confront these with empirical data to determine which are worth further investigation.

Models that relate directly to empirical data

An empirical model, in contrast, has a well-defined way of mapping to the phenomena it represents. For example, the variables of the gas laws (volume, temperature and pressure) are measured using standard methods developed over a long period of time, one does not invent a new way of doing this each time the laws are applied. In this case, the ways of measuring these properties have developed alongside the mathematical models of the laws so that these work reliably under broad (and well known) conditions and cannot be adjusted at the whim of a modeller. Empirical generality comes from when a model applies reliably to many different situations – in the case of the gas laws, to a wide range of materials in gaseous form to a high degree of accuracy.

Empirical models can be used for different purposes, including: prediction, explanation and description (Edmonds et al. 2019). Each of these uses how the model is mapped to empirical data in different ways, to reflect these purposes. With a descriptive model the mapping is one-way from empirical data to the model to justify the different parts. In a predictive model, the initial model setup is determined from known data and the model is then run to get its results. These results are then mapped back to what we might expect as a prediction, which can be later compared to empirically measured values to check the model’s validity. An explanatory model supports a complex explanation of some known outcomes in terms of a set of processes, structures and parameter values. When it is shown that the outcomes of such a model sufficiently match those from the observed data – the model represents a complex chain of causation that would result in that data in terms of the processes, structures and parameter values it comprised. It thus supports an explanation in terms of the model and its input of what was observed. In each of these three cases the mapping from empirical data to the model happens in a different order and maybe in a different direction, however they all depend upon the mapping being well defined.

Cartwright (1983), studying how physics works, distinguished between explanatory and phenomenological laws – the former explains but does not necessary relate exactly to empirical data (such as when we fit a line to data using regression), whilst the latter fits the data but does not necessarily explain (like the gas laws). Thus the jobs of theoretical explanation and empirical prediction are done by different models or theories (often calling the explanatory version “theory” and the empirical versions “models”). However, in physics the relationship between the two is, itself, examined so that the “bridging laws” between them are well understood, especially in formal terms. In this case, we attribute reliable empirical meaning to the explanatory theories to the extent that the connection to the data is precise, even though it is done via the intermediary of an “phenomenological” model because both mappings (explanatory↔phenomenological and phenomenological↔empirical data) are precise and well established. The point is that the total mapping from model or theory to empirical data is not subject to interpretation or post-hoc adjustment to improve its fit.

ABMs are often quite complicated and require many parameters or other initialising input to be specified before they can be run. If some of these are not empirically determinable (even in principle) then these might be guessed at using a process of “calibration”, that is searching the space of possible initialisations for some values for which some measured outcomes of the results match other empirical data. If the model has been separately shown to be empirically reliable then one could do such a calibration to suggest what these input values might have been. Such a process might establish that the model captures a possible explanation of the fitted outcomes (in terms of the model plus those backward-inferred input values), but this is not a very strong relationship, since many models are very flexible and so could fit a wide range of possible outcomes. The reliability of such a suggested explanation, supported by the model, is only relative to (a) the empirical reliability of any theory or other assumptions the model is built upon (b) how flexibly the model outcomes can be adjusted to fit the target data and (c) how precisely the choice of outcome measures and fit are. Thus, calibration does not provide strong evidence of the empirical adequacy of an ABM and any explanation supported by such a procedure is only relative to the ‘wiggle room’ afforded by free parameters and unknown input data as well as any assumptions used in the making of the model. However, empirical calibration is better than none and may empirically fix the context in which theoretical exploration occurs – showing that the model is, at least, potentially applicable to the case being considered [Note 4].

An example of a model that is strongly grounded in empirical data is the “538” model of the US electoral college for presidential elections (Silver 2012). This is not an ABM but more like a micro-simulation. It aggregates the uncertainty from polling data to make probabilistic predictions about what this means for the outcomes. The structure of the model comes directly from the rules of the electoral college, the inputs are directly derived from the polling data and it makes predictions about the results that can be independently checked. It does a very specific, but useful job, in translating the uncertainty of the polling data into the uncertainty about the outcome.

Why this matters

If people did not confuse the analogical and empirical cases, there would not be a problem. However, researchers seem to suffer from a variety of “Kuhnian Spectacles” (Kuhn 1962) – namely that because they view their target systems through an analogical model, they tend to think that this is how that system actually is – i.e. that the model has not just analogical but also empirical applicability. This is understandable, we use many layers of analogy to navigate our world and in many every-day cases it is practical to conflate our models with the reality we deal with (when they are very reliable). However, people who claim to be scientists are under an obligation to be more cautious and precise than this, since others might wish to rely upon our theories and models (this is, after all, why they support us in our privileged position). However, such caution is not always followed. There are cases where modellers declare their enterprise a success even after a long period without any empirical backing, making a variety of excuses instead of coming clean about this lack (Arnold 2015).

Another fundamental aspect is that agent-based models can be very interdisciplinary and, because of that, they can be used also by researchers in different fields. However, many fields do not consider models as simple analogies, especially when they provide precise mathematical relationship among variables. This can easily result in confusions where the analogical applicability of ABMs is interpreted as empirical in another field.

Of course, we may be hopeful that, sometime in the future, our vague or abstract analogical model maybe developed into something with proven empirical abilities, but we should not suggest such empirical abilities until these have been established. Furthermore, we should be particularly careful to ensure that non-modellers understand that this possibility is only a hope and not imply anything otherwise (e.g. imply that it is likely to have empirical validity). However, we suspect that in many cases this confusion goes beyond optimistic anticipation and that some modellers conflate analogical with empirical applicability, assuming that their model is basically right just because it seems that way to them. This is what we call “delusional generality” – that a researcher is under the impression that their model has a wide applicability (or potentially wide applicability) due to the attractiveness of the analogy it presents. In other words, unaware of the unconscious process of re-inventing the mapping to each target system, they imagine (without further justification) that it has some reliable empirical (or potentially empirical) generality at its core [Note 5].

Such confusion can have severe real-world consequences if a model with only analogical validity is assumed to also have some empirical reliability. Thompson (2022) discusses how abstract economic models of the cost of future climate change did affect the debate about the need for prevention and mitigation, even though they had no empirical validity. However, agent-based modellers have also made the same mistake, with a slew of completely unvalidated models about COVID affecting public debate about policy (Squazzoni et al 2021).

Conclusion

All of the above discussion raises the question of how we might achieve reliable models with even a moderate level of empirical generality in the social sciences. This is a tricky question of scientific strategy, which we are not going to answer here [Note 6]. However, we question whether the approach of making “heroic” jumps from phenomena to abstract non-empirical models on the sole basis of its plausibility to its authors will be a productive route when the target is complex phenomena, such as socio-cognitive systems (Dignum, Edmonds and Carpentras 2022). Certainly, that route has not yet been empirically demonstrated.

Whatever the best strategy is, there is a lot of theoretical modelling in the field of social simulation that assumes or implies that it is the precursor for empirical applicability and not a lot of critique about the extent of empirical success achieved. The assumption seems to be that abstract theory is the way to make progress understanding social phenomena but, as we argue here, this is largely wishful thinking – the hope that such models will turn out to have empirical generality being a delusion. Furthermore, this approach has substantive deleterious effects in terms of encouraging an explosion of analogical models without any process of selection (Edmonds 2010). It seems that the ‘famine’ of theory about social phenomena with any significant level of generality is so severe, that many seem to give credence to models they might otherwise reject – constructing their understanding using models built on sand.

Notes

1. There is some debate about the extent to which analogical reasoning works, what kind of insights it results in and under what circumstances (Hofstede 1995). However, all we need for our purposes is that: (a) it does not reliably produce knowledge, (b) the human mind is exceptionally good at ‘fitting’ analogies to new situations (adjusting the mapping to make it ‘work’ somehow) and (c) due to this ability analogies can be far more convincing that the analogical reasoning warrants.

2. In pattern-oriented modelling (Grimm & al 2005) models are related to empirical evidence in a qualitative (pattern-based) manner, for example to some properties of a distribution of numeric outcomes. In this kind of modelling, a precise numerical correspondence is replaced by a set of qualitative correspondences in many different dimensions. In this the empirical relevance of a model is established on the basis that it is too hard to simultaneously fit a model to evidence in this way, thus ruling that out as a source of its correspondence with that evidence.

3. So-called “parsimony principles” are a very unreliable manner of evaluating competing theories on grounds other than convenience or that of using limited data to justify the values of parameters (Edmonds 2007).

4. In many models a vague argument for its plausibility is often all that is described to show that it is applicable to the cases being discussed. At least calibration demonstrates its empirical applicability, rather than simply assuming it.

5. We are applying the principle of charity here, assuming that such conflations are innocent and not deliberate. However, there is increasing pressure from funding agencies to demonstrate ‘real life relevance’ so some of these apparent confusions might be more like ‘spin’ – trying to give an impression of empirical relevance even when this is merely an aspiration, in order to suggest that their model has more significant than they have reliably established.

6. This has been discussed elsewhere, e.g. (Moss & Edmonds 2005).

Acknowledgements

Thanks to all those we have discussed these issues with, including Scott Moss (who was talking about these kinds of issue more than 30 years ago), Eckhart Arnold (who made many useful comments and whose careful examination of the lack of empirical success of some families of model demonstrates our mostly abstract arguments), Sven Banisch and other members of the ESSA special interest group on “Strongly Empirical Modelling”.

References

Arnold, E. (2013). Simulation models of the evolution of cooperation as proofs of logical possibilities. How useful are they? Ethics & Politics, XV(2), pp. 101-138. https://philpapers.org/archive/ARNSMO.pdf

Arnold, E. (2015) How Models Fail – A Critical Look at the History of Computer Simulations of the Evolution of Cooperation. In Misselhorn, C. (Ed.): Collective Agency and Cooperation in Natural and Artificial Systems. Explanation, Implementation and Simulation, Philosophical Studies Series, Springer, pp. 261-279. https://eckhartarnold.de/papers/2015_How_Models_Fail

Axelrod, R. (1984) The Evolution of Cooperation, Basic Books.

Axelrod, R. & Hamilton, W.D. (1981) The evolution of cooperation. Science, 211, 1390-1396. https://www.science.org/doi/abs/10.1126/science.7466396

Banisch, S., & Olbrich, E. (2019). Opinion polarization by learning from social feedback. The Journal of Mathematical Sociology, 43(2), 76-103. https://doi.org/10.1080/0022250X.2018.1517761

Carpentras, D., Maher, P. J., O’Reilly, C., & Quayle, M. (2022). Deriving An Opinion Dynamics Model From Experimental Data. Journal of Artificial Societies & Social Simulation, 25(4).http://doi.org/10.18564/jasss.4947

Cartwright, N. (1983) How the Laws of Physics Lie. Oxford University Press.

Chacoma, A. & Zanette, D. H. (2015). Opinion formation by social influence: From experiments to modelling. PloS ONE, 10(10), e0140406.https://doi.org/10.1371/journal.pone.0140406

Dignum, F., Edmonds, B. and Carpentras, D. (2022) Socio-Cognitive Systems – A Position Statement. Review of Artificial Societies and Social Simulation, 2^nd Apr 2022. https://rofasss.org/2022/04/02/scs

Edmonds, B. (2000). The Use of Models – making MABS actually work. In. S. Moss and P. Davidsson. Multi Agent Based Simulation. Berlin, Springer-Verlag. 1979: 15-32. http://doi.org/10.1007/3-540-44561-7_2

Edmonds, B. (2007) Simplicity is Not Truth-Indicative. In Gershenson, C.et al. (eds.) Philosophy and Complexity. World Scientific, pp. 65-80.

Edmonds, B. (2010) Bootstrapping Knowledge About Social Phenomena Using Simulation Models. Journal of Artificial Societies and Social Simulation, 13(1), 8. http://doi.org/10.18564/jasss.1523

Edmonds, B. (2018) The “formalist fallacy”. Review of Artificial Societies and Social Simulation, 11th June 2018. https://rofasss.org/2018/07/20/be/

Edmonds, B., le Page, C., Bithell, M., Chattoe-Brown, E., Grimm, V., Meyer, R., Montañola-Sales, C., Ormerod, P., Root H. & Squazzoni. F. (2019) Different Modelling Purposes. Journal of Artificial Societies and Social Simulation, 22(3):6. http://doi.org/10.18564/jasss.3993

Epstein, J. M. (2008). Why Model?. Journal of Artificial Societies and Social Simulation, 11(4),12. https://www.jasss.org/11/4/12.html

Flache, A., Mäs, M., Feliciani, T., Chattoe-Brown, E., Deffuant, G., Huet, S. & Lorenz, J. (2017). Models of social influence: Towards the next frontiers. Journal of Artificial Societies and Social Simulation, 20(4), 2. http://doi.org/10.18564/jasss.4298

Grimm, V., Revilla, E., Berger, U., Jeltsch, F., Mooij, W.M., Railsback, S.F., et al. (2005). Pattern-oriented modeling of agent-based complex systems: lessons from ecology. Science, 310 (5750), 987–991. https://www.jstor.org/stable/3842807

Hartman, S. (1997) Modelling and the Aims of Science. 20^th International Wittgenstein Symposium, Kirchberg am Weshsel.

Hofstadter, D. (1995) Fluid Concepts and Creative Analogies. Basic Books.

Kuhn, T. S. (1962). The structure of scientific revolutions. University of Chicago Press.

Lakoff, G. (2008). Women, fire, and dangerous things: What categories reveal about the mind. University of Chicago Press.

Merton, R.K. (1968). On the Sociological Theories of the Middle Range. In Classical Sociological Theory, Calhoun, C., Gerteis, J., Moody, J., Pfaff, S. and Virk, I. (Eds), Blackwell, pp. 449–459.

Meyer, R. & Edmonds, B. (2023). The Importance of Dynamic Networks Within a Model of Politics. In: Squazzoni, F. (eds) Advances in Social Simulation. ESSA 2022. Springer Proceedings in Complexity. Springer. (Earlier, open access, version at: https://cfpm.org/discussionpapers/292)

Moss, S. and Edmonds, B. (2005). Towards Good Social Science. Journal of Artificial Societies and Social Simulation, 8(4), 13. https://www.jasss.org/8/4/13.html

Squazzoni, F. et al. (2020) ‘Computational Models That Matter During a Global Pandemic Outbreak: A Call to Action’ Journal of Artificial Societies and Social Simulation 23(2):10. http://doi.org/10.18564/jasss.4298

Silver, N, (2012) The Signal and the Noise: Why So Many Predictions Fail – But Some Don’t. Penguin.

Simon, H. A. (1962). The architecture of complexity. Proceedings of the American philosophical society, 106(6), 467-482.https://www.jstor.org/stable/985254

Thompson, E. (2022). Escape from Model Land: How mathematical models can lead us astray and what we can do about it. Basic Books.

Wartofsky, M. W. (1979). The model muddle: Proposals for an immodest realism. In Models (pp. 1-11). Springer, Dordrecht.

Zeigler, B. P. (1976). Theory of Modeling and Simulation. Wiley Interscience, New York.

Edmonds, B., Carpentras, D., Roxburgh, N., Chattoe-Brown, E. and Polhill, G. (2024) Delusional Generality – how models can give a false impression of their applicability even when they lack any empirical foundation. Review of Artificial Societies and Social Simulation, 7 May 2024. https://rofasss.org/2024/05/06/delusional-generality

© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

Content

An Institute for Crisis Modelling (ICM) – Towards a resilience center for sustained crisis modeling capability

May 22, 2023 thesubmissionauthor Leave a comment

By Fabian Lorig^1*, Bart de Bruin², Melania Borit³, Frank Dignum⁴, Bruce Edmonds⁵, Sinéad M. Madden⁶, Mario Paolucci⁷, Nicolas Payette⁸, Loïs Vanhée⁴

*Corresponding author
1 Internet of Things and People Research Center, Malmö University, Sweden
2 Delft University of Technology, Netherlands
3 CRAFT Lab, Arctic University of Norway, Tromsø, Norway
4 Department of Computing Science, Umeå University, Sweden
5 Centre for Policy Modelling, Manchester Metropolitan University Business School, UK
6 School of Engineering, University of Limerick, Ireland
7 Laboratory of Agent Based Social Simulation, ISTC/CNR, Italy
8 Complex Human-Environmental Systems Simulation Laboratory, University of Oxford, UK

The Need for an ICM

Most crises and disasters do occur suddenly and hit the society while it is unprepared. This makes it particularly challenging to react quick to their occurrence, to adapt to the resulting new situation, to minimize the societal impact, and to recover from the disturbance. A recent example was the Covid-19 crisis, which revealed weak points of our crisis preparedness. Governments were trying to put restrictions in place to limit the spread of the virus while ensuring the well-being of the population and at the same time preserving economic stability. It quickly became clear that interventions which worked well in some countries did not seem to have the intended effect in other countries and the reason for this is that the success of interventions to a great extent depends on individual human behavior.

Agent-based Social Simulations (ABSS) explicitly model the behavior of the individuals and their interactions in the population and allow us to better understand social phenomena. Thus, ABSS are perfectly suited for investigating how our society might be affected by different crisis scenarios and how policies might affect the societal impact and consequences of these disturbances. Particularly during the Covid-19 crisis, a great number of ABSS have been developed to inform policy making around the globe (e.g., Dignum et al. 2020, Balkely et al. 2021, Lorig et al. 2021). However, weaknesses in creating useful and explainable simulations in a short time also became apparent and there is still a lack of consistency to be better prepared for the next crisis (Squazzoni et al. 2020). Especially, ABSS development approaches are, at this moment, more geared towards simulating one particular situation and validating the simulation using data from that situation. In order to be prepared for a crisis, instead, one needs to simulate many different scenarios for which data might not yet be available. They also typically need a more interactive interface where stake holders can experiment with different settings, policies, etc.

For ABSS to become an established, reliable, and well-esteemed method for supporting crisis management, we need to organize and consolidate the available competences and resources. It is not sufficient to react once a crisis occurs but instead, we need to proactively make sure that we are prepared for future disturbances and disasters. For this purpose, we also need to systematically address more fundamental problems of ABSS as a method of inquiry and particularly consider the specific requirements for the use of ABSS to support policy making, which may differ from the use of ABSS in academic research. We therefore see the need for establishing an Institute for Crisis Modelling (ICM), a resilience center to ensure sustained crisis modeling capability.

The vision of starting an Institute for Crisis Modelling was the result of the discussions and working groups at the Lorentz Center workshop on “Agent Based Simulations for Societal Resilience in Crisis Situations” that took place in Leiden, Netherlands from 27 February to 3 March 2023**.

Vision of the ICM

“To have tools suitable to support policy actors in situations that are of
big uncertainty, large consequences, and dependent on human behavior.”

The ICM consists of a taskforce for quickly and efficiently supporting policy actors (e.g., decision makers, policy makers, policy analysts) in situations that are of big uncertainty, large consequences, and dependent on human behavior. For this purpose, the taskforce consists of a larger (informal) network of associates that contribute with their knowledge, skills, models, tools, and networks. The group of associates is composed of a core group of multidisciplinary modeling experts (ranging from social scientists and formal modelers to programmers) as well as of partners that can contribute to specific focus areas (like epidemiology, water management, etc.). The vision of ICM is to consolidate and institutionalize the use of ABSS as a method for crisis management. Although physically ABSS competences may be distributed over a variety of universities, research centers, and other institutions, the ICM serves as a virtual location that coordinates research developments and provides a basic level of funding and communication channel for ABSS for crisis management. This does not only provide policy actors with a single point of contact, making it easier for them to identify who to reach when simulation expertise is needed and to develop long-term trust relationships. It also enables us to jointly and systematically evolve ABSS to become a valuable and established tool for crisis response. The center combines all necessary resources, competences, and tools to quickly develop new models, to adapt existing models, and to efficiently react to new situations.

To achieve this goal and to evolve and establish ABSS as a valuable tool for policy makers in crisis situations, research is needed in different areas. This includes the collection, development, critical analysis, and review of fundamental principles, theories, methods, and tools used in agent-based modeling. This also includes research on data handling (analysis, sharing, access, protection, visualization), data repositories, ontologies, user-interfaces, methodologies, documentation, and ethical principles. Some of these points are concisely described in (Dignum, 2021, Ch. 14 and 15).

The ICM shall be able to provide a wide portfolio of models, methods, techniques, design patterns, and components required to quickly and effectively facilitate the work of policy actors in crisis situations by providing them with adequate simulation models. For the purpose of being able to provide specialized support, the institute will coordinate the human effort (e.g., the modelers) and have specific focus areas for which expertise and models are available. This might be, for instance, pandemics, natural disasters, or financial crises. For each of these focus areas, the center will develop different use cases, which ensures and facilitates rapid responses due to the availability of models, knowledge, and networks.

Objectives of the ICM

To achieve this vision, there are a series of objectives that a resilience center for sustained crisis modeling capability in crisis situations needs to address:

1) Coordinate and promote research

Providing quick and appropriate support for policy actors in crisis situations requires not only a profound knowledge on existing models, methods, tools, and theories but also the systematic development of new approaches and methodologies. This is to advance and evolve ABSS for being better prepared for future crises and will serve as a beacon for organizing the ABSS research oriented towards practical applications.

2) Enable trusted connections with policy actors

Sustainable collaborations and interactions with decision-makers and policy analysts as well as other relevant stakeholders is a great challenge in ABSS. Getting in contact with the right actors, “speaking the same language”, and having realistic expectations are only some of the common problems that need to be addressed. Thus, the ICM should not only connect to policy actors in times of crises, but have continuous interactions, provide sample simulations, develop use cases, and train the policy actors wherever possible.

3) Enable sustainability of the institute itself

Classic funding schemes are unfit for responding in crises, which require fast responses with always-available resources as well as the continuous build-up of knowledge, skills, network, and technological buildup requires long-term. Sustainable funding is needed that for enabling such a continuity, for which the IBM provides a demarked, unifying frame.

4) Actively maintain the network of associates

Maintaining a network of experts is challenging because it requires different competences and experiences. PhD candidates, for instance, might have a great practical experience in using different simulation frameworks, however, after their graduation, some might leave academia and others might continue to other positions where they do not have the opportunity to use their simulation expertise. Thus, new experts need to be acquired continuously to form a resilient and balanced network.

5) Inform policy actors

The most advanced and profound models cannot do any good in crisis situations in case of a lacking demand from policy actors. Many modelers perceive a certain hesitation from policy actors regarding the use of ABSS which might be due to them being unfamiliar with the potential benefits and use-cases of ABSS, lacking trust in the method itself, or simply due to a lack of awareness that ABSS actually exists. Hence, the center needs to educate policy makers and raise awareness as well as improve trust in ABSS.

6) Train the next generation of experts

To quickly develop suitable ABSS models in critical situations requires a variety of expertise. In addition to objective 4, the acquisition of associates, it is also of great importance to educate and train the next generation of experts. ABSS research is still a niche and not taught as an inherent part of the spectrum of methods of most disciplines. The center shall promote and strengthen ABSS education to ensure the training of the next generation of experts.

7) Engage the general public

Finally, the success of ABSS does not only depend on the trust of policy actors but also on how it is perceived by the general public. When developing interventions during the Covid-19 crisis and giving recommendations, the trust in the method was a crucial success factor. Also, developing realistic models requires the active participation of the general public.

Next steps

For ABSS to become a valuable and established tool for supporting policy actors in crisis situations, we are convinced that our efforts need to be institutionalized. This allows us to consolidate available competences, models, and tools as well as to coordinate research endeavors and the development of new approaches required to ensure a sustained crisis modeling capability.

To further pursue this vision, a Special Interest Group (SIG) on Building ResilienCe with Social Simulations (BRICSS) was established at the European Social Simulation Association (ESSA). Moreover, Special Tracks will be organized at the 2023 Social Simulation Conference (SSC) to bring together interested experts.

However, for this vision to become reality, the next steps towards establishing an Institute for Crisis Modelling consist of bringing together ambitious and competent associates as well as identifying core funding opportunities for the center. If the readers feel motivated to contribute in any way to this topic, they are encouraged to contact Frank Dignum, Umeå University, Sweden or any of the authors of this article.

Acknowledgements

This piece is a result of discussions at the Lorentz workshop on “Agent Based Simulations for Societal Resilience in Crisis Situations” at Leiden, NL in earlier this year! We are grateful to the organisers of the workshop and to the Lorentz Center as funders and hosts for such a productive enterprise. The final report of the workshop as well as more information can be found on the webpage of the Lorentz Center: https://www.lorentzcenter.nl/agent-based-simulations-for-societal-resilience-in-crisis-situations.html

References

Blakely, T., Thompson, J., Bablani, L., Andersen, P., Ouakrim, D. A., Carvalho, N., Abraham, P., Boujaoude, M.A., Katar, A., Akpan, E., Wilson, N. & Stevenson, M. (2021). Determining the optimal COVID-19 policy response using agent-based modelling linked to health and cost modelling: Case study for Victoria, Australia. Medrxiv, 2021-01. doi: 10.1101/2021.01.11.21249630

Dignum, F., Dignum, V., Davidsson, P., Ghorbani, A., van der Hurk, M., Jensen, M., Kammler C., Lorig, F., Ludescher, L.G., Melchior, A., Mellema, R., Pastrav, C., Vanhee, L. & Verhagen, H. (2020). Analysing the combined health, social and economic impacts of the coronavirus pandemic using agent-based social simulation. Minds and Machines, 30, 177-194. doi: 10.1007/s11023-020-09527-6

Dignum, F. (ed.). (2021) Social Simulation for a Crisis; Results and Lessons from Simulating the COVID-19 Crisis. Springer.

Lorig, Fabian, Johansson, Emil and Davidsson, Paul (2021) ‘Agent-Based Social Simulation of the Covid-19 Pandemic: A Systematic Review’ Journal of Artificial Societies and Social Simulation 24(3), 5. http://jasss.soc.surrey.ac.uk/24/3/5.html. doi: 10.18564/jasss.4601

Squazzoni, F. et al. (2020) ‘Computational Models That Matter During a Global Pandemic Outbreak: A Call to Action‘ Journal of Artificial Societies and Social Simulation 23(2), 10. http://jasss.soc.surrey.ac.uk/23/2/10.html. doi: 10.18564/jasss.4298

Lorig, F., de Bruin, B., Borit, M., Dignum, F., Edmonds, B., Madden, S.M., Paolucci, M., Payette, N. and Vanhée, L. (2023) An Institute for Crisis Modelling (ICM) – Towards a resilience center for sustained crisis modeling capability. Review of Artificial Societies and Social Simulation, 22 May 2023. https://rofasss.org/2023/05/22/icm

© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

Content

Making Models FAIR: An educational initiative to build good ABM practices

May 11, 2023 thesubmissionauthor Leave a comment

By Marco A. Janssen¹, Kelly Claborn¹, Bruce Edmonds², Mohsen Shahbaznezhadfard¹ and Manuela Vanegas-Ferro¹

Arizona State University, USA
Manchester Metropolitan University, UK

Imagine a world where models are available to build upon. You do not have to build from scratch and painstakingly try to figure out how published papers are getting the published results. To achieve this utopian world, models have to be findable, accessible, interoperable, and reusable (FAIR). With the “Making Models FAIR” initiative, we seek to contribute to moving towards this world.

The initiative – Making Models FAIR – aims to provide capacity building opportunities to improve the skills, practices, and protocols to make computational models findable, accessible, interoperable and reusable (FAIR). You can find detailed information about the project on the website (tobefair.org), but here we will present the motivations behind the initiative and a brief outline of the activities.

There is increasing interest to make data and model code FAIR, and there is quite a lot of discussion on standards (https://www.openmodelingfoundation.org/ ). What is lacking are opportunities to gain skills for how to do this in practice. We have selected a list of highly cited publications from different domains and developed a protocol for making those models FAIR. The protocol may be adapted over time when we learn what works well.

This list of model publications provides opportunities to learn the skills needed to make models FAIR. The current list is a starting point, and you can suggest alternative model publications as desired. The main goal is to provide the modeling community a place to build capacity in making models FAIR. How do you use Github, code a model in a language or platform of your choice, and write good model documentation? These are necessary skills for collaboration and developing FAIR models. A suggested way of participating is for an instructor to have student groups participate in this activity, selecting a model publication that is of interest to their research.

To make a model FAIR, we focus on five activities:

If the code is not available with the publication, find out whether the code is available (contact the authors) or replicate the model based on the model documentation. It might also happen that the code is available in programming language X, but you want to have it available in another language.
If the code does not have a license, make sure an appropriate license is selected to make it available.
Get a DOI, which is a permanent link to the model code and documentation. You could use comses.net or zenodo.org or similar services.
Can you improve the model documentation? There is typically a form of documentation in a publication, in the article or an appendix, but is this detailed enough to understand how and why certain model choices have been made? Could you replicate the model from the information provided in the model documentation?
What is the state of the model code? We know that most of us are not professional programmers and might be hesitant to share our code. Good practice is to provide comments on what different procedures are doing, defining variables, and not leave all kinds of wild ideas commented out left in the code base.

Most of the models listed do not have code available with the publication, which will require participants to contact the original others to obtain the code and/or to reproduce the code from the model documentation.

We are eager to learn what challenges people experience to make models FAIR. This could help to improve the protocols we provide. We also hope that those who made a model FAIR publish a contribution in RofASSS or relevant modeling journals. For publishing contributions in journals, it would be interesting to use a FAIR model to explore the robustness of the model results, especially for models that have been published many years ago and for which there were less computational resources available.

The tobefair.org website contains a lot of detailed information and educational opportunities. Below is a diagram from the site that aims to illustrate the road map of making models FAIR, so you can easily find the relevant information. Learn more by navigating to the About page and clicking through the diagram.

Making simulation models findable, accessible, interoperable and reusable is an important part of good scientific practice for simulation research. If important models fail to reach this standard, then this makes it hard for others to reproduce, check and extend them. If you want to be involved – to improve the listed models, or to learn the skills to make models FAIR – we hope you will participate in the project by going to tobefair.org and contributing.

Janssen, M.A., Claborn, K., Edmonds, B., Shahbaznezhadfard, M. and Vanegas-Ferro, M. (2023) Making Models FAIR: An educational initiative to build good ABM practices. Review of Artificial Societies and Social Simulation, 8 May 2023. https://rofasss.org/2023/05/11/fair/

© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

Content

The inevitable “layering” of models to extend the reach of our understanding

February 9, 2023 thesubmissionauthor Leave a comment

By Bruce Edmonds

“Just as physical tools and machines extend our physical abilities, models extend our mental abilities, enabling us to understand and control systems beyond our direct intellectual reach” (Calder & al. 2018)

Motivation

There is a modelling norm that one should be able to completely understand one’s own model. Whilst acknowledging there is a trade-off between a model’s representational adequacy and its simplicity of formulation, this tradition assumes there will be a “sweet spot” where the model is just tractable but also good enough to be usefully informative about the target of modelling – in the words attributed to Einstein, “Everything should be made as simple as possible, but no simpler”¹. But what do we do about all the phenomena where to get an adequate model² one has to settle for a complex one (where by “complex” I mean a model that we do not completely understand)? Despite the tradition in Physics to the contrary, it would be an incredibly strong assumption that there are no such phenomena, i.e. that an adequate simple model is always possible (Edmonds 2013).

There are three options in these difficult cases.

Do not model the phenomena at all until we can find an adequate model we can fully understand. Given the complexity of much around us this would mean to not model these for the foreseeable future and maybe never.
Accept inadequate simpler models and simply hope that these are somehow approximately right³. This option would allow us to get answers but with no idea whether they were at all reliable. There are many cases of overly simplistic models leading policy astray (Adoha & Edmonds 2017; Thompson 2022), so this is dangerous if such models influence decisions with real consequences.
Use models that are good for our purpose but that we only partially understand. This is the option examined in this paper.

When the purpose is empirical the last option is equivalent to preferring empirical grounding over model simplicity (Edmonds & Moss 2005).

Partially Understood Models

In practice this argument has already been won – we do not completely understand many computer simulations that we use and rely on. For example, due to the chaotic nature of the dynamics of the weather, forecasting models are run multiple times with slightly randomised inputs and the “ensemble” of forecasts inspected to get an idea of the range of different outcomes that could result (some of which might be qualitatively different from the others)⁴. Working out the outcomes in each case requires the computational tracking of a huge numbers of entities in a way that is far beyond what the human mind can do⁵. In fact, the whole of “Complexity Science” can be seen as different ways to get some understanding of systems for which there is no analytic solution⁶.

Of course, this raises the question of what is meant by “understand” a model, for this is not something that is formally defined. This could involve many things, including the following.

That the micro-level – the individual calculations or actions done by the model each time step – is understood. This is equivalent to understanding each line of the computer code.
That some of the macro-level outcomes that result from the computation of the whole model is understood in terms of partial theories or “rules of thumb”.
That all the relevant macro-level outcomes can be determined to a high degree of accuracy without simulating the model (e.g. by a mathematical model).

Clearly, level (1) is necessary for most modelling purposes in order to know the model is behaving as intended. The specification of this micro-level is usually how such models are made, so if this differs from what was intended then this would be a bug. Thus this level would be expected of most models⁷. However, this does not necessarily mean that this is at the finest level of detail possible – for example, we usually do not bother about how random number generators work, but simply rely on its operation, but in this case we have very good level (3) of understanding for these sub-routines.

At the other extreme, a level (3) understanding is quite rare outside the realm of physics. In a sense, having this level of understanding makes the model redundant, so would probably not be the case for most working models (those used regularly)⁸. As discussed above, there will be many kinds of phenomena for which this level of understanding is not feasible.

Clearly, what many modelers find useful is a combination of levels (1) & (2) – that is, the detailed, micro-level steps that the model takes are well understood and the outcomes understood well enough for the intended task. For example, when using a model to establish a complex explanation⁹ (of some observed pattern in data using certain mechanisms or structures) then one might understand the implementation of the candidate mechanisms and verify that the outcomes fit the target pattern for a range of parameters, but not completely understand the detail of the causation involved. There might well be some understanding, for example how robust this is to minor variations in the initial conditions or the working of the mechanisms involved (e.g. by adding some noise to the processes). A complete understanding might not be accessible but this does not stop an explanation being established (although a better understanding is an obvious goal for future research or avenue for critiques of the explanation).

Of course, any lack of a complete, formal understanding leaves some room for error. The argument here is not deriding the desirability of formal understanding, but is against prioritising that over model adequacy. Also the lack of a formal, level (3), understanding of a model does not mean we cannot take more pragmatic routes to checking it. For example: performing a series of well-designed simulation experiments that intend to potentially refute the stated conclusions, systematically comparing to other models, doing a thorough sensitivity analysis and independently reproducing models can help ensure their reliability. These can be compared with engineering methods – one may not have a proof that a certain bridge design is solid over all possible dynamics, but practical measures and partial modelling can ensure that any risk is so low as to be negligible. If we had to wait until bridge designs were proven beyond doubt, we would simply have to do without them.

Layering Models to Leverage some Understanding

As a modeller, if I do not understand something my instinct is to model it. This instinct does not change if what I do not understand is, itself, a model. The result is a model of the original model – a meta-model. This is, in fact, common practice. I may select certain statistics summarising the outcomes and put these on a graph; I might analyse the networks that have emerged during model runs; I may use maths to approximate or capture some aspect of the dynamics; I might cluster and visualise the outcomes using Machine Learning techniques; I might make a simpler version of the original and compare them. All of these might give me insights into the behaviour of the original model. Many of these are so normal we do not think of this as meta-modelling. Indeed, empirically-based models are already, in a sense, meta-models, since the data that they represent are themselves a kind of descriptive model of reality (gained via measurement processes).

This meta-modelling strategy can be iterated to produce meta-meta-models etc. resulting in “layers” of models, with each layer modelling some aspect of the one “below” until one reaches the data and then what the data measures. Each layer should be able to be compared and checked with the layer “below”, and analysed by the layer “above”.

An extended example of such layering was built during the SCID (Social Complexity of Immigration and Diversity) project¹⁰ and illustrated in Figure 1. In this a complicated simulation (Model 1) was built to incorporate some available data and what was known concerning the social and behavioural processes that lead people to bother to vote (or not). This simulation was used as a counter-example to show how assumptions about the chaining effect of interventions might be misplaced (Fieldhouse et al. 2016). A much simpler simulation was then built by theoretical physicists (Model 2), so that it produced the same selected outcomes over time and aa range of parameter values. This allowed us to show that some of the features in the original (such as dynamic networks) were essential to get the observed dynamics in it (Lafuerza et al. 2016a). This simpler model was in turn modelled by an even simpler model (Model 3) that was amenable to an analytic model (Model 4) that allowed us to obtain some results concerning the origin of a region of bistability in the dynamics (Lafuerza et al. 2016b).

Layering fig 1

Figure 1. The Layering of models that were developed in part of the SCID project

Although there are dangers in such layering – each layer could introduce a new weakness – there are also methodological advantages, including the following. (A) Each model in the chain (except model 4) is compared and checked against both the layer below and that above. Such multiple model comparisons are excellent for revealing hidden assumptions and unanticipated effects. (B) Whilst previously what might have happened was a “heroic” leap of abstraction from evidence and understanding straight to Model 3 or 4, here abstraction happens over a series of more modest steps, each of which is more amenable to checking and analysis. When you stage abstraction the introduced assumptions are more obvious and easier to analyse.

One can imagine such “layering” developing in many directions to leverage useful (but indirect) understanding, for example the following.

Using an AI algorithm to learn patterns in some data (e.g. medical data for disease diagnosis) but then modelling its working to obtain some human-accessible understanding of how it is doing it.
Using a machine learning model to automatically identify the different “phase spaces” in model results where qualitatively different model behaviour is exhibited, so one can then try to simplify the model within each phase.
Automatically identifying the processes and structures that are common to a given set of models to facilitate the construction of a more general, ‘umbrella’ model that approximates all the outcomes that would have resulted from the set, but within a narrower range of conditions.

As the quote at the top implies, we are used to settling for partial control of what machines do because it allows us to extend our physical abilities in useful ways. Each time we make their control more indirect, we need to check that this is safe and adequate for purpose. In the cars we drive there are ever more layers of electronic control between us and the physical reality it drives through which we adjust to – we are currently adjusting to more self-drive abilities. Of course, the testing and monitoring of these systems is very important but that will not stop the introduction of layers that will make them safer and more pleasant to drive.

The same is true of our modelling, which we will need to apply in ever more layers in order to leverage useful understanding which would not be accessible otherwise. Yes, we will need to use practical methods to test their fitness for purpose and reliability, and this might include the complete verification of some components (where this is feasible), but we cannot constrain ourselves to only models we completely understand.

Concluding Discussion

If the above seems obvious, then why am I bothering to write this? I think for a few reasons. Firstly, to answer the presumption that understanding one’s model must have priority over all other considerations (such as empirical adequacy) so that sometimes we must accept and use partially understood models. Secondly, to point out that such layering has benefits as well as difficulties – especially if it can stage abstraction into more verifiable steps and thus avoid huge leaps to simple but empirically-isolated models. Thirdly, because such layering will become increasingly common and necessary.

In order to extend our mental reach further, we will need to develop increasingly complicated and layered modelling. To do this we will need to accept that our understanding is leveraged via partially understood models, but also to develop the practical methods to ensure their adequacy for purpose.

Notes

[1] These are a compressed version of his actual words during a 1933 lecture, which were: “It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience.” (Robinson 2018)
[2] Adequate for whatever our purpose for it is (Edmonds & al. 2019).
[3]The weasel words I once heard from a Mathematician excusing an analytic model he knew to be simplistic were: that, although he knew it was wrong, it was useful for “capturing core dynamics” (though how he knew that they were not completely wrong eludes me).
[4] For an introduction to this approach read the European Centre for Medium-Range Weather Forecasts’ fact sheet on “Ensemble weather forecasting” at: https://www.ecmwf.int/en/about/media-centre/focus/2017/fact-sheet-ensemble-weather-forecasting
[5] In principle, a person could do all the calculations involved in a forecast but only with the aid of exterior tools such as pencil and paper to keep track of it all so it is arguable whether the person doing the individual calculations has an “understanding” of the complete picture. Lewis Fry Richardson, who pioneered the idea of numerical forecasting of weather in the 1920s, did a 1-day forecast by hand to illustrate his method (Lynch 2008), but this does not change the argument.
[6] An analytic solution is when one can obtain a closed-form equation that characterises all the outcomes by manipulating the mathematical symbols in a proof. If one has to numerically calculate outcomes for different initial conditions and parameters this is a computational solution.
[7] For purely predictive models, whose purpose is only to anticipate an unknown value to a useful level of accuracy, this is not strictly necessary. For example, how some AI/Machine learning models work may not clear at the micro-level, but as long as it works (successfully predicts) this does not matter – even if its predictive ability is due to a bug.
[8] Models may still be useful in this case, for example to check the assumptions made in the matching mathematical or other understanding.
[9] For more on this use see (Edmonds et al. 2019).
[10] For more about this project see http://cfpm.org/scid

Acknowledgements

Bruce Edmonds is supported as part of the ESRC-funded, UK part of the “ToRealSim” project, 2019-2023, grant number ES/S015159/1 and was supported as part of the EPSRC-funded “SCID” project 2010-2016, grant number EP/H02171X/1.

References

Calder, M., Craig, C., Culley, D., de Cani, R., Donnelly, C.A., Douglas, R., Edmonds, B., Gascoigne, J., Gilbert, N. Hargrove, C., Hinds, D., Lane, D.C., Mitchell, D., Pavey, G., Robertson, D., Rosewell, B., Sherwin, S., Walport, M. and Wilson, A. (2018) Computational modelling for decision-making: where, why, what, who and how. Royal Society Open Science, DOI:10.1098/rsos.172096.

Edmonds, B. (2013) Complexity and Context-dependency. Foundations of Science, 18(4):745-755. DOI:10.1007/s10699-012-9303-x

Edmonds, B. and Moss, S. (2005) From KISS to KIDS – an ‘anti-simplistic’ modelling approach. In P. Davidsson et al. (Eds.): Multi Agent Based Simulation 2004. Springer, Lecture Notes in Artificial Intelligence, 3415:130–144. DOI:10.1007/978-3-540-32243-6_11

Fieldhouse, E., Lessard-Phillips, L. & Edmonds, B. (2016) Cascade or echo chamber? A complex agent-based simulation of voter turnout. Party Politics. 22(2):241-256. DOI:10.1177/1354068815605671

Lafuerza, LF, Dyson, L, Edmonds, B & McKane, AJ (2016a) Simplification and analysis of a model of social interaction in voting, European Physical Journal B, 89:159. DOI:10.1140/epjb/e2016-70062-2

Lafuerza L.F., Dyson L., Edmonds B., & McKane A.J. (2016b) Staged Models for Interdisciplinary Research. PLoS ONE, 11(6): e0157261. DOI:10.1371/journal.pone.0157261

Lynch, P. (2008). The origins of computer weather prediction and climate modeling. Journal of Computational Physics, 227(7), 3431-3444. DOI:10.1016/j.jcp.2007.02.034

Robinson, A. (2018) Did Einstein really say that? Nature, 557, 30. DOI:10.1038/d41586-018-05004-4

Thompson, E. (2022) Escape from Model Land. Basic Books. ISBN-13: 9781529364873

Edmonds, B. (2023) The inevitable “layering” of models to extend the reach of our understanding. Review of Artificial Societies and Social Simulation, 9 Feb 2023. https://rofasss.org/2023/02/09/layering

Content

Socio-Cognitive Systems – a position statement

April 2, 2022 thesubmissionauthor Leave a comment

By Frank Dignum¹, Bruce Edmonds² and Dino Carpentras³

¹Department of Computing Science, Faculty of Science and Technology, Umeå University, frank.dignum@umu.se
²Centre for Policy Modelling, Manchester Metropolitan University, bruce@edmonds.name
³Department of Psychology, University of Limerick, dino.carpentras@gmail.com

In this position paper we argue for the creation of a new ‘field’: Socio-Cognitive Systems. The point of doing this is to highlight the importance of a multi-levelled approach to understanding those phenomena where the cognitive and the social are inextricably intertwined – understanding them together.

What goes on ‘in the head’ and what goes on ‘in society’ are complex questions. Each of these deserves serious study on their own – motivating whole fields to answer them. However, it is becoming increasingly clear that these two questions are deeply related. Humans are fundamentally social beings, and it is likely that many features of their cognition have evolved because they enable them to live within groups (Herrmann et al. 20007). Whilst some of these social features can be studied separately (e.g. in a laboratory), others only become fully manifest within society at large. On the other hand, it is also clear that how society ‘happens’ is complicated and subtle and that these processes are shaped by the nature of our cognition. In other words, what people ‘think’ matters for understanding how society ‘is’ and vice versa. For many reasons, both of these questions are difficult to answer. As a result of these difficulties, many compromises are necessary in order to make progress on them, but each compromise also implies some limitations. The main two types of compromise consist of limiting the analysis to only one of the two (i.e. either cognition or society)^[1]. To take but a few examples of this.

Neuro-scientists study what happens between systems of neurones to understand how the brain does things and this is so complex that even relatively small ensembles of neurones are at the limits of scientific understanding.
Psychologists see what can be understood of cognition from the outside, usually in the laboratory so that some of the many dimensions can be controlled and isolated. However, what can be reproduced in a laboratory is a limited part of behaviour that might be displayed in a natural social context.
Economists limit themselves to the study of the (largely monetary) exchange of services/things that could occur under assumptions of individual rationality, which is a model of thinking not based upon empirical data at the individual level. Indeed it is known to contradict a lot of the data and may only be a good approximation for average behaviour under very special circumstances.
Ethnomethodologists will enter a social context and describe in detail the social and individual experience there, but not generalise beyond that and not delve into the cognition of those they observe.
Other social scientists will take a broader view, look at a variety of social evidence, and theorise about aspects of that part of society. They (almost always) do not include individual cognition into account in these and do not seek to integrate the social and the cognitive levels.

Each of these in the different ways separate the internal mechanisms of thought from the wider mechanisms of society or limits its focus to a very specific topic. This is understandable; what each is studying is enough to keep them occupied for many lifetimes. However, this means that each of these has developed their own terms, issues, approaches and techniques which make relating results between fields difficult (as Kuhn, 1962, pointed out).

Figure 1: Schematic representation of the relationship between the individual and society. Individuals’ cognition is shaped by society, at the same time, society is shaped by individuals’ beliefs and behaviour.

This separation of the cognitive and the social may get in the way of understanding many things that we observe. Some phenomena seem to involve a combination of these aspects in a fundamental way – the individual (and its cognition) being part of society as well as society being part of the individual. Some examples of this are as follows (but please note that this is far from an exhaustive list).

Norms. A social norm is a constraint or obligation upon action imposed by society (or perceived as such). One may well be mistaken about a norm (e.g. whether it is ok to casually talk to others at a bus stop), thus it is also a belief – often not told to one explicitly but something one needs to infer from observation. However, for a social norm to hold it also needs to be an observable convention. Decisions to violate social norms require that the norm is an explicit (referable) object in the cognitive model. But the violation also has social consequences. If people react negatively to violations the norm can be reinforced. But if violations are ignored it might lead to a norm disappearing. How new norms come about, or how old ones fade away, is a complex set of interlocking cognitive and social processes. Thus social norms are a phenomena that essentially involves both the social and the cognitive (Conte et al. 2013).
Joint construction of social reality. Many of the constraints on our behaviour come from our perception of social reality. However, we also create this social reality and constantly update it. For example, we can invent a new procedure to select a person as head of department or exit a treaty and thus have different ways of behaving after this change. However, these changes are not unconstrained in themselves. Sometimes the time is “ripe for change”, while at other times resistance is too big for any change to take place (even though a majority of the people involved would like to change). Thus what is socially real for us depends on what people individually believe is real, but this depends in complex ways on what other people believe and their status. And probably even more important: the “strength” of a social structure depends on the use people make of it. E.g. a head of department becomes important if all decisions in the department are deferred to the head. Even though this might not be required by university or law.
Identity. Our (social) identity determines the way other people perceive us (e.g. a sports person, a nerd, a family man) and therefore creates expectations about our behaviour. We can create our identities ourselves and cultivate them, but at the same time, when we have a social identity, we try to live up to it. Thus, it will partially determine our goals and reactions and even our feeling of self-esteem when we live up to our identity or fail to do so. As individuals we (at least sometimes) have a choice as to our desired identity, but in practice, this can only be realised with the consent of society. As a runner I might feel the need to run at least three times a week in order for other people to recognize me as runner. At the same time a person known as a runner might be excused from a meeting if training for an important event. Thus reinforcing the importance of the “runner” identity.
Social practices. The concept already indicates that social practices are about the way people habitually interact and through this interaction shape social structures. Practices like shaking hands when greeting do not always have to be efficient, but they are extremely socially important. For example, different groups, countries and cultures will have different practices when greeting and performing according to the practice shows whether you are part of the in-group or out-group. However, practices can also change based on circumstances and people, as it happened, for example, to the practice of shaking hands during the covid-19 pandemic. Thus, they are flexible and adapting to the context. They are used as flexible mechanisms to efficiently fit interactions in groups, connecting persons and group behaviour.

As a result, this division between cognitive and the social gets in the way not only of theoretical studies, but also in practical applications such as policy making. For example, interventions aimed at encouraging vaccination (such as compulsory vaccination) may reinforce the (social) identity of the vaccine hesitant. However, this risk and its possible consequences for society cannot be properly understood without a clear grasp of the dynamic evolution of social identity.

Computational models and systems provide a way of trying to understand the cognitive and the social together. For computational modellers, there is no particular reason to confine themselves to only the cognitive or only the social because agent-based systems can include both within a single framework. In addition, the computational system is a dynamic model that can represent the interactions of the individuals that connect the cognitive models and the social models. Thus the fact that computational models have a natural way to represent the actions as an integral and defining part of the socio-cognitive system is of prime importance. Given that the actions are an integral part of the model it is well suited to model the dynamics of socio-cognitive systems and track changes at both the social and the cognitive level. Therefore, within such systems we can study how cognitive processes may act to produce social phenomena whilst, at the same time, as how social realities are shaping the cognitive processes. Caarley and Newell (1994) discusses what is necessary at the agent level for sociality, Hofested et al. (2021) talk about how to understand sociality using computational models (including theories of individual action) – we want to understand both together. Thus, we can model the social embeddedness that Granovetter (1985) talked about – going beyond over- or under-socialised representations of human behaviour. It is not that computational models are innately suitable for modelling either the cognitive or the social, but that they can be appropriately structured (e.g. sets of interacting parts bridging micro-, meso- and macro-levels) and include arbitrary levels of complexity. Lots of models that represent the social have entities that stand for the cognitive, but do not explicitly represent much of that detail – similarly much cognitive modelling implies the social in terms of the stimuli and responses of an individual that would be to other social entities, but where these other entities are not explicitly represented or are simplified away.

Socio-Cognitive Systems (SCS) are: those models and systems where both cognitive and social complexity are represented with a meaningful level of processual detail.

A good example of an application where this appeared of the biggest importance was in simulations for the covid-19 crisis. The spread of the corona virus on macro level could be given by an epidemiological model, but the actual spreading depended crucially on the human behaviour that resulted from individuals’ cognitive model of the situation. In Dignum (2021) it was shown how the socio-cognitive system approach was fundamental to obtaining better insights in the effectiveness of a range of covid-19 restrictions.

Formality here is important. Computational systems are formal in the sense that they can be unambiguously passed around (i.e. unlike language, it is not differently re-interpreted by each individual) and operate according to their own precisely specified and explicit rules. This means that the same system can be examined and experimented on by a wider community of researchers. Sometimes, even when the researchers from different fields find it difficult to talk to one another, they can fruitfully cooperate via a computational model (e.g. Lafuerza et al. 2016). Other kinds of formal systems (e.g. logic, maths) are geared towards models that describe an entire system from a birds eye view. Although there are some exceptions like fibred logics Gabbay (1996), these are too abstract to be of good use to model practical situations. The lack of modularity and has been addressed in context logics Giunchiglia, F., & Ghidini, C. (1998). However, the contexts used in this setting are not suitable to generate a more general societal model. It results in most typical mathematical models using a number of agents which is either one, two or infinite (Miller and Page 2007), while important social phenomena happen with a “medium sized” population. What all these formalisms miss is a natural way of specifying the dynamics of the system that is modelled, while having ways to modularly describe individuals and the society resulting from their interactions. Thus, although much of what is represented in Socio-Cognitive Systems is not computational, the lingua franca for talking about them is.

The ‘double complexity’ of combining the cognitive and the social in the same system will bring its own methodological challenges. Such complexity will mean that many socio-cognitive systems will be, themselves, hard to understand or analyse. In the covid-19 simulations, described in (Dignum 2021), a large part of the work consisted of analysing, combining and representing the results in ways that were understandable. As an example, for one scenario 79 pages of graphs were produced showing different relations between potentially relevant variables. New tools and approaches will need to be developed to deal with this. We only have some hints of these, but it seems likely that secondary stages of analysis – understanding the models – will be necessary, resulting in a staged approach to abstraction (Lafuerza et al. 2016). In other words, we will need to model the socio-cognitive systems, maybe in terms of further (but simpler) socio-cognitive systems, but also maybe with a variety of other tools. We do not have a view on this further analysis, but this could include: machine learning, mathematics, logic, network analysis, statistics, and even qualitative approaches such as discourse analysis.

An interesting input for the methodology of designing and analysing socio-cognitive systems is anthropology and specifically ethnographical methods. Again, for the covid-19 simulations the first layer of the simulation was constructed based on “normal day life patterns”. Different types of persons were distinguished that each have their own pattern of living. These patterns interlock and form a fabric of social interactions that overall should satisfy most of the needs of the agents. Thus we calibrate the simulation based on the stories of types of people and their behaviours. Note that doing the same just based on available data of behaviour would not account for the underlying needs and motives of that behaviour and would not be a good basis for simulating changes. The stories that we used looked very similar to the type of reports ethnographers produce about certain communities. Thus further investigating this connection seems worthwhile.

For representing the output of the complex socio-cognitive systems we can also use the analogue of stories. Basically, different stories show the underlying (assumed) causal relations between phenomena that are observed. E.g. seeing an increase in people having lunch with friends can be explained by the fact that a curfew prevents people having dinner with their friends, while they still have a need to socialize. Thus the alternative of going for lunch is chosen more often. One can see that the explaining story uses both social as well as cognitive elements to describe the results. Although in the covid-19 simulations we have created a number of these stories, they were all created by hand after (sometimes weeks) of careful analysis of the results. Thus for this kind of approach to be viable, new tools are required.

Although human society is the archetypal socio-cognitive system, it is not the only one. Both social animals and some artificial systems also come under this category. These may be very different from the human, and in the case of artificial systems completely different. Thus, Socio-Cognitive Systems is not limited to the discussion of observable phenomena, but can include constructed or evolved computational systems, and artificial societies. Examination of these (either theoretically or experimentally) opens up the possibility of finding either contrasts or commonalities between such systems – beyond what happens to exist in the natural world. However, we expect that ideas and theories that were conceived with human socio-cognitive systems in mind might often be an accessible starting point for understanding these other possibilities.

In a way, Socio-Cognitive Systems bring together two different threads in the work of Herbert Simon. Firstly, as in Simon (1948) it seeks to take seriously the complexity of human social behaviour without reducing this to overly simplistic theories of individual behaviour. Secondly, it adopts the approach of explicitly modelling the cognitive in computational models (Newell & Simon 1972). Simon did not bring these together in his lifetime, perhaps due to the limitations and difficulty of deploying the computational tools to do so. Instead, he tried to develop alternative mathematical models of aspects of thought (Simon 1957). However, those models were limited by being mathematical rather than computational.

To conclude, a field of Socio-Cognitive Systems would consider the cognitive and the social in an integrated fashion – understanding them together. We suggest that computational representation or implementation might be necessary to provide concrete reference between the various disciplines that are needed to understand them. We want to encourage research that considers the cognitive and the social in a truly integrated fashion. If by labelling a new field does this it will have achieved its purpose. However, there is the possibility that completely new classes of theory and complexity may be out there to be discovered – phenomena that are denied if either the cognitive or the social are not taken together – a new world of a socio-cognitive systems.

Notes

[1] Some economic models claim to bridge between individual behaviour and macro outcomes, however this is traditionally notional. Many economists admit that their primary cognitive models (varieties of economic rationality) are not valid for individuals but are what people on average do – i.e. this is a macro-level model. In other economic models whole populations are formalised using a single representative agent. Recently, there are some agent-based economic models emerging, but often limited to agree with traditional models.

Acknowledgements

Bruce Edmonds is supported as part of the ESRC-funded, UK part of the “ToRealSim” project, grant number ES/S015159/1.

References

Carley, K., & Newell, A. (1994). The nature of the social agent. Journal of mathematical sociology, 19(4): 221-262. DOI: 10.1080/0022250X.1994.9990145

Conte R., Andrighetto G. and Campennì M. (eds) (2013) Minding Norms – Mechanisms and dynamics of social order in agent societies. Oxford University Press, Oxford.

Dignum, F. (ed.) (2021) Social Simulation for a Crisis; Results and Lessons from Simulating the COVID-19 Crisis. Springer.

Herrmann E., Call J, Hernández-Lloreda MV, Hare B, Tomasello M (2007) Humans have evolved specialized skills of social cognition: The cultural intelligence hypothesis. Science 317(5843): 1360-1366. DOI: 10.1126/science.1146282

Hofstede, G.J, Frantz, C., Hoey, J., Scholz, G. and Schröder, T. (2021) Artificial Sociality Manifesto. Review of Artificial Societies and Social Simulation, 8th Apr 2021. https://rofasss.org/2021/04/08/artsocmanif/

Gabbay, D. M. (1996). Fibred Semantics and the Weaving of Logics Part 1: Modal and Intuitionistic Logics. The Journal of Symbolic Logic, 61(4), 1057–1120.

Ghidini, C., & Giunchiglia, F. (2001). Local models semantics, or contextual reasoning= locality+ compatibility. Artificial intelligence, 127(2), 221-259. DOI: 10.1016/S0004-3702(01)00064-9

Granovetter, M. (1985) Economic action and social structure: The problem of embeddedness. American Journal of Sociology 91(3): 481-510. DOI: 10.1086/228311

Kuhn, T,S, (1962) The structure of scientific revolutions. University of Chicago Press, Chicago

Lafuerza L.F., Dyson L., Edmonds B., McKane A.J. (2016) Staged Models for Interdisciplinary Research. PLoS ONE 11(6): e0157261, DOI: 10.1371/journal.pone.0157261

Miller, J. H., Page, S. E., & Page, S. (2009). Complex adaptive systems. Princeton university press.

Newell A, Simon H.A. (1972) Human problem solving. Prentice Hall, Englewood Cliffs, NJ

Simon, H.A. (1948) Administrative behaviour: A study of the decision making processes in administrative organisation. Macmillan, New York

Simon, H.A. (1957) Models of Man: Social and rational. John Wiley, New York

Dignum, F., Edmonds, B. and Carpentras, D. (2022) Socio-Cognitive Systems – A Position Statement. Review of Artificial Societies and Social Simulation, 2nd Apr 2022. https://rofasss.org/2022/04/02/scs

© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

Content

The Poverty of Suggestivism – the dangers of “suggests that” modelling

February 28, 2022 thesubmissionauthor 2 Comments

By Bruce Edmonds

Vagueness and refutation

A model^[1] is basically composed of two parts (Zeigler 1976, Wartofsky 1979):

A set of entities (such as mathematical equations, logical rules, computer code etc.) which can be used to make some inferences as to the consequences of that set (usually in conjunction with some data and parameter values)
A mapping from this set to what it aims to represent – what the bits mean

Whilst a lot of attention has been paid to the internal rigour of the set of entities and the inferences that are made from them (1), the mapping to what that represents (2) has often been left as implicit or incompletely described – sometimes only indicated by the labels given to its parts. The result is a model that vaguely relates to its target, suggesting its properties analogically. There is not a well-defined way that the model is to be applied to anything observed, but a new map is invented each time it is used to think about a particular case. I call this way of modelling “Suggestivism”, because the model “suggests” things about what is being modelled.

This is partly a recapitulation of Popper’s critique of vague theories in his book “The Poverty of Historicism” (1957). He characterised such theories as “irrefutable”, because whatever the facts, these theories could be made to fit them. Irrefutability is an indicator of a lack of precise mapping to reality – such vagueness makes refutation very hard. However, it is only an indicator; there may be other reasons than vagueness for it not being possible to test a theory – it is their disconnection from well-defined empirical reference that is the issue here.

Some might go as far as suggesting that any model or theory that is not refutable is “unscientific”, but this goes too far, implying a very restricted definition of what ‘science’ is. We need analogies to think about what we are doing and to gain insight into what we are studying, e.g. (Hartman 1997) – for humans they are unavoidable, ‘baked’ into the way language works (Lakoff 1987). A model might make a set of ideas clear and help map out the consequences of a set of assumptions/structures/processes. Many of these suggestivist models relate to a set of ideas and it is the ideas that relate to what is observed (albeit informally) (Edmonds 2001). However, such models do not capture anything reliable about what they refer to, and in that sense are not part of the set of the established statements and theories that is at the core of science (Arnold 2014).

The dangers of suggestivist modelling

As above, there are valid uses of abstract or theoretical modelling where this is explicitly acknowledged and where no conclusions about observed phenomena are made. So what are the dangers of suggestivist modelling – why am I making such a fuss about it?

Firstly, that people often seem to confuse a model as an analogy – a way of thinking about stuff – and a model that tells us reliably about what we are studying. Thus they give undue weight to the analyses of abstract models that are, in fact, just thought experiments. Making models is a very intimate way of theorising – one spends an extended period of time interacting with one’s model: developing, checking, analysing etc. The result is a particularly strong version of “Kuhnian Spectacles” (Kuhn 1962) causing us to see the world though our model for weeks after. Under this strong influence it is natural to confuse what we can reliably infer about the world and how we are currently perceiving/thinking about it. Good scientists should then pause and wait for this effect to wear off so that they can effectively critique what they have done, its limitations and what its implications are. However, often in the rush to get their work out, modellers often do not do this, resulting in a sloppy set of suggestive interpretations of their modelling.

Secondly, empirical modelling is hard. It is far easier (and, frankly, more fun) to play with non-empirical models. A scientific culture that treats suggestivist modelling as substantial progress and significantly rewards modellers that do it, will effectively divert a lot of modelling effort in this direction. Chattoe-Brown (2018) displayed evidence of this in his survey of opinion dynamics models – abstract, suggestivist modelling got far more reward (in terms of citations) than those that tried to relate their model to empirical data in a direct manner. Abstract modelling has a role in science, but if it is easier and more rewarding then the field will become unbalanced. It may give the impression of progress but not deliver on this impression. In a more mature science, researchers working on measurement methods (steps from observation to models) and collecting good data are as important as the theorists (Moss 1998).

Thirdly, it is hard to judge suggestivist models. Given their connection to the modelling target is vague there cannot be any decisive test of its success. Good modellers should declare the exact purpose of their model, e.g. that is analogical or merely exploring the consequences of theory (Edmonds et al. 2019), but then accept the consequences of this choice – namely, that it excludes making conclusions about the observed world. If it is for a theoretical exploration then the comprehensiveness of the exploration, the scope of the exploration and the applicability of the model can be judged, but if the model is analogical or illustrative then this is harder. Whilst one model may suggest X, another may suggest the opposite. It is quite easy to fix a model to get the outcomes one wants. Clearly, if a model makes startling suggestions – illustrating totally new ideas or making a counter-example to widely held assumptions – then this helps science by widening the pool of theories or hypotheses that are considered. However most suggestivist modelling does not do this.

Fourthly, their sheer flexibility of as to application causes problems – if one works hard enough one can invent mappings to a wide range of cases, the limits are only those of our imagination. In effect, having a vague mapping from model to what it models adds in huge flexibility in a similar way to having a large number of free (non-empirical) parameters. This flexibility gives an impression of generality, and many desire simple and general models for complex phenomena. However, this is illusory because a different mapping is needed for each case, to make it apply. Given the above (1)+(2) definition of a model this means that, in fact, it is a different model for each case – what a model refers to, is part of the model. The same flexibility makes such models impossible to refute, since one can just adjust the mapping to save them. The apparent generality and lack of refutation means that such models hang around in the literature, due to their surface attractiveness.

Finally, these kinds of model are hugely influential beyond the community of modellers to the wider public including policy actors. Narratives that start in abstract models make their way out and can be very influential (Vranckx 1999). Despite the lack of rigorous mapping from model to reality, suggestivist models look impressive, look scientific. For example, very abstract models from the Neo-Classical ‘Chicago School’ of economists supported narratives about the optimal efficiency of markets, leading to a reluctance to regulate them (Krugman 2009). A lack of regulation seemed to be one of the factors behind the 2007/8 economic crash (Baily et al 2008). Modellers may understand that other modellers get over-enthusiastic and over-interpret their models, but others may not. It is the duty of modellers to give an accurate impression of the reliability of any modelling results and not to over-hype them.

How to recognise a suggestivist model

It can be hard to detangle how empirically vague a model is, because many descriptions about modelling work do not focus on making the mapping to what it represents precise. The reasons for this are various, for example: the modeller might be conflating reality and what is in the model in their minds, the researcher is new to modelling and has not really decided what the purpose of their model is, the modeller might be over-keen to establish the importance of their work and so is hyping the motivation and conclusions, they might simply not got around to thinking enough about the relationship between their model and what it might represent, or they might not have bothered to make the relationship explicit in their description. Whatever the reason the reader of any description of such work is often left with an archaeological problem: trying to unearth what the relationship might be, based on indirect clues only. The only way to know for certain is to take a case one knows about and try and apply the model to it, but this is a time consuming process and relies upon having a case with suitable data available. However, there are some indicators, albeit fallible ones, including the following.

A relatively simple model is interpreted as explaining a wide range of observed, complex phenomena
No data from an observed case study is compared to data from the model (often no data is brought in at all, merely abstract observations) – despite this, conclusions about some observed phenomena are made
The purpose of the model is not explicitly declared
The language of the paper seems to conflate talking about the model with what is being modelled
In the paper there are sudden abstraction ‘jumps’ between the motivation and the description of the model and back again to the interpretation of the results in terms of that motivation. The abstraction jumps involved are large and justified by some a priori theory or modelling precedents rather than evidence.

How to avoid suggestivist modelling

How to avoid the dangers of suggestivist modelling should be clear from the above discussion, but I will make them explicit here.

Be clear about the model purpose – that is does the model aim to achieve, which indicates how it should be judged by others (Edmonds et al 2019)
Do not make any conclusions about the real world if you have not related the model to any data
Do not make any policy conclusions – things that might affect other people’s lives – without at least some independent validation of the model outcomes
Document how a model relates (or should relate) to data, the nature of that data and maybe even the process whereby that data should be obtained (Achter et al 2019)
Be explicit as possible about what kinds of phenomena the model applies to – the limits of its scope
Keep the language about the model and what is being modelled distinct – for any statement it should be clear whether it is talking about the model or what it models (Edmonds 2020)
Highlight any bold assumptions in the specification of the model or describe what empirical foundation there is for them – be honest about these

Conclusion

Models can serve many different purposes (Epstein 2008). This is fine as long as the purpose of models are always made clear, and model results are not interpreted further than their established purpose allows. Research which gives the impression that analogical, illustrative or theoretical modelling can tell us anything reliable about observed complex phenomena is not only sloppy science, but can have a deleterious impact – giving an impression of progress whilst diverting attention from empirically reliable work. Like a bad investment: if it looks too good and too easy to be true, it probably isn’t.

Notes

[1] We often use the word “model” in a lazy way to indicate (1) rather than (1)+(2) in this definition, but a set of entities without any meaning or mapping to anything else is not a model, as it does not represent anything. For example, a random set of equations or program instructions does not make a model.

Acknowledgements

Bruce Edmonds is supported as part of the ESRC-funded, UK part of the “ToRealSim” project, grant number ES/S015159/1.

References

Achter, S., Borit, M., Chattoe-Brown, E., Palaretti, C. & Siebers, P.-O. (2019) Cherchez Le RAT: A Proposed Plan for Augmenting Rigour and Transparency of Data Use in ABM. Review of Artificial Societies and Social Simulation, 4^th June 2019. https://rofasss.org/2019/06/04/rat/

Arnold, E. (2014). What’s wrong with social simulations?. The Monist, 97(3), 359-377. DOI:10.5840/monist201497323

Baily, M. N., Litan, R. E., & Johnson, M. S. (2008). The origins of the financial crisis. Fixing Finance Series – Paper 3, The Brookings Institution. https://www.brookings.edu/wp-content/uploads/2016/06/11_origins_crisis_baily_litan.pdf

Chattoe-Brown, E. (2018) What is the earliest example of a social science simulation (that is nonetheless arguably an ABM) and shows real and simulated data in the same figure or table? Review of Artificial Societies and Social Simulation, 11^th June 2018. https://rofasss.org/2018/06/11/ecb/

Edmonds, B. (2001) The Use of Models – making MABS actually work. In. Moss, S. and Davidsson, P. (eds.), Multi Agent Based Simulation, Lecture Notes in Artificial Intelligence, 1979:15-32. http://cfpm.org/cpmrep74.html

Edmonds, B. (2020) Basic Modelling Hygiene – keep descriptions about models and what they model clearly distinct. Review of Artificial Societies and Social Simulation, 22^nd May 2020. https://rofasss.org/2020/05/22/modelling-hygiene/

Epstein, J. M. (2008). Why model?. Journal of artificial societies and social simulation, 11(4), 12. https://jasss.soc.surrey.ac.uk/11/4/12.html

Hartmann, S. (1997): Modelling and the Aims of Science. In: Weingartner, P. et al (ed.) : The Role of Pragmatics in Contemporary Philosophy: Contributions of the Austrian Ludwig Wittgenstein Society. Vol. 5. Wien und Kirchberg: Digi-Buch. pp. 380-385. https://epub.ub.uni-muenchen.de/25393/

Krugman, P. (2009) How Did Economists Get It So Wrong? New York Times, Sept. 2nd 2009. https://www.nytimes.com/2009/09/06/magazine/06Economic-t.html

Kuhn, T.S. (1962) The Structure of Scientific Revolutions. Chicago: University of Chicago Press.

Lakoff, G. (1987) Women, fire, and dangerous things. University of Chicago Press, Chicago.

Morgan, M. S., & Morrison, M. (1999). Models as mediators. Cambridge: Cambridge University Press.

Moss, S. (1998) Social Simulation Models and Reality: Three Approaches. Centre for Policy Modelling Discussion Paper: CPM-98-35, http://cfpm.org/cpmrep35.html

Popper, K. (1957). The poverty of historicism. Routledge.

Vranckx, An. (1999) Science, Fiction & the Appeal of Complexity. In Aerts, Diederik, Serge Gutwirth, Sonja Smets, and Luk Van Langehove, (eds.) Science, Technology, and Social Change: The Orange Book of “Einstein Meets Magritte.” Brussels: Vrije Universiteit Brussel; Dordrecht: Kluwer., pp. 283–301.

Wartofsky, M. W. (1979). The model muddle: Proposals for an immodest realism. In Models (pp. 1-11). Springer, Dordrecht.

Zeigler, B. P. (1976). Theory of Modeling and Simulation. Wiley Interscience, New York.

Edmonds, B. (2022) The Poverty of Suggestivism – the dangers of "suggests that" modelling. Review of Artificial Societies and Social Simulation, 28th Feb 2022. https://rofasss.org/2022/02/28/poverty-suggestivism

Content

Where Now For Experiments In Agent-Based Modelling? Report of a Round Table at SSC2021, held on 22 September 2021

November 2, 2021 thesubmissionauthor Leave a comment

**By Dino Carpentras¹, Edmund Chattoe-Brown^2*, Bruce Edmonds³, Cesar García-Diaz⁴, Christian Kammler⁵, Anna Pagani⁶ and Nanda Wijermans⁷**

^*Corresponding author, ¹Centre for Social Issues Research, University of Limerick, ²School of Media, Communication and Sociology, University of Leicester, ³Centre for Policy Modelling, Manchester Metropolitan University, ⁴Department of Business Administration, Pontificia Universidad Javeriana, ⁵Department of Computing Science, Umeå University, ⁶Laboratory on Human-Environment Relations in Urban Systems (HERUS), École Polytechnique Fédérale de Lausanne (EPFL), ⁷Stockholm Resilience Centre, Stockholm University.

Introduction

This round table was convened to advance and improve the use of experimental methods in Agent-Based Modelling, in the hope that both existing and potential users of the method would be able to identify steps towards this aim^[i]. The session began with a presentation by Bruce Edmonds (http://cfpm.org/slides/experiments%20and%20ABM.pptx) whose main argument was that the traditional idea of experimentation (controlling extensively for the environment and manipulating variables) was too simplistic to add much to the understanding of the sort of complex systems modelled by ABMs and that we should therefore aim to enhance experiments (for example using richer experimental settings, richer measures of those settings and richer data – like discussions between participants as well as their behaviour). What follows is a summary of the main ideas discussed organised into themed sections.

What Experiments Are

Defining the field of experiments proved to be challenging on two counts. The first was that there are a number of labels for potentially relevant approaches (experiments themselves – for example, Boero et al. 2010, gaming – for example, Tykhonov et al. 2008, serious games – for example Taillandier et al. 2019, companion/participatory modelling – for example, Ramanath and Gilbert 2004 and web based gaming – for example, Basole et al. 2013) whose actual content overlap is unclear. Is it the case that a gaming approach is generally more in line with the argument proposed by Edmonds? How can we systematically distinguish the experimental content of a serious game approach from a gaming approach? This seems to be a problem in immature fields where the labels are invented first (often on the basis of a few rather divergent instances) and the methodology has to grow into them. It would be ludicrous if we couldn’t be sure whether a piece of research was survey based or interview based (and this would radically devalue the associated labels if it were so.)

The second challenge is also more general in Agent-Based Modelling which is the same labels being used differently by different researchers. It is not productive to argue about which uses are correct but it is important that the concepts behind the different uses are clear so a common scheme of labelling might ultimately be agreed. So, for example, experiment can be used (and different round table participants had different perspectives on the uses they expected) to mean laboratory experiments (simplified settings with human subjects – again see, for example, Boero et al. 2010), experiments with ABMs (formal experimentation with a model that doesn’t necessarily have any empirical content – for example, Doran 1998) and natural experiments (choice of cases in the real world to, for example, test a theory – see Dinesen 2013).

One approach that may help with this diversity is to start developing possible dimensions of experimentation. One might be degree of control (all the way from very stripped down behavioural laboratory experiments to natural situations where the only control is to select the cases). Another might be data diversity: From pure analysis of ABMs (which need not involve data at all), through laboratory experiments that record only behaviour to ethnographic collection and analysis of diverse data in rich experiments (like companion modelling exercises.) But it is important for progress that the field develops robust concepts that allow meaningful distinctions and does not get distracted into pointless arguments about labelling. Furthermore, we must consider the possible scientific implications of experimentation carried out at different points in the dimension space: For example, what are the relative strengths and limitations of experiments that are more or less controlled or more or less data diverse? Is there a “sweet spot” where the benefit of experiments is greatest to Agent-Based Modelling? If so, what is it and why?

The Philosophy of Experiment

The second challenge is the different beliefs (often associated with different disciplines) about the philosophical underpinnings of experiment such as what we might mean by a cause. In an economic experiment, for example, the objective may be to confirm a universal theory of decision making through displayed behaviour only. (It is decisions described by this theory which are presumed to cause the pattern of observed behaviour.) This will probably not allow the researcher to discover that their basic theory is wrong (people are adaptive not rational after all) or not universal (agents have diverse strategies), or that some respondents simply didn’t understand the experiment (deviations caused by these phenomena may be labelled noise relative to the theory being tested but in fact they are not.)

By contrast qualitative sociologists believe that subjective accounts (including accounts of participation in the experiment itself) can be made reliable and that they may offer direct accounts of certain kinds of cause: If I say I did something for a certain reason then it is at least possible that I actually did (and that the reason I did it is therefore its cause). It is no more likely that agreement will be reached on these matters in the context of experiments than it has been elsewhere. But Agent-Based Modelling should keep its reputation for open mindedness by seeing what happens when qualitative data is also collected and not just rejecting that approach out of hand as something that is “not done”. There is no need for Agent-Based Modelling blindly to follow the methodology of any one existing discipline in which experiments are conducted (and these disciplines often disagree vigorously on issues like payment and deception with no evidence on either side which should also make us cautious about their self-evident correctness.)

Finally, there is a further complication in understanding experiments using analogies with the physical sciences. In understanding the evolution of a river system, for example, one can control/intervene, one can base theories on testable micro mechanisms (like percolation) and one can observe. But there is no equivalent to asking the river what it intends (whether we can do this effectively in social science or not).^[ii] It is not totally clear how different kinds of data collection like these might relate to each other in the social sciences, for example, data from subjective accounts, behavioural experiments (which may show different things from what respondents claim) and, for example, brain scans (which side step the social altogether.) This relationship between different kinds of data currently seems incompletely explored and conceptualised. (There is a tendency just to look at easy cases like surveys versus interviews.)

The Challenge of Experiments as Practical Research

This is an important area where the actual and potential users of experiments participating in the round table diverged. Potential users wanted clear guidance on the resources, skills and practices involved in doing experimental work (and see similar issues in the behavioural strategy literature, for example, Reypens and Levine 2018). At the most basic level, when does a researcher need to do an experiment (rather than a survey, interviews or observation), what are the resource requirements in terms of time, facilities and money (laboratory experiments are unusual in often needing specific funding to pay respondents rather than substituting the researcher working for free) what design decisions need to be made (paying subjects, online or offline, can subjects be deceived?), how should the data be analysed (how should an ABM be validated against experimental data?) and so on.^[iii] (There are also pros and cons to specific bits of potentially supporting technology like Amazon Mechanical Turk, Qualtrics and Prolific, which have not yet been documented and systematically compared for the novice with a background in Agent-Based Modelling.) There is much discussion about these matters in the traditional literatures of social sciences that do experiments (see, for example, Kagel and Roth 1995, Levine and Parkinson 1994 and Zelditch 2014) but this has not been summarised and tuned specifically for the needs of Agent-Based Modellers (or published where they are likely to see it).

However, it should not be forgotten that not all research efforts need this integration within the same project, so thinking about the problems that really need it is critical. Nonetheless, triangulation is indeed necessary within research programmes. For instance, in subfields such as strategic management and organisational design, it is uncommon to see an ABM integrated with an experiment as part of the same project (though there are exceptions, such as Vuculescu 2017). Instead, ABMs are typically used to explore “what if” scenarios, build process theories and illuminate potential empirical studies. In this approach, knowledge is accumulated instead through the triangulation of different methodologies in different projects (see Burton and Obel 2018). Additionally, modelling and experimental efforts are usually led by different specialists – for example, there is a Theoretical Organisational Models Society whose focus is the development of standards for theoretical organisation science.

In a relatively new and small area, all we often have is some examples of good practice (or more contentiously bad practice) of which not everyone is even aware. A preliminary step is thus to see to what extent people know of good practice and are able to agree that it is good (and perhaps why it is good).

Finally, there was a slightly separate discussion about the perspectives of experimental participants themselves. It may be that a general problem with unreal activity is that you know it is unreal (which may lead to problems with ecological validity – Bornstein 1999.) On the other hand, building on the enrichment argument put forward by Edmonds (above), there is at least anecdotal observational evidence that richer and more realistic settings may cause people to get “caught up” and perhaps participate more as they would in reality. Nonetheless, there are practical steps we can take to learn more about these phenomena by augmenting experimental designs. For example we might conduct interviews (or even group discussions) before and after experiments. This could make the initial biases of participants explicit and allow them to self-evaluate retrospectively the extent to which they got engaged (or perhaps even over-engaged) during the game. The first such questionnaire could be available before attending the experiment, whilst another could be administered right after the game (and perhaps even a third a week later). In addition to practical design solutions, there are also relevant existing literatures that experimental researchers should probably draw on in this area, for example that on systemic design and the associated concept of worldviews. But it is fair to say that we do not yet fully understand the issues here but that they clearly matter to the value of experimental data for Agent-Based Modelling.^[iv]

Design of Experiments

Something that came across strongly in the round table discussion as argued by existing users of experimental methods was the desirability of either designing experiments directly based on a specific ABM structure (rather than trying to use a stripped down – purely behavioural – experiment) or mixing real and simulated participants in richer experimental settings. In line with the enrichment argument put forward by Edmonds, nobody seemed to be using stripped down experiments to specify, calibrate or validate ABM elements piecemeal. In the examples provided by round table participants, experiments corresponding closely to the ABM (and mixing real and simulated participants) seemed particularly valuable in tackling subjects that existing theory had not yet really nailed down or where it was clear that very little of the data needed for a particular ABM was available. But there was no sense that there is a clearly defined set of research designs with associated purposes on which the potential user can draw. (The possible role of experiments in supporting policy was also mentioned but no conclusions were drawn.)

Extracting Rich Data from Experiments

Traditional experiments are time consuming to do, so they are frequently optimised to obtain the maximum power and discrimination between factors of interest. In such situations they will often limit their data collection to what is strictly necessary for testing their hypotheses. Furthermore, it seems to be a hangover from behaviourist psychology that one does not use self-reporting on the grounds that it might be biased or simply involve false reconstruction (rationalisation). From the point of view of building or assessing ABMs this approach involves a wasted opportunity. Due to the flexible nature of ABMs there is a need for as many empirical constraints upon modelling as possible. These constraints can come from theory, evidence or abstract principles (such as simplicity) but should not hinder the design of an ABM but rather act as a check on its outcomes. Game-like situations can provide rich data about what is happening, simultaneously capturing decisions on action, the position and state of players, global game outcomes/scores and what players say to each other (see, for example, Janssen et al. 2010, Lindahl et al. 2021). Often, in social science one might have a survey with one set of participants, interviews with others and longitudinal data from yet others – even if these, in fact, involve the same people, the data will usually not indicate this through consistent IDs. When collecting data from a game (and especially from online games) there is a possibility for collecting linked data with consistent IDs – including interviews – that allows for a whole new level of ABM development and checking.

Standards and Institutional Bootstrapping

This is also a wider problem in newer methods like Agent-Based Modelling. How can we foster agreement about what we are doing (which has to build on clear concepts) and institutionalise those agreements into standards for a field (particularly when there is academic competition and pressure to publish).^[v] If certain journals will not publish experiments (or experiments done in certain ways) what can we do about that? JASSS was started because it was so hard to publish ABMs. It has certainly made that easier but is there a cost through less publication in other journals? See, for example, Squazzoni and Casnici (2013). Would it have been better for the rigour and wider acceptance of Agent-Based Modelling if we had met the standards of other fields rather than setting our own? This strategy, harder in the short term, may also have promoted communication and collaboration better in the long term. If reviewing is arbitrary (reviewers do not seem to have a common view of what makes an experiment legitimate) then can that situation be improved (and in particular how do we best go about that with limited resources?) To some extent, normal individualised academic work may achieve progress here (researchers make proposals, dispute and refine them and their resulting quality ensures at least some individualised adoption by other researchers) but there is often an observable gap in performance: Even though most modellers will endorse the value of data for modelling in principle most models are still non-empirical in practice (Angus and Hassani-Mahmooei 2015, Figure 9). The jury is still out on the best way to improve reviewer consistency, use the power of peer review to impose better standards (and thus resolve a collective action problem under academic competition^[vi]) and so on but recognising and trying to address these issues is clearly important to the health of experimental methods in Agent-Based Modelling. Since running experiments in association with ABMs is already challenging, adding the problem of arbitrary reviewer standards makes the publication process even harder. This discourages scientists from following this path and therefore retards this kind of research generally. Again, here, useful resources (like the Psychological Science Accelerator, which facilitates greater experimental rigour by various means) were suggested in discussion as raw material for our own improvements to experiments in Agent-Based Modelling.

Another issue with newer methods such as Agent-Based Modelling is the path to legitimation before the wider scientific community. The need to integrate ABMs with experiments does not necessarily imply that the legitimation of the former is achieved by the latter. Experimental economists, for instance, may still argue that (in the investigation of behaviour and its implications for policy issues), experiments and data analysis alone suffice. They may rightly ask: What is the additional usefulness of an ABM? If an ABM always needs to be justified by an experiment and then validated by a statistical model of its output, then the method might not be essential at all. Orthodox economists skip the Agent-Based Modelling part: They build behavioural experiments, gather (rich) data, run econometric models and make predictions, without the need (at least as they see it) to build any computational representation. Of course, the usefulness of models lies in the premise that they may tell us something that experiments alone cannot (see Knudsen et al. 2019). But progress needs to be made in understanding (and perhaps reconciling) these divergent positions. The social simulation community therefore needs to be clearer about exactly what ABMs can contribute beyond the limitations of an experiment, especially when addressing audiences of non-modellers (Ballard et al. 2021). Not only is a model valuable when rigorously validated against data, but also whenever it makes sense of the data in ways that traditional methods cannot.

Where Now?

Researchers usually have more enthusiasm than they have time. In order to make things happen in an academic context it is not enough to have good ideas, people need to sign up and run with them. There are many things that stand a reasonable chance of improving the profile and practice of experiments in Agent-Based Modelling (regular sessions at SSC, systematic reviews, practical guidelines and evaluated case studies, discussion groups, books or journal special issues, training and funding applications that build networks and teams) but to a great extent, what happens will be decided by those who make it happen. The organisers of this round table (Nanda Wijermans and Edmund Chattoe-Brown) are very keen to support and coordinate further activity and this summary of discussions is the first step to promote that. We hope to hear from you.

References

Angus, Simon D. and Hassani-Mahmooei, Behrooz (2015) ‘“Anarchy” Reigns: A Quantitative Analysis of Agent-Based Modelling Publication Practices in JASSS, 2001-2012’, Journal of Artificial Societies and Social Simulation, 18(4), October, article 16, <http://jasss.soc.surrey.ac.uk/18/4/16.html>. doi:10.18564/jasss.2952

Ballard, Timothy, Palada, Hector, Griffin, Mark and Neal, Andrew (2021) ‘An Integrated Approach to Testing Dynamic, Multilevel Theory: Using Computational Models to Connect Theory, Model, and Data’, Organizational Research Methods, 24(2), April, pp. 251-284. doi: 10.1177/1094428119881209

Basole, Rahul C., Bodner, Douglas A. and Rouse, William B. (2013) ‘Healthcare Management Through Organizational Simulation’, Decision Support Systems, 55(2), May, pp. 552-563. doi:10.1016/j.dss.2012.10.012

Boero, Riccardo, Bravo, Giangiacomo, Castellani, Marco and Squazzoni, Flaminio (2010) ‘Why Bother with What Others Tell You? An Experimental Data-Driven Agent-Based Model’, Journal of Artificial Societies and Social Simulation, 13(3), June, article 6, <https://www.jasss.org/13/3/6.html>. doi:10.18564/jasss.1620

Bornstein, Brian H. (1999) ‘The Ecological Validity of Jury Simulations: Is the Jury Still Out?’ Law and Human Behavior, 23(1), February, pp. 75-91. doi:10.1023/A:1022326807441

Burton, Richard M. and Obel, Børge (2018) ‘The Science of Organizational Design: Fit Between Structure and Coordination’, Journal of Organization Design, 7(1), December, article 5. doi:10.1186/s41469-018-0029-2

Derbyshire, James (2020) ‘Answers to Questions on Uncertainty in Geography: Old Lessons and New Scenario Tools’, Environment and Planning A: Economy and Space, 52(4), June, pp. 710-727. doi:10.1177/0308518X19877885

Dinesen, Peter Thisted (2013) ‘Where You Come From or Where You Live? Examining the Cultural and Institutional Explanation of Generalized Trust Using Migration as a Natural Experiment’, European Sociological Review, 29(1), February, pp. 114-128. doi:10.1093/esr/jcr044

Doran, Jim (1998) ‘Simulating Collective Misbelief’, Journal of Artificial Societies and Social Simulation, 1(1), January, article 1, <https://www.jasss.org/1/1/3.html>.

Janssen, Marco A., Holahan, Robert, Lee, Allen and Ostrom, Elinor (2010) ‘Lab Experiments for the Study of Social-Ecological Systems’, Science, 328(5978), 30 April, pp. 613-617. doi:10.1126/science.1183532

Kagel, John H. and Roth, Alvin E. (eds.) (1995) The Handbook of Experimental Economics (Princeton, NJ: Princeton University Press).

Knudsen, Thorbjørn, Levinthal, Daniel A. and Puranam, Phanish (2019) ‘Editorial: A Model is a Model’, Strategy Science, 4(1), March, pp. 1-3. doi:10.1287/stsc.2019.0077

Levine, Gustav and Parkinson, Stanley (1994) Experimental Methods in Psychology (Hillsdale, NJ: Lawrence Erlbaum Associates).

Lindahl, Therese, Janssen, Marco A. and Schill, Caroline (2021) ‘Controlled Behavioural Experiments’, in Biggs, Reinette, de Vos, Alta, Preiser, Rika, Clements, Hayley, Maciejewski, Kristine and Schlüter, Maja (eds.) The Routledge Handbook of Research Methods for Social-Ecological Systems (London: Routledge), pp. 295-306. doi:10.4324/9781003021339-25

Ramanath, Ana Maria and Gilbert, Nigel (2004) ‘The Design of Participatory Agent-Based Social Simulations’, Journal of Artificial Societies and Social Simulation, 7(4), October, article 1, <https://www.jasss.org/7/4/1.html>.

Reypens, Charlotte and Levine, Sheen S. (2018) ‘Behavior in Behavioral Strategy: Capturing, Measuring, Analyzing’, in Behavioral Strategy in Perspective, Advances in Strategic Management Volume 39 (Bingley: Emerald Publishing), pp. 221-246. doi:10.1108/S0742-332220180000039016

Squazzoni, Flaminio and Casnici, Niccolò (2013) ‘Is Social Simulation a Social Science Outstation? A Bibliometric Analysis of the Impact of JASSS’, Journal of Artificial Societies and Social Simulation, 16(1), January, article 10, <http://jasss.soc.surrey.ac.uk/16/1/10.html>. doi:10.18564/jasss.2192

Taillandier, Patrick, Grignard, Arnaud, Marilleau, Nicolas, Philippon, Damien, Huynh, Quang-Nghi, Gaudou, Benoit and Drogoul, Alexis (2019) ‘Participatory Modeling and Simulation with the GAMA Platform’, Journal of Artificial Societies and Social Simulation, 22(2), March, article 3, <https://www.jasss.org/22/2/3.html>. doi:10.18564/jasss.3964

Tykhonov, Dmytro, Jonker, Catholijn, Meijer, Sebastiaan and Verwaart, Tim (2008) ‘Agent-Based Simulation of the Trust and Tracing Game for Supply Chains and Networks’, Journal of Artificial Societies and Social Simulation, 11(3), June, article 1, <https://www.jasss.org/11/3/1.html>.

Vuculescu, Oana (2017) ‘Searching Far Away from the Lamp-Post: An Agent-Based Model’, Strategic Organization, 15(2), May, pp. 242-263. doi:10.1177/1476127016669869

Zelditch, Morris Junior (2007) ‘Laboratory Experiments in Sociology’, in Webster, Murray Junior and Sell, Jane (eds.) Laboratory Experiments in the Social Sciences (New York, NY: Elsevier), pp. 183-197.

Notes

[i] This event was organised (and the resulting article was written) as part of “Towards Realistic Computational Models of Social Influence Dynamics” a project funded through ESRC (ES/S015159/1) by ORA Round 5 and involving Bruce Edmonds (PI) and Edmund Chattoe-Brown (CoI). More about SSC2021 (Social Simulation Conference 2021) can be found at https://ssc2021.uek.krakow.pl

[ii] This issue is actually very challenging for social science more generally. When considering interventions in social systems, knowing and acting might be so deeply intertwined (Derbyshire 2020) that interventions may modify the same behaviours that an experiment is aiming to understand.

[iii] In addition, experiments often require institutional ethics approval (but so do interviews, gaming activities and others sort of empirical research of course), something with which non-empirical Agent-Based Modellers may have little experience.

[iv] Chattoe-Brown had interesting personal experience of this. He took part in a simple team gaming exercise about running a computer firm. The team quickly worked out that the game assumed an infinite return to advertising (so you could have a computer magazine consisting entirely of adverts) independent of the actual quality of the product. They thus simultaneously performed very well in the game from the perspective of an external observer but remained deeply sceptical that this was a good lesson to impart about running an actual firm. But since the coordinators never asked the team members for their subjective view, they may have assumed that the simulation was also a success in its didactic mission.

[v] We should also not assume it is best to set our own standards from scratch. It may be valuable to attempt integration with existing approaches, like qualitative validity (https://conjointly.com/kb/qualitative-validity/) particularly when these are already attempting to be multidisciplinary and/or to bridge the gap between, for example, qualitative and quantitative data.

[vi] Although journals also face such a collective action problem at a different level. If they are too exacting relative to their status and existing practice, researchers will simply publish elsewhere.

Dino Carpentras, Edmund Chattoe-Brown, Bruce Edmonds, Cesar García-Diaz, Christian Kammler, Anna Pagani and Nanda Wijermans (2020) Where Now For Experiments In Agent-Based Modelling? Report of a Round Table as Part of SSC2021. Review of Artificial Societies and Social Simulation, 2nd Novermber 2021. https://rofasss.org/2021/11/02/round-table-ssc2021-experiments/

Content

The Systematic Comparison of Agent-Based Policy Models – It’s time we got our act together!

May 11, 2021 thesubmissionauthor 6 Comments

By Mike Bithell and Bruce Edmonds

Model Intercomparison

The recent Covid crisis has led to a surge of new model development and a renewed interest in the use of models as policy tools. While this is in some senses welcome, the sudden appearance of many new models presents a problem in terms of their assessment, the appropriateness of their application and reconciling any differences in outcome. Even if they appear similar, their underlying assumptions may differ, their initial data might not be the same, policy options may be applied in different ways, stochastic effects explored to a varying extent, and model outputs presented in any number of different forms. As a result, it can be unclear what aspects of variations in output between models are results of mechanistic, parameter or data differences. Any comparison between models is made tricky by differences in experimental design and selection of output measures.

If we wish to do better, we suggest that a more formal approach to making comparisons between models would be helpful. However, it appears that this is not commonly undertaken most fields in a systematic and persistent way, except for the field of climate change, and closely related fields such as pollution transport or economic impact modelling (although efforts are underway to extend such systematic comparison to ecosystem models – Wei et al., 2014, Tittensor et al., 2018⁠). Examining the way in which this is done for climate models may therefore prove instructive.

Model Intercomparison Projects (MIP) in the Climate Community

Formal intercomparison of atmospheric models goes back at least to 1989 (Gates et al., 1999)⁠ with the first atmospheric model inter-comparison project (AMIP), initiated by the World Climate Research Programme. By 1999 this had contributions from all significant atmospheric modelling groups, providing standardised time-series of over 30 model variables for one particular historical decade of simulation, with a standard experimental setup. Comparisons of model mean values with available data helped to reveal overall model strengths and weaknesses: no single model was best at simulation of all aspects of the atmosphere, with accuracy varying greatly between simulations. The model outputs also formed a reference base for further inter-comparison experiments including targets for model improvement and reduction of systematic errors, as well as a starting point for improved experimental design, software and data management standards and protocols for communication and model intercomparison. This led to AMIPII and, subsequently, to a series of Climate model inter-comparison projects (CMIP) beginning with CMIP I in 1996. The latest iteration (CMIP 6) is a collection of 23 separate model intercomparison experiments covering atmosphere, ocean, land surface, geo-engineering, and the paleoclimate. This collection is aimed at the upcoming 2021 IPCC process (AR6). Participating projects go through an endorsement process for inclusion, (a process agreed with modelling groups), based on 10 criteria designed to ensure some degree of coherence between the various models – a further 18 MIPS are also listed as currently active (https://www.wcrp-climate.org/wgcm-cmip/wgcm-cmip6). Groups contribute to a central set of common experiments covering the period 1850 to the near-present. An overview of the whole process can be found in (Eyring et al., 2016).

The current structure includes a set of three overarching questions covering the dynamics of the earth system, model systematic biases and understanding possible future change under uncertainty. Individual MIPS may build on this to address one or more of a set of 7 “grand science challenges” associated with the climate. Modelling groups agree to provide outputs in a standard form, obtained from a specified set of experiments under the same design, and to provide standardised documentation to go with their models. Originally (up to CMIP 5), outputs were then added to a central public repository for further analysis, however the output grew so large under CMIP6 that now the data is held dispersed over repositories maintained by separate groups.

Other Examples

Two further more recent examples of collective model development may also be helpful to consider.

Firstly, an informal network collating models across more than 50 research groups has already been generated as a result of the COVID crisis – the Covid Forecast Hub (https://covid19forecasthub.org). This is run by a small number of research groups collaborating with the US Centre for Disease Control and is strongly focussed on the epidemiology. Participants are encouraged to submit weekly forecasts, and these are integrated into a data repository and can be vizualized on the website – viewers can look at forward projections, along with associated confidence intervals and model evaluation scores, including those for an ensemble of all models. The focus on forecasts in this case arises out of the strong policy drivers for the current crisis, but the main point is that it is possible to immediately view measures of model performance and to compare the different model types: one clear message that rapidly becomes apparent is that many of the forward projections have 95% (and at some times, even 50%) confidence intervals for incident deaths that more than span the full range of the past historic data. The benefit of comparing many different models in this case is apparent, as many of the historic single-model projections diverge strongly from the data (and the models most in error are not consistently the same ones over time), although the ensemble mean tends to be better.

As a second example, one could consider the Psychological Science Accelerator (PSA: Moshontz et al 2018, https://psysciacc.org/). This is a collaborative network set up with the aim of addressing the “replication crisis” in psychology: many previously published results in psychology have proved problematic to replicate as a result of small or non-representative sampling or use of experimental designs that do not generalize well or have not been used consistently either within or across studies. The PSA seeks to ensure accumulation of reliable and generalizable evidence in psychological science, based on principles of inclusion, decentralization, openness, transparency and rigour. The existence of this network has, for example, enabled the reinvestigation of previous experiments but with much larger and less nationally biased samples (e.g. Jones et al 2021).

The Benefits of the Intercomparison Exercises and Collaborative Model Building

More specifically, long-term intercomparison projects help to do the following.

Build on past effort. Rather than modellers re-inventing the wheel (or building a new framework) with each new model project, libraries of well-tested and documented models, with data archives, including code and experimental design, would allow researchers to more efficiently work on new problems, building on previous coding effort
Aid replication. Focussed long term intercomparison projects centred on model results with consistent standardised data formats would allow new versions of code to be quickly tested against historical archives to check whether expected results could be recovered and where differences might arise, particularly if different modelling languages were being used
Help to formalize. While informal code archives can help to illustrate the methods or theoretical foundations of a model, intercomparison projects help to understand which kinds of formal model might be good for particular applications, and which can be expected to produce helpful results for given desired output measures
Build credibility. A continuously updated set of model implementations and assessment of their areas of competence and lack thereof (as compared with available datasets) would help to demonstrate the usefulness (or otherwise) of ABM as a way to represent social systems
Influence Policy (where appropriate). Formal international policy organisations such as the IPCC or the more recently formed IPBES are effective partly through an underpinning of well tested and consistently updated models. As yet it is difficult to see whether such a body would be appropriate or effective for social systems, as we lack the background of demonstrable accumulated and well tested model results.

Lessons for ABM?

What might we be able to learn from the above, if we attempted to use a similar process to compare ABM policy models?

In the first place, the projects started small and grew over time: it would not be necessary, for example, to cover all possible ABM applications at the outset. On the other hand, the latest CMIP iterations include a wide range of different types of model covering many different aspects of the earth system, so that the breadth of possible model types need not be seen as a barrier.

Secondly, the climate inter-comparison project has been persistent for some 30 years – over this time many models have come and gone, but the history of inter-comparisons allows for an overview of how well these models have performed over time – data from the original AMIP I models is still available on request, supporting assessments concerning long-term model improvement.

Thirdly, although climate models are complex – implementing a variety of different mechanisms in different ways – they can still be compared by use of standardised outputs, and at least some (although not necessarily all) have been capable of direct comparison with empirical data.

Finally, an agreed experimental design and public archive for documentation and output that is stable over time is needed; this needs to be done via a collective agreement among the modelling groups involved so as to ensure a long-term buy-in from the community as a whole, so that there is a consistent basis for long-term model development, building on past experience.

The need for aligning or reproducing ABMs has long been recognised within the community (Axtell et al. 1996; Edmonds & Hales 2003), but on a one-one basis for verifying the specification of models against their implementation, although (Hales et al. 2003) discusses a range of possibilities. However, this is far from a situation where many different models of basically the same phenomena are systematically compared – this would be a larger scale collaboration lasting over a longer time span.

The community has already established a standardised form of documentation in the ODD protocol. Sharing of model code is also becoming routine, and can be easily achieved through COMSES, Github or similar. The sharing of data in a long-term archive may require more investigation. As a starting project COVID-19 provides an ideal opportunity for setting up such a model inter-comparison project – multiple groups already have running examples, and a shared set of outputs and experiments should be straightforward to agree on. This would potentially form a basis for forward looking experiments designed to assist with possible future pandemic problems, and a basis on which to build further features into the existing disease-focussed modelling, such as the effects of economic, social and psychological issues.

Additional Challenges for ABMs of Social Phenomena

Nobody supposes that modelling social phenomena is going to have the same set of challenges that climate change models face. Some of the differences include:

The availability of good data. Social science is bedevilled by a paucity of the right kind of data. Although an increasing amount of relevant data is being produced, there are commercial, ethical and data protection barriers to accessing it and the data rarely concerns the same set of actors or events.
The understanding of micro-level behaviour. Whilst the micro-level understanding of our atmosphere is very well established, those of the behaviour of the most important actors (humans) is not. However, it may be that better data might partially substitute for a generic behavioural model of decision-making.
Agreement upon the goals of modelling. Although there will always be considerable variation in terms of what is wanted from a model of any particular social phenomena, a common core of agreed objectives will help focus any comparison and give confidence via ensembles of projections. Although the MIPs and Covid Forecast Hub are focussed on prediction, it may be that empirical explanation may be more important in other areas.
The available resources. ABM projects tend to be add-ons to larger endeavours and based around short-term grant funding. The funding for big ABM projects is yet to be established, not having the equivalent of weather forecasting to piggy-back on.
Persistence of modelling teams/projects. ABM tends to be quite short-term with each project developing a new model for a new project. This has made it hard to keep good modelling teams together.
Deep uncertainty. Whilst the set of possible factors and processes involved in a climate change model are well established, which social mechanisms need to be involved in any model of any particular social phenomena is unknown. For this reason, there is deep disagreement about the assumptions to be made in such models, as well as sharp divergence in outcome due to changes brought about by a particular mechanism but not included in a model. Whilst uncertainty in known mechanisms can be quantified, assessing the impact of those due to such deep uncertainty is much harder.
The sensitivity of the political context. Even in the case of Climate Change, where the assumptions made are relatively well understood and done on objective bases, the modelling exercise and its outcomes can be politically contested. In other areas, where the representation of people’s behaviour might be key to model outcomes, this will need even more care (Adoha & Edmonds 2017).

However, some of these problems were solved in the case of Climate Change as a result of the CMIP exercises and the reports they ultimately resulted in. Over time the development of the models also allowed for a broadening and updating of modelling goals, starting from a relatively narrow initial set of experiments. Ensuring the persistence of individual modelling teams is easier in the context of an internationally recognised comparison project, because resources may be easier to obtain, and there is a consistent central focus. The modelling projects became longer-term as individual researchers could establish a career doing just climate change modelling and importance of the work increasingly recognised. An ABM modelling comparison project might help solve some of these problems as the importance of its work is established.

Towards an Initial Proposal

The topic chosen for this project should be something where there: (a) is enough public interest to justify the effort, (b) there are a number of models with a similar purpose in mind being developed. At the current stage, this suggests dynamic models of COVID spread, but there are other possibilities, including: transport models (where people go and who they meet) or criminological models (where and when crimes happen).

Whichever ensemble of models is focussed upon, these models should be compared on a core of standard, with the same:

Start and end dates (but not necessarily the same temporal granularity)
Covering the same set of regions or cases
Using the same population data (though possibly enhanced with extra data and maybe scaled population sizes)
With the same initial conditions in terms of the population
Outputting a core of agreed measures (but maybe others as well)
Checked against their agreement against a core set of cases, with agreed data sets
Reported on in a standard format (though with a discussion section for further/other observations)
well documented and with code that is open access
Run a minimum of times with different random seeds

Any modeller/team that had a suitable model and was willing to adhere to the rules would be welcome to participate (commercial, government or academic) and these teams would collectively decide the rules, development and write any reports on the comparisons. Other interested stakeholder groups could be involved including professional/academic associations, NGOs and government departments but in a consultative role providing wider critique – it is important that the terms and reports from the exercise be independent or any particular interest or authority.

Conclusion

We call upon those who think ABMs have the potential to usefully inform policy decisions to work together, in order that the transparency and rigour of our modelling matches our ambition. Whilst model comparison exercises of the kind described are important for any simulation work, particular care needs to be taken when the outcomes can affect people’s lives.

References

Aodha, L. & Edmonds, B. (2017) Some pitfalls to beware when applying models to issues of policy relevance. In Edmonds, B. & Meyer, R. (eds.) Simulating Social Complexity – a handbook, 2nd edition. Springer, 801-822. (A version is at http://cfpm.org/discussionpapers/236)

Axtell, R., Axelrod, R., Epstein, J. M., & Cohen, M. D. (1996). Aligning simulation models: A case study and results. Computational & Mathematical Organization Theory, 1(2), 123-141. https://link.springer.com/article/10.1007%2FBF01299065

Edmonds, B., & Hales, D. (2003). Replication, replication and replication: Some hard lessons from model alignment. Journal of Artificial Societies and Social Simulation, 6(4), 11. http://jasss.soc.surrey.ac.uk/6/4/11.html

Eyring, V., Bony, S., Meehl, G. A., Senior, C. A., Stevens, B., Stouffer, R. J., & Taylor, K. E. (2016). Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization. Geoscientific Model Development, 9(5), 1937–1958. https://doi.org/10.5194/gmd-9-1937-2016

Gates, W. L., Boyle, J. S., Covey, C., Dease, C. G., Doutriaux, C. M., Drach, R. S., Fiorino, M., Gleckler, P. J., Hnilo, J. J., Marlais, S. M., Phillips, T. J., Potter, G. L., Santer, B. D., Sperber, K. R., Taylor, K. E., & Williams, D. N. (1999). An Overview of the Results of the Atmospheric Model Intercomparison Project (AMIP I). In Bulletin of the American Meteorological Society (Vol. 80, Issue 1, pp. 29–55). American Meteorological Society. https://doi.org/10.1175/1520-0477(1999)080<0029:AOOTRO>2.0.CO;2

Hales, D., Rouchier, J., & Edmonds, B. (2003). Model-to-model analysis. Journal of Artificial Societies and Social Simulation, 6(4), 5. http://jasss.soc.surrey.ac.uk/6/4/5.html

Jones, B.C., DeBruine, L.M., Flake, J.K. et al. To which world regions does the valence–dominance model of social perception apply?. Nat Hum Behav 5, 159–169 (2021). https://doi.org/10.1038/s41562-020-01007-2

Moshontz, H. + 85 others (2018) The Psychological Science Accelerator: Advancing Psychology Through a Distributed Collaborative Network , 1(4) 501-515. https://doi.org/10.1177/2515245918797607

Tittensor, D. P., Eddy, T. D., Lotze, H. K., Galbraith, E. D., Cheung, W., Barange, M., Blanchard, J. L., Bopp, L., Bryndum-Buchholz, A., Büchner, M., Bulman, C., Carozza, D. A., Christensen, V., Coll, M., Dunne, J. P., Fernandes, J. A., Fulton, E. A., Hobday, A. J., Huber, V., … Walker, N. D. (2018). A protocol for the intercomparison of marine fishery and ecosystem models: Fish-MIP v1.0. Geoscientific Model Development, 11(4), 1421–1442. https://doi.org/10.5194/gmd-11-1421-2018

Wei, Y., Liu, S., Huntzinger, D. N., Michalak, A. M., Viovy, N., Post, W. M., Schwalm, C. R., Schaefer, K., Jacobson, A. R., Lu, C., Tian, H., Ricciuto, D. M., Cook, R. B., Mao, J., & Shi, X. (2014). The north american carbon program multi-scale synthesis and terrestrial model intercomparison project – Part 2: Environmental driver data. Geoscientific Model Development, 7(6), 2875–2893. https://doi.org/10.5194/gmd-7-2875-2014

Bithell, M. and Edmonds, B. (2020) The Systematic Comparison of Agent-Based Policy Models - It’s time we got our act together!. Review of Artificial Societies and Social Simulation, 11th May 2021. https://rofasss.org/2021/05/11/SystComp/

Content

Basic Modelling Hygiene – keep descriptions about models and what they model clearly distinct

May 22, 2020 thesubmissionauthor Leave a comment

By Bruce Edmonds

The essence of a model is that it relates to something else – what it models – even if this is only a vague or implicit mapping. Otherwise a model would be indistinguishable from any other computer code, set of equations etc (Hesse 1964; Wartofsky 1966). The centrality of this essence makes it unsurprising that many modellers seem to conflate the two.

This is made worse by three factors.

A strong version of Kuhn’s “Spectacles” (Kuhn 1962) where the researcher goes beyond using the model as a way of thinking about the world to projecting their model onto the world, so they see the world only through that “lens”. This effect seems to be much stronger for simulation modelling due to the intimate interaction that occurs over a period of time between modellers and their model.
It is a natural modelling heuristic to make the model more like what it models (Edmonds & al. 2019), introducing more elements of realism. This is especially strong with agent-based modelling which lends itself to complication and descriptive realism.
It is advantageous to stress the potential connections between a model (however abstract) and possible application areas. It is common to start an academic paper with a description of a real-world issue to motivate the work being reported on; then (even if the work is entirely abstract and unvalidated) to suggest conclusions for what is observed. A lack of substantiated connections between model and any empirical data can be covered up by slick passing from the world to the model and back again and a lack of clarity as to what their research achieves (Edmonds & al. 2019).

Whatever the reasons the result is similar – that the language used to describe entities, processes and outcomes in the model is the same as that used for its descriptions of what is intended to be modelled.

Such conflation is common in academic papers (albeit to different degrees). Expert modellers will not usually be confused by such language because they understand the modelling process and know what to look for in a paper. Thus one might ask, what is the harm of a little rhetoric and hype in the reporting of models? After all, we want modellers to be motivated and should thus be tolerant of their enthusiasm. To show the danger I will thus look at an example that talks about modelling aspects of ethnocentrism.

In their paper, entitled “The Evolutionary Dominance of Ethnocentric Cooperation“, Hartshorn, Kaznatcheev & Shultz (2013) further analyse the model described in (Hammond & Axelrod 2006). The authors have reimplemented the original model and extensively analysed it especially the temporal dynamics. The paper is solely about the original model and its properties, there is no pretence of any validation or calibration with respect to any data. The problem is in the language used, because it the language could equally well refer to the model and the real world.

Take the first sentence of its abstract: “Recent agent-based computer simulations suggest that ethnocentrism, often thought to rely on complex social cognition and learning, may have arisen through biological evolution“. This sounds like the simulation suggests something about the world we live in – that, as the title suggests, Ethnocentric cooperation naturally dominates other strategies (e.g. humanitarianism) and so it is natural. The rest of the abstract then goes on in the same sort of language which could equally apply to the model and the real world.

Expert modellers will understand that they were talking about the purely abstract properties of the model, but this will not be clear to other readers. However, in this case there is evidence that it is a problem. This paper has, in recent years, shot to the top of page requests from the JASSS website (22nd May 2020) at 162,469 requests over a 7-day period, but is nowhere in the top 50 articles in terms of JASSS-JASSS citations. Tracing where these requests come from, results in many alt-right and Russian web sites. It seems that many on the far right see this paper as confirmation of their Nationalist and Racist viewpoints. This is far more attention than a technical paper just about a model would get, so presumably they took it as confirmation about real-world conclusions (or were using it to fool others about the scientific support for their viewpoints) – namely that Ethnocentrism does beat Humanitarianism and this is an evolutionary inevitability [note 1].

This is an extreme example of the confusion that occurs when non-expert modellers read many papers on modelling. Modellers too often imply a degree of real-world relevance when this is not justified by their research. They often imply real-world conclusions before any meaningful validation has been done. As agent-based simulation reaches a less specialised audience, this will become more important.

Some suggestions to avoid this kind of confusion:

After the motivation section, carefully outline what part this research will play in the broader programme – do not leave this implicit or imply a larger role than is justified
Add in the phrase “in the model” frequently in the text, even if this is a bit repetitive [note 2]
Keep discussions about the real world in a different sections from those that discuss the model
Have an explicit statement of what the model can reliably say about the real world
Use different terms when referring to parts of the model and part of the real world (e.g. actors for real world individuals, agents in the model)
Be clear about the intended purpose of the model – what can be achieved as a result of this research (Edmonds et al. 2019) – for example, do not imply the model will be able to predict future real world properties until this has been demonstrated (de Matos Fernandes & Keijzer 2020)
Be very cautious in what you conclude from your model – make sure this is what has been already achieved rather than a reflection of your aspirations (in fact it might be better to not mention such hopes at all until they are realised)

Notes

To see that this kind of conclusion is not necessary see (Hales & Edmonds 2019).
This is similar to a campaign to add the words “in mice” in reports about medical “breakthroughs”, (https://www.statnews.com/2019/04/15/in-mice-twitter-account-hype-science-reporting)

Acknowledgements

Bruce Edmonds is supported as part of the ESRC-funded, UK part of the “ToRealSim” project, grant number ES/S015159/1.

References

Edmonds, B., et al. (2019) Different Modelling Purposes, Journal of Artificial Societies and Social Simulation 22(3), 6. <http://jasss.soc.surrey.ac.uk/22/3/6.html>. doi:10.18564/jasss.3993

Hammond, R. A., N. D. and Axelrod, R. (2006). The Evolution of Ethnocentrism. Journal of Conflict Resolution, 50(6), 926–936. doi:10.1177/0022002706293470

Hartshorn, Max, Kaznatcheev, Artem and Shultz, Thomas (2013) The Evolutionary Dominance of Ethnocentric Cooperation, Journal of Artificial Societies and Social Simulation 16(3), 7. <http://jasss.soc.surrey.ac.uk/16/3/7.html>. doi:10.18564/jasss.2176

Hesse, M. (1964). Analogy and confirmation theory. Philosophy of Science, 31(4), 319-327.

Kuhn, T. S. (1962). The Structure of Scientific Revolutions. Univ. of Chicago Press.

de Matos Fernandes, C. A. and Keijzer, M. A. (2020) No one can predict the future: More than a semantic dispute. Review of Artificial Societies and Social Simulation, 15th April 2020. https://rofasss.org/2020/04/15/no-one-can-predict-the-future/

Wartofsky, M. (1966). the Model Muddle – Proposals for an Immodest Realism. Journal Of Philosophy, 63(19), 589-589.

Edmonds, B. (2020) Basic Modelling Hygiene - keep descriptions about models and what they model clearly distinct. Review of Artificial Societies and Social Simulation, 22nd May 2020. https://rofasss.org/2020/05/22/modelling-hygiene/

By Bruce Edmonds

Introduction

Descriptions of Agent-Based Models

The impossibility of a general “specification compiler”

The impossibility of a general “code checker”

What about simple Agent-Based Models?

Discussion

Notes

Acknowledgements

Appendix

Formal descriptions

Producing code from a specification

Checking code meets a specification

Showing GASP ABMs are Turning Complete

References

By Bruce Edmonds1, Dino Carpentras2, Nick Roxburgh3, Edmund Chattoe-Brown4 and Gary Polhill3

Models and Generality

Models used as analogies

Models that relate directly to empirical data

Why this matters

Conclusion

Notes

Acknowledgements

References

By Fabian Lorig1*, Bart de Bruin2, Melania Borit3, Frank Dignum4, Bruce Edmonds5, Sinéad M. Madden6, Mario Paolucci7, Nicolas Payette8, Loïs Vanhée4

Acknowledgements

References

By Marco A. Janssen1, Kelly Claborn1, Bruce Edmonds2, Mohsen Shahbaznezhadfard1 and Manuela Vanegas-Ferro1

By Bruce Edmonds

Motivation

Partially Understood Models

Layering Models to Leverage some Understanding

Concluding Discussion

Notes

Acknowledgements

References

By Frank Dignum1, Bruce Edmonds2 and Dino Carpentras3

Notes

Acknowledgements

References

By Bruce Edmonds

Vagueness and refutation

The dangers of suggestivist modelling

How to recognise a suggestivist model

How to avoid suggestivist modelling

Conclusion

Notes

Acknowledgements

References

By Dino Carpentras1, Edmund Chattoe-Brown2*, Bruce Edmonds3, Cesar García-Diaz4, Christian Kammler5, Anna Pagani6 and Nanda Wijermans7

Introduction

What Experiments Are

The Philosophy of Experiment

The Challenge of Experiments as Practical Research

Design of Experiments

Extracting Rich Data from Experiments

Standards and Institutional Bootstrapping

Where Now?

References

Notes

By Mike Bithell and Bruce Edmonds

Model Intercomparison

Model Intercomparison Projects (MIP) in the Climate Community

Other Examples

The Benefits of the Intercomparison Exercises and Collaborative Model Building

Lessons for ABM?

Additional Challenges for ABMs of Social Phenomena

Towards an Initial Proposal

Conclusion

References

By Bruce Edmonds

Notes

Acknowledgements

References

For discussion about social simulation research

By Bruce Edmonds¹, Dino Carpentras², Nick Roxburgh³, Edmund Chattoe-Brown⁴ and Gary Polhill³

By Fabian Lorig^1*, Bart de Bruin², Melania Borit³, Frank Dignum⁴, Bruce Edmonds⁵, Sinéad M. Madden⁶, Mario Paolucci⁷, Nicolas Payette⁸, Loïs Vanhée⁴

By Marco A. Janssen¹, Kelly Claborn¹, Bruce Edmonds², Mohsen Shahbaznezhadfard¹ and Manuela Vanegas-Ferro¹

By Frank Dignum¹, Bruce Edmonds² and Dino Carpentras³

**By Dino Carpentras¹, Edmund Chattoe-Brown^2*, Bruce Edmonds³, Cesar García-Diaz⁴, Christian Kammler⁵, Anna Pagani⁶ and Nanda Wijermans⁷**