Tag Archives: comment

Delusional Generality – how models can give a false impression of their applicability even when they lack any empirical foundation

May 6, 2024 thesubmissionauthor Leave a comment

By Bruce Edmonds¹, Dino Carpentras², Nick Roxburgh³, Edmund Chattoe-Brown⁴ and Gary Polhill³

Centre for Policy Modelling, Manchester Metropolitan University
Computational Social Science, ETH Zurich
James Hutton Institute, Aberdeen
University of Leicester

“Hamlet: Do you see yonder cloud that’s almost in shape of a camel?
Polonius: By the mass, and ‘tis like a camel, indeed.
Hamlet: Methinks it is like a weasel.
Polonius: It is backed like a weasel.
Hamlet: Or like a whale?
Polonius: Very like a whale.”

Models and Generality

The essence of a model is that it represents – if it is not a model of something it is not a model at all (Zeigler 1976, Wartofsky 1979). A random bit of code or set of equations is not a model. The point of a model is that one can use the model to infer or understand some aspects about what it represents. However, models can represent a variety of kinds of things in a variety of ways (Edmonds & al. 2019) – it can represent ideas, correspond to data, or aspects of other models and it can represent each of these in either a vague or precise manner. To completely understand a model – its construction, properties and working – one needs to understand how it does this mapping. This piece focuses attention on this mapping, rather than the internal construction of models.

What a model reliably represents may be a single observed situation, but it might satisfactorily represent more than one such situation. The range of situations that the model satisfactorily represents is called the “scope” of the model (what is “satisfactory” depending on the purpose for which the model is being used). The more extensive the scope, the more “general” we say the model is. A model that only represents one case has no generality at all and may be more in the nature of a description.

There is a hunger for general accounts of social phenomena (let us call these ‘theories’). However, this hunger is often frustrated by the sheer complexity and ‘messiness’ involved in such phenomena. If every situation we observe is essentially different, then no such theory is possible. However, we hope that this is not the case for the social world and, indeed, informal observation suggests that there is, at least some, commonality between situations – in other words, that some kind of reliable generalisation about social phenomena might be achievable, however modest (Merton 1968). This piece looks at two kinds of applicability – analogical applicability and empirical applicability – and critiques those that conflate them. Although the expertise of the authors is in the agent-based modelling of social phenomena, and so we restrict our discussion to this, we strongly suspect that our arguments are true for many kinds of modelling across a range of domains.

In the next sections we contrast two uses for models: as analogies (ways of thinking about observed systems) and those that intend to represent empirical data in a more precise way. There are, of course, other uses of model such as that of exploring theory which have nothing to do with anything observed.

Models used as analogies

Analogical applicability comes from the flexibility of the human mind in interpreting accounts in terms of the different situations. When we encounter a new situation, the account is mapped onto it – the account being used as an analogy for understanding this situation. Such accounts are typically in the form of a narrative, but a model can also be used as an analogy (which is the case we are concerned with here). The flexibility with which this mapping can be constructed means that such an account can be related to a wide range of phenomena. Such analogical mapping can lead to an impression that the account has a wide range of applicability. Analogies are a powerful tool for thinking since it may give us some insights into otherwise novel situations. There are arguments that analogical thinking is a fundamental aspect of human thought (Hofstadter 1995) and language (Lakoff 2008). We can construct and use analogical mappings so effortlessly that they seem natural to us. The key thing about analogical thinking is that the mapping from the analogy to the situation to which it is applied is re-invented each time – there is no fixed relationship between the analogy and what it might be applied to. We are so good at doing this that we may not be aware of how different the constructed mapping is each time. However, its flexibility comes at a cost, namely that because there is no well-defined relationship with what it applies to, the mapping tends to be more intuitive than precise. An analogy can give insights but analogical reasoning suggests rather than establishes anything reliably and you cannot empirically test it (since analogical mappings can be adjusted to avoid falsification). Such “ways of thinking” might be helpful, but equally might be misleading [note ‎1].

Just because the content of an analogy might be expressed formally does not change any of this (Edmonds 2018), in fact formally expressed analogies might give the impression of being applicable, but often are only related to anything observed via ideas – the model relates to some ideas, and the ideas relate to reality (Edmonds 2000). Using models as analogies is a valid use of models but this is not an empirically reliable one (Edmonds et al. 2019). Arnold (2013) makes a powerful argument that many of the more abstract simulation models are of this variety and simply not relatable to empirically observed cases and data at all – although these give the illusion of wide applicability, that applicability is not empirical. In physics the ways of thinking about atomic or subatomic entities have changed over time whilst the mathematically-expressed, empirically-relevant models have not (Hartman 1997). Although Thompson (2022) concentrates on mathematically formulated models, she also distinguishes between well-validated empirical models and those that just encapsulate the expertise/opinion of the modeller. She gives some detailed examples of where the latter kind had disproportionate influence, beyond that of other expertise, just because it was in the form of a model (e.g. the economic impact of climate change).

An example of an analogical model is described in Axelrod (1984) – a formalised tournament where algorithmically-expressed strategies are pitted against each other, playing the iterated prisoner’s dilemma game. It is shown how the ‘tit for tat’ strategy can survive against many other mixes of strategies (static or evolving). In the book, the purpose of the model is to suggest a new way of thinking about the evolution of cooperation. The book claims the idea ‘explains’ many observed phenomena, but this in an analogical manner – no precise relationship with any observed measurements is described. There is no validation of the model here or in the more academic paper that described these results (Axelrod & Hamilton 1981).

Of course, researchers do not usually call their models “analogies” or “analogical” explicitly but tend to use other phrasings that imply a greater importance. An exception is Epstein (2008) where it is explicitly listed as one of the 15 modelling purposes, other than prediction, that he discusses. Here he says such models are “…more than beautiful testaments to the unifying power of models: they are headlights in dark unexplored territory.” (ibid.) thus suggesting their use in thinking about phenomena where we do not already have reliable empirical models. Anything that helps us think about such phenomena could be useful, but that does not mean they are at all reliable. As Herbert Simon said: “Metaphor and analogy can be helpful, or they can be misleading. ” (Simon 1968, p. 467).

Another purpose listed in Epstein (2008) is to “Illuminate core dynamics”. After raising the old chestnut that “All models are wrong”, he goes on to justify them on the grounds that “…they capture qualitative behaviors of overarching interest”. This is fine if the models are, in fact, known to be useful as more than vague analogies [Note 2] – that they do, in some sense, approximate observed phenomena – but this is not the case with novel models that have not been empirically tested. This phrase is more insidious, because it implies that the dynamics that have been illuminated by the model are “core” – some kind of approximation of what is important about the phenomena, allowing for future elaborations to refine the representation. This implies a process where an initially rough idea is iteratively improved. However, this is premature because we do not know if what has been abstracted away in the abstract model was essential to the dynamics of the target phenomena or not without empirical testing – this is just assumed or asserted based on the intuitions of the modeller.

This idea of the “core dynamics” leads to some paradoxical situations – where a set of competing models are all deemed to be core. Indeed, the literature has shown how the same phenomenon can be modelled in many contrasting ways. For instance, political polarisation has been modelled through models with mechanisms for repulsion, bounded confidence, reinforcement, or even just random fluctuations, to name a few (Flache et al., 2017; Banisch & Olbrich 2019; Carpentras et al. 2022). However, it is likely that only a few of them contribute substantially to the political polarisation we observe in the real world, and so that all the others are not a real “core dynamic” but until we have more empirical work we do not know which are core and which not.

A related problem with analogical models is that, even when relying on parsimony principles [Note 3], it is not possible to decide which model is better. This aspect, combined with the constant production of new models, can makes the relevant literature increasingly difficult to navigate as models proliferate without any empirical selection, especially for researchers new to ABM. Furthermore, most analogical models define their object of study in an imprecise manner so that it is hard to evaluate whether they are even intended to capture element of any particular observed situation. For example, opinion dynamics models rarely define the type of interaction they represent (e.g. in person vs online) or even what an opinion is. This has led to cases where even knowledge of facts has been studied as “opinions” (e.g. Chacoma & Zanette, 2015).

In summary, analogical models can be a useful tool to start thinking about complex phenomena. However, the danger with them is that they give an impression of progress but result in more confusion than clarity, possibly slowing down scientific progress. Once one has some possible insights, one needs to confront these with empirical data to determine which are worth further investigation.

Models that relate directly to empirical data

An empirical model, in contrast, has a well-defined way of mapping to the phenomena it represents. For example, the variables of the gas laws (volume, temperature and pressure) are measured using standard methods developed over a long period of time, one does not invent a new way of doing this each time the laws are applied. In this case, the ways of measuring these properties have developed alongside the mathematical models of the laws so that these work reliably under broad (and well known) conditions and cannot be adjusted at the whim of a modeller. Empirical generality comes from when a model applies reliably to many different situations – in the case of the gas laws, to a wide range of materials in gaseous form to a high degree of accuracy.

Empirical models can be used for different purposes, including: prediction, explanation and description (Edmonds et al. 2019). Each of these uses how the model is mapped to empirical data in different ways, to reflect these purposes. With a descriptive model the mapping is one-way from empirical data to the model to justify the different parts. In a predictive model, the initial model setup is determined from known data and the model is then run to get its results. These results are then mapped back to what we might expect as a prediction, which can be later compared to empirically measured values to check the model’s validity. An explanatory model supports a complex explanation of some known outcomes in terms of a set of processes, structures and parameter values. When it is shown that the outcomes of such a model sufficiently match those from the observed data – the model represents a complex chain of causation that would result in that data in terms of the processes, structures and parameter values it comprised. It thus supports an explanation in terms of the model and its input of what was observed. In each of these three cases the mapping from empirical data to the model happens in a different order and maybe in a different direction, however they all depend upon the mapping being well defined.

Cartwright (1983), studying how physics works, distinguished between explanatory and phenomenological laws – the former explains but does not necessary relate exactly to empirical data (such as when we fit a line to data using regression), whilst the latter fits the data but does not necessarily explain (like the gas laws). Thus the jobs of theoretical explanation and empirical prediction are done by different models or theories (often calling the explanatory version “theory” and the empirical versions “models”). However, in physics the relationship between the two is, itself, examined so that the “bridging laws” between them are well understood, especially in formal terms. In this case, we attribute reliable empirical meaning to the explanatory theories to the extent that the connection to the data is precise, even though it is done via the intermediary of an “phenomenological” model because both mappings (explanatory↔phenomenological and phenomenological↔empirical data) are precise and well established. The point is that the total mapping from model or theory to empirical data is not subject to interpretation or post-hoc adjustment to improve its fit.

ABMs are often quite complicated and require many parameters or other initialising input to be specified before they can be run. If some of these are not empirically determinable (even in principle) then these might be guessed at using a process of “calibration”, that is searching the space of possible initialisations for some values for which some measured outcomes of the results match other empirical data. If the model has been separately shown to be empirically reliable then one could do such a calibration to suggest what these input values might have been. Such a process might establish that the model captures a possible explanation of the fitted outcomes (in terms of the model plus those backward-inferred input values), but this is not a very strong relationship, since many models are very flexible and so could fit a wide range of possible outcomes. The reliability of such a suggested explanation, supported by the model, is only relative to (a) the empirical reliability of any theory or other assumptions the model is built upon (b) how flexibly the model outcomes can be adjusted to fit the target data and (c) how precisely the choice of outcome measures and fit are. Thus, calibration does not provide strong evidence of the empirical adequacy of an ABM and any explanation supported by such a procedure is only relative to the ‘wiggle room’ afforded by free parameters and unknown input data as well as any assumptions used in the making of the model. However, empirical calibration is better than none and may empirically fix the context in which theoretical exploration occurs – showing that the model is, at least, potentially applicable to the case being considered [Note 4].

An example of a model that is strongly grounded in empirical data is the “538” model of the US electoral college for presidential elections (Silver 2012). This is not an ABM but more like a micro-simulation. It aggregates the uncertainty from polling data to make probabilistic predictions about what this means for the outcomes. The structure of the model comes directly from the rules of the electoral college, the inputs are directly derived from the polling data and it makes predictions about the results that can be independently checked. It does a very specific, but useful job, in translating the uncertainty of the polling data into the uncertainty about the outcome.

Why this matters

If people did not confuse the analogical and empirical cases, there would not be a problem. However, researchers seem to suffer from a variety of “Kuhnian Spectacles” (Kuhn 1962) – namely that because they view their target systems through an analogical model, they tend to think that this is how that system actually is – i.e. that the model has not just analogical but also empirical applicability. This is understandable, we use many layers of analogy to navigate our world and in many every-day cases it is practical to conflate our models with the reality we deal with (when they are very reliable). However, people who claim to be scientists are under an obligation to be more cautious and precise than this, since others might wish to rely upon our theories and models (this is, after all, why they support us in our privileged position). However, such caution is not always followed. There are cases where modellers declare their enterprise a success even after a long period without any empirical backing, making a variety of excuses instead of coming clean about this lack (Arnold 2015).

Another fundamental aspect is that agent-based models can be very interdisciplinary and, because of that, they can be used also by researchers in different fields. However, many fields do not consider models as simple analogies, especially when they provide precise mathematical relationship among variables. This can easily result in confusions where the analogical applicability of ABMs is interpreted as empirical in another field.

Of course, we may be hopeful that, sometime in the future, our vague or abstract analogical model maybe developed into something with proven empirical abilities, but we should not suggest such empirical abilities until these have been established. Furthermore, we should be particularly careful to ensure that non-modellers understand that this possibility is only a hope and not imply anything otherwise (e.g. imply that it is likely to have empirical validity). However, we suspect that in many cases this confusion goes beyond optimistic anticipation and that some modellers conflate analogical with empirical applicability, assuming that their model is basically right just because it seems that way to them. This is what we call “delusional generality” – that a researcher is under the impression that their model has a wide applicability (or potentially wide applicability) due to the attractiveness of the analogy it presents. In other words, unaware of the unconscious process of re-inventing the mapping to each target system, they imagine (without further justification) that it has some reliable empirical (or potentially empirical) generality at its core [Note 5].

Such confusion can have severe real-world consequences if a model with only analogical validity is assumed to also have some empirical reliability. Thompson (2022) discusses how abstract economic models of the cost of future climate change did affect the debate about the need for prevention and mitigation, even though they had no empirical validity. However, agent-based modellers have also made the same mistake, with a slew of completely unvalidated models about COVID affecting public debate about policy (Squazzoni et al 2021).

Conclusion

All of the above discussion raises the question of how we might achieve reliable models with even a moderate level of empirical generality in the social sciences. This is a tricky question of scientific strategy, which we are not going to answer here [Note 6]. However, we question whether the approach of making “heroic” jumps from phenomena to abstract non-empirical models on the sole basis of its plausibility to its authors will be a productive route when the target is complex phenomena, such as socio-cognitive systems (Dignum, Edmonds and Carpentras 2022). Certainly, that route has not yet been empirically demonstrated.

Whatever the best strategy is, there is a lot of theoretical modelling in the field of social simulation that assumes or implies that it is the precursor for empirical applicability and not a lot of critique about the extent of empirical success achieved. The assumption seems to be that abstract theory is the way to make progress understanding social phenomena but, as we argue here, this is largely wishful thinking – the hope that such models will turn out to have empirical generality being a delusion. Furthermore, this approach has substantive deleterious effects in terms of encouraging an explosion of analogical models without any process of selection (Edmonds 2010). It seems that the ‘famine’ of theory about social phenomena with any significant level of generality is so severe, that many seem to give credence to models they might otherwise reject – constructing their understanding using models built on sand.

Notes

1. There is some debate about the extent to which analogical reasoning works, what kind of insights it results in and under what circumstances (Hofstede 1995). However, all we need for our purposes is that: (a) it does not reliably produce knowledge, (b) the human mind is exceptionally good at ‘fitting’ analogies to new situations (adjusting the mapping to make it ‘work’ somehow) and (c) due to this ability analogies can be far more convincing that the analogical reasoning warrants.

2. In pattern-oriented modelling (Grimm & al 2005) models are related to empirical evidence in a qualitative (pattern-based) manner, for example to some properties of a distribution of numeric outcomes. In this kind of modelling, a precise numerical correspondence is replaced by a set of qualitative correspondences in many different dimensions. In this the empirical relevance of a model is established on the basis that it is too hard to simultaneously fit a model to evidence in this way, thus ruling that out as a source of its correspondence with that evidence.

3. So-called “parsimony principles” are a very unreliable manner of evaluating competing theories on grounds other than convenience or that of using limited data to justify the values of parameters (Edmonds 2007).

4. In many models a vague argument for its plausibility is often all that is described to show that it is applicable to the cases being discussed. At least calibration demonstrates its empirical applicability, rather than simply assuming it.

5. We are applying the principle of charity here, assuming that such conflations are innocent and not deliberate. However, there is increasing pressure from funding agencies to demonstrate ‘real life relevance’ so some of these apparent confusions might be more like ‘spin’ – trying to give an impression of empirical relevance even when this is merely an aspiration, in order to suggest that their model has more significant than they have reliably established.

6. This has been discussed elsewhere, e.g. (Moss & Edmonds 2005).

Acknowledgements

Thanks to all those we have discussed these issues with, including Scott Moss (who was talking about these kinds of issue more than 30 years ago), Eckhart Arnold (who made many useful comments and whose careful examination of the lack of empirical success of some families of model demonstrates our mostly abstract arguments), Sven Banisch and other members of the ESSA special interest group on “Strongly Empirical Modelling”.

References

Arnold, E. (2013). Simulation models of the evolution of cooperation as proofs of logical possibilities. How useful are they? Ethics & Politics, XV(2), pp. 101-138. https://philpapers.org/archive/ARNSMO.pdf

Arnold, E. (2015) How Models Fail – A Critical Look at the History of Computer Simulations of the Evolution of Cooperation. In Misselhorn, C. (Ed.): Collective Agency and Cooperation in Natural and Artificial Systems. Explanation, Implementation and Simulation, Philosophical Studies Series, Springer, pp. 261-279. https://eckhartarnold.de/papers/2015_How_Models_Fail

Axelrod, R. (1984) The Evolution of Cooperation, Basic Books.

Axelrod, R. & Hamilton, W.D. (1981) The evolution of cooperation. Science, 211, 1390-1396. https://www.science.org/doi/abs/10.1126/science.7466396

Banisch, S., & Olbrich, E. (2019). Opinion polarization by learning from social feedback. The Journal of Mathematical Sociology, 43(2), 76-103. https://doi.org/10.1080/0022250X.2018.1517761

Carpentras, D., Maher, P. J., O’Reilly, C., & Quayle, M. (2022). Deriving An Opinion Dynamics Model From Experimental Data. Journal of Artificial Societies & Social Simulation, 25(4).http://doi.org/10.18564/jasss.4947

Cartwright, N. (1983) How the Laws of Physics Lie. Oxford University Press.

Chacoma, A. & Zanette, D. H. (2015). Opinion formation by social influence: From experiments to modelling. PloS ONE, 10(10), e0140406.https://doi.org/10.1371/journal.pone.0140406

Dignum, F., Edmonds, B. and Carpentras, D. (2022) Socio-Cognitive Systems – A Position Statement. Review of Artificial Societies and Social Simulation, 2^nd Apr 2022. https://rofasss.org/2022/04/02/scs

Edmonds, B. (2000). The Use of Models – making MABS actually work. In. S. Moss and P. Davidsson. Multi Agent Based Simulation. Berlin, Springer-Verlag. 1979: 15-32. http://doi.org/10.1007/3-540-44561-7_2

Edmonds, B. (2007) Simplicity is Not Truth-Indicative. In Gershenson, C.et al. (eds.) Philosophy and Complexity. World Scientific, pp. 65-80.

Edmonds, B. (2010) Bootstrapping Knowledge About Social Phenomena Using Simulation Models. Journal of Artificial Societies and Social Simulation, 13(1), 8. http://doi.org/10.18564/jasss.1523

Edmonds, B. (2018) The “formalist fallacy”. Review of Artificial Societies and Social Simulation, 11th June 2018. https://rofasss.org/2018/07/20/be/

Edmonds, B., le Page, C., Bithell, M., Chattoe-Brown, E., Grimm, V., Meyer, R., Montañola-Sales, C., Ormerod, P., Root H. & Squazzoni. F. (2019) Different Modelling Purposes. Journal of Artificial Societies and Social Simulation, 22(3):6. http://doi.org/10.18564/jasss.3993

Epstein, J. M. (2008). Why Model?. Journal of Artificial Societies and Social Simulation, 11(4),12. https://www.jasss.org/11/4/12.html

Flache, A., Mäs, M., Feliciani, T., Chattoe-Brown, E., Deffuant, G., Huet, S. & Lorenz, J. (2017). Models of social influence: Towards the next frontiers. Journal of Artificial Societies and Social Simulation, 20(4), 2. http://doi.org/10.18564/jasss.4298

Grimm, V., Revilla, E., Berger, U., Jeltsch, F., Mooij, W.M., Railsback, S.F., et al. (2005). Pattern-oriented modeling of agent-based complex systems: lessons from ecology. Science, 310 (5750), 987–991. https://www.jstor.org/stable/3842807

Hartman, S. (1997) Modelling and the Aims of Science. 20^th International Wittgenstein Symposium, Kirchberg am Weshsel.

Hofstadter, D. (1995) Fluid Concepts and Creative Analogies. Basic Books.

Kuhn, T. S. (1962). The structure of scientific revolutions. University of Chicago Press.

Lakoff, G. (2008). Women, fire, and dangerous things: What categories reveal about the mind. University of Chicago Press.

Merton, R.K. (1968). On the Sociological Theories of the Middle Range. In Classical Sociological Theory, Calhoun, C., Gerteis, J., Moody, J., Pfaff, S. and Virk, I. (Eds), Blackwell, pp. 449–459.

Meyer, R. & Edmonds, B. (2023). The Importance of Dynamic Networks Within a Model of Politics. In: Squazzoni, F. (eds) Advances in Social Simulation. ESSA 2022. Springer Proceedings in Complexity. Springer. (Earlier, open access, version at: https://cfpm.org/discussionpapers/292)

Moss, S. and Edmonds, B. (2005). Towards Good Social Science. Journal of Artificial Societies and Social Simulation, 8(4), 13. https://www.jasss.org/8/4/13.html

Squazzoni, F. et al. (2020) ‘Computational Models That Matter During a Global Pandemic Outbreak: A Call to Action’ Journal of Artificial Societies and Social Simulation 23(2):10. http://doi.org/10.18564/jasss.4298

Silver, N, (2012) The Signal and the Noise: Why So Many Predictions Fail – But Some Don’t. Penguin.

Simon, H. A. (1962). The architecture of complexity. Proceedings of the American philosophical society, 106(6), 467-482.https://www.jstor.org/stable/985254

Thompson, E. (2022). Escape from Model Land: How mathematical models can lead us astray and what we can do about it. Basic Books.

Wartofsky, M. W. (1979). The model muddle: Proposals for an immodest realism. In Models (pp. 1-11). Springer, Dordrecht.

Zeigler, B. P. (1976). Theory of Modeling and Simulation. Wiley Interscience, New York.

Edmonds, B., Carpentras, D., Roxburgh, N., Chattoe-Brown, E. and Polhill, G. (2024) Delusional Generality – how models can give a false impression of their applicability even when they lack any empirical foundation. Review of Artificial Societies and Social Simulation, 7 May 2024. https://rofasss.org/2024/05/06/delusional-generality

© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

Content

Is The Journal of Artificial Societies and Social Simulation Parochial? What Might That Mean? Why Might It Matter?

September 10, 2022 thesubmissionauthor 3 Comments

By Edmund Chattoe-Brown

Introduction

The Journal of Artificial Societies and Social Simulation (hereafter JASSS) retains a distinctive position amongst journals publishing articles on social simulation and Agent-Based Modelling. Many journals have published a few Agent-Based Models, some have published quite a few but it is hard to name any other journal that predominantly does this and has consistently done so over two decades. Using Web of Science on 25.07.22, there are 5540 hits including the search term <“agent-based model”> anywhere in their text. JASSS does indeed have the most of any single journal with 268 hits (5% of the total to the nearest integer). The basic search returns about 200 distinct journals and about half of these have 10 hits or less. Since this search is arranged by hit count, this means that the unlisted journals have even fewer hits than those listed i. e. less than 7 per journal. This supports the claim that the great majority of journals have very limited engagement with Agent-Based Modelling. Note that the point here is to evidence tendencies effectively and not to claim that this specific search term tells us the precise relative frequency of articles on the subject of Agent-Based Modelling in different journals.

This being so, it seems reasonable – and desirable for other practical reasons like being entirely open access, online and readily searchable – to use JASSS as a sample – though clearly not necessarily a representative sample – of what may be happening in Agent-Based Modelling more generally. This is the case study approach (Yin 2009) where smaller samples may be practically unavoidable to discuss richer or more complex phenomena like the actual structures of arguments rather than something quantitative like, say, the number of sources cited by each article.

This piece is motivated by the scepticism that some reviewers have displayed about such a case study approach focused on JASSS and conclusions drawn from it. It is actually quite strange to have the editors and reviewers of a journal argue against its ability to tell us anything useful about wider Agent-Based Modelling research even as a starting point (particularly since this approach has been used in articles previously published in the journal, see for example, Meyer et al. 2009 and Hauke et al. 2017). Of course, it is a given that different journals have unique editorial policies, distinct reviewer pools and so on. Though this may mean, for example, that journals only irregularly publishing Agent-Based Models are actually less typical because it is more arbitrary who reviews for them and there may therefore be less reviewing skill and consensus about the value of articles involved. Anecdotally, I have found this to be true in medical journals where excellent articles rub shoulders with much more problematic ones in a small overall pool. The point of my argument is not to claim that JASSS can really stand in for ABM research as a whole – which it plainly cannot – but that, if the case study approach is to be accepted at all, JASSS is one of the few journals that successfully qualifies for it on empirically justifiable grounds. Conversely, given the potentially distinctive character of journals and the wide spread of Agent-Based Modelling, attempts at representative sampling may be very challenging in resource terms.

Method and Results

Again, using Web of Science on 04.07.22, I searched for the most highly cited articles containing the string “opinion dynamics”. I am well aware that this will not capture all articles that actually have opinion dynamics as their subject matter but this is not the intention. The intention is to describe a reproducible and measurable procedure correlated with the importance of articles so my results can be checked, criticised and extended. Comparing results based on other search terms would be part of that process. Then I took the first ten distinct journals that could be identified from this set of articles in order of citation count. The idea here was to see what journals had published the most important articles in the field overall – at least as identified by this particular search term – and then follow up their coverage of opinion dynamics generally. In addition, for each journal, I accessed the top 50 most cited articles and then checked how many articles containing the string “opinion dynamics” featured in that top 50. The idea here was to assess the extent to which opinion dynamics articles were important to the impact of a particular journal. Table 1 shows the results of this analysis.

Journal Title	“opinion dynamics” Articles in the Top 50 Most Cited	Most Highly Cited “opinion dynamics” Article Citations	Number of Articles Containing the String “opinion dynamics”
Reviews of Modern Physics	0	2380	1
JASSS	6	1616	64
International Journal of Modern Physics C	4	376	72
Dynamic Games and Applications	1	338	5
Physical Review Letters	0	325	5
Global Challenges	1	272	1
IEEE Transactions on Automatic Control	0	269	38
SIAM Review	0	258	2
Central European Journal of Operations Research	1	241	1
Physica A: Statistical Mechanics and Its Applications	0	231	143

Table 1. The Coverage, Commitment and Importance of Different Journals in Regard to “opinion dynamics”: Top Ten by Citation Count of Most Influential Article.

This list attempts to provide two somewhat separate assessments of a journal with regard to “opinion dynamics”. The first is whether it has a substantial body of articles on the topic: Coverage. The second is whether, by the citation levels of the journal generally, “opinion dynamics” models are important to it: Commitment. These journals have been selected on a third dimension, their ability to contribute at least one very influential article to the literature as a whole: Importance.

The resulting patterns are interesting in several ways. Firstly, JASSS appears unique in this sample in being a clearly social science journal rather than a physical science journal or one dealing with instrumental problems like operations research or automatic control. It is an interesting corollary how many “opinion dynamics” models in a physics journal will have been reviewed by social scientists or modellers with a social science orientation at least. This is part of a wider question about whether, for example, physics journals are mainly interested in these models as formal systems rather than as having likely application to real societies. Secondly, 3 journals out of 10 have only a single “opinion dynamics” article – and a further journal has only 2 – which are nonetheless, extremely highly cited relative to such articles as a whole. It is unclear whether this “only one but what a one” pattern has any wider significance. It should also be noted that the most highly cited article in JASSS is four times more highly cited than the next most cited. Only 4 of these journals out of 10 could really be said to have a usable sample of such articles for case study analysis. Thirdly, only 2 journals out of 10 have a significant number of articles sufficiently important that they appear in the top 50 most cited and 5 journals have no “opinion dynamics” articles in their top 50 most cited at all. This makes the point that a journal can have good coverage of the topic and contain at least one highly cited article without “opinion dynamics” necessarily being a commitment of the journal.

Thus it seems that to be a journal contributing at least one influential article to the field as a whole, to have several articles that are amongst the most cited by that journal and to have a non-trivial number of articles overall is unusual. Only one other journal in the top 10 meets all three criteria (International Journal of Physics C). This result is corroborated in Table 2 which carries out the same analysis for all additional journals containing at least one highly cited “opinion dynamics” article (with an arbitrary cut off of at least 100 citations for that article). There prove to be fourteen such journals in addition to the ten above.

Journal Title	“opinion dynamics” Articles in the Top 50 Most Cited	Most Highly Cited “opinion dynamics” Article Citations	Number of Articles Containing the String “opinion dynamics”
Mathematics of Operations Research	1	215	2
Information Sciences	0	186	14
Physica D: Nonlinear Phenomena	0	182	4
Journal of Complex Networks	1	177	5
Annual Reviews in Control	2	165	4
Information Fusion	0	154	11
IEEE Transactions on Control of Network Systems	3	151	12
Automatica	0	141	32
Public Opinion Quarterly	0	132	5
Physical Review E	0	129	74
SIAM Journal on Control and Optimization	0	127	13
Europhysics Letters	0	116	3
Knowledge-Based Systems	0	112	5
Scientific Reports	0	111	26

Table 2. The Coverage, Commitment and Importance of Different Journals in Regard to “opinion dynamics”: All Remaining Distinct Journals whose most important “opinion dynamics” article receives at least 100 citations.

Table 2 confirms the dominance of physical science journals and those solving instrumental problems as opposed to those evidently dealing with the social sciences: A few terms like complex networks are ambivalent in this regard however. Further it confirms the scarcity of journals that simultaneously contribute at least one influential article to the wider field, have a sensibly sized sample of articles on this topic – so that provisional but nonetheless empirical hypotheses might be derived from a case study – and have “opinion dynamics” articles in their top 50 most cited articles as a sign of the importance of the topic to the journal and its readers. To some extent, however, the latter confirmation is an unavoidable artefact of the sampling strategy. As the most cited article becomes less highly cited, the chance it will appear in the top 50 most cited for a particular journal will almost certainly fall unless the journal is very new or generally not highly cited.

As a third independent check, I again used Web of Science to identify all journals which had – somewhat arbitrarily – at least 30 articles on “opinion dynamics”, giving some sense of their contribution. Only two more journals (see Table 3) not already occurring in the two tables above were identified. Generally, this analysis considers only journal articles and not conference proceedings and book chapter serials whose peer review status is less clear/comparable.

Journal Title	“opinion dynamics” Articles in the Top 50 Most Cited	Most Highly Cited “opinion dynamics” Article Citations	Number of Articles Containing the String “opinion dynamics”
Advances in Complex Systems	5	54	42
Plos One	0	53	32

Table 3. The Coverage, Commitment and Importance of Different Journals: All Journals with at Least 30 “opinion dynamics” hits not already listed in Tables 1 and 2.

This cross check shows that while the additional journals do have sample of articles large enough to form the basis for a case study, they either have not yet contributed a really influential article to the wider field – less than half the number of citations of the journals which qualify for Tables 1 and 2, do not have a high commitment to opinion dynamics – in terms of impact within the journal and among its readers – or both.

Before concluding this analysis, it is worth briefly reflecting on what these three criteria jointly tell us – though other criteria could also be used in further research. By sampling on highly cited articles we focus on journals that have managed to go beyond their core readership and influence the field as a whole. There is a danger that journals that have never done this are merely “talking to themselves” and may therefore form a less effective basis for a case study speaking to the field as a whole. By attending to the number of articles in the top 50 for the journal, we get a sense of whether the topic is central (or only peripheral) to that journal/its readership and, again, journals where the topic is central stand a chance of being better case studies than those where it is peripheral. The criteria for having enough articles is simply a practical one for conducting a meaningful case study. Researchers using different methods may disagree about how many instances you need to draw useful conclusions but there is general agreement that it is more than one!

Analysis and Conclusions

The present article was motivated by an attempt to evaluate the claim that JASSS may be parochial and therefore not constitute a suitable basis for provisional hypotheses generated by case study analysis of its articles. Although the argument presented here is clearly rough and ready – and could be improved on by subsequent researchers – it does not appear to support this claim. JASSS actually seems to be one of very few journals – arguably the only social science one – that simultaneously has made at least one really influential contribution to the wider field of opinion dynamics, has a large enough number of articles on the topic for plausible generalisation and has quite a few such articles in its top 50, which shows the importance of the topic to the journal and its wider readership. Unless one wishes to reject case study analysis altogether, there are – in fact – very few other journals on which it can effectively be done for this topic.

But actually, my main conclusion is a wider reflection on peer reviewing, sampling and scientific progress based on reviewer resistance to the case study approach. There are 1386 articles with the search term “opinion dynamics” in Web of Science as of 25.07.22. It is clearly not realistic for one article – or even one book – to analyse all that content, particularly qualitatively. This being so we have to consider what is practical and adequate to generate hypotheses suitable for publication and further development of research along these lines. Case studies of single journals are not the only strategy but do have a recognised academic tradition in methodology (Brown 2008). We could sample randomly from the population of articles but I have never yet seen such a qualitative analysis based on sampling and it is not clear whether it would be any better received by potential reviewers. (In particular, with many journals each having only a few examples of Agent-Based Models, realistically low sampling rates would leave many journals unrepresented altogether which would be a problem if they had distinctive approaches.) Most journals – including JASSS – have word limits and this restricts how much you can report. Qualitative analysis is more drawn-out than quantitative analysis which limits this research style further in terms of practical sample sizes. Both reading whole articles for analysis and writing up the resulting conclusions takes more resources of time and word count. As long as one does not claim that a qualitative analysis from JASSS can stand for all Agent-Based Modelling – but is merely a properly grounded hypothesis for further investigation – and shows ones working properly to support that further investigation, it isn’t really clear why that shouldn’t be sufficient for publication. Particularly as I have now shown that JASSS isn’t notably parochial along several potentially relevant dimensions. If a reviewer merely conjectures that your results won’t generalise, isn’t the burden of proof then on them to do the corresponding analysis and publish it? Otherwise the danger is that we are setting conjecture against actual evidence – however imperfect – and this runs the risk of slowing scientific progress by favouring research compatible with traditionally approved perspectives in publication. It might be useful to revisit the everyday idea of burden of proof in assessing the arguments of reviewers. What does it take in terms of evidence and argument (rather than simply power) for a comment by a reviewer to scientifically require an answer? It is a commonplace that a disproved hypothesis is more valuable to science than a mere conjecture or something that cannot be proven one way or another. One reason for this is that scientific procedure illustrates methodological possibility as well as generating actual results. A sample from JASSS may not stand for all research but it shows how a conclusion might ultimately be reached for all research if the resources were available and the administrative constraints of academic publishing could be overcome.

As I have argued previously (Chattoe-Brown 2022), and has now been pleasingly illustrated (Keijzer 2022), this situation may create an important and distinctive role for RofASSS. It may be valuable to get hypotheses, particularly ones that potentially go against the prevailing wisdom, “out there” so they can subsequently be tested more rigorously rather than having to wait until the framer of the hypothesis can meet what may be a counsel of perfection from peer reviewers. Another issue with reviewing is a tendency to say what will not do rather than what will do. This rather the puts the author at the mercy of reviewers during the revision process. RofASSS can also be used to hive off “contextual” analyses – like this one regarding what it might mean for a journal to be parochial – so that they can be developed in outline for the general benefit of the Agent-Based Modelling community – rather than having to add length to specific articles depending on the tastes of particular reviewers.

Finally, as should be obvious, I have only suggested that JASSS is not parochial in regard to articles involving the string “opinion dynamics”. However, I have also illustrated how this kind of analysis could be done systematically for different topics to justify the claim that a particular journal can serve as a reasonable basis for a case study.

Acknowledgements

This analysis was funded by the project “Towards Realistic Computational Models Of Social Influence Dynamics” (ES/S015159/1) funded by ESRC via ORA Round 5.

References

Brown, Patricia Anne (2008) ‘A Review of the Literature on Case Study Research’, Canadian Journal for New Scholars in Education/Revue Canadienne des Jeunes Chercheures et Chercheurs en Éducation, 1(1), July, pp. 1-13, https://journalhosting.ucalgary.ca/index.php/cjnse/article/view/30395.

Chattoe-Brown, E. (2022) ‘If You Want to Be Cited, Don’t Validate Your Agent-Based Model: A Tentative Hypothesis Badly in Need of Refutation’, Review of Artificial Societies and Social Simulation, 1st Feb 2022. https://rofasss.org/2022/02/01/citing-od-models

Hauke, Jonas, Lorscheid, Iris and Meyer, Matthias (2017) ‘Recent Development of Social Simulation as Reflected in JASSS Between 2008 and 2014: A Citation and Co-Citation Analysis’, Journal of Artificial Societies and Social Simulation, 20(1), 5. https://www.jasss.org/20/1/5.html. doi:10.18564/jasss.3238

Keijzer, M. (2022) ‘If You Want to be Cited, Calibrate Your Agent-Based Model: Reply to Chattoe-Brown’, Review of Artificial Societies and Social Simulation, 9th Mar 2022. https://rofasss.org/2022/03/09/Keijzer-reply-to-Chattoe-Brown

Meyer, Matthias, Lorscheid, Iris and Troitzsch, Klaus G. (2009) ‘The Development of Social Simulation as Reflected in the First Ten Years of JASSS: A Citation and Co-Citation Analysis’, Journal of Artificial Societies and Social Simulation, 12(4), 12,. https://www.jasss.org/12/4/12.html.

Yin, R. K. (2009) Case Study Research: Design and Methods, fourth edition (Thousand Oaks, CA: Sage).

Chattoe-Brown, E. (2022) Is The Journal of Artificial Societies and Social Simulation Parochial? What Might That Mean? Why Might It Matter? Review of Artificial Societies and Social Simulation, 10th Sept 2022. https://rofasss.org/2022/09/10/is-the-journal-of-artificial-societies-and-social-simulation-parochial-what-might-that-mean-why-might-it-matter/

© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

Content

If you want to be cited, calibrate your agent-based model: A Reply to Chattoe-Brown

March 9, 2022 thesubmissionauthor 1 Comment

By Marijn A. Keijzer

This is a reply to a previous comment, (Chattoe-Brown 2022).

The social simulation literature has called on its proponents to enhance the quality and realism of their contributions through systematic validation and calibration (Flache et al., 2017). Model validation typically refers to assessments of how well the predictions of their agent-based models (ABMs) map onto empirically observed patterns or relationships. Calibration, on the other hand, is the process of enhancing the realism of the model by parametrizing it based on empirical data (Boero & Squazzoni, 2005). We would expect that presenting a validated or calibrated model serves as a signal of model quality, and would thus be a desirable characteristic of a paper describing an ABM.

In a recent contribution to RofASSS, Edmund Chattoe-Brown provocatively argued that model validation does not bear fruit for researchers interested in boosting their citations. In a sample of articles from JASSS published on opinion dynamics he observed that “the sample clearly divides into non-validated research with more citations and validated research with fewer” (Chattoe-Brown, 2022). Well-aware of the bias and limitations of the sample at hand, Chattoe-Brown calls on refutation of his hypothesis. An analysis of the corpus of articles in Web of Science, presented here, could serve that goal.

To test whether there exists an effect of model calibration and/or validation on the citation counts of papers, I compare citation counts of a larger number of original research articles on agent-based models published in the literature. I extracted 11,807 entries from Web of Science by searching for items that contained the phrases “agent-based model”, “agent-based simulation” or “agent-based computational model” in its abstract.^[1] I then labeled all items that mention “validate” in its abstract as validated ABMs and those that mention “calibrate” as calibrated ABMs. This measure if rather crude, of course, as descriptions containing phrases like “we calibrated our model” or “others should calibrate our model” are both labeled as calibrated models. However, if mentioning that future research should calibrate or validate the model is not related to citations counts (which I would argue it indeed is not), then this inaccuracy does not introduce bias.

The shares of entries that mention calibration or validation are somewhat small. Overall, just 5.62% of entries mention validation, 3.21% report a calibrated model and 0.65% fall in both categories. The large sample size, however, will still enable the execution of proper statistical analysis and hypothesis testing.

How are mentions of calibration and validation in the abstract related to citation counts at face value? Bivariate analyses show only minor differences, as revealed in Figure 1. In fact, the distribution of citations for validated and non-validated ABMs (panel A) is remarkably similar. Wilcoxon tests with continuity correction—the nonparametric version of the simple t test—corroborate their similarity (W = 3,749,512, p = 0.555). The differences in citations between calibrated and non-calibrated models appear, albeit still small, more pronounced. Calibrated ABMs are cited slightly more often (panel B), as also supported by a bivariate test (W = 1,910,772, p < 0.001).

Figure 1. Distributions of number of citations of all the entries in the dataset for validated (panel A) and calibrated (panel B) ABMs and their averages with standard errors over years (panels C and D)

Age of the paper might be a more important determinant of citation counts, as panels C and D of Figure 1 suggest. Clearly, the age of a paper should be important here, because older papers have had much more opportunity to get cited. In particular, papers younger than 10 years seem to not have matured enough for its citation rates to catch up to older articles. When comparing the citation counts of purely theoretical models with calibrated and validated versions, this covariate should not be missed, because the latter two are typically much younger. In other words, the positive relationship between model calibration/validation and citation counts could be hidden in the bivariate analysis, as model calibration and validation are recent trends in ABM research.

I run a Poisson regression on the number of citations as explained by whether they are validated and calibrated (simultaneously) and whether they are both. The age of the paper is taken into account, as well as the number of references that the paper uses itself (controlling for reciprocity and literature embeddedness, one might say). Finally, the fields in which the papers have been published, as registered by Web of Science, have been added to account for potential differences between fields that explains both citation counts and conventions about model calibration and validation.

Table 1 presents the results from the four models with just the main effects of validation and calibration (model 1), the interaction of validation and calibration (model 2) and the full model with control variables (model 3).

Table 1. Poisson regression on the number of citations

	# Citations
	(1)	(2)	(3)

Validated	-0.217^***	-0.298^***	-0.094^***
	(0.012)	(0.014)	(0.014)
Calibrated	0.171^***	0.064^***	0.076^***
	(0.014)	(0.016)	(0.016)
Validated x Calibrated		0.575^***	0.244^***
		(0.034)	(0.034)
Age			0.154^***
			(0.0005)
Cited references			0.013^***
			(0.0001)
Field included	No	No	Yes
Constant	2.553^***	2.556^***	0.337^**
	(0.003)	(0.003)	(0.164)

Observations	11,807	11,807	11,807
AIC	451,560	451,291	301,639

Note:	^p<0.1; ^p<0.05; ^**p<0.01

The results from the analyses clearly suggest a negative effect of model validation and a positive effect of model calibration on the likelihood of being cited. The hypothesis that was so “badly in need of refutation” (Chattoe-Brown, 2022) will remain unrefuted for now. The effect does turn positive, however, when the abstract makes mention of calibration as well. In both the controlled (model 3) and uncontrolled (model 2) analyses, combining the effects of validation and calibration yields a positive coefficient overall.^[2]

The controls in model 3 substantially affect the estimates from the three main factors of interest, while remaining in expected directions themselves. The age of a paper indeed helps its citation count, and so does the number of papers the item cites itself. The fields, furthermore, take away from the main effects somewhat, too, but not to a problematic degree. In an additional analysis, I have looked at the relationship between the fields and whether they are more likely to publish calibrated or validated models and found no substantial relationships. Citation counts will differ between fields, however. The papers in our sample are more often cited in, for example, hematology, emergency medicine and thermodynamics. The ABMs in the sample coming from toxicology, dermatology and religion are on the unlucky side of the equation, receiving less citations on average. Finally, I have also looked at papers published in JASSS specifically, due to the interest of Chattoe-Brown and the nature of this outlet. Surprisingly, the same analyses run on the subsample of these papers (N=376) showed a negative relationship between citation counts and model calibration/validation. Does the JASSS readership reveal its taste for artificial societies?

In sum, I find support for the hypothesis of Chattoe-Brown (2022) on the negative relationship between model validation and citations counts for papers presenting ABMs. If you want to be cited, you should not validate your ABM. Calibrated ABMs, on the other hand, are more likely to receive citations. What is more, ABMs that were both calibrated and validated are most the most successful papers in the sample. All conclusions were drawn considering (i.e. controlling for) the effects of age of the paper, the number of papers the paper cited itself, and (citation conventions in) the field in which it was published.

While the patterns explored in this and Chattoe-Brown’s recent contribution are interesting, or even puzzling, they should not distract from the goal of moving towards realistic agent-based simulations of social systems. In my opinion, models that combine rigorous theory with strong empirical foundations are instrumental to the creation of meaningful and purposeful agent-based models. Perhaps the results presented here should just be taken as another sign that citation counts are a weak signal of academic merit at best.

Data, code and supplementary analyses

All data and code used for this analysis, as well as the results from the supplementary analyses described in the text, are available here: https://osf.io/x9r7j/

Notes

^[1] Note that the hyphen between “agent” and “based” does not affect the retrieved corpus. Both contributions that mention “agent based” and “agent-based” were retrieved.

^[2] A small caveat to the analysis of the interaction effect is that the marginal improvement of model 2 upon model 1 is rather small (AIC difference of 269). This is likely (partially) due to the small number of papers that mention both calibration and validation (N=77).

Acknowledgements

Marijn Keijzer acknowledges IAST funding from the French National Research Agency (ANR) under the Investments for the Future (Investissements d’Avenir) program, grant ANR-17-EURE-0010.

References

Boero, R., & Squazzoni, F. (2005). Does empirical embeddedness matter? Methodological issues on agent-based models for analytical social science. Journal of Artificial Societies and Social Simulation, 8(4), 1–31. https://www.jasss.org/8/4/6.html

Chattoe-Brown, E. (2022) If You Want To Be Cited, Don’t Validate Your Agent-Based Model: A Tentative Hypothesis Badly In Need of Refutation. Review of Artificial Societies and Social Simulation, 1st Feb 2022. https://rofasss.org/2022/02/01/citing-od-models

Flache, A., Mäs, M., Feliciani, T., Chattoe-Brown, E., Deffuant, G., Huet, S., & Lorenz, J. (2017). Models of social influence: towards the next frontiers. Journal of Artificial Societies and Social Simulation, 20(4). https://doi.org/10.18564/jasss.3521

Keijzer, M. (2022) If you want to be cited, calibrate your agent-based model: Reply to Chattoe-Brown. Review of Artificial Societies and Social Simulation, 9th Mar 2022. https://rofasss.org/2022/03/09/Keijzer-reply-to-Chattoe-Brown

© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

Content

The Poverty of Suggestivism – the dangers of “suggests that” modelling

February 28, 2022 thesubmissionauthor 2 Comments

By Bruce Edmonds

Vagueness and refutation

A model^[1] is basically composed of two parts (Zeigler 1976, Wartofsky 1979):

A set of entities (such as mathematical equations, logical rules, computer code etc.) which can be used to make some inferences as to the consequences of that set (usually in conjunction with some data and parameter values)
A mapping from this set to what it aims to represent – what the bits mean

Whilst a lot of attention has been paid to the internal rigour of the set of entities and the inferences that are made from them (1), the mapping to what that represents (2) has often been left as implicit or incompletely described – sometimes only indicated by the labels given to its parts. The result is a model that vaguely relates to its target, suggesting its properties analogically. There is not a well-defined way that the model is to be applied to anything observed, but a new map is invented each time it is used to think about a particular case. I call this way of modelling “Suggestivism”, because the model “suggests” things about what is being modelled.

This is partly a recapitulation of Popper’s critique of vague theories in his book “The Poverty of Historicism” (1957). He characterised such theories as “irrefutable”, because whatever the facts, these theories could be made to fit them. Irrefutability is an indicator of a lack of precise mapping to reality – such vagueness makes refutation very hard. However, it is only an indicator; there may be other reasons than vagueness for it not being possible to test a theory – it is their disconnection from well-defined empirical reference that is the issue here.

Some might go as far as suggesting that any model or theory that is not refutable is “unscientific”, but this goes too far, implying a very restricted definition of what ‘science’ is. We need analogies to think about what we are doing and to gain insight into what we are studying, e.g. (Hartman 1997) – for humans they are unavoidable, ‘baked’ into the way language works (Lakoff 1987). A model might make a set of ideas clear and help map out the consequences of a set of assumptions/structures/processes. Many of these suggestivist models relate to a set of ideas and it is the ideas that relate to what is observed (albeit informally) (Edmonds 2001). However, such models do not capture anything reliable about what they refer to, and in that sense are not part of the set of the established statements and theories that is at the core of science (Arnold 2014).

The dangers of suggestivist modelling

As above, there are valid uses of abstract or theoretical modelling where this is explicitly acknowledged and where no conclusions about observed phenomena are made. So what are the dangers of suggestivist modelling – why am I making such a fuss about it?

Firstly, that people often seem to confuse a model as an analogy – a way of thinking about stuff – and a model that tells us reliably about what we are studying. Thus they give undue weight to the analyses of abstract models that are, in fact, just thought experiments. Making models is a very intimate way of theorising – one spends an extended period of time interacting with one’s model: developing, checking, analysing etc. The result is a particularly strong version of “Kuhnian Spectacles” (Kuhn 1962) causing us to see the world though our model for weeks after. Under this strong influence it is natural to confuse what we can reliably infer about the world and how we are currently perceiving/thinking about it. Good scientists should then pause and wait for this effect to wear off so that they can effectively critique what they have done, its limitations and what its implications are. However, often in the rush to get their work out, modellers often do not do this, resulting in a sloppy set of suggestive interpretations of their modelling.

Secondly, empirical modelling is hard. It is far easier (and, frankly, more fun) to play with non-empirical models. A scientific culture that treats suggestivist modelling as substantial progress and significantly rewards modellers that do it, will effectively divert a lot of modelling effort in this direction. Chattoe-Brown (2018) displayed evidence of this in his survey of opinion dynamics models – abstract, suggestivist modelling got far more reward (in terms of citations) than those that tried to relate their model to empirical data in a direct manner. Abstract modelling has a role in science, but if it is easier and more rewarding then the field will become unbalanced. It may give the impression of progress but not deliver on this impression. In a more mature science, researchers working on measurement methods (steps from observation to models) and collecting good data are as important as the theorists (Moss 1998).

Thirdly, it is hard to judge suggestivist models. Given their connection to the modelling target is vague there cannot be any decisive test of its success. Good modellers should declare the exact purpose of their model, e.g. that is analogical or merely exploring the consequences of theory (Edmonds et al. 2019), but then accept the consequences of this choice – namely, that it excludes making conclusions about the observed world. If it is for a theoretical exploration then the comprehensiveness of the exploration, the scope of the exploration and the applicability of the model can be judged, but if the model is analogical or illustrative then this is harder. Whilst one model may suggest X, another may suggest the opposite. It is quite easy to fix a model to get the outcomes one wants. Clearly, if a model makes startling suggestions – illustrating totally new ideas or making a counter-example to widely held assumptions – then this helps science by widening the pool of theories or hypotheses that are considered. However most suggestivist modelling does not do this.

Fourthly, their sheer flexibility of as to application causes problems – if one works hard enough one can invent mappings to a wide range of cases, the limits are only those of our imagination. In effect, having a vague mapping from model to what it models adds in huge flexibility in a similar way to having a large number of free (non-empirical) parameters. This flexibility gives an impression of generality, and many desire simple and general models for complex phenomena. However, this is illusory because a different mapping is needed for each case, to make it apply. Given the above (1)+(2) definition of a model this means that, in fact, it is a different model for each case – what a model refers to, is part of the model. The same flexibility makes such models impossible to refute, since one can just adjust the mapping to save them. The apparent generality and lack of refutation means that such models hang around in the literature, due to their surface attractiveness.

Finally, these kinds of model are hugely influential beyond the community of modellers to the wider public including policy actors. Narratives that start in abstract models make their way out and can be very influential (Vranckx 1999). Despite the lack of rigorous mapping from model to reality, suggestivist models look impressive, look scientific. For example, very abstract models from the Neo-Classical ‘Chicago School’ of economists supported narratives about the optimal efficiency of markets, leading to a reluctance to regulate them (Krugman 2009). A lack of regulation seemed to be one of the factors behind the 2007/8 economic crash (Baily et al 2008). Modellers may understand that other modellers get over-enthusiastic and over-interpret their models, but others may not. It is the duty of modellers to give an accurate impression of the reliability of any modelling results and not to over-hype them.

How to recognise a suggestivist model

It can be hard to detangle how empirically vague a model is, because many descriptions about modelling work do not focus on making the mapping to what it represents precise. The reasons for this are various, for example: the modeller might be conflating reality and what is in the model in their minds, the researcher is new to modelling and has not really decided what the purpose of their model is, the modeller might be over-keen to establish the importance of their work and so is hyping the motivation and conclusions, they might simply not got around to thinking enough about the relationship between their model and what it might represent, or they might not have bothered to make the relationship explicit in their description. Whatever the reason the reader of any description of such work is often left with an archaeological problem: trying to unearth what the relationship might be, based on indirect clues only. The only way to know for certain is to take a case one knows about and try and apply the model to it, but this is a time consuming process and relies upon having a case with suitable data available. However, there are some indicators, albeit fallible ones, including the following.

A relatively simple model is interpreted as explaining a wide range of observed, complex phenomena
No data from an observed case study is compared to data from the model (often no data is brought in at all, merely abstract observations) – despite this, conclusions about some observed phenomena are made
The purpose of the model is not explicitly declared
The language of the paper seems to conflate talking about the model with what is being modelled
In the paper there are sudden abstraction ‘jumps’ between the motivation and the description of the model and back again to the interpretation of the results in terms of that motivation. The abstraction jumps involved are large and justified by some a priori theory or modelling precedents rather than evidence.

How to avoid suggestivist modelling

How to avoid the dangers of suggestivist modelling should be clear from the above discussion, but I will make them explicit here.

Be clear about the model purpose – that is does the model aim to achieve, which indicates how it should be judged by others (Edmonds et al 2019)
Do not make any conclusions about the real world if you have not related the model to any data
Do not make any policy conclusions – things that might affect other people’s lives – without at least some independent validation of the model outcomes
Document how a model relates (or should relate) to data, the nature of that data and maybe even the process whereby that data should be obtained (Achter et al 2019)
Be explicit as possible about what kinds of phenomena the model applies to – the limits of its scope
Keep the language about the model and what is being modelled distinct – for any statement it should be clear whether it is talking about the model or what it models (Edmonds 2020)
Highlight any bold assumptions in the specification of the model or describe what empirical foundation there is for them – be honest about these

Conclusion

Models can serve many different purposes (Epstein 2008). This is fine as long as the purpose of models are always made clear, and model results are not interpreted further than their established purpose allows. Research which gives the impression that analogical, illustrative or theoretical modelling can tell us anything reliable about observed complex phenomena is not only sloppy science, but can have a deleterious impact – giving an impression of progress whilst diverting attention from empirically reliable work. Like a bad investment: if it looks too good and too easy to be true, it probably isn’t.

Notes

[1] We often use the word “model” in a lazy way to indicate (1) rather than (1)+(2) in this definition, but a set of entities without any meaning or mapping to anything else is not a model, as it does not represent anything. For example, a random set of equations or program instructions does not make a model.

Acknowledgements

Bruce Edmonds is supported as part of the ESRC-funded, UK part of the “ToRealSim” project, grant number ES/S015159/1.

References

Achter, S., Borit, M., Chattoe-Brown, E., Palaretti, C. & Siebers, P.-O. (2019) Cherchez Le RAT: A Proposed Plan for Augmenting Rigour and Transparency of Data Use in ABM. Review of Artificial Societies and Social Simulation, 4^th June 2019. https://rofasss.org/2019/06/04/rat/

Arnold, E. (2014). What’s wrong with social simulations?. The Monist, 97(3), 359-377. DOI:10.5840/monist201497323

Baily, M. N., Litan, R. E., & Johnson, M. S. (2008). The origins of the financial crisis. Fixing Finance Series – Paper 3, The Brookings Institution. https://www.brookings.edu/wp-content/uploads/2016/06/11_origins_crisis_baily_litan.pdf

Chattoe-Brown, E. (2018) What is the earliest example of a social science simulation (that is nonetheless arguably an ABM) and shows real and simulated data in the same figure or table? Review of Artificial Societies and Social Simulation, 11^th June 2018. https://rofasss.org/2018/06/11/ecb/

Edmonds, B. (2001) The Use of Models – making MABS actually work. In. Moss, S. and Davidsson, P. (eds.), Multi Agent Based Simulation, Lecture Notes in Artificial Intelligence, 1979:15-32. http://cfpm.org/cpmrep74.html

Edmonds, B. (2020) Basic Modelling Hygiene – keep descriptions about models and what they model clearly distinct. Review of Artificial Societies and Social Simulation, 22^nd May 2020. https://rofasss.org/2020/05/22/modelling-hygiene/

Epstein, J. M. (2008). Why model?. Journal of artificial societies and social simulation, 11(4), 12. https://jasss.soc.surrey.ac.uk/11/4/12.html

Hartmann, S. (1997): Modelling and the Aims of Science. In: Weingartner, P. et al (ed.) : The Role of Pragmatics in Contemporary Philosophy: Contributions of the Austrian Ludwig Wittgenstein Society. Vol. 5. Wien und Kirchberg: Digi-Buch. pp. 380-385. https://epub.ub.uni-muenchen.de/25393/

Krugman, P. (2009) How Did Economists Get It So Wrong? New York Times, Sept. 2nd 2009. https://www.nytimes.com/2009/09/06/magazine/06Economic-t.html

Kuhn, T.S. (1962) The Structure of Scientific Revolutions. Chicago: University of Chicago Press.

Lakoff, G. (1987) Women, fire, and dangerous things. University of Chicago Press, Chicago.

Morgan, M. S., & Morrison, M. (1999). Models as mediators. Cambridge: Cambridge University Press.

Moss, S. (1998) Social Simulation Models and Reality: Three Approaches. Centre for Policy Modelling Discussion Paper: CPM-98-35, http://cfpm.org/cpmrep35.html

Popper, K. (1957). The poverty of historicism. Routledge.

Vranckx, An. (1999) Science, Fiction & the Appeal of Complexity. In Aerts, Diederik, Serge Gutwirth, Sonja Smets, and Luk Van Langehove, (eds.) Science, Technology, and Social Change: The Orange Book of “Einstein Meets Magritte.” Brussels: Vrije Universiteit Brussel; Dordrecht: Kluwer., pp. 283–301.

Wartofsky, M. W. (1979). The model muddle: Proposals for an immodest realism. In Models (pp. 1-11). Springer, Dordrecht.

Zeigler, B. P. (1976). Theory of Modeling and Simulation. Wiley Interscience, New York.

Edmonds, B. (2022) The Poverty of Suggestivism – the dangers of "suggests that" modelling. Review of Artificial Societies and Social Simulation, 28th Feb 2022. https://rofasss.org/2022/02/28/poverty-suggestivism

© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

Content

If You Want To Be Cited, Don’t Validate Your Agent-Based Model: A Tentative Hypothesis Badly In Need of Refutation

February 1, 2022 thesubmissionauthor 1 Comment

By Edmund Chattoe-Brown

As part of a previous research project, I collected a sample of the Opinion Dynamics (hereafter OD) models published in JASSS that were most highly cited in JASSS. The idea here was to understand what styles of OD research were most influential in the journal. In the top 50 on 19.10.21 there were eight such articles. Five were self-contained modelling exercises (Hegselmann and Krause 2002, 58 citations, Deffuant et al. 2002, 35 citations, Salzarulo 2006, 13 citations, Deffuant 2006, 13 citations and Urbig et al. 2008, 9 citations), two were overviews of OD modelling (Flache et al. 2017, 13 citations and Sobkowicz 2009, 10 citations) and one included an OD example in an article mainly discussing the merits of cellular automata modelling (Hegselmann and Flache 1998, 12 citations). In order to get in to the top 50 on that date you had to achieve at least 7 citations. In parallel, I have been trying to identify Agent-Based Models that are validated (undergo direct comparison of real and equivalent simulated data). Based on an earlier bibliography (Chattoe-Brown 2020) which I extended to the end of 2021 for JASSS and articles which were described as validated in the highly cited articles listed above, I managed to construct a small and unsystematic sample of validated OD models. (Part of the problem with a systematic sample is that validated models are not readily searchable as a distinct category and there are too many OD models overall to make reading them all feasible. Also, I suspect, validated models just remain rare in line with the larger scale findings of Dutton and Starbuck (1971, p. 130, table 1) and discouragingly, much more recently, Angus and Hassani-Mahmooei (2015, section 4.5, figure 9). Obviously, since part of the sample was selected by total number of citations, one cannot make a comparison on that basis, so instead I have used the best possible alternative (given the limitations of the sample) and compared articles on citations per year. The problem here is that attempting validated modelling is relatively new while older articles inevitably accumulate citations however slowly. But what I was trying to discover was whether new validated models could be cited at a much higher annual rate without reaching the top 50 (or whether, conversely, older articles could have a high enough total citations to get into the top 50 without having a particularly impressive annual citation rate.) One would hope that, ultimately, validated models would tend to receive more citations than those that were not validated (but see the rather disconcerting related findings of Serra-Garcia and Gneezy 2021). Table 1 shows the results sorted by citations per year.

Article	Status	Number of JASSS Citations[1]	Number of Years[2]	Citations Per Year
Bernardes et al. 2002	Validated	1	20	0.05
Bernardes et al. 2001	Validated	2	21	0.096
Fortunato and Castellano 2007	Validated	2	15	0.13
Caruso and Castorina 2005	Validated	4	17	0.24
Chattoe-Brown 2014	Validated	2	8	0.25
Brousmiche et al. 2016	Validated	2	6	0.33
Hegselmann and Flache 1998	Non-Validated	12	24	0.5
Urbig et al. 2008	Non-Validated	9	14	0.64
Sobkowicz 2009	Non-Validated	10	13	0.77
Deffuant 2006	Non-Validated	13	16	0.81
Salzarulo 2006	Non-Validated	13	16	0.81
Duggins 2017	Validated	5	5	1
Deffuant et al. 2002	Non-Validated	35	20	1.75
Flache et al. 2017	Non-Validated	13	5	2.6
Hegselmann and Krause 2002	Non-Validated	58	20	2.9

Table 1. Annual Citation Rates for OD Articles Highly Cited in JASSS (Systematic Sample) and Validated OD Articles in or Cited in JASSS (Unsystematic Sample)

With the notable (and potentially encouraging) exception of Duggins (2017), the most recent validated OD model I have been able to discover in JASSS, the sample clearly divides into non-validated research with more citations and validated research with fewer. The position of Duggins (2017) might suggest greater recent interest in validated OD models. Unfortunately, however, qualitative analysis of the citations suggests that these are not cited as validated models per se (and thus as a potential improvement over non-validated models) but merely as part of general classes of OD model (like those involving social networks or repulsion – moving away from highly discrepant opinions). This tendency to cite validated models without acknowledging that they are validated (and what the implications of that might be) is widespread in the articles I looked at.

Obviously, there is plenty wrong with this analysis. Even looking at citations per annum we are arguably still partially sampling on the dependent variable (articles selected for being widely cited prove to be widely cited!) and the sample of validated OD models is unsystematic (though in fairness the challenges of producing a systematic sample are significant.[3]) But the aim here is to make a distinctive use of RoFASSS as a rapid mode of permanent publication and to think differently about science. If I tried to publish this in a peer reviewed journal, the amount of labour required to satisfy reviewers about the research design would probably be prohibitive (even if it were possible). As a result, the case to answer about this apparent (and perhaps undesirable) pattern in data might never see the light of day.

But by publishing quickly in RoFASSS without the filter of peer review I actively want my hypothesis to be rejected or replaced by research based on a better design (and such research may be motivated precisely by my presenting this interesting pattern with all its imperfections). When it comes to scientific progress, the chance to be clearly wrong now could be more useful than the opportunity to be vaguely right at some unknown point in the future.

Acknowledgements

This analysis was funded by the project “Towards Realistic Computational Models Of Social Influence Dynamics” (ES/S015159/1) funded by ESRC via ORA Round 5 (PI: Professor Bruce Edmonds, Centre for Policy Modelling, Manchester Metropolitan University: https://gtr.ukri.org/projects?ref=ES%2FS015159%2F1).

Notes

[1] Note that the validated OD models had their citations counted manually while the high total citation articles had them counted automatically. This may introduce some comparison error but there is no reason to think that either count will be terribly inaccurate.

[2] Including the year of publication and the current year (2021).

[3] Note, however, that there are some checks and balances on sample quality. Highly successful validated OD models would have shown up independently in the top 50. There is thus an upper bound to the impact of the articles I might have missed in manually constructing my “version 1” bibliography. The unsystematic review of 47 articles by Sobkowicz (2009) also checks independently on the absence of validated OD models in JASSS to that date and confirms the rarity of such articles generally. Only four of the articles that he surveys are significantly empirical.

References

Angus, Simon D. and Hassani-Mahmooei, Behrooz (2015) ‘“Anarchy” Reigns: A Quantitative Analysis of Agent-Based Modelling Publication Practices in JASSS, 2001-2012’, Journal of Artificial Societies and Social Simulation, 18(4), October, article 16, <http://jasss.soc.surrey.ac.uk/18/4/16.html>. doi:10.18564/jasss.2952

Bernardes, A. T., Costa, U. M. S., Araujo, A. D. and Stauffer, D. (2001) ‘Damage Spreading, Coarsening Dynamics and Distribution of Political Votes in Sznajd Model on Square Lattice’, International Journal of Modern Physics C: Computational Physics and Physical Computation, 12(2), February, pp. 159-168. doi:10.1140/e10051-002-0013-y

Bernardes, A. T., Stauffer, D. and Kertész, J. (2002) ‘Election Results and the Sznajd Model on Barabasi Network’, The European Physical Journal B: Condensed Matter and Complex Systems, 25(1), January, pp. 123-127. doi:10.1142/S0129183101001584

Brousmiche, Kei-Leo, Kant, Jean-Daniel, Sabouret, Nicolas and Prenot-Guinard, François (2016) ‘From Beliefs to Attitudes: Polias, A Model of Attitude Dynamics Based on Cognitive Modelling and Field Data’, Journal of Artificial Societies and Social Simulation, 19(4), October, article 2, <https://www.jasss.org/19/4/2.html>. doi:10.18564/jasss.3161

Caruso, Filippo and Castorina, Paolo (2005) ‘Opinion Dynamics and Decision of Vote in Bipolar Political Systems’, arXiv > Physics > Physics and Society, 26 March, version 2. doi:10.1142/S0129183105008059

Chattoe-Brown, Edmund (2014) ‘Using Agent Based Modelling to Integrate Data on Attitude Change’, Sociological Research Online, 19(1), February, article 16, <https://www.socresonline.org.uk/19/1/16.html>. doi:0.5153/sro.3315

Chattoe-Brown Edmund (2020) ‘A Bibliography of ABM Research Explicitly Comparing Real and Simulated Data for Validation: Version 1’, CPM Report CPM-20-216, 12 June, <http://cfpm.org/discussionpapers/256>

Deffuant, Guillaume (2006) ‘Comparing Extremism Propagation Patterns in Continuous Opinion Models’, Journal of Artificial Societies and Social Simulation, 9(3), June, article 8, <https://www.jasss.org/9/3/8.html>.

Deffuant, Guillaume, Amblard, Frédéric, Weisbuch, Gérard and Faure, Thierry (2002) ‘How Can Extremism Prevail? A Study Based on the Relative Agreement Interaction Model’, Journal of Artificial Societies and Social Simulation, 5(4), October, article 1, <https://www.jasss.org/5/4/1.html>.

Duggins, Peter (2017) ‘A Psychologically-Motivated Model of Opinion Change with Applications to American Politics’, Journal of Artificial Societies and Social Simulation, 20(1), January, article 13, <http://jasss.soc.surrey.ac.uk/20/1/13.html>. doi:10.18564/jasss.3316

Dutton, John M. and Starbuck, William H. (1971) ‘Computer Simulation Models of Human Behavior: A History of an Intellectual Technology’, IEEE Transactions on Systems, Man, and Cybernetics, SMC-1(2), April, pp. 128-171. doi:10.1109/TSMC.1971.4308269

Flache, Andreas, Mäs, Michael, Feliciani, Thomas, Chattoe-Brown, Edmund, Deffuant, Guillaume, Huet, Sylvie and Lorenz, Jan (2017) ‘Models of Social Influence: Towards the Next Frontiers’, Journal of Artificial Societies and Social Simulation, 20(4), October, article 2, <http://jasss.soc.surrey.ac.uk/20/4/2.html>. doi:10.18564/jasss.3521

Fortunato, Santo and Castellano, Claudio (2007) ‘Scaling and Universality in Proportional Elections’, Physical Review Letters, 99(13), 28 September, article 138701. doi:10.1103/PhysRevLett.99.138701

Hegselmann, Rainer and Flache, Andreas (1998) ‘Understanding Complex Social Dynamics: A Plea For Cellular Automata Based Modelling’, Journal of Artificial Societies and Social Simulation, 1(3), June, article 1, <https://www.jasss.org/1/3/1.html>.

Hegselmann, Rainer and Krause, Ulrich (2002) ‘Opinion Dynamics and Bounded Confidence Models, Analysis, and Simulation’, Journal of Artificial Societies and Social Simulation, 5(3), June, article 2, <http://jasss.soc.surrey.ac.uk/5/3/2.html>.

Salzarulo, Laurent (2006) ‘A Continuous Opinion Dynamics Model Based on the Principle of Meta-Contrast’, Journal of Artificial Societies and Social Simulation, 9(1), January, article 13, <http://jasss.soc.surrey.ac.uk/9/1/13.html>.

Serra-Garcia, Marta and Gneezy, Uri (2021) ‘Nonreplicable Publications are Cited More Than Replicable Ones’, Science Advances, 7, 21 May, article eabd1705. doi:10.1126/sciadv.abd1705

Sobkowicz, Pawel (2009) ‘Modelling Opinion Formation with Physics Tools: Call for Closer Link with Reality’, Journal of Artificial Societies and Social Simulation, 12(1), January, article 11, <http://jasss.soc.surrey.ac.uk/12/1/11.html>.

Urbig, Diemo, Lorenz, Jan and Herzberg, Heiko (2008) ‘Opinion Dynamics: The Effect of the Number of Peers Met at Once’, Journal of Artificial Societies and Social Simulation, 11(2), March, article 4, <http://jasss.soc.surrey.ac.uk/11/2/4.html>.

© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

Content

Today We Have Naming Of Parts: A Possible Way Out Of Some Terminological Problems With ABM

January 11, 2022 thesubmissionauthor 2 Comments

By Edmund Chattoe-Brown

Today we have naming of parts. Yesterday,
We had daily cleaning. And tomorrow morning,
We shall have what to do after firing. But to-day,
Today we have naming of parts. Japonica
Glistens like coral in all of the neighbouring gardens,
And today we have naming of parts.
(Naming of Parts, Henry Reed, 1942)

It is not difficult to establish by casual reading that there are almost as many ways of using crucial terms like calibration and validation in ABM as there are actual instances of their use. This creates several damaging problems for scientific progress in the field. Firstly, when two different researchers both say they “validated” their ABMs they may mean different specific scientific activities. This makes it hard for readers to evaluate research generally, particularly if researchers assume that it is obvious what their terms mean (rather than explaining explicitly what they did in their analysis). Secondly, based on this, each researcher may feel that the other has not really validated their ABM but has instead done something to which a different name should more properly be given. This compounds the possible confusion in debate. Thirdly, there is a danger that researchers may rhetorically favour (perhaps unconsciously) uses that, for example, make their research sound more robustly empirical than it actually is. For example, validation is sometimes used to mean consistency with stylised facts (rather than, say, correspondence with a specific time series according to some formal measure). But we often have no way of telling what the status of the presented stylised facts is. Are they an effective summary of what is known in a field? Are they the facts on which most researchers agree or for which the available data presents the clearest picture? (Less reputably, can readers be confident that they were not selected for presentation because of their correspondence?) Fourthly, because these terms are used differently by different researchers it is possible that valuable scientific activities that “should” have agreed labels will “slip down the terminological cracks” (either for the individual or for the ABM community generally). Apart from clear labels avoiding confusion for others, they may help to avoid confusion for you too!

But apart from these problems (and there may be others but these are not the main thrust of my argument here) there is also a potential impasse. There simply doesn’t seem to be any value in arguing about what the “correct” meaning of validation (for example) should be. Because these are merely labels there is no objective way to resolve this issue. Further, even if we undertook to agree the terminology collectively, each individual would tend to argue for their own interpretation without solid grounds (because there are none to be had) and any collective decision would probably therefore be unenforceable. If we decide to invent arbitrary new terminology from scratch we not only run the risk of adding to the existing confusion of terms (rather than reducing it) but it is also quite likely that everyone will find the new terms unhelpful.

Unfortunately, however, we probably cannot do without labels for these scientific activities involved in quality controlling ABMs. If we had to describe everything we did without any technical shorthand, presenting research might well become impossibly unwieldy.

My proposed solution is therefore to invent terms from scratch (so we don’t end up arguing about our different customary usages to no purpose) but to do so on the basis of actual scientific practices reported in published research. For example, we might call the comparison of corresponding real and simulated data (which at least has the endorsement of the much used Gilbert and Troitzsch 2005 – see pp. 15-19 – to be referred to as validation) CORAS – Comparison Of Real And Simulated. Similarly, assigning values to parameters given the assumptions of model “structures” might be called PANV – Parameters Assigned Numerical Values.

It is very important to be clear what the intention is here. Naming cannot solve scientific problems or disagreements. (Indeed, failure to grasp this may well be why our terminology is currently so muddled as people try to get their different positions through “on the nod”.) For example, if we do not believe that correspondence with stylised facts and comparison measures on time series have equivalent scientific status then we will have to agree distinct labels for them and have the debate about their respective value separately. Perhaps the former could be called COSF – Comparison Of Stylised Facts. But it seems plainly easier to describe specific scientific activities accurately and then find labels for them than to have to wade through the existing marsh of ambiguous terminology and try to extract the associated science. An example of a practice which does not seem to have even one generally agreed label (and therefore seems to be neglected in ABM as a practice) is JAMS – Justifying A Model Structure. (Why are your agents adaptive rather than habitual or rational? Why do they mix randomly rather than in social networks?)

Obviously, there still needs to be community agreement for such a convention to be useful (and this may need to be backed institutionally for example by reviewing requirements). But the logic of the approach avoids several existing problems. Firstly, while the labels are useful shorthand, they are not arbitrary. Each can be traced back to a clearly definable scientific practice. Secondly, this approach steers a course between the Scylla of fruitless arguments from current muddled usage and the Charybdis of a novel set of terminology that is equally unhelpful to everybody. (Even if people cannot agree on labels, they knew how they built and evaluated their ABMs so they can choose – or create – new labels accordingly.) Thirdly, the proposed logic is extendable. As we clarify our thinking, we can use it to label (or improve the labels of) any current set of scientific practices. We will do not have to worry that we will run out of plausible words in everyday usage.

Below I suggest some more scientific practices and possible terms for them. (You will see that I have also tried to make the terms as pronounceable and distinct as possible.)

Practice	Term
Checking the results of an ABM by building another.[1]	CAMWA (Checking A Model With Another).
Checking ABM code behaves as intended (for example by debugging procedures, destructive testing using extreme values and so on).	TAMAD (Testing A Model Against Description).
Justifying the structure of the environment in which agents act.	JEM (Justifying the Environment of a Model): This is again a process that may pass unnoticed in ABM typically. For example, by assuming that agents only consider ethnic composition, the Schelling Model (Schelling 1969, 1971) does not “allow” locations to be desirable because, for example, they are near good schools. This contradicts what was known empirically well before (see, for example, Rossi 1955) and it isn’t clear whether simply saying that your interest is in an “abstract” model can justify this level of empirical neglect.
Finding out what effect parameter values have on ABM behaviour.	EVOPE (Exploring Value Of Parameter Effects).
Exploring the sensitivity of an ABM to structural assumptions not justified empirically (see Chattoe-Brown 2021).	ESOSA (Exploring the Sensitivity Of Structural Assumptions).

Clearly this list is incomplete but I think it would be more effective if characterising the scientific practices in existing ABM and naming them distinctively was a collective enterprise.

Acknowledgements

This research is funded by the project “Towards Realistic Computational Models Of Social Influence Dynamics” (ES/S015159/1) funded by ESRC via ORA Round 5 (PI: Professor Bruce Edmonds, Centre for Policy Modelling, Manchester Metropolitan University: https://gtr.ukri.org/projects?ref=ES%2FS015159%2F1).

Notes

[1] It is likely that we will have to invent terms for subcategories of practices which differ in their aims or warranted conclusions. For example, rerunning the code of the original author (CAMWOC – Checking A Model With Original Code), building a new ABM from a formal description like ODD (CAMUS – Checking A Model Using Specification) and building a new ABM from the published description (CAMAP – Checking A Model As Published, see Chattoe-Brown et al. 2021).

References

Chattoe-Brown, Edmund (2021) ‘Why Questions Like “Do Networks Matter?” Matter to Methodology: How Agent-Based Modelling Makes It Possible to Answer Them’, International Journal of Social Research Methodology, 24(4), pp. 429-442. doi:10.1080/13645579.2020.1801602

Chattoe-Brown, Edmund, Gilbert, Nigel, Robertson, Duncan A. and Watts Christopher (2021) ‘Reproduction as a Means of Evaluating Policy Models: A Case Study of a COVID-19 Simulation’, medRXiv, 23 February. doi:10.1101/2021.01.29.21250743

Gilbert, Nigel and Troitzsch, Klaus G. (2005) Simulation for the Social Scientist, second edition (Maidenhead: Open University Press).

Rossi, Peter H. (1955) Why Families Move: A Study in the Social Psychology of Urban Residential Mobility (Glencoe, IL, Free Press).

Schelling, Thomas C. (1969) ‘Models of Segregation’, American Economic Review, 59(2), May, pp. 488-493. (available at https://www.jstor.org/stable/1823701)

Chattoe-Brown, E. (2022) Today We Have Naming Of Parts: A Possible Way Out Of Some Terminological Problems With ABM. Review of Artificial Societies and Social Simulation, 11th January 2022. https://rofasss.org/2022/01/11/naming-of-parts/

© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

Content

Challenges and opportunities in expanding ABM to other fields: the example of psychology

December 20, 2021 thesubmissionauthor Leave a comment

By Dino Carpentras

Centre for Social Issues Research, Department of Psychology, University of Limerick

The loop of isolation

One of the problems discussed during the last public meeting of the European Social Simulation Association (ESSA) at the Social Simulation Conference 2021 was the problem of reaching different communities outside the ABM one. This is a serious problem as we are risking getting trapped in a vicious cycle of isolation.

The cycle can be explained as follows. (a) Many fields are not familiar with ABM methods and standards. This results in the fact that (b) both reviewers and editors will struggle in understanding and evaluating the quality of an ABM paper. In general, this translates in a higher rejection rate and way longer time before publication. As results (c) fewer researchers in ABM will be willing to send their work to other communities, and, in general, fewer ABM works will be published in journals of other communities. Fewer articles using ABM makes it such that (d) fewer people would be aware of ABM, understand their methods and standards and even consider it an established research method.

Another point to consider is that, as time passes, each field evolves and develops new standards and procedures. Unfortunately, if two fields are not enough aware of each other, the new procedures will appear even more alien to members of the other community reinforcing the previously discussed cycle. A schematic of this is offered in figure 1.

fig1_v2

Figure 1: Vicious cycle of isolation

The challenge

Of course, a “brute force” solution would be to keep sending articles to journals in different fields until they get published. However, this would be extremely expensive in terms of time, and probably most researchers will not be happy of following this path.

A more elaborated solution could be framed as “progressively getting to know each other.” This would consist in modellers getting more familiar with the target community and vice versa. In this way, people from ABM would be able to better understand the jargon, the assumptions and even what is interesting enough to be the main result of a paper in a specific discipline. This would make it easier for members of our community to communicate research results using the language and methods familiar to the other field.

At the same time, researchers in the other field could slowly integrate ABM into their work, showing the potential of ABM and making it appear less alien to their peers. All of this would revert the previously discussed vicious cycle, by producing a virtuous one which would bring the two fields closer and closer.

Unfortunately, such goal cannot be obtained overnight, as it probably will require several events, collaborations, publications and probably several years (or even decades!). However, as result, our field would be familiar to and recognized by multiple other fields, enormously increasing the scientific impact of our research as well as the number of people working in ABM.

In this short communication, I would like to, firstly, highlight the importance and the challenges of reaching out other fields and, secondly, show a practical example with the field of psychology. I have chosen this field for no particular reason, besides the fact that I am currently working in the department of psychology. This gave me the opportunity of interacting with several researchers in this field.

In the next sections, I will summarize the main points of several informal discussions with these researchers. Specifically, I will try to highlight what they reported to be promising or interesting in ABM and also what felt alien or problematic to them.

Let me also stress that this does not want to be a complete overview, nor it should be thought as a summary of “what every psychologist think about ABM.” Instead, this is simply a summary of the discussions I had so far. What I hope, is that this will be at least a little useful to our community for building better connections with other fields.

The elephant in the room

Before moving to the list of comments on ABM I have collected, I want to address one point which appeared almost every time I discussed ABM with psychologists. Actually, it appeared almost every time I discuss ABM with people outside our field. This is the problem of experiments and validation.

I know there was recently a massive discussion on the SimSoc mailing list on opinion dynamics and validation, and this discussion will probably continue. Therefore, I am not going to discuss if all models should be tested, if a validated model should be considered superior, etc. Indeed, I do not want to discuss at all if validation should be considered important within our community. Instead, I want to discuss how important this is while interacting with other communities.

Indeed, many other fields give empirical data and validation a key role, having even developed different methods to test the quality of a hypothesis or a model when comparing it to empirical data (e.g. calculation of p-value, Krishnaiah 1980). Also, I repeatedly experienced disappointment or even mockery when I explained to non-ABM people that the model I was explaining them about was not empirically validated (e.g. the Deffuant model of opinion dynamics). In one single case, I even had a person laughing at me for this.

Unfortunately, many people which are not familiar with ABM end up considering it almost like a “nice exercise,” and even “not a real science.” This could be extremely dangerous for our field. Indeed, if multiple researchers will start thinking of ABM as a lesser science, communication with other fields – as well as obtaining funding for research – would get exponentially harder for our community.

Also, please, let me stress again to not “confuse the message with the messenger.” Here, I am not claiming that an unvalidated model should be considered inferior, or anything like that. What I am saying is that many people outside our field think in a similar fashion and this may eventually turn into a way bigger problem for us.

I will further discuss this point in the conclusion section, however, I will not claim that we should get rid of “pure models,” or that every model should be validated. What I will claim is that we should promote more empirical works as they will allow us to interact more easily with other fields.

Further points

In this section, I have collected (in no particular order) different comments and suggestions I have received from psychologist on the topic ABM. All of them had at least some experience of working side to side with a researcher developing ABMs.

Also in this case, please, remember that this are not my claims, but feedbacks I received. Furthermore, they should not be analysed as “what ABM is,” but more as “how ABM may look like to people in another field.”

Some psychologists showed interest in the possibility of having loops in ABMs, which allow for relationships which go beyond simple cause and effect. Indeed, several models in psychology are structured in the form of “parameter X influences parameter Y” (and Y cannot influence X, forming a loop). While this approach is very common in psychology, many researchers are not satisfied with it, making ABMs are a very good opportunity for the development of more realistic models.
Some psychologists said that at first impact, ABM looks very interesting. However, the extensive use of equations can confuse or even scare people who are not very used to them.
Some praised Schelling’s model (Schelling 1971). Especially the approach of developing a hypothesis and then using an ABM to falsify it.
Some criticized that often is not clear what an ABM should be used for or what such a model “is telling us.”
Similarly, the use of models with a big number of parameters was criticized as “[these models] can eventually produce any result.”
Another confusion that appeared multiple times was that often it is not clear if the model should be analysed and interpreted at the individual level (e.g. agents which start from state A often end up in state B) or at the more global level (e.g. distribution A results in distribution B).
Another major complaint was that psychological measures are nominal or ordinal, while many models suppose interval-like variables.
Another criticism was based on the fact that often agents behave all in the same way without including personal differences.
In psychology there is a lot of attention on the sample size and if this is big enough to produce significant results. Some stressed that in many ABM works it is often not clear if the sample size (i.e. the number of agents) is sufficient for supporting the analysis.

Conclusion

I would like to stress again that these comments are not supposed to represent the thoughts of every psychologist, nor that I am suggesting that all the ABM literature should adapt to them or that they are always correct. For example, to my personal opinion, point 5 and 8 are pushing towards opposite directions; one aiming at simpler models and the other pushing towards complexity. Similarly, I do not think we should decrease the number of equations in our works to meet point 2. However, I think we should consider these feedbacks when planning interactions with the psychology community.

As mentioned before, a crucial role when interacting with other communities is played by experiments and validations. Even points 6 and especially points 7 and 9 suggest how member of this community often try to look for 1-to-1 relationships between agents of simulations and people in the real world.

fig2

Figure 2: (left) Empirical ABM acting as a bridge between theoretical ABM and other research fields. (Right) as the relationship between ABM and the other field matures, people become familiar with ABM standards and a direct link to theoretical ABM can be established.

As suggested by someone during the already mentioned discussion in the SimSoc mailing list, this could be solved by introducing a new figure (or, equivalently, a new research field) dedicated to empirical work in ABM. Following this solution, theoretical modellers could keep developing models without having to worry about validation. This would be similar to the work carried out by theoretical researchers in physics. At the same time, we would have also a stream of research dedicated to “experimental ABM.” People working on this topic will further explore the connection between models and the empirical world through experiments and validation processes. Of course, the two should not be mutually exclusive, as a researcher (or a piece of research) may still fall in both categories. However, having this distinction may help in giving more space to empirical work.

I believe that the role of experimental ABM could be crucial for developing good interactions between ABM and other communities. Indeed, this type of research could be accepted much more easily by other communities, producing better interactions with ABM. Especially, mentioning experiments and validation, could strongly decrease the initial mistrust that many people show when discussing ABM. Furthermore, as ABM develops stronger connections with another field, and our methods and standards become more familiar, we would probably also observe more people from the other community which would start looking into more theoretical ABM approaches and what-if scenarios (see fig 2).

References

Krishnaiah, P. R. (Ed.). (1980). A Hand Book of Statistics (Vol. 1). Motilal Banarsidass Publishe.

Schelling, T. C. (1971). Dynamic models of segregation. Journal of Mathematical Sociology, 1(2), 143-186.

Edmonds, B. and Moss, S. (2005) From KISS to KIDS – an ‘anti-simplistic’ modelling approach. In P. Davidsson et al. (Eds.): Multi Agent Based Simulation 2004. Springer, Lecture Notes in Artificial Intelligence, 3415:130–144.

Carpentras, D. (2020) Challenges and opportunities in expanding ABM to other fields: the example of psychology. Review of Artificial Societies and Social Simulation, 20th December 2021. https://rofasss.org/2021/12/20/challenges/

Content

Benefits of Open Research in Social Simulation: An Early-Career Researcher’s Perspective

November 23, 2021 thesubmissionauthor Leave a comment

By Hyesop Shin

Research Associate at the School of Geographical and Earth Sciences, University of Glasgow, UK

In March 2017, in my first year of PhD, I attended a talk at the Microsoft Research Lab in Cambridge UK. It was about the importance of reproducibility and replicability in science. Inspired by the talk, I redesigned my research beyond my word processer and hard disk to open repositories and social media. Through my experience, there have been some challenges to learn other people’s work and replicate them to my project, but I found it more beneficial to share my problem and solutions for other people who may have encountered the same problem.

Having spoken to many early career researchers (ECRs) regarding the need for open science, specifically whether sharing codes is essential, the consensus was that it was not an essential component for their degree. A few answered that they were too embarrassed to share their codes online because their codes were not well coded enough. I somewhat empathised with their opinions, but at the same time, would insist that open research can gain more benefits than shame.

I wrote this short piece to openly discuss the benefits of conducting open research and suggest some points that ECRs should keep in mind. During the writing, there are some screenshots taken from my PhD work (Shin, 2021). I conclude my writing by accommodating personal experiences or other thoughts that might give more insights to the audience.

Benefits of Aiming an Open Project

I argue here that being transparent and honest about your model development strengthens the credibility of the research. In doing so, my thesis shared the original data, the scripts with annotations that are downloadable and executable, and wiki pages to summarise the outcomes and interpretations (see Figure 1 for examples). This evidence enables scholars and technicians to visit the repository if they are interested in the source codes or outcomes. Also, people can comment if any errors or bugs are identified, or the model is not executing on their machine or may suggest alternative ways to tackle the same problem. Even during the development, many developers share their work via online repositories (e.g. Github, Gitlab), and social media to ask for advice. Agent-based models are mostly uploaded on CoMSeS.net (previously named OpenABM). All of this can improve the quality of research.

Figure 1 A screenshot of a Github page showing how open platforms can help other people to understand the outcomes step by step

More practically, one can learn new ideas by helping each other. If there was a technical issue that can’t be solved, the problem should not be kept hidden, but rather be opened and solved together with experts online and offline. Figure 2 is a pragmatic example of posing questions to a wide range of developers on Stackoverflow – an online community of programmers to share and build codes. Providing my NetLogo codes, I asked how to send an agent group from one location to the other. The anonymous person, whose ID was JenB, kindly responded to me with a new set of codes, which helped me structure the codes more effectively.

Figure 2 Raising a question about sending agents from one location to another in NetLogo

Another example was about the errors I had encountered whilst I was running NetLogo with an R package “nlrx” (Salecker et al., 2019). Here, R was used as a compiler to submit iterative NetLogo jobs on the HPC (High Performance Computing) cluster to improve the execution speed. However, much to my surprise, I received error messages due to early terminations of failed HPC jobs. Not knowing what to do, I posed a question to the developer of the package (see Figure 3) and luckily got a response that the R ecosystem stores all the assigned objects in the RAM, but even with gigabytes of RAM, it struggles to write 96,822 patches over 8764 ticks on a spreadsheet.

Stackoverflow has kindly informed that NetLogo has a memory ceiling of 1GB[i] and keeps each run in the memory before it shuts down. Thus, if the model is huge and requires several iterations, then it is more likely that the execution speed will decrease after a few iterations. Before this information was seen, it was not understood why the model took 1 hour 20 minutes to finish the first run but struggled to maintain that speed on the twentieth run. Hence, sharing technical obstacles that occur in the middle of research can save a lot of time even for those who are contemplating similar research.

Figure 3 Comments posted on an online repository regarding the memory issue that NetLogo and R encountered

The Future for Open Research

For future quantitative studies in social simulation, this paper suggests students and researchers in their early careers should acclimatise themselves to using open-source platforms to conduct sustainable research. As clarity, conciseness, and coherence are featured as the important C’s for writing skills, good programming should take into consideration the following points.

First is clarity and conciseness (C&C). Here, clarity means that the scripts should be neatly documented. The computer does not know whether the codes are dirty or neat, it only cares whether it is syntactically correct, but it matters when other people attempt to understand the task. If the outcome produces the same results, it is always better to write clearer and simpler codes for other people and future upgrades. Thus, researchers should refer to other people’s work and learn how to code effectively. Another way to maintain clarity in coding is to keep descriptive and distinctive names for new variables. This statement might seem contradictory to the conciseness issue, but this is important as one of the common mistakes users make is to assign variables with abstract names such as LP1, LP2…LP10, which seems clear and concise for the model builder, but is even harder for the others when reviewing the code. The famous quote from Einstein, “Everything should be made as simple as possible, but not simpler.” is the appropriate phrase that model builders should always keep in mind. Hence, instead of coding LP9, names such as LandPriceIncreaseRate2009 (camel cases) or landprice_incrate_2009 (snake cases) can be more effective for the reviewers to understand the model.

Second is reproducibility and replicability (R&R). To be reproducible and replicable, initially, no errors should occur when others execute the script, and possible errors or bugs should be reported. It will also be more useful to document the libraries and the dependencies required. This is quite important as different OSs (operating systems) have different behaviours to install packages. For instance, the sf package in R has slightly different ways to install the package between OSs where Windows and MacOSX can be installed from the binary package while Linux needs to separately install GDAL (to read and write vector and raster data), Proj (which deals with projection), and GEOS (which provides geospatial functions) prior to the package installation. Finally, it would be very helpful if unit testing is included in the model. While R and Python provide splendid examples in their vignettes, NetLogo remains to offer the library models but goes no further than that. Offering unit testing examples can give a better understanding when the whole model is too complicated for others to comprehend. It can also give the impression that the modeller has full control of the model because without the unit test the verification process becomes error-prone. The good news is that NetLogo has most recently released the Beginner’s Interactive Dictionary with friendly explanations with videos and code examples[ii].

Third is to maintain version control. In terms of sustainability, researchers should be aware of software maintenance. Much programming software relies on libraries and packages that are built on a particular version. If the software is upgraded and no longer accepts the previous versions, then the package developers need to keep updating to run it on a new version. For example, NetLogo 6.0 experienced a significant change compared to versions 5.X. The biggest change was the replacement of tasks[iii] by anonymous procedures (Wilensky, 1999). This means that tasks are no longer primitives but are converted to arrow syntax. For example, if there is a list of [a b c], the previous task is asked to add the first, second, and third element as foreach [a b c] [ ?a+?b+?c ], while the new version does the same job as foreach [a b c][ add_all → a + b + c]. If the models haven’t converted to a new version it can be viewable as a read-only model but can’t be executed. Other geospatial packages in R such as rgdal and sf, have also struggled whenever a major update was made on their own packages or on the R version itself due to a lot of dependencies. Even ArcGIS, a UI (User Interface) software, had issues when they upgraded it from version 9.3 to 10. The projects that were coded under the VBA script in 9.3 were broken because it was not recognised as a correct function in the new version based on Python. This is also another example that backward compatibility and deprecation mechanisms are important.

Lastly, for more advanced users, it is also recommended to use a collaborative platform that executes every result from the codes with the exact version. One of the platforms is Codeocean. The Nature research team has recently chosen the platform to peer-review the codes (Perkel, 2019). The Nature editors and peer-reviewers strongly believed that coding has become a norm across many disciplines, and hence have asserted that the model process including the quality of data, conciseness, reproducibility, and documentation of the model should be placed as a requirement. Although the training procedure can be difficult at first, it will lead researchers to conduct themselves with more responsibility.

Looking for Opinions

With the advent of the era of big data and data science where people collaborate online and the ‘sharing is caring’ atmosphere has become a norm (Arribas-Bel et al., 2021; Lovelace, 2021), I insist that open research should no longer be an option. However, one may argue that although open research is by far an excellent model that can benefit many of today’s projects, there are certain types of risks that might concern ECRs such as intellectual property issues, code quality and technical security. Thus, if you have had different opinions regarding this issue, or simply favour adding your experiences during your PhD in social simulation, please add your thoughts via a thread.

Notes

[i] http://ccl.northwestern.edu/netlogo/docs/faq.html#how-big-can-my-model-be-how-many-turtles-patches-procedures-buttons-and-so-on-can-my-model-contain

[ii] https://ccl.northwestern.edu/netlogo/bind/

[iii] Tasks can be equations, x + y, or a set of lists [1 2 3 4 5]

References

Arribas-Bel, D., Alvanides, S., Batty, M., Crooks, A., See, L., & Wolf, L. (2021). Urban data/code: A new EP-B section. Environment and Planning B: Urban Analytics and City Science, 23998083211059670. https://doi.org/10.1177/23998083211059670

Lovelace, R. (2021). Open source tools for geographic analysis in transport planning. Journal of Geographical Systems, 23(4), 547–578. https://doi.org/10.1007/s10109-020-00342-2

Perkel, J. M. (2019). Make code accessible with these cloud services. Nature, 575(7781), 247. https://doi.org/10.1038/d41586-019-03366-x

Salecker, J., Sciaini, M., Meyer, K. M., & Wiegand, K. (2019). The nlrx r package: A next-generation framework for reproducible NetLogo model analyses. Methods in Ecology and Evolution, 10(11), 1854–1863. https://doi.org/10.1111/2041-210X.13286

Shin, H. (2021). Assessing Health Vulnerability to Air Pollution in Seoul Using an Agent-Based Simulation. University of Cambridge. https://doi.org/https://doi.org/10.17863/CAM.65615

Wilensky, U. (1999). Netlogo. Northwestern University: Evanston, IL, USA. https://ccl.northwestern.edu/netlogo/

Shin, H. (2021) Benefits of Open Research in Social Simulation: An Early-Career Researcher’s Perspective. Review of Artificial Societies and Social Simulation, 24th Nov 2021. https://rofasss.org/2021/11/23/benefits-open-research/

Content

Reply to Frank Dignum

November 10, 2021 thesubmissionauthor Leave a comment

By Edmund Chattoe-Brown

This is a reply to Frank Dignum’s reply (about Edmund Chattoe-Brown’s review of Frank’s book)

As my academic career continues, I have become more and more interested in the way that people justify their modelling choices, for example, almost every Agent-Based Modeller makes approving noises about validation (in the sense of comparing real and simulated data) but only a handful actually try to do it (Chattoe-Brown 2020). Thus I think two specific statements that Frank makes in his response should be considered carefully:

“… we do not claim that we have the best or only way of developing an Agent-Based Model (ABM) for crises.” Firstly, negative claims (“This is not a banana”) are not generally helpful in argument. Secondly, readers want to know (or should want to know) what is being claimed and, importantly, how they would decide if it is true “objectively”. Given how many models sprang up under COVID it is clear that what is described here cannot be the only way to do it but the question is how do we know you did it “better?” This was also my point about institutionalisation. For me, the big lesson from COVID was how much the automatic response of the ABM community seems to be to go in all directions and build yet more models in a tearing hurry rather than synthesise them, challenge them or test them empirically. I foresee a problem both with this response and our possible unwillingness to be self-aware about it. Governments will not want a million “interesting” models to choose from but one where they have externally checkable reasons to trust it and that involves us changing our mindset (to be more like climate modellers for example, Bithell & Edmonds 2020). For example, colleagues and I developed a comparison methodology that allowed for the practical difficulties of direct replication (Chattoe-Brown et al. 2021).
The second quotation which amplifies this point is: “But we do think it is an extensive foundation from which others can start, either picking up some bits and pieces, deviating from it in specific ways or extending it in specific ways.” Again, here one has to ask the right question for progress in modelling. On what scientific grounds should people do this? On what grounds should someone reuse this model rather than start their own? Why isn’t the Dignum et al. model built on another “market leader” to set a good example? (My point about programming languages was purely practical not scientific. Frank is right that the model is no less valid because the programming language was changed but a version that is now unsupported seems less useful as a basis for the kind of further development advocated here.)

I am not totally sure I have understood Frank’s point about data so I don’t want to press it but my concern was that, generally, the book did not seem to “tap into” relevant empirical research (and this is a wider problem that models mostly talk about other models). It is true that parameter values can be adjusted arbitrarily in sensitivity analysis but that does not get us any closer to empirically justified parameter values (which would then allow us to attempt validation by the “generative methodology”). Surely it is better to build a model that says something about the data that exists (however imperfect or approximate) than to rely on future data collection or educated guesses. I don’t really have the space to enumerate the times the book said “we did this for simplicity”, “we assumed that” etc. but the cumulative effect is quite noticeable. Again, we need to be aware of the models which use real data in whatever aspects and “take forward” those inputs so they become modelling standards. This has to be a collective and not an individualistic enterprise.

References

Bithell, M. and Edmonds, B. (2020) The Systematic Comparison of Agent-Based Policy Models – It’s time we got our act together!. Review of Artificial Societies and Social Simulation, 11th May 2021. https://rofasss.org/2021/05/11/SystComp/

Chattoe-Brown, E. (2020) A Bibliography of ABM Research Explicitly Comparing Real and Simulated Data for Validation. Review of Artificial Societies and Social Simulation, 12th June 2020. https://rofasss.org/2020/06/12/abm-validation-bib/

Chattoe-Brown, E. (2021) A review of “Social Simulation for a Crisis: Results and Lessons from Simulating the COVID-19 Crisis”. Journal of Artificial Society and Social Simulation. 24(4). https://www.jasss.org/24/4/reviews/1.html

Chattoe-Brown, E., Gilbert, N., Robertson, D. A., & Watts, C. J. (2021). Reproduction as a Means of Evaluating Policy Models: A Case Study of a COVID-19 Simulation. medRxiv 2021.01.29.21250743; DOI: https://doi.org/10.1101/2021.01.29.21250743

Dignum, F. (2020) Response to the review of Edmund Chattoe-Brown of the book “Social Simulations for a Crisis”. Review of Artificial Societies and Social Simulation, 4th Nov 2021. https://rofasss.org/2021/11/04/dignum-review-response/

Dignum, F. (Ed.) (2021) Social Simulation for a Crisis: Results and Lessons from Simulating the COVID-19 Crisis. Springer. DOI:10.1007/978-3-030-76397-8

Chattoe-Brown, E. (2021) Reply to Frank Dignum. Review of Artificial Societies and Social Simulation, 10th November 2021. https://rofasss.org/2021/11/10/reply-to-dignum/

Content

Response to the review of Edmund Chattoe-Brown of the book “Social Simulations for a Crisis”

November 4, 2021 thesubmissionauthor Leave a comment

By Frank Dignum

This is a reply to a review in JASSS (Chattoe-Brown 2021) of (Dignum 2021).

Before responding to some of the specific concerns of Edmund I would like to thank him for the thorough review. I am especially happy with his conclusion that the book is solid enough to make it a valuable contribution to scientific progress in modelling crises. That was the main aim of the book and it seems that is achieved. I want to reiterate what we already remarked in the book; we do not claim that we have the best or only way of developing an Agent-Based Model (ABM) for crises. Nor do we claim that our simulations were without limitations. But we do think it is an extensive foundation from which others can start, either picking up some bits and pieces, deviating from it in specific ways or extending it in specific ways.

The concerns that are expressed by Edmund are certainly valid. I agree with some of them, but will nuance some others. First of all the concern about the fact that we seem to abandon the NetLogo implementation and move to Repast. This fact does not make the ABM itself any less valid! In itself it is also an important finding. It is not possible to scale such a complex model in NetLogo beyond around two thousand agents. This is not just a limitation of our particular implementation, but a more general limitation of the platform. It leads to the important challenge to get more computer scientists involved to develop platforms for social simulations that both support the modelers adequately and provide efficient and scalable implementations.

That the sheer size of the model and the results make it difficult to trace back the importance and validity of every factor on the results is completely true. We have tried our best to highlight the most important aspects every time. But, this leaves questions as to whether we make the right selection of highlighted aspects. As an illustration to this, we have been busy for two months to justify our results of the simulations of the effectiveness of the track and tracing apps. We basically concluded that we need much better integrated analysis tools in the simulation platform. NetLogo is geared towards creating one simulation scenario, running the simulation and analyzing the results based on a few parameters. This is no longer sufficient when we have a model with which we can create many scenarios and have many parameters that influence a result. We used R now to interpret the flood of data that was produced with every scenario. But, R is not really the most user friendly tool and also not specifically meant for analyzing the data from social simulations.

Let me jump to the third concern of Edmund and link it to the analysis of the results as well. While we tried to justify the results of our simulation on the effectiveness of the track and tracing app we compared our simulation with an epidemiological based model. This is described in chapter 12 of the book. Here we encountered the difference in assumed number of contacts per day a person has with other persons. One can take the results, as quoted by Edmund as well, of 8 or 13 from empirical work and use them in the model. However, the dispute is not about the number of contacts a person has per day, but what counts as a contact! For the COVID-19 simulations standing next to a person in the queue in a supermarket for five minutes can count as a contact, while such a contact is not a meaningful contact in the cited literature. Thus, we see that what we take as empirically validated numbers might not at all be the right ones for our purpose. We have tried to justify all the values of parameters and outcomes in the context for which the simulations were created. We have also done quite some sensitivity analyses, which we did not all report on just to keep the volume of the book to a reasonable size. Although we think we did a proper job in justifying all results, that does not mean that one can have different opinions on the value that some parameters should have. It would be very good to check the influence on the results of changes in these parameters. This would also progress scientific insights in the usefulness of complex models like the one we made!

I really think that an ABM crisis response should be institutional. That does not mean that one institution determines the best ABM, but rather that the ABM that is put forward by that institution is the result of a continuous debate among scientists working on ABM’s for that type of crisis. For us, one of the more important outcomes of the ASSOCC project is that we really need much better tools to support the types of simulations that are needed for a crisis situation. However, it is very difficult to develop these tools as a single group. A lot of the effort needed is not publishable and thus not valued in an academic environment. I really think that the efforts that have been put in platforms such as NetLogo and Repast are laudable. They have been made possible by some generous grants and institutional support. We argue that this continuous support is also needed in order to be well equipped for a next crisis. But we do not argue that an institution would by definition have the last word in which is the best ABM. In an ideal case it would accumulate all academic efforts as is done in the climate models, but even more restricted models would still be better than just having a thousand individuals all claiming to have a useable ABM while governments have to react quickly to a crisis.

The final concern of Edmund is about the empirical scale of our simulations. This is completely true! Given the scale and details of what we can incorporate we can only simulate some phenomena and certainly not everything around the COVID-19 crisis. We tried to be clear about this limitation. We had discussions about the Unity interface concerning this as well. It is in principle not very difficult to show people walking in the street, taking a car or a bus, etc. However, we decided to show a more abstract representation just to make clear that our model is not a complete model of a small town functioning in all aspects. We have very carefully chosen which scenarios we can realistically simulate and give some insights in reality from. Maybe we should also have discussed more explicitly all the scenarios that we did not run with the reasons why they would be difficult or unrealistic in our ABM. One never likes to discuss all the limitations of one’s labor, but it definitely can be very insightful. I have made up for this a little bit by submitting an to a special issue on predictions with ABM in which I explain in more detail, which should be the considerations to use a particular ABM to try to predict some state of affairs. Anyone interested to learn more about this can contact me.

To conclude this response to the review, I again express my gratitude for the good and thorough work done. The concerns that were raised are all very valuable to concern. What I tried to do in this response is to highlight that these concerns should be taken as a call to arms to put effort in social simulation platforms that give better support for creating simulations for a crisis.

References

Dignum, F. (Ed.) (2021) Social Simulation for a Crisis: Results and Lessons from Simulating the COVID-19 Crisis. Springer. DOI:10.1007/978-3-030-76397-8

Dignum, F. (2020) Response to the review of Edmund Chattoe-Brown of the book “Social Simulations for a Crisis”. Review of Artificial Societies and Social Simulation, 4th Nov 2021. https://rofasss.org/2021/11/04/dignum-review-response/

By Bruce Edmonds1, Dino Carpentras2, Nick Roxburgh3, Edmund Chattoe-Brown4 and Gary Polhill3

Models and Generality

Models used as analogies

Models that relate directly to empirical data

Why this matters

Conclusion

Notes

Acknowledgements

References

By Edmund Chattoe-Brown

Introduction

Method and Results

Analysis and Conclusions

Acknowledgements

References

By Marijn A. Keijzer

Data, code and supplementary analyses

Notes

Acknowledgements

References

By Bruce Edmonds

Vagueness and refutation

The dangers of suggestivist modelling

How to recognise a suggestivist model

How to avoid suggestivist modelling

Conclusion

Notes

Acknowledgements

References

By Edmund Chattoe-Brown

Acknowledgements

Notes

References

By Edmund Chattoe-Brown

Acknowledgements

Notes

References

By Dino Carpentras

References

By Hyesop Shin

Benefits of Aiming an Open Project

The Future for Open Research

Looking for Opinions

Notes

References

By Edmund Chattoe-Brown

References

By Frank Dignum

References

For discussion about social simulation research

By Bruce Edmonds¹, Dino Carpentras², Nick Roxburgh³, Edmund Chattoe-Brown⁴ and Gary Polhill³