Tag Archives: Empirical Agent-Based Modelling

Why we are failing at connecting opinion dynamics to the empirical world

By Dino Carpentras

ETH Zürich – Department of Humanities, Social and Political Sciences (GESS)

The big mystery

Opinion dynamics (OD) is field dedicated to studying the dynamic evolution of opinions and is currently facing some extremely cryptic mysteries. Since 2009 there have been multiple calls for OD models to be strongly grounded in empirical data (Castellano et al., 2009, Valori et al., 2012, Flache et al., 2017; Dong et al., 2018), however the number of articles moving in this direction is still extremely limited. This is especially puzzling when compared with the increase in the number of publications in this field (see Fig 1). Another surprising issue, which extends also beyond OD, is that validated models are not cited as often as we would expect them to be (Chattoe-Brown, 2022; Kejjzer, 2022).

Some may argue that this could be explained by a general lack of people interested in the empirical side of opinion dynamics. However, the World seems in desperate need of empirically grounded OD models that could help us in shape policies on topics such as vaccination and climate change. Thus, it is very surprising to see that almost nobody is interested in meeting such a big and pressing demand.

In this short piece, I will share my experience both as a writer and as a reviewer for empirical OD papers, as well as the information I gathered from discussions with other researchers in similar roles. This will help us understand much better what is going on in the world of empirical OD and, more in general, in the empirical parts of agent-based modelling (ABM) related to psychological phenomena.

fig 1 rofasss

Publications containing the term “opinion dynamics” in abstract or title. Total 2,527. Obtained from dimensions.ai

Theoretical versus empirical OD

The main issue I have noticed with works in empirical OD is that these papers do not conform to the standard framework of ABM papers. Indeed, in “classical” ABM we usually try to address research questions like:

  1. Can we develop a toy model to show how variables X and Y are linked?
  2. Can we explain some macroscopic phenomenon as the result of agents’ interaction?
  3. What happens to the outputs of a popular model if we add a new variable?

However, empirical papers do not fit into this framework. Indeed, empirical ABM papers ask questions such as:

  1. How accurate are the predictions made by a certain model when compared with data?
  2. How close is the micro-dynamic to the experimental data?
  3. How can we refine previous models to improve their predicting ability?

Unfortunately, many reviewers do not view the latter questions as genuine research inquiries, ending up in pushing the authors to modify their papers to meet the first set of questions.

For instance, my empirical works often receive the critique that “the research question is not clear”, even though the question was explicitly stated in the main text, abstract and even in the title (See, for example “Deriving An Opinion Dynamics Model From Experimental Data”, Carpentras et al. 2022). Similarly, once a reviewer acknowledged that the experiment presented in the paper was an interesting addition to it, but they requested me to demonstrate why it was useful. Notice that, also in this case, the paper was on developing a model from the dynamical behavior observed in an experiment; therefore, the experiment was not just “an add on”, but core of the paper. I also have reviewed some empirical OD papers where the authors are asked, by other reviewers, to showcase how their model informs us about the world in a novel way.

As we will see in a moment, this approach does not just make authors’ life harder, but it also generates a cascade of consequences on the entire field of opinion dynamics. But to better understand our world, let us move first to a fictitious scenario.

A quick tale of natural selection of researcher

Let us now imagine a hypothetical world where people have almost no knowledge of the principles of physics. However, to keep the thought experiment simple, let us also suppose they have already developed the peer-review process. Of course, this fictious scenario is far from being realistic, but it should still help us understand what is going on with empirical OD.

In this world, a scientist named Alice writes a paper suggesting that there is an upward force when objects enter water. She also shows that many objects can float on water, therefore “validating” her model. The community is excited about this new paper which took Alice 6 months to write.

Now, consider another scientist named Bob. Bob, inspired by Alice’s paper, in 6 months conducts a series of experiments demonstrating that when an object is submerged in water, it experiences an upward force that is proportional to its submerged volume. This pushes knowledge forward as Bob does not just claim that this force exists, but he shows how this force has some clear quantitative relationship to the volume of the object.

However, when reviewers read Bob’s work, they are unimpressed. They question the novelty of his research and fail to see the specific research question he is attempting to address. After all, Alice already showed that this force exists, so what is new in this paper? One of the reviewers suggests that Bob should show how his study may impact their understanding of the world.

As a result, Bob spends an additional six months to demonstrate that he could technically design a floating object made out of metal (i.e. a ship). He also describes the advantages for society if such an object was invented. Unfortunately, one of the reviewers is extremely skeptical as metal is known to be extremely heavy and should not float in water, and requests additional proof.

After multiple revisions, Bob’s work is eventually published. However, the publication process takes significantly longer than Alice’s work, and the final version of the paper addresses a variety of points, including empirical validation, the feasibility of constructing a metal boat, and evidence to support this claim. Consequently, the paper becomes densely technical, making it challenging for most people to read and understand.

At the end, Bob is left with a single paper which is hardly readable (and therefore citable), while Alice, in the meanwhile, published many other easier-to-read papers having a much bigger impact.

Solving the mystery of empirical opinion dynamics

The previous sections helped us in understanding the following points: (1) validation and empirical grounding are often not seen as a legitimate research goal by many members of the ABM community. (2) This leads to bigger struggle when trying to publish this kind of research, and (3) reviewers often try to push the paper into the more classic research questions, possibly resulting in a monster-paper which tries to address multiple points all at once. (4) This also generates lower readability and so less impact.

So to sum it up: empirical OD gives you the privilege of working much more to obtain way less. This, combined with the “natural selection” of the “publish or perish” explains the scarcity of publications in this field, as authors need either to adapt to more standard ABM formulas or to “perish.” I also personally know an ex-researcher who tried to publish empirical OD until he got fed up and left the field.

Some clarifications

Let me make clear that this is a bit of a simplification and that, of course, it is definitely possible to publish empirical work in opinion dynamics even without “perishing.” However, choosing this instead of the traditional ABM approach strongly enhances the difficulty. This is a little like running while carrying extra weight: it is still possible that you will win the race, but the weight strongly decreases the probability of this happening.

I also want to say that while here I am offering an explanation of the puzzles I presented, I do not claim that this is the only possible explanation. Indeed, I am sure that what I am offering here is only part of the full story.

Finally, I want to clarify that I do not believe anyone in the system has bad intentions. Indeed, I think reviewers are in good faith when suggesting empirically-oriented papers to take a more classical approach. However, even with good intentions, we are creating a lot of useless obstacles for an entire research field.

Trying to solve the problem

To address this issue, in the past I have suggested dividing ABM researchers into theoretical and empirically oriented (Carpentras, 2020). The division of research into two streams could help us in developing better standards for both developing toy models and for empirical ABMs.

To give you a practical example, my empirical ABM works usually receive long and detailed comments about the model properties and almost no comment on the nature of the experiment or data analysis. Am I that good in these last two steps? Or maybe reviewers in ABM focus very little on the empirical side of empirical ABMs? While the first explanation would be flattering for me, I am afraid that the reality is better depicted by the second option.

With this in mind, together with other members of the community, we have created a special interest group for Experimental ABM (see http://www.essa.eu.org/sig/sig-experimental-abm/). However, for this to be successful, we really need people to recognize the distinction between these two fields. We need to acknowledge that empirically-related research questions are still valid and not push papers towards the more classical approach.

I really believe empirical OD will raise, but how this will happen is still to decide. Will it be at the cost of many researchers facing bigger struggle or will we develop a more fertile environment? Or maybe some researchers will create an entire new niche outside of the ABM community? The choice is up to us!


Carpentras, D., Maher, P. J., O’Reilly, C., & Quayle, M. (2022). Deriving An Opinion Dynamics Model From Experimental Data. Journal of Artificial Societies & Social Simulation, 25(4). https://www.jasss.org/25/4/4.html

Carpentras, D. (2020) Challenges and opportunities in expanding ABM to other fields: the example of psychology. Review of Artificial Societies and Social Simulation, 20th December 2021. https://rofasss.org/2021/12/20/challenges/

Castellano, C., Fortunato, S., & Loreto, V. (2009). Statistical physics of social dynamics. Reviews of modern physics, 81(2), 591. DOI: 10.1103/RevModPhys.81.591

Chattoe-Brown, E. (2022). If You Want To Be Cited, Don’t Validate Your Agent-Based Model: A Tentative Hypothesis Badly In Need of Refutation. Review of Artificial Societies and Social Simulation, 1 Feb 2022. https://rofasss.org/2022/02/01/citing-od-models/

Dong, Y., Zhan, M., Kou, G., Ding, Z., & Liang, H. (2018). A survey on the fusion process in opinion dynamics. Information Fusion, 43, 57-65. DOI: 10.1016/j.inffus.2017.11.009

Flache, A., Mäs, M., Feliciani, T., Chattoe-Brown, E., Deffuant, G., Huet, S., & Lorenz, J. (2017). Models of social influence: Towards the next frontiers. Journal of Artificial Societies and Social Simulation, 20(4). https://www.jasss.org/20/4/2.html

Keijzer, M. (2022). If you want to be cited, calibrate your agent-based model: a reply to Chattoe-Brown. Review of Artificial Societies and Social Simulation.  9th Mar 2022. https://rofasss.org/2022/03/09/Keijzer-reply-to-Chattoe-Brown

Valori, L., Picciolo, F., Allansdottir, A., & Garlaschelli, D. (2012). Reconciling long-term cultural diversity and short-term collective social behavior. Proceedings of the National Academy of Sciences, 109(4), 1068-1073. DOI: 10.1073/pnas.1109514109

Carpentras, D. (2023) Why we are failing at connecting opinion dynamics to the empirical world. Review of Artificial Societies and Social Simulation, 8 Mar 2023. https://rofasss.org/2023/03/08/od-emprics

© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

Antisocial simulation: using shared high-performance computing clusters to run agent-based models

By Gary Polhill

Information and Computational Sciences Department, The James Hutton Institute, Aberdeen AB15 8QH, UK.

High-performance computing (HPC) clusters are increasingly being used for agent-based modelling (ABM) studies. There are reasons why HPC provides a significant benefit for ABM work, and to expect a growth in HPC/ABM applications:

  1. ABMs typically feature stochasticity, which require multiple runs using the same parameter settings and initial conditions to ascertain the scope of the behaviour of the model. The ODD protocol has stipulated the explicit specification of this since it was first conceived (Grimm et al. 2006). Some regard stochasticity as ‘inelegant’ and to be avoided in models, but asynchrony in agents’ actions can avoid artefacts (results being a ‘special case’ rather than a ‘typical case’) and introduces an extra level of complexity affecting the predictability of the system even when all data are known (Polhill et al. 2021).
  2. ABMs often have high-dimensional parameter spaces, which need to be sampled for sensitivity analyses and, in the case of empirical ABMs, for calibration and validation. The so-called ‘curse of dimensionality’ means that the problem of exploring parameter space grows exponentially with the number of parameters. While ABMs’ parameters may not all be ‘orthogonal’ (i.e. each point in parameter space does not uniquely specify model behaviour – a situation sometimes referred to as ‘equifinality’), diminishing the ‘curse’, the exponential growth means the challenge of parameter search does not need many dimensions before it becomes intractable exhaustively.
  3. Both the above points are exacerbated in empirical applications of ABMs given Sun et al.’s (2016) observations about the ‘medawar zone’ of model complicatedness in relation to that of theoretical models. In empirical applications, we also may be more interested in knowing that an undesirable outcome cannot occur, or has a very low probability of occurring, requiring more runs with the same conditions. Further, the additional complicatedness of empirical ABM will entail more parameters, and the empirical application will place greater emphasis on searching parameter space for calibrating and validating to data.

HPC clusters are shared computing resources, and it is now commonplace for research organizations and universities to have them. There can be few academic disciplines without some sort of scientific computing requirement – typical applications include particle physics, astronomy, meteorology, materials, chemistry, neuroscience, medicine and genetics. And social science. As a shared resource, an HPC cluster is subject to norms and institutions frequently observed in common-pool resource dilemmas. Users of HPC clusters are asked to request allocations of computing time, memory and long-term storage space to accommodate their needs. The requests are made in advance of the runs being executed; sometimes so far in advance that the calculations form part of the research project proposal. Hence, as a user, if you do not know, or cannot calculate, the resources you will require, you have a dilemma: ask for more than it turns out you really need and risk normative sanctions; or ask for less than it turns out you really need and impair the scientific quality of your research. Normative sanctions are in the job description of the HPC cluster administrator. This can lead to emails such as those in Figure 1.

Can I once again remind everyone to please be sensible (and considerate) in your allocation of memory for jobs on the cluster. We now have a situation on the cluster where jobs are unable to run because large amounts of memory have been requested yet only a tiny amount is actually active - check the attached image, where light green shows allocated and dark green shows used. Over allocating resources can block the cluster for others, as well as waste a huge amount of energy as additional machines need to power up unnecessarily. Picture 1b

Figure 1: Example email and accompanying visualization from an HPC cluster administrator reminding users that it is antisocial to request more resources than you will use when submitting jobs.

The ‘managerialist’ turn in academia has been lamented in various articles. Kolsaker (2008), while presenting a nuanced view of the relationship between managerialist and academic modes of working, says that “managerialism represents a distinctive discourse based upon a set of values that justify the assumed right of one group to monitor and control the activities of others.” Steinþórsdóttir et al. (2019) note in the abstract to their article that their results from a case study in Iceland support arguments that managerialism discriminates against women and early-career researchers, in part because of a systemic bias towards natural sciences. Both observations are relevant in this context.

Measurement and control as the tools of managerialist conduct renders Goodhart’s Law (the principle that when a metric becomes a target, the metric is useless) relevant. Goodhart’s Law has been found to have led to bibliometrics now being useless for comparing researchers’ performance – both within and between departments (Fire and Guestrin 2019). We may therefore expect that if an HPC cluster’s administrator has the accurate prediction of computing resource as a target for their own performance assessment, or if they give it as a target for users – e.g. by prioritizing jobs submitted by users on the basis of the accuracy of their predicted resource use, or denying access to those consistently over-estimating requirements – this accuracy will be useless. To give a concrete example, programming languages such as C give the programmer direct control over memory allocation. Hence, were access to an HPC conditional on the accurate prediction of memory allocation requirements, a savvy C programmer would have the (excessive) memory allowance in the batch job submission as a command-line argument to their program, which on execution would immediately request that allocation from the server’s operating system. The rest of the program would use bespoke memory allocation functions that allocated the memory the program actually needed from the memory initially reserved. Similar principles can be used for CPU cycles – if the program runs too quickly, then calculate digits of π until the predicted CPU time has elapsed; and disk space – if too much disk space has been requested, then pad files with random data. These activities waste the programmer’s time, and entail additional use of computing resources with energy cost implications for the cluster administrator.

With respect to the normative statements such as those in Figure 1, Griesemer (2020, p. 77), discussing the use of metrics leading to ‘gaming the system’ in academia generally (the savvy C programmer’s behaviour being an example in the context of HPC usage) claims that “it is … problematic to moralize and shame [such] practices as if it were clear what constitutes ethical … practice in social worlds where Goodhart’s law operates” [emphasis mine]. In computer science, however, there are theoretical (in the mathematical sense of the term) reasons why such norms are problematic over-and-above the social context of measurement-and-control.

The theory of computer science is founded in mathematics and logic, and the work of notable thinkers such as Gödel, Turing, Hilbert, Kolmogorov, Chomsky, Shannon, Tarski, Russell and von Neumann. The growth in areas of computer science (e.g. artificial intelligence, internet-of-things) means that undergraduate degrees have increasingly less space to devote to teaching this theory. Blumenthal (2021, p. 46), comparing computer science curricula in 2014 and 2021, found that the proportion of courses with required modules on computational theory had dropped from 46% to 40%, though the sample size meant this result was not significant (P = 0.09 under a two-population z-test). Similarly, the time dedicated to algorithmics and complexity in CS2013 fell to 28 (of which 19 are ‘tier-1’ – required of every curriculum; and 9 are ‘tier-2’ – in which 80% topic coverage is the stipulated minimum) from 31 in CS2008 (Joint Task Force on Computing Curricula 2013).

One of the most critical theoretical results in computer science is the so-called Halting Problem (Turing 1937), which proves that it is impossible to write a computer program that (in the general case) takes as input another computer program and its input data and gives as output whether the latter program will halt or run forever. The halting problem is ‘tier-1’ in CS2013, and so should be taught to every computer scientist. Rice (1953) generalized Turing’s finding to prove that any ‘non-trivial’ properties of computer programs could not be decided algorithmically. These results mean that the automated job scheduling and resource allocation algorithms in HPC, such as SLURM (Yoo et al. 2003), cannot take a user’s submitted job as input and calculate the computing resources it will need. Any requirement for such prediction is thus pushed to the user. In the general case, this means users of HPC clusters are being asked to solve formally undecidable problems when submitting jobs. Qualified computer scientists should know this – but possibly not all cluster administrators, and certainly not all cluster users, are qualified computer scientists. The power dynamic implied by Kolsaker’s (2008) characterization of a managerialist working culture puts users as a disadvantage, while Steinþórsdóttir et al.’s (2019) observations suggest this practice may be indirectly discriminatory on the basis of age and gender; the latter particularly when social scientists are seeking access to shared HPC facilities.

I emphasized ‘in the general case’ above because in many specific cases, computing resources can be accurately estimated. Sorting a list of strings in alphabetical order, for example is known to grow in execution time with as a function of n log n, where n is the length of the list. Integers can even be sorted in linear time, but with demands on memory that are exponential in the number of bits used to store an integer (Andersson et al. 1998).

However, agent-based modellers should not expect to be so lucky. There are various features that ABMs may implement that make their computing resources difficult (perhaps impossible) to predict:

  • Birth and death of agents can render computing time and memory requirements difficult to predict. Indeed, the size of the population and any fluctuation in it may be the purpose of the simulation study. With each agent having memory needed to store its attributes, and execution time for its behaviour, if the maximum population size of a specific run is not predictable from its initial conditions and parameter settings without first running the model, then computing resources cannot be predicted for HPC job submission.
    • A more dramatic corollary of birth and death is the question of extinction – i.e. where all agents die before they can reproduce. At this point, a run would typically terminate – far sooner than the computing time budgeted.
  • Interactions among agents, where the set of other agents with which one agent interacts is not predetermined, will also typically result in unpredictable computing times, even if the time needed for any one interaction is known. In some cases, agents’ social networks may be formally represented using data structures (‘links’ in NetLogo), and if these connections can be created or destroyed as a result of the model’s dynamics, then the memory requirements will typically be unpredictable.
  • Memories of agents, where implemented, are most trivially stored in lists that may have arbitrary length. The algorithms implementing the agents’ behaviours that use their memories will have computing times that are a function of the list length at any one time. These lists may not have a predictable length (e.g. if the agent ‘forgets’ some memories) and hence their behavioural algorithms won’t have predictable execution time.
  • Gotts and Polhill (2010) have shown that running a specific model with larger spaces led to qualitatively different results than with smaller spaces. This suggests that smaller (personal) computers (such as desktops and laptops) cannot necessarily be used to accurately estimate execution times and memory requirements prior to submitting larger-scale simulations requiring resources only available on HPC clusters.

Worse, a job will typically comprise several runs in a ‘batch’ covering multiple parameter settings and/or initial conditions. Even if the maximum time and memory requirements of any of the runs in a batch were known, there is no guarantee that all of the other runs will use anything like as much. These matters combine to make agent-based modellers ‘antisocial’ users of HPC clusters where the ‘performance’ of the clusters’ users is measured by their ability to accurately predict resource requirements, or there isn’t an ‘accommodating’ relationship between the administrator and researcher. Further, the social environment in which researchers access these resources put early-career and female researchers at a potential systemic disadvantage

The main purpose of making these points is to lay down the foundations for more equitable access to HPC for social scientists, and provide tentative users of these facilities with the arguments they need to develop constructive working arrangements with cluster administrators for them to run their agent-based models on shared HPC equipment.


This work was supported by the Scottish Government Rural and Environment Science and Analytical Services Division (project reference JHI-C5-1)


Andersson, A., Hagerup, T., Nilsson, S. and Raman, R. (1998) Sorting in linear time? Journal of Computer and System Sciences 57, 74-93. https://doi.org/10.1006/jcss.1998.1580

Blumenthal, R. (2021) Walking the curricular talk: a longitudinal study of computer science departmental course requirements. In Lu, B. and Smallwood, P. (eds.) The Journal of Computing Sciences in Colleges: Papers of the 30th Annual CCSC Rocky Mountain Conference, October 15th-16th, 2021, Utah Valley University (virtual), Orem, UT. Volume 37, Number 2, pp. 40-50.

Fire, M. and Guestrin, C. (2019) Over-optimization of academic publishing metrics: observing Goodhart’s Law in action. GigaScience 8 (6), giz053. https://doi.org/10.1093/gigascience/giz053

Gotts, N. M. and Polhill, J. G. (2010) Size matters: large-scale replications of experiments with FEARLUS. Advances in Complex Systems 13 (4), 453-467. https://doi.org/10.1142/S0219525910002670

Griesemer, J. (2020) Taking Goodhart’s Law meta: gaming, meta-gaming, and hacking academic performance metrics. In Kippmann, A. and Biagioli, M. (eds.) Gaming the Metrics: Misconduct and Manipulation in Academic Research. Cambridge, MA, USA: The MIT Press, pp. 77-87.

Grimm, V., Berger, U., Bastiansen, F., Eliassen, S., Ginot, V., Giske, J., Goss-Custard, J., Grand, T., Heinz, S. K., Huse, G., Huth, A., Jepsen, J. U., Jørgensen, C., Mooij, W. M., Müller, B., Pe’er, G., Piou, C., Railsback, S. F., Robbins, A. M., Robbins, M. M., Rossmanith, E., Rüger, N., Strand, E., Souissi, S., Stillman, R. A., Vabø, R., Visser, U. and DeAngelis, D. L. (2006) A standard protocol for describing individual-based and agent-based models. Ecological Modelling 198, 115-126. https://doi.org/10.1016/j.ecolmodel.2006.04.023

(The) Joint Task Force on Computing Curricula, Association for Computing Machinery (ACM) IEEE Computer Society (2013) Computer Science Curricula 2013: Curriculum Guidelines for Undergraduate Degree Programs in Computer Science. https://doi.org/10.1145/2534860

Kolsaker, A. (2008) Academic professionalism in the managerialist era: a study of English universities. Studies in Higher Education 33 (5), 513-525. https://doi.org/10.1080/03075070802372885

Polhill, J. G., Hare, M., Bauermann, T., Anzola, D., Palmer, E., Salt, D. and Antosz, P. (2021) Using agent-based models for prediction in complex and wicked systems. Journal of Artificial Societies and Social Simulation 24 (3), 2. https://doi.org/10.18564/jasss.4597

Rice, H. G. (1953) Classes of recursively enumerable sets and their decision problems. Transactions of the American Mathematical Society 74, 358-366. https://doi.org/10.1090/S0002-9947-1953-0053041-6

Steinþórsdóttir, F. S., Brorsen Smidt, T., Pétursdóttir, G. M., Einarsdóttir, Þ, and Le Feuvre, N. (2019) New managerialism in the academy: gender bias and precarity. Gender, Work & Organization 26 (2), 124-139. https://doi.org/10.1111/gwao.12286

Sun, Z., Lorscheid, I., Millington, J. D., Lauf, S., Magliocca, N. R., Groeneveld, J., Balbi, S., Nolzen, H., Müller, B., Schulze, J. and Buchmann, C. M. (2016) Simple or complicated agent-based models? A complicated issue. Environmental Modelling & Software 86, 56-67. https://doi.org/10.1016/j.envsoft.2016.09.006

Turing, A. M. (1937) On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society s2-42 (1), 230-265. https://doi.org/10.1112/plms/s2-42.1.230

Yoo, A. B., Jette, M. A. and Grondona, M. (2003) SLURM: Simple Linux utility for resource management. In Feitelson, D., Rudolph, L. and Schwiegelshohn, U. (eds.) Job Scheduling Strategies for Parallel Processing. 9th International Workshop, JSSPP 2003, Seattle, WA, USA, June 2003, Revised Papers. Lecture Notes in Computer Science 2862, pp. 44-60. Berlin, Germany: Springer. https://doi.org/10.1007/10968987_3

Polhill, G. (2022) Antisocial simulation: using shared high-performance computing clusters to run agent-based models. Review of Artificial Societies and Social Simulation, 14 Dec 2022. https://rofasss.org/2022/12/14/antisoc-sim

© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)