Chapter 3 The biggest myth of object tracking

What I consider to be the biggest myth about object tracking involves three misconceptions:

There is a fixed capacity limit of about four or five objects that can be tracked, after which performance falls rapidly.
A softer version of the above claim: that performance falls to a particular level once the number of targets is increased to four or five objects.
Different tasks show the same limit.

These three claims are widespread in the scholarly literature. A set of researchers writing about the “object tracking system” in 2010, for example, stated: “One of the defining properties of this system is that it is limited in capacity to three to four individuals at a time” (Piazza 2010). Similarly, Fougnie and Marois (2006) wrote that “People’s ability to attentively track a number of randomly moving objects among like distractors is limited to four or five items” . This idea is sometimes perpetuated with more ambiguous statements such as “participants can track about four objects simultaneously” (Van der Burg, Cass, and Theeuwes 2019).

Misconception #1 in my list above, including the idea of a sharp fall in performance after a limit, is one aspect of the statements of the previous paragraph. This is fully explcit in one set of researchers’ 2010 take on the literature, when they wrote that “the main finding” of the object tracking literature is that “observers can accurately track approximately four objects and that once this limit is exceeded, accuracy declines precipitously.” (Doran and Hoffman 2010). Vaguer statements in other papers, such as “researchers have consistently found that approximately 4 objects can be tracked” (G. A. Alvarez and Franconeri 2007) and “people typically can track four or five items” Chesney and Haladjian (2011) also bolster misconception #1 in the minds of readers.

To examine the evidence behind the claims of each of the quotations of the two preceding paragraphs, I have checked the evidence provided, and the papers cited, as well as the papers those cited papers cite. Each paper contains no evidence supporting the claim that performance decreases very rapidly once the number of targets is increased above some value. Instead, a gradual decrease in performance is seen as number of targets is increased, with no discontinuity, not even a conspicuous inflection point. For example, Oksama and Hyönä (2004), which is sometimes cited in this context, assessed performance with up to six targets. After a five-second phase of random motion of the multiple moving objects, one object was flashed repeatedly and participants hit a key to indicate whether they thought it was one of the targets. The number of trials that participants got wrong increased steadily with target size, from 3% incorrect with two targets to 16% incorrect with six targets.

Although Z. W. Pylyshyn and Storm (1988) is the paper most frequently cited when a limit of four objects is claimed, even they found a quite gradual decrease in performance (their Figure 1) as the number of targets was increased, from one to five (five targets was the most that they tested). And nowhere in their paper did Z. W. Pylyshyn and Storm (1988) state that there is a value beyond which performance rapidly declines. Six years later, however, Z. Pylyshyn et al. (1994) did write that it is “possible to track about four randomly moving objects” (Z. Pylyshyn et al. 1994) By 2007 when he published his book, Things and Places: How the Mind Connects with the World, Pylyshyn wrote sentences like “And as long as there are not more than 4 or 5 of these individuals the visual system can treat them as though it had a concept of ‘individual object’” (Z. W. Pylyshyn 2007). I suspect that this sort of slide toward seeming to back a hard limit is caused in part by the desire for a simple story. It may also stem from an unconscious oversimplification of one’s own data, and/or Pylyshyn’s commitment to his theory that tracking is limited by a set of discrete mental pointers.

I have so far addressed only one aspect of the claim (misconception #1) that there is a limit after which performance decreases rapidly. Another aspect of misconception #1 is that the limit is consistently found to be four or five. This isn’t viable if there is no limit after which performance decreases rapidly, but a researcher could retreat to misconception #2, the idea that tracking performance falls to some particular level at about four targets, even if this does not mark a hard limit (or even an inflection point). The particular performance level might be 75% correct, or another criterion like the halfway point between ceiling and chance (Alex O. Holcombe and Chen 2013), or the “effective number of items tracked”, calculated by applying a formula to percent correct with the number of targets and distractors (B. J. Scholl, Pylyshyn, and Feldman 2001). In a charitable reading, this may be what researchers like G. A. Alvarez and Franconeri (2007) mean when they wrote phrases such as: “researchers have consistently found that approximately 4 objects can be tracked (Intriligator & Cavanagh, 2001; Pylyshyn & Storm, 1988; Yantis, 1992)”. The early studies cited may indeed be consistent with this statement, albeit not strongly supportive. However, work published over the last fifteen years has revealed this apparent agreement on a soft limit across studies to be an artifact of researchers using similar display and task characteristics. That is, findings that approximately four objects can be tracked based on some performance criterion are just an accident of researchers using similar display parameters.

One of the most important display parameters is object speed. The influence of object speed was demonstrated in dramatic fashion by G. A. Alvarez and Franconeri (2007), who tested participants with a display of sixteen wandering discs. When the speed of the discs was very high, participants could, at the end of a trial, correctly pick out the targets only if there were just a few targets. But for very slow speeds, participants could track up to eight targets accurately. This indicated that the accuracy of the statement that participants can track four objects is highly contingent on the speed of those objects. Additional evidence for this was found by others (Alex O. Holcombe and Chen 2012; Feria 2013) and other display parameters that strongly affect the number of objects that can be tracked were also discovered, in particular object spacing (S. L. Franconeri et al. 2008; Alex O. Holcombe, Chen, and Howe 2014).

In summary, it is incorrect to say that people can track about four moving objects, or even that once some (varying with circumstances) number of targets is reached, performance declines very rapidly with additional targets. The number that can be tracked is quite specific to the display arrangement, object spacing, and object speeds. If a researcher is tempted to write that “people can track about four objects”, to reduce confusion, I think that they should stipulate that this refers to a particular combination of display characteristics and performance measures

This issue of how to characterize a human cognitive limit has also bedeviled the study of short-term memory, a literature in which one of the most famous papers is titled “The magical number seven, plus or minus two: Some limits on our capacity for processing information” (Miller 1956). Two dozen working memory researchers convened in 2013 to highlight empirical “benchmarks” for models of working memory. One issue they considered was how to talk about how many items people can remember. In the paper that they published in 2018, the researchers pointed out that “observed item limits vary substantially between materials and testing procedures” (Oberauer et al. 2018). They suggested, however, that much of this variability could be explained by humans’ ability to store groups of items as “chunks”, and thus the group endorsed a statement that there is a limit of “three to four chunks” (Cowan 2001). In the case of short-term memory, then, the observed variability in experiments’ results can potentially be explained by a common underlying limit of three to four chunks that manifests as different observed item limits depending on circumstances (in particular, the opportunity for chunking). Evidently there is no simple task parameter unrelated to chunking opportunity, analogous to object speed in the case of MOT, that smoothly varies through a wide range how many items people can remember. However, whether there is an inflection in performance after four objects, or at any point, remains debated (e.g. Robinson, Benjamin, and Irwin 2020).

Another strong candidate for a real capacity limit is the human ability to “subitize” or judge nearly exactly the numerosity of a collection of objects. For this task of reporting how many items are in a briefly-presented display, there really does seem to be an inflection point in accuracy when the number of objects shown goes from less than four to more than four (Revkin et al. 2008). Four objects and fewer is frequently referred to as the “subitizing range”, with performance approximately as good for rapidly counting four objects as it is for two or one. Note that this is very different than in tracking, for which speed thresholds decline markedly from one to two targets, as well as subsequently to three and four.

In the case of MOT, it remains possible that researchers will be able to identify a set of circumstances that consistently yield a mean tracking limit, in the modal human, of three or four targets, if “limit” is defined as performance falling to a particular level on some performance metric. Probably these circumstances will simply be certain spacings, speeds, object trajectories, and number of objects in a display. It would be nice if some underlying construct, the counterpart of memory’s “chunks”, would be identified to explain how performance changes when other circumstances are used. That would constitute real progress in theory development. However, I don’t see much prospect of that based on the current literature.

3.1 Claim #3: Different tasks, same limit?

Even after discarding the idea that there is a particular number of objects that one can track, misconception #3 might still be viable. This claim is frequently tangled up in the myth reviewed above, and sometimes stated as “there is a magical number four”. IF we discard the idea of a specific number that does not vary with circumstances, there remains the notion that different tasks have the same number-of-objects limit when tested in comparable circumstances. For example, Bettencourt, Michalka, and Somers (2011) stated that visual short-term memory and MOT show “an equivalent four-object limit”, and Piazza (2010) similarly claimed that visuo-spatial short-term memory, ultra-rapid counting (subitizing), and multiple object tracking all share a common limit of “three or four items”. So far, however, there is no good evidence that object tracking has the same limit as visual working memory and subitizing.

Ideally evidence for a common limit could be found by measuring the limits for all three tasks using the same stimuli, but it is unclear how to equate the information available across tasks. Especially difficult is comparing performance with the briefly-presented static stimuli used in subitizing and working memory tasks to the extended exposures of moving stimuli needed to assess object tracking. A stronger understanding of the processes mediating tracking would be required to model performance of the two tasks using a common framework via which they could be compared at an underlying psychological construct level. There is another approach, too: measure the tasks of interest in large numbers of individuals and see whether the different task limits strongly co-vary between individuals. The relationships found so far are not strong enough to conclude that, however, and they are reviewed in Section 11.

In summary, the idea of a limit of four or five targets is a myth. What’s most disappointing is that at no point did it have good evidence behind it, which makes me worry that the way we do science, or the way we do this kind of science, does not result in the community of researchers knowing the basics of what the evidence supports. The general issues around that are beyond the scope of this Element. Let’s stick with the facts of tracking and consider the following: given that tracking performance does depend greatly on circumstances and falls gradually rather than displaying a discontinuity at a particular target number, what are the implications for how tracking works?

References

Alvarez, G A, and S L Franconeri. 2007. “How Many Objects Can You Track? Evidence for a Resource-Limited Attentive Tracking Mechanism.” Journal of Vision 7 (13): 1–10. https://doi.org/10.1167/7.13.14.

Bettencourt, Katherine C., Samantha W. Michalka, and David C. Somers. 2011. “Shared Filtering Processes Link Attentional and Visual Short-Term Memory Capacity Limits.” Journal of Vision 11 (10): 22–22. https://doi.org/10.1167/11.10.22.

Chesney, Dana L., and Harry Haroutioun Haladjian. 2011. “Evidence for a Shared Mechanism Used in Multiple-Object Tracking and Subitizing.” Attention, Perception, & Psychophysics 73 (8): 2457–80. https://doi.org/10.3758/s13414-011-0204-9.

Cowan, Nelson. 2001. “The Magical Number 4 in Short-Term Memory: A Reconsideration of Mental Storage Capacity.” Behavioral and Brain Sciences 24 (1): 87–114.

Doran, Matthew M., and James E. Hoffman. 2010. “The Role of Visual Attention in Multiple Object Tracking: Evidence from ERPs.” Attention, Perception, & Psychophysics 72 (1): 33–52. https://doi.org/10.3758/APP.72.1.33.

Feria, Cary S. 2013. “Speed Has an Effect on Multiple-Object Tracking Independently of the Number of Close Encounters Between Targets and Distractors.” Attention, Perception & Psychophysics 75 (1): 53–67. https://doi.org/10.3758/s13414-012-0369-x.

Fougnie, Daryl, and René Marois. 2006. “Distinct Capacity Limits for Attention and Working Memory: Evidence From Attentive Tracking and Visual Working Memory Paradigms.” Psychological Science 17 (6): 526–34. https://doi.org/10.1111/j.1467-9280.2006.01739.x.

Franconeri, S. L., J. Y. Lin, Z. W. Pylyshyn, B. Fisher, and J. T. Enns. 2008. “Evidence Against a Speed Limit in Multiple-Object Tracking.” Psychonomic Bulletin & Review 15 (4): 802–8. https://doi.org/10.3758/PBR.15.4.802.

Holcombe, Alex O., and Wei-Ying Chen. 2012. “Exhausting Attentional Tracking Resources with a Single Fast-Moving Object.” Cognition 123 (2). https://doi.org/10.1016/j.cognition.2011.10.003.

Holcombe, Alex O, and Wei-ying Chen. 2013. “Splitting Attention Reduces Temporal Resolution from 7 Hz for Tracking One Object to \(<\)3 Hz When Tracking Three.” Journal of Vision 13 (1): 1–19. https://doi.org/10.1167/13.1.12.

Holcombe, Alex O, W Chen, and Piers D L Howe. 2014. “Object Tracking: Absence of Long-Range Spatial Interference Supports Resource Theories.” Journal of Vision 14 (6): 1–21. https://doi.org/10.1167/14.6.1.

Miller, George A. 1956. “The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information.” Psychological Review 63 (2): 81.

Oberauer, Klaus, Stephan Lewandowsky, Edward Awh, Gordon DA Brown, Andrew Conway, Nelson Cowan, Christopher Donkin, Simon Farrell, Graham J. Hitch, and Mark J. Hurlstone. 2018. “Benchmarks for Models of Short-Term and Working Memory.” Psychological Bulletin 144 (9): 885.

Oksama, Lauri, and Jukka Hyönä. 2004. “Is Multiple Object Tracking Carried Out Automatically by an Early Vision Mechanism Independent of Higher-Order Cognition? An Individual Difference Approach.” Visual Cognition 11 (5): 631–71. https://doi.org/10.1080/13506280344000473.

Piazza, Manuela. 2010. “Neurocognitive Start-up Tools for Symbolic Number Representations.” Trends in Cognitive Sciences 14 (12): 542–51. https://doi.org/10.1016/j.tics.2010.09.008.

Pylyshyn, Z W. 2007. Things and Places: How the Mind Connects with the World. Cambridge, Massachusetts: MIT Press.

Pylyshyn, Z W, and R W Storm. 1988. “Tracking Multiple Independent Targets: Evidence for a Parallel Tracking Mechanism.” Spatial Vision 3 (3): 179–97.

Pylyshyn, Z., J. Burkell, B. Fisher, C. Sears, W. Schmidt, and L. Trick. 1994. “Multiple Parallel Access in Visual Attention.” Canadian Journal of Experimental Psychology/Revue Canadienne de Psychologie Expérimentale 48 (2): 260.

Revkin, Susannah K., Manuela Piazza, Véronique Izard, Laurent Cohen, and Stanislas Dehaene. 2008. “Does Subitizing Reflect Numerical Estimation?” Psychological Science 19 (6): 607–14.

Robinson, Maria M., Aaron S. Benjamin, and David E. Irwin. 2020. “Is There a K in Capacity? Assessing the Structure of Visual Short-Term Memory.” Cognitive Psychology 121 (September): 101305. https://doi.org/10.1016/j.cogpsych.2020.101305.

Scholl, B J, Z W Pylyshyn, and J Feldman. 2001. “What Is a Visual Object? Evidence from Target Merging in Multiple Object Tracking.” Cognition 80 (1-2): 159–77. https://doi.org/10.1016/s0010-0277(00)00157-8.

Van der Burg, Erik, John Cass, and Jan Theeuwes. 2019. “Changes (but Not Differences) in Motion Direction Fail to Capture Attention.” Vision Research 165 (December): 54–63. https://doi.org/10.1016/j.visres.2019.09.008.