Chapter 3 The biggest myth of object tracking
What I consider to be the biggest myth about object tracking involves three misconceptions:
- There is a fixed capacity limit of about four or five objects that can be tracked, after which performance falls rapidly.
- A softer version of the above claim: that performance falls to a particular level once the number of targets is increased to four or five objects.
- Different tasks show the same limit.
These three claims are widespread in the scholarly literature. A set of researchers writing about the “object tracking system” in 2010, for example, stated: “One of the defining properties of this system is that it is limited in capacity to three to four individuals at a time” (Piazza 2010). Similarly, Fougnie and Marois (2006) wrote that “People’s ability to attentively track a number of randomly moving objects among like distractors is limited to four or five items” . This idea is sometimes perpetuated with more ambiguous statements such as “participants can track about four objects simultaneously” (Van der Burg, Cass, and Theeuwes 2019).
Misconception #1 in my list above, including the idea of a sharp fall in performance after a limit, is one aspect of the statements of the previous paragraph. This is fully explcit in one set of researchers’ 2010 take on the literature, when they wrote that “the main finding” of the object tracking literature is that “observers can accurately track approximately four objects and that once this limit is exceeded, accuracy declines precipitously.” (Doran and Hoffman 2010). Vaguer statements in other papers, such as “researchers have consistently found that approximately 4 objects can be tracked” (G. A. Alvarez and Franconeri 2007) and “people typically can track four or five items” Chesney and Haladjian (2011) also bolster misconception #1 in the minds of readers.
To examine the evidence behind the claims of each of the quotations of the two preceding paragraphs, I have checked the evidence provided, and the papers cited, as well as the papers those cited papers cite. Each paper contains no evidence supporting the claim that performance decreases very rapidly once the number of targets is increased above some value. Instead, a gradual decrease in performance is seen as number of targets is increased, with no discontinuity, not even a conspicuous inflection point. For example, Oksama and Hyönä (2004), which is sometimes cited in this context, assessed performance with up to six targets. After a five-second phase of random motion of the multiple moving objects, one object was flashed repeatedly and participants hit a key to indicate whether they thought it was one of the targets. The number of trials that participants got wrong increased steadily with target size, from 3% incorrect with two targets to 16% incorrect with six targets.
Although Z. W. Pylyshyn and Storm (1988) is the paper most frequently cited when a limit of four objects is claimed, even they found a quite gradual decrease in performance (their Figure 1) as the number of targets was increased, from one to five (five targets was the most that they tested). And nowhere in their paper did Z. W. Pylyshyn and Storm (1988) state that there is a value beyond which performance rapidly declines. Six years later, however, Z. Pylyshyn et al. (1994) did write that it is “possible to track about four randomly moving objects” (Z. Pylyshyn et al. 1994) By 2007 when he published his book, Things and Places: How the Mind Connects with the World, Pylyshyn wrote sentences like “And as long as there are not more than 4 or 5 of these individuals the visual system can treat them as though it had a concept of ‘individual object’” (Z. W. Pylyshyn 2007). I suspect that this sort of slide toward seeming to back a hard limit is caused in part by the desire for a simple story. It may also stem from an unconscious oversimplification of one’s own data, and/or Pylyshyn’s commitment to his theory that tracking is limited by a set of discrete mental pointers.
I have so far addressed only one aspect of the claim (misconception #1) that there is a limit after which performance decreases rapidly. Another aspect of misconception #1 is that the limit is consistently found to be four or five. This isn’t viable if there is no limit after which performance decreases rapidly, but a researcher could retreat to misconception #2, the idea that tracking performance falls to some particular level at about four targets, even if this does not mark a hard limit (or even an inflection point). The particular performance level might be 75% correct, or another criterion like the halfway point between ceiling and chance (Alex O. Holcombe and Chen 2013), or the “effective number of items tracked”, calculated by applying a formula to percent correct with the number of targets and distractors (B. J. Scholl, Pylyshyn, and Feldman 2001). In a charitable reading, this may be what researchers like G. A. Alvarez and Franconeri (2007) mean when they wrote phrases such as: “researchers have consistently found that approximately 4 objects can be tracked (Intriligator & Cavanagh, 2001; Pylyshyn & Storm, 1988; Yantis, 1992)”. The early studies cited may indeed be consistent with this statement, albeit not strongly supportive. However, work published over the last fifteen years has revealed this apparent agreement on a soft limit across studies to be an artifact of researchers using similar display and task characteristics. That is, findings that approximately four objects can be tracked based on some performance criterion are just an accident of researchers using similar display parameters.
One of the most important display parameters is object speed. The influence of object speed was demonstrated in dramatic fashion by G. A. Alvarez and Franconeri (2007), who tested participants with a display of sixteen wandering discs. When the speed of the discs was very high, participants could, at the end of a trial, correctly pick out the targets only if there were just a few targets. But for very slow speeds, participants could track up to eight targets accurately. This indicated that the accuracy of the statement that participants can track four objects is highly contingent on the speed of those objects. Additional evidence for this was found by others (Alex O. Holcombe and Chen 2012; Feria 2013) and other display parameters that strongly affect the number of objects that can be tracked were also discovered, in particular object spacing (S. L. Franconeri et al. 2008; Alex O. Holcombe, Chen, and Howe 2014).
In summary, it is incorrect to say that people can track about four moving objects, or even that once some (varying with circumstances) number of targets is reached, performance declines very rapidly with additional targets. The number that can be tracked is quite specific to the display arrangement, object spacing, and object speeds. If a researcher is tempted to write that “people can track about four objects”, to reduce confusion, I think that they should stipulate that this refers to a particular combination of display characteristics and performance measures
This issue of how to characterize a human cognitive limit has also bedeviled the study of short-term memory, a literature in which one of the most famous papers is titled “The magical number seven, plus or minus two: Some limits on our capacity for processing information” (Miller 1956). Two dozen working memory researchers convened in 2013 to highlight empirical “benchmarks” for models of working memory. One issue they considered was how to talk about how many items people can remember. In the paper that they published in 2018, the researchers pointed out that “observed item limits vary substantially between materials and testing procedures” (Oberauer et al. 2018). They suggested, however, that much of this variability could be explained by humans’ ability to store groups of items as “chunks”, and thus the group endorsed a statement that there is a limit of “three to four chunks” (Cowan 2001). In the case of short-term memory, then, the observed variability in experiments’ results can potentially be explained by a common underlying limit of three to four chunks that manifests as different observed item limits depending on circumstances (in particular, the opportunity for chunking). Evidently there is no simple task parameter unrelated to chunking opportunity, analogous to object speed in the case of MOT, that smoothly varies through a wide range how many items people can remember. However, whether there is an inflection in performance after four objects, or at any point, remains debated (e.g. Robinson, Benjamin, and Irwin 2020).
Another strong candidate for a real capacity limit is the human ability to “subitize” or judge nearly exactly the numerosity of a collection of objects. For this task of reporting how many items are in a briefly-presented display, there really does seem to be an inflection point in accuracy when the number of objects shown goes from less than four to more than four (Revkin et al. 2008). Four objects and fewer is frequently referred to as the “subitizing range”, with performance approximately as good for rapidly counting four objects as it is for two or one. Note that this is very different than in tracking, for which speed thresholds decline markedly from one to two targets, as well as subsequently to three and four.
In the case of MOT, it remains possible that researchers will be able to identify a set of circumstances that consistently yield a mean tracking limit, in the modal human, of three or four targets, if “limit” is defined as performance falling to a particular level on some performance metric. Probably these circumstances will simply be certain spacings, speeds, object trajectories, and number of objects in a display. It would be nice if some underlying construct, the counterpart of memory’s “chunks”, would be identified to explain how performance changes when other circumstances are used. That would constitute real progress in theory development. However, I don’t see much prospect of that based on the current literature.
3.1 Claim #3: Different tasks, same limit?
Even after discarding the idea that there is a particular number of objects that one can track, misconception #3 might still be viable. This claim is frequently tangled up in the myth reviewed above, and sometimes stated as “there is a magical number four”. IF we discard the idea of a specific number that does not vary with circumstances, there remains the notion that different tasks have the same number-of-objects limit when tested in comparable circumstances. For example, Bettencourt, Michalka, and Somers (2011) stated that visual short-term memory and MOT show “an equivalent four-object limit”, and Piazza (2010) similarly claimed that visuo-spatial short-term memory, ultra-rapid counting (subitizing), and multiple object tracking all share a common limit of “three or four items”. So far, however, there is no good evidence that object tracking has the same limit as visual working memory and subitizing.
Ideally evidence for a common limit could be found by measuring the limits for all three tasks using the same stimuli, but it is unclear how to equate the information available across tasks. Especially difficult is comparing performance with the briefly-presented static stimuli used in subitizing and working memory tasks to the extended exposures of moving stimuli needed to assess object tracking. A stronger understanding of the processes mediating tracking would be required to model performance of the two tasks using a common framework via which they could be compared at an underlying psychological construct level. There is another approach, too: measure the tasks of interest in large numbers of individuals and see whether the different task limits strongly co-vary between individuals. The relationships found so far are not strong enough to conclude that, however, and they are reviewed in Section 11.
In summary, the idea of a limit of four or five targets is a myth. What’s most disappointing is that at no point did it have good evidence behind it, which makes me worry that the way we do science, or the way we do this kind of science, does not result in the community of researchers knowing the basics of what the evidence supports. The general issues around that are beyond the scope of this Element. Let’s stick with the facts of tracking and consider the following: given that tracking performance does depend greatly on circumstances and falls gradually rather than displaying a discontinuity at a particular target number, what are the implications for how tracking works?