Chapter 13 Progress and recommendations
Near the beginning of this Element, I suggested that five findings about multiple object tracking were particularly important. Now that I’ve explained them and gone through the associated evidence, it’s time to sum up. The five findings are:
- The number of moving objects humans can track is limited, but not to a particular number such as four or five. (Section 3)
- The number of targets has little effect on spatial interference, whereas it greatly increases temporal interference (Section 5).
- Predictability of movement paths benefits tracking only for one or two targets, not for more (Section 6).
- Tracking capacity is hemifield specific: capacity nearly doubles when targets are presented in different hemifields (Section 9).
- When tracking multiple targets, people often don’t know which target is which, and updating of non-location features is poor (Section 10).
The first theory of multiple object tracking, Pylyshyn’s FINST theory, debuted in the first paper that established that people can actually do the task. Although hundreds of MOT experiments have been published since then, as of this writing, the FINST theory is the only theory mentioned on the Wikipedia page for MOT (Editors 2021). Based on what they write in their papers, many active researchers as well as Wikipedia’s editors do not seem to appreciate how much the main points of FINST theory have been rebutted. Core to the theory was the idea that tracking is mediated by a small set of discrete and pre-attentive indices. As we have seen, however, as object speed increases, the number of targets that can be tracked steadily decreases, to just one target, which doesn’t sit well with a fixed set of indices (G. A. Alvarez and Franconeri 2007; Alex O. Holcombe and Chen 2012; see also Brian J. Scholl 2009). Instead, it suggests that tracking reflects a more continuous resource that can both be allocated entirely to one or two objects and spread thinly among several objects. However, it could also be explained by a process that has to serially switch among the targets.
Another prediction of FINST theory was that participants would be aware of which target is which among the targets they are tracking. Pylyshyn himself reported evidence against this, to his credit, and the evidence that updating of target identities is poor has increased since then (important finding #5 above). Explaining the dissociation between position updating and non-position maintenance and updating of features is an integral part of two recent theories, by Li, Oksama, and Hyönä (2019) and by Lovett, Bridewell, and Bello (2019). Both concur with FINST theory that position updating happens in parallel, but they suggest that other features of targets are maintained and updated by a process that switches among the targets one-by-one.
Humans’ poor awareness of which monitored object is which has consequences for the quest to explain our cognitive abilities. Our minds can represent structure in a content-independent fashion, such as with language, where syntax involves structure with distinct roles, e.g. using the word “giving” can involve a giver, a recipient, and item. A recent paper suggested that this could be implemented by Pylyshyn’s FINSTs (O’Reilly, Ranganath, and Russin 2022). As we have seen, however, during multiple object tracking the distinct identities of the targets often are not represented, so this approach to explaining cognition may not work.
In positing a serial process for updating of features other than position, Lovett, Bridewell, and Bello (2019) further proposed that the serial process can compute the motion history of a target. This can explain important finding #3, that predictability of motion trajectories yields a measurable advantage only when there are only a few targets (Piers DL Howe and Holcombe 2012; Luu and Howe 2015), because with more targets, the benefit may be too small to be detectable.
In summary, spatial selection appears to occur in parallel, at a hemifield-specific processing stage, with other features subsequently updated and linked in at a visual field-wide, possibly serial process. Some evidence about position updating, however, suggests that it may be more limited-capacity than it appears, which I grapple with in another manuscript (Alex O. Holcombe 2022).
13.1 Recommendations for future work
The MOT paradigm is important not only because of the insights that its findings provide, but also because it has the potential to reveal many more insights about human abilities. MOT’s high test-retest reliability, on the order of .8 or .9, has been found to be the higher than other attentional tasks. High reliability means that MOT results are often highly credible (because with a non-noisy task, less data is needed to have high statistical power) and have high potential for revealing individual differences (Section 11).
The discovery that tracking’s capacity limit reflects two resources, one in each hemisphere, was one of the greatest advances in tracking research, but it’s disappointing how little that discovery has been built upon. Consider, for example, the issue of whether tracking draws on the same mental resources as other tasks. FINST theory proposed that the tracking process is preattentive, but dual-task studies show substantial interference from other tasks (Oksama and Hyönä 2016; Alnaes et al. 2014). Sadly, however, such studies do not seem to have ruled out the possibility that these findings were caused entirely by a process with a capacity of only one object (what I have called System B) rather than the hemifield-specific tracking processes. “Carving nature at its joints”, or dissociating the components of a biological system, is important for scientific progress but can be difficult in psychology (Fodor 1983) — general cognition (System B) can do many different things and thereby contaminate the study of any processing specific to object tracking. Testing for hemifield specificity can help us tease System B apart.
I’d like to see fewer missed opportunities to study what makes tracking distinctive, hence my top recommendations for future research emphasize this point. Those recommendations are:
To dilute the influence of capacity-one System B processing (@ref(#Cequals1)), use several targets, not just two or three. But remember that even with several targets, a small effect could be explained by a capacity-one process. Test for hemifield specificity as that can help rule out a capacity-one process.
Always test for hemifield specificity! In addition to it helping to rule out a factor having its effect only on a capacity-one process, we know very little about what limited-capacity brain processes are hemisphere-specific, so any results here are likely to be interesting.
As we have seen, MOT is a complex task, so it’s difficult to interpret individual differences and predict whether they will translate to other tasks. Individual-difference studies should use task variations that help isolate the component processes that contribute to overall success or failure, such as spatial interference, temporal interference, and cognitive processing.
For computational modelling as well, don’t restrict oneself to standard MOT tasks with unconstrained trajectories, as that sort of data may not constrain models very much. Show that a model succeeds at task variations that isolate component processes.
13.2 Omissions
Several topics that I originally planned to cover could not be included here, due to limited space. Some of the most important are the role of retinotopic, spatiotopic, and configural representations in tracking (see (Yantis 1992; Bill et al. 2020; Piers D. L. Howe, Pinto, and Horowitz 2010; Meyerhoff et al. 2015; G. Liu et al. 2005; Maechler, Cavanagh, and Tse 2021)), the role of distractor suppression, the role of surface features (Papenmeier et al. 2014), and the findings from dual-task paradigms. I hope those readers whose favorite topic was left out can take some consolation in the fact that my own favorite, the temporal limits on tracking (Alex O. Holcombe and Chen 2013; Roudaia and Faubert 2017), also was not covered. Because that topic has major implications for what the tracking resource actually does during tracking, and whether processing is serial or parallel, I have a separate manuscript about it (Alex O. Holcombe 2022).