Article Text

Cognitive performance of military men and women during prolonged load carriage
  1. Nicola C Armstrong1,2,
  2. S J R Smith3,
  3. D Risius1,
  4. D Doyle1,
  5. S L Wardle4,
  6. J P Greeves4,5,6,
  7. J R House2,
  8. M Tipton2 and
  9. M Lomax2
  1. 1 Human Sciences Group, Defence Science and Technology Laboratory, Salisbury, UK
  2. 2 School of Sport, Health and Exercise Science, University of Portsmouth, Portsmouth, UK
  3. 3 Human Sciences Group, Defence Science and Technology Laboratory, Fareham, UK
  4. 4 Department of Army Health and Physical Performance Research, UK Ministry of Defence, Andover, UK
  5. 5 Faculty of Medicine and Health Science, University of East Anglia Norwich Medical School, Norwich, UK
  6. 6 Division of Surgery and Interventional Science, University College London, London, Unitied Kingdom
  1. Correspondence to Dr Nicola C Armstrong, Defence Science and Technology Laboratory, Salisbury, UK; ncarmstrong{at}


Background This study evaluated cognitive workload in soldiers undertaking a long duration march wearing different loads.

Methods Military participants (n=12 men and n=10 women) performed four 3-hour loaded marches (12.25 km at 4.9 km/hour) wearing either 21 kg, 26 kg, 33 kg or 43 kg. During the march, accuracy and response time were measured using the verbal working memory n-back test (0, 1, 2 and 3) and two bespoke Go/No Go tests (visual/auditory) to assess inhibition of a pre-potent response.

Results The physical demands of the march increased with load and march duration but remained at moderate intensity. N-back test accuracy ranged from 74% to 98% in men and 62% to 98% in women. Reduced accuracy was observed as load and time increased. Accuracy during the visual Go/No Go test also reduced with load, accuracy ranged from 69% to 89% in men and 65% to 90% in women. No differences due to load or time were observed during completion of the auditory Go/No Go task; accuracy ranged from 93% to 97% in men and 77% to 95% in women. A number of participants were unable to complete the march due to discomfort. Reports of discomfort were more frequent in women, which may have contributed to the greater reductions in accuracy observed.

Conclusion These data provide further evidence that cognitive performance of military personnel can be affected during long duration loaded marching. Women reported discomfort from equipment more frequently than men, which may make them more susceptible to declines in cognitive performance. These findings highlight important considerations for equipment procurement.

  • sports medicine
  • physiology
  • education & training

Data availability statement

All data relevant to the study are included in the article or uploaded as supplemental information.

This material is licensed under the terms of the Open Government Licence except where otherwise stated. To view this licence, visit or write to the Information Policy Team, The National Archives, Kew, London TW9 4DU, or email:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Key messages

  • This study provides further evidence that cognitive performance is negatively affected during prolonged load carriage; preliminary data indicate that women may be more affected than men.

  • Increased cognitive decrement in women was attributed to increased discomfort and work rate in women and greater load carriage experience of men.

  • These data highlight important considerations for equipment procurement and future research of this nature.


The mass carried by soldiers during operational deployments has increased due to an increase in the volume of combat protective equipment, electronic devices (including batteries) and the amount of ammunition carried.1 During operations in Afghanistan (2002–2014), high mean (57 kg) and peak (71 kg) patrol loads were recorded for UK soldiers.1

The physiological, biomechanical and clinical implications of carrying load have been studied,2–5 but the effect of load carriage on cognition has received less attention. Several critical military tasks involve a cognitive component, including maintaining vigilance through attentional processes, perceiving and processing verbal communications and using executive processes to respond to threats and stimuli in the environment. The evolving demands for the future warfighter means there is likely to be increased requirement for high-levels of cognitive performance in the field, for example, as the requirement to integrate warfighter-worn technology to support situation awareness continues. As such, it is essential to consider both the physical and cognitive aspects of military tasks when assessing soldier performance, to understand any resulting decrements and consider how we train personnel to optimise performance under dual and multi-tasking conditions.6

Early work exploring the effect of load on cognition measured cognitive task performance before and after a loaded march.7 This approach was limited as it allowed for recovery from the physical demands of the task and therefore potentially underestimated the impact of load on cognition. More recent load carriage studies have examined the physical and cognitive demands concurrently. Under these dual-task conditions, there is evidence that increased cognitive demands are placed on a soldier as the mass carried rises. Reductions in accuracy and decreased sensitivity have been observed during a vigilance task in participants wearing 40 kg in comparison with no load8; impaired target detection and memory recall reported with 61 kg loads9 and performance declines (demonstrated by reductions in accuracy and decreased sensitivity) during an auditory Go/No Go task when loads ≥40 kg were carried in both laboratory10 and field settings.11

The cognitive demands placed on the wearer during load carriage8–12 are compounded by complex movements, such as navigating obstacles,8 suggesting that physical movement and cognitive processing compete for shared neural resources. This is supported by findings showing that dual-task conditions, for example, combining running while completing a word recall task, result in impaired performance of one of the tasks compared with single-task performance.13 The effects of combining tasks that are competing for resources require further investigation to understand the implications for soldier task performance.

When considering the influence of physical exertion on cognition,14–16 it is evident that the characteristics of the task and of participants influence performance outcomes, which may make certain individuals more susceptible to declines in cognitive performance. This can be explained in part by the arousal–performance relationship which is based on work by Yerkes and Dodson17 and proposes an inverted-U relationship between arousal and performance. During familiar tasks, performance is greatest at moderate levels of arousal, but is reduced at the extremes. This relationship may shift left or right depending on the complexity of the task, demands of the environment or the characteristics of the individual.

Exercise is a stimulus that increases arousal18 and there is evidence of declines in cognitive performance as exercise intensity increases.19 As women experience greater physiological strain than men during load carriage,20 this may move them towards the extremes of the arousal–performance relationship and result in greater decline in the performance of cognitive tasks. However, this may only become clear after performing the task for some time; previous evidence shows that decrements resulting from the effects of differing loads may only become apparent after 30 minutes of physical activity.11 It may be that the differences between men and women are only observedwith greater loads and after a longer period of sustained physical performance.

Women, who are now eligible to join ground close combat roles, will carry the same load as men in combat. There are no published studies examining the effect of load on cognition in men and women during a prolonged loaded march. Load carriage research must consider the performance of both men and women, to fully understand the implications of such tasks for both sexes.

The primary objective of this work was to compare the cognitive performance of military men and women during prolonged loaded marching with representative military loads. The experimental hypotheses were: (1) accuracy on cognitive tests would decrease with time during a 3-hour loaded march; (2) accuracy on cognitive tests would decrease with increased loads; and (3) accuracy on cognitive tests would decrease to the same extent in men and women.



This study was part of a larger body of work exploring the response of Infantry soldiers to load carriage. At the time this study was undertaken, there were no women in Infantry roles. Rather than exclude women from this study, participants were recruited from the Royal Artillery as the physical demands associated with these roles were considered the closest match to the Infantry. Twelve men from the Infantry and ten women from the Royal Artillery volunteered to participate and provided written informed consent.

Study inclusion criteria included: (1) unit sign-off as fit to undertake loaded marching; (2) passing a study entrance medical examination and (3) lifting a minimum of 30 kg to a height of 1.45 m using the Army Power Bag single lift test. At the time of the study, this test was one of the physical selection standards for the British Army. In this context, it was used to assess participant suitability for load carriage tasks. Prior to each test session, participants were reviewed by the study medical officer to ensure they remained fit to take part.

Study design

Participants performed four 3-hour loaded marches on a slatted belt treadmill (Woodway, model PPS70, Germany) over a 2-week period; each loaded march differed in the mass carried. Participants conducted one march per day with at least one rest day between marches; no physical activity was undertaken on rest days. During each loaded march, a cognitive test battery was performed to evaluate the effect of march duration and mass carried on overall accuracy. Response time was also measured during the cognitive tests to better understand accuracy data.

Participant characteristics

Prior to commencing the load carriage element of the study, stature (Seca, model 225, Germany), body mass (Ian Fellows, model Lucid, UK) and body composition (from a whole body scan; dual-energy X-ray absorptiometry, GE Healthcare, model Lunar iDXA, USA) were assessed. An incremental exercise test was performed to determine maximal rate of oxygen uptake (O2max) and gas exchange threshold (GET) (Metamax 3B (stationary mode), Cortex, Germany). Work rate (Metamax 3B (stationary mode), Cortex, Germany) was measured during each loaded march and compared to the GET to quantify the physical demands placed on the participants.

Loaded march

The loaded march was developed in collaboration with subject matter experts, operational analysts and with reference to unpublished data collected by the Defence Science and Technology Laboratory (Dstl) during military training exercises/operations. Relevant published literature was also considered. The exercise profile represented a 3-hour march (4.9 km/hour; 0%) with a 10-minute water stop every 50 minutes. The total distance marched on each occasion was 12.25 km. Marches were performed in a fed state, and participants were allowed water ad libitum during the rest periods. Rating of perceived exertion21 was measured at three points every hour. Oxygen uptake (O2) was measured continuously during each loaded march. All testing was conducted at 19.8°C (0.7°C) and 48.9% (3.3%) relative humidity. At the end of each march, participants were interviewed by an investigator to capture subjective feedback including reports of pain, soreness and discomfort.

Load configurations

Four load configurations were investigated (Table 1). These configurations were selected in collaboration with Dstl military advisors and designed to represent the mass typically carried during military tasks. While in-service equipment was used where possible, it was not the purpose of this study to provide an equipment evaluation. The order in which the load configurations were worn was counterbalanced using a Latin Square. Participants were issued with body armour that was self-reported as the most comfortable with the greatest range of movement. Investigators confirmed that the appropriate landmarks on the torso were covered.22

Table 1

Load configurations

Cognitive tests

Three cognitive tests were conducted each hour during the loaded march. These tests included visual and auditory Go/No Go tests (two per hour) to measure response inhibition and n-back working memory tests (once per hour). Unpublished work conducted by Dstl has developed a framework for research studies which maps military tasks to cognitive tests.23 These cognitive tests were selected as they represent important cognitive functions that are critical to the dismounted soldier’s operational performance. These include attention, working memory, that is, the maintenance and manipulation of information, and response inhibition which allows a participant to withhold a pre-potent response to a target.23 Other published research has demonstrated that these tasks are reliable and valid24 25 and have been implemented successfully in other load carriage studies.8 10 11

The visual cognitive tests were displayed on a 297×214 cm projector screen, 4.5 m from the treadmill and 118 cm from the ground to approximately eye level. The visual Go/No Go test was presented using the Virtual Battlespace 2 program. Familiarisation and training were undertaken on all cognitive tests to ensure participants understood the test, were responding correctly and had an opportunity to practise responding while walking on the treadmill.

Go/No Go tests

Two Go/No Go response inhibition tests (one visual, one auditory) were administered at minutes 5 and 20 of each hour during the march. In total, each test was completed six times during the 3 hours and each element (visual/auditory) lasted approximately 3 minutes.

In both tests, participants were presented with 60 targets; 48 (80%) targets were enemy and 12 (20%) targets were friendly. Participants were asked to respond using a hand-held device whenever an enemy target was presented (‘Go’ stimuli) and not respond whenever a friendly target was presented (‘No Go’ stimuli). This test design results in a prevalent ‘Go’ response to the enemy target so that participants must slow their response to correctly inhibit their response when the friendly ‘No Go’ stimuli was presented. The order of friendly and enemy targets was randomised and the interstimulus interval was 1000–2000 ms. Stimuli were presented for 500 ms and participant responses were recorded for 1000 ms from the start of stimulus onset. Response accuracy (%) and time to respond correctly (response time: ms) are reported.

The visual test comprised a static image of an apartment block with 12 possible target locations (Figure 1). Friendly targets were depicted as uniformed soldiers, enemy targets were depicted as non-uniformed individuals (Figure 2). The stimuli for the auditory test consisted of the sound of a single shot from a Dragonov sniper rifle as the enemy target, and the sound of a single shot from an L85 A2 as the friendly target. Sounds were controlled for volume.

Figure 1

Visual Go/No Go balcony scene and target locations.

Figure 2

Friendly target (left), enemy target (centre), enemy target appearing at position 12 (right).

N-back verbal working memory test

The n-back test can be conducted at a number of different levels, usually 0, 1, 2 or 3, with the ‘n’ referring to the level of difficulty. In the easiest verbal version of the test, the 0-back, a series of letters are presented on the screen, participants have to respond to each letter by deciding whether it is the same or different to a target letter presented at the beginning of the test. At the next level, 1-back, instead of comparing the letter on the screen with a target letter, it is compared with the letter presented on the previous screen (i.e. one screen back). In the 2-back test, the letter is compared with the letter on the screen two screens back, and the 3-back involves seeing if the current letter matches the letter three screens back. The demands of the ‘n-back’ gradually increase such that the easiest versions of the test are placing cognitive demands in terms of attention and vigilance for targets, whereas the harder versions require continuous updating of working memory.

In the current study, the test required participants to respond to each letter by pressing a green button on a hand-held device whenever the letter presented matched the letter presented ‘n’ letters before. For letters that did not match, participants had to respond by pressing a red button on the same hand-held device. Each test consisted of 45 stimulus presentations (letters) and lasted approximately 3 minutes. The stimuli were presented centrally on a screen for 500 ms followed by a blank screen for 2500 ms. Of the stimuli presented, 15 (33%) were targets and 30 (67%) were non-targets.25

The n-back tests were administered at minutes 30–45 of each hour (three times in each load configuration) and lasted for approximately 12 minutes. Mean response time (ms) and mean accuracy (%) to targets and non-targets were recorded.

Data analysis

Statistical analysis was performed using IBM SPSS Statistics V.26. Parametric statistics (analysis of variance (ANOVA), t-test) were applied to data that were normally distributed (Shapiro-Wilk >0.05). Non-parametric analyses (Friedman, Wilcoxon, Mann-Whitney U) were performed on data that were not normally distributed. Data were accepted as significant at the level of p<0.05. Due to a number of incomplete cognitive tests, data for men and women were pooled and differences between the sexes explored using t-tests or descriptive statistics where n˂6.

Parametric data

A two-way repeated measures ANOVA was performed to investigate the interaction between load (×4) and time (×3 for n-backs; ×6 for Go/No Go) and the F-value reported. The Greenhouse-Geisser correction was applied where the assumption of sphericity was not met. When a significant main effect was observed, the Bonferroni post-hoc test was used to identify differences between the loads; for the effect of time, data from the first test undertaken were compared with all subsequent tests. Differences between men and women were explored using an unpaired t-test (n≥6).

Non-parametric data

When data were not normally distributed, Friedman’s one-way ANOVA was used to investigate the effects of load and time without exploring the interaction effects. Where this test was used, χ2 is reported. This test was followed by Wilcoxon’s test when a main effect was detected and the U-value reported. The Mann-Whitney U test was used to compare men and women (n≥6).

Effect sizes

Partial eta squared (η p 2) was reported for the main ANOVA effects to inform future power calculations. For pairwise comparisons, Cohen’s d (d) was calculated for parametric data and interpreted as 0.2 (small), 0.6 (moderate), 1.2 (large), 2.0 (very large) and 4.0 (extremely large).26 Pearson’s r (r) was calculated for non-parametric data and interpreted as 0.1 (small), 0.3 (medium) and 0.5 (large).


Participant characteristics

Participant characteristics are presented in Table 2. On average, men were taller and leaner, with a greater O2max, although there was overlap between the groups for all parameters (Table 2). Women reported their hormonal contraceptive use as follows: combined contraceptive pill, n=4; contraceptive implant, n=2; intrauterine contraceptive device, n=1; no hormonal contraception, n=3.

Table 2

Participant characteristics

Loaded march

Only six men and one woman were able to complete the full exercise test under all conditions (Table 3). Test withdrawals were related to self-reported discomfort caused by the load carried (shoulder discomfort, blisters and chaffing around the hips and groin) or muscular discomfort. Reports of discomfort increased with load in both sexes, but the frequency of reporting was greatest in women (see online supplemental table 1). No withdrawals were self-reported as being related to work rate.

Table 3

Number of participants who completed the march in each load configuration

Participants reported discomfort when they were required to lift their head to look up at the projector screen in some load configurations. This discomfort caused some participants to withdraw from the exercise test altogether, while others withdrew from the cognitive tests but continued marching.

Physical demands

O2 increased with load (F(2, 12)=49.316, p≤0.0001, η p 2=0.851) and time (F(8, 48)=12.193, p≤0.0001, η p 2=0.680), but there was no interaction between load and time (F(16, 96)=0.845, p>0.05, ηp 2=0.134). When O2 was expressed relative to lean body mass, differences between the sexes were observed for each load (t(16)=−3.261, p=0.005, d=1.1–1.6) indicating that physical work rate was greater in women. Men marched at 31%–41% of O2max and women 36%–55% of O2max. O2 did not exceed GET in either group during the march, regardless of the load carried or duration of the march indicating that all participants worked at a moderate exercise intensity.

N-back working memory test


Accuracy data are presented in Figure 3. Visual inspection indicates that some participants were more susceptible to reductions in accuracy in the heaviest loads than others (Figure 3B,C). These data points represent those participants who attempted the cognitive tests but were unable to continuously look at the projector screen as they adjusted their backpacks or sought relief from the discomfort of lifting their heads. These data points could be considered outliers, but the authors opted to include them in the data as they represent the demands placed on the participants rather than measurement errors.

Figure 3

Accuracy measured during the n-backs. (A) 1-back, hour 1; (B) 2-back, hour 3; (C) 3-back, hour 1; (D) 3-back, hour 3. Dashed line represents mean data for women, complete line represents mean data for men. * indicates a difference from V-HIGH; Φ indicates a difference between the sexes (p<0.05).

There was no effect of load (χ2 (3)=6.155, p>0.05) or time (χ2 (2)=0.913, p>0.05) during the 0-backs. During the 1-backs, there was a main effect of time in participants wearing HIGH (χ2 (2)=9.160, p=0.010). Accuracy decreased during hour 3 when compared with hour 1 (p=0.003, r=−0.7) and hour 2 (p=0.045, r=−0.5). There was also a main effect of load during hour 1 (χ2 (3)=11.541, p=0.009). Accuracy was reduced in V-HIGH when compared with LOW (p=0.003, r=−0.8), MED (p=0.004, r=−0.7) and HIGH (p=0.007, r=−0.7) (Figure 3A).

During the 2-backs, there was no effect of time (χ2 (2)=1.849, p>0.05), but there was a main effect of load during hour 1 (χ2 (3)=12.459, p=0.006) and hour 3 (χ2 (3)=13.791, p=0.003). Accuracy was reduced in V-HIGH when compared with HIGH during hour 1 (p=0.028, r=−0.6) and LOW (p=0.018, r=−0.8), MED (p=0.028, r=−0.8) and HIGH (p=0.017, r=−0.8) during hour 2 (Figure 3B).

A similar pattern was observed during the 3-backs, with no effect of time (χ2 (2)=2.769, p>0.05), but a main effect of load during hour 1 (χ2 (3)=10.886, p=0.012) and hour 3 (χ2 (3)=9.800, p=0.020). Accuracy was reduced in V-HIGH when compared with LOW (p=0.018, r=−0.7) and HIGH (p=0.008, r=−0.8) during hour 1 (Figure 3C) and MED (p=0.028, r=−0.9) and HIGH (p=0.018, r=−0.9) during hour 3 (Figure 3D).

Accuracy ranged from 74% to 98% in men and 62% and 98% in women (see online supplemental table 2). Accuracy was reduced in women during the 0-backs in HIGH (hour 2: U=12.000, p=0.015, r=−0.6) and V-HIGH (hour 1: U=13.500, p=0.046, r=−0.5). Descriptive statistics indicate that accuracy in women was reduced earlier in the march in the heavier loads. Women with a higher accuracy tended to march for longer; thus, the difference between men and women reduced as time progressed.

Response time

During the 0-back, there was a main effect of load (F(3, 12)=4.087, p=0.033, η p 2=0.505), but not time (F(2, 8)=2.025, p≥0.05, η p 2=0.336) and there was no interaction between load and time (F(6, 24)=1.363, p≥0.05, η p 2=0.254). During hour 3, reaction time was reduced in V-HIGH when compared with MED (p=0.010, d=0.7). No other differences in reaction time were observed. Descriptive data indicated that women took longer to respond to a target than men.

Visual Go/No Go test


During hour 1, there was a main effect of load (χ2 (3)=9.478, p=0.024). Accuracy was greatest in LOW when compared with MED (p=0.010, r=−0.5), HIGH (p=0.002, r=−0.6) and V-HIGH (p=0.017, r=−0.5) (Figure 4). No other differences were observed between loads or over time.

Figure 4

Left: visual Go/No Go accuracy measured during hour 1. Right: visual Go/No Go response time in LOW. Dashed line represents mean data for women, complete line represents mean data for men. * indicates a difference from LOW; † indicates a difference from hour 1; Φ indicates a difference between the sexes (p<0.05).

Accuracy ranged from 89% to 69% in men and 90% to 65% in women (see online supplemental table 3). Descriptive data suggest that women tended to have a higher accuracy in lighter loads, but accuracy in men was higher in the heavier loads.

Response time

There was a main effect of time when participants wore LOW (χ2 (5)=17.629, p=0.003). Response time was quicker in the final hour of the march when compared with the first hour (p=0.004, r=−0.7) (Figure 4). Men responded more quickly than women in the final hour of the march in LOW (p=0.028, r=−0.5) (Figure 4). No differences in response time were observed with load (p>0.05).

Auditory Go/No Go test


Accuracy data are provided in the online supplemental table 4. Due to a technical error, data were only analysed for eight men. No differences in accuracy were observed with load (F(3, 12)=0.945, p>0.05, η p 2=0.191) or time (F(5, 20)=2.135, p>0.05, η p 2=0.348) with no interaction between load and time (F(15, 60)=1.209, p>0.05, η p 2=0.232). Accuracy ranged from 93% to 97% in men and 77% to 95% in women. Descriptive data indicated that accuracy was reduced in the heavier loads in women.

Response time

Overall response time was unaffected by load (F(3, 12)=2.132, p>0.05, η p 2=0.348) and time (F(5, 20)=1.040, p>0.05, η p 2=0.206) and there was no interaction between load and time (F(15, 60)=0.927, p>0.05, η p 2=0.1688). No differences in response time were observed between the sexes.


This study aimed to compare cognitive performance in service men and women conducting a long duration loaded march in doctrinal military loads. In spite of stringent medical and physical selection criteria, only six men and one woman were able to complete all conditions, with discomfort being a common reason for withdrawal. Due to incomplete data sets, it was not possible to fully test the study hypotheses; however, this work provides evidence of the impact of discomfort on military relevant physical and cognitive task performance. This work also highlights important issues that have implications for the procurement of soldier equipment as well as methodological considerations for future research.

Notwithstanding the high number of test withdrawals, decreased accuracy was observed during the n-back tests and visual Go/No Go tests with both load and time. This supports the findings of other studies which have reported reductions in cognitive performance during load carriage.8–12 In addition, review of the descriptive statistics from the current study provides initial indication that reductions in accuracy occurred during less complex tests and with ‘lighter’ loads in women, although this observation should be confirmed with follow-on work.

Reports of discomfort increased in both sexes with load, but the frequency of these reports was greatest in women (see online supplemental table 1). Previous work has demonstrated that the configuration of equipment can impact on marching performance.7 In the current study, participants attributed the discomfort experienced to issues with equipment fit and integration. Despite five sizes of armour being available, 9 out of 10 women reported issues related to fit, compared with 0 out of 12 men. The mass and stature of the participants was in close agreement with data from the UK military anthropometric database. This highlights the subject that women should not be issued with smaller sizes of equipment that has been designed to fit men; equipment should be designed so that it is inclusive of different body shapes.

When considering body armour and load carriage systems, it is challenging to define ‘good fit’ beyond the subjective experience of the wearer. Stirling et al 27 have attempted to define fit based on three characteristics which include static, dynamic and cognitive fit. These characteristics consider how the equipment interacts with the wearer during static postures, dynamic movements and the impact of the equipment on cognitive capabilities, specifically somatosensation and executive function. In relation to cognitive fit, the authors consider how the equipment affects movement, posture and sensory feedback as well as cognitive processes. The current study supports the approach proposed by Stirling et al 27 as it highlights the impact that discomfort and poor fit can have on cognitive performance and the importance of assessing these characteristics during the equipment procurement process.

Anecdotal reports from investigators also indicate that more experienced soldiers were able to better mitigate against the discomfort caused by load. In the current study, the experience of the participants ranged from newly trained soldiers to those who had undertaken several operational tours as section commanders. As the women used in this study were not Infantry soldiers, a lack of load carriage experience may have contributed to some of the observed results.

O2 data were used to quantify work rate so that the results of the cognitive tests could be interpreted within the context of the physical demands. O2 was greater in women, which likely reflects the additional dead mass carried by this group (5 kg), where dead mass was the sum of load carried and fat mass. O2 remained below GET which indicates that the work rate remained moderate throughout the march regardless of the mass carried or sex. As moderate intensity exercise can be sustained for longer than 4 hours,28 the work rate should not have influenced the participants’ ability to complete the march. However, perceived exertion was increased in women compared with men (see online supplemental tables 5 and 6) which may have reduced the available resource to devote to the cognitive tasks, and suggests that women found the marching task more challenging than men.

Our data suggest that increasing the mass carried led to increased conflict between the dual cognitive/physical task conditions, that is, there was a greater impact of the physical task on the behavioural ability to complete the cognitive task, and there are initial indications that this occurred to a greater extent in the women. These potential differences between the sexes may be related to a number of factors including discomfort caused by poor-fitting equipment, load carriage experience given that the women tested were not Infantry, and perceived exertion.

The preliminary findings suggest an interaction between the behavioural requirements of the visual task, specifically the requirement to look at the screen and the discomfort or physical demands of carrying load which affected participants’ posture. This observation illustrates the conflicts that may occur in a dual task involving both physical and cognitive demands and the effect it can have on performance. For example, Kobus et al 9 reported decreased accuracy during a visual choice reaction time test after 85 min of marching at 3.2 km/hour in 61 kg. These authors presented eye-tracking data, which identified that visual focus was directed towards the lower part of the projector screen and threat detection was reduced towards the upper portion of the visual field as mass increased.

This study indicates that both perceived exertion and discomfort have the potential to add to the cognitive load and negatively affect the subjective experience of the soldier. There is a growing body of evidence exploring the relationship between physical exertion and cognition14–16 29; however, few studies have considered discomfort.30 In the current study, the participants experienced difficulties completing the cognitive tests as it was uncomfortable for them to lift their head to look up at the projector screen in some load configurations. This discomfort caused some participants to withdraw from the exercise test altogether, while others withdrew from the cognitive tests but continued marching. The implications of this behavioural response in an operational scenario may include reduced situational awareness, reduced vigilance and an increase in task errors. It is also interesting to note that reductions in accuracy were not observed during the auditory Go/No Go test which conflicts with the findings of others who have used auditory tests to assess cognitive performance during load carriage.10 11 In the current study, there was no requirement to look at the projector screen during the auditory tests which could have reduced the conflict between the posture needed to look at the screen and the limitations associated with the load configurations resulting in reduced demands placed on the participant during this test.


These findings should be considered preliminary given the small number of participants included in statistical analyses. Military participants were recruited to ensure the highest ecological validity, with the trade-off that they were only available for a limited time period and it was not possible to recruit participants until the desired power was achieved. This limitation also reduced the number of recovery days between tests and may have contributed to a number of test withdrawals due to the cumulative effects of blisters, chaffing and muscular discomfort. This schedule, however, could be considered representative of the demands placed on soldiers during operations. Future studies should consider fewer test configurations, longer recovery periods to mitigate against the risk of data loss resulting from test withdrawals or increased participant recruitment to allow for withdrawals if the research requirement necessitates testing on multiple successive days to better replicate demands in the field.

Women were recruited from the Royal Artillery as there were no women in the Infantry at the time of the study. Although the Royal Artillery have physical employment standards of similar demands to the Infantry, they do not routinely carry loads of this magnitude, which may have contributed to increased reports of discomfort and higher ratings of perceived exertion.

The test battery was designed to represent the demands of operational scenarios in terms of the loads carried and the cognitive tasks required by the soldier. Conducting this study in a more immersive simulated environment or in a field environment would further increase face validity. The test battery was developed for a controlled environment to reduce the impact of confounding factors such as varying weather and light levels on the findings. Practically, the participants who were trained soldiers, many with operational experience, may have suffered boredom from the monotony of treadmill-based exercise, which prevented self-paced exercise. Future work should investigate how this test battery could be used in the field environment to evaluate, ecologically, the cognitive demands on the soldier, or a hybrid of simulated laboratory environment and field testing.6 10 11


This study provides further evidence that reductions in cognitive performance will be observed during a long duration loaded march. Initial observations indicate that women may experience reductions in task accuracy earlier, in lighter loads and during less complex tasks than men, but this finding requires further confirmation. The differences observed between the sexes were attributed to increased discomfort in women, possibly caused by poor-fitting equipment and poor integration, greater load carriage experience of men and a greater perceived exertion experienced by women. These physical factors will have influenced the amount of reserve available for women to devote to the cognitive tasks. These data should be considered during future equipment procurement to ensure that equipment issues are not a limiting factor for the physical and cognitive performance of service personnel.

Data availability statement

All data relevant to the study are included in the article or uploaded as supplemental information.

Ethics statements

Patient consent for publication

Ethics approval

This study involves human participants and received favourable ethical opinion from the Ministry of Defence Research Ethics Committee (protocol 694/MODREC/15). Participants gave informed consent to participate in the study before taking part.


The authors would like to thank the project management team Mrs Lorraine Beavis and Mr Jon Russell; the military advisors involved in the study design Lt Col Mike Potter and Lt Col Howard Long, and Dr Thomas O’Leary and Dr Charlotte Coombs for providing technical review.

© Crown copyright (2022), Dstl. This material is licensed under the terms of the Open Government Licence except where otherwise stated. To view this licence, visit or write to the Information Policy Team, The National Archives, Kew, London TW9 4DU, or email:


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Twitter @nic_c_armstrong, @JulieGreeves

  • Contributors NCA, SLW, JPG, SJRS and JRH designed the study. Data collection was conducted by NCA, DR, DD and SLW. NA and DD analysed the data. NCA and SJRS prepared the manuscript. All authors reviewed the manuscript, MT, JRH and ML supervised the project (PhD supervisors). NCA is the guarantor for the study.

  • Funding This work was jointly funded by the UK Ministry of Defence, through the Dstl Land Integrated Survivability Programme, Dismounted Protection project (contract number STECH/0008) and the Women in Ground Close Combat Review Programme (Army Contract number: ARMYHQ2/00061).

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.