Table 1

The 25 most common tokens, bigrams and trigrams in the text corpus, numbers in parenthesis denote the frequency of the ngram in the corpus. The token list was generated from a corpus with common specific bigrams reduced to their shortest stem (see the online supplementary table for a complete list), the corpus used for analysis of bigrams and trigrams had undergone selective stemming without the removal of bigrams

RankTokenBigramTrigram
1case (1022)army medical (245)royal army medical (179)
2medical (736)royal army (190)army medical corps (163)
3note (587)medical corps (163)journal royal army (47)
4military (569)medical service (153)army medical service (39)
5army (555)enteric fever (110)defence medical service (21)
6treatment (490)note case (100)mitchiner memorial lecture (19)
7report (416)medical officer (80)south africa war (19)
8service (374)military hospital (73)experience territorial medical (18)
9war (347)british army (64)territorial medical officer (18)
10hospital (316)south africa (64)war experience territorial (18)
11soldier (311)field ambulance (63)uk armed forces (17)
12india (276)british soldier (62)case enteric fever (16)
13british (260)special reference (59)army medical college (15)
14field (260)typhoid fever (57)case gunshot wound (15)
15injury (259)gunshot wound (53)committee royal society (15)
16fever (246)report case (53)medical research council (14)
17malaria (245)journal royal (47)reminiscences army surgeon (14)
18africa (220)general hospital (44)report medical research (14)
19operation (204)armed forces (42)report royal commission (14)
20disease (200)military personnel (42) 21 other trigrams (13)
21wound (180)unusual case (39)
22training (175)treatment gonorrhoea (35)
23use (171)army surgeon (34)
24trauma (167)field hospital (34)
25enteric (162)active service (32)