Abstract Pulmonary thromboembolism as a cause of respiratory complaints is frequently undiagnosed and requires expensive imaging modalities
to diagnose. The objective of this study was to determine if genetic programming could be used to classify patients as having
or not having pulmonary thromboembolism using exhaled ventilatory and gas indices as genetic material. Using a custom-built
exhaled oxygen and carbon dioxide analyzer; exhaled flows, volumes, and gas partial pressures were recorded from patients
for a series of deep exhalation and 30 s tidal volume breathing. A diagnosis of pulmonary embolism was made by contrast-enhanced
computerized tomography angiography of the chest and indirect venography supplemented by 90-day follow-up. Genetic programming
developed a series of genomes comprising genes of exhaled CO
2, O
2, flow, volume, vital signs, and patient demographics from these data and their predictions were compared to the radiological
results. We found that 24 of 178 patients had pulmonary embolism. The best genome consisted of four genes: the minimum flow
rate during the third 30 s period of tidal breathing, the average peak exhaled CO
2 during the first 30 s period of tidal breathing, the average peak of the exhaled O
2 during the first 30 s period of tidal breathing, and the average peak exhaled CO
2 during the fourth period of tidal breathing, which immediately followed a deep exhalation. This had 100% sensitivity and
91% specificity on the construction population and 100% and 82%, respectively when tested on the separate validation population.
Genetic programming using only data obtained from exhaled breaths was very accurate in classifying patients with suspected
pulmonary thromboembolism.