01 February 2011: Diagnostics and Medical Technology
Evaluation of the accuracy of manual and automatic scoring of a single airflow channel in patients with a high probability of obstructive sleep apnea
Ahmed S BaHammam ADEFG , Munir Sharif BC , Divinagracia E. Gacuan BD , Smitha George BD
DOI: 10.12659/MSM.881379
Med Sci Monit 2011; 17(2): MT13-19
Background
Obstructive sleep apnea (OSA) is a common sleep disorder with serious complications, including hypertension, atherosclerosis, stroke and insulin resistance [1–3]. The gold standard method for diagnosing OSA requires an in-lab, attended, full overnight polysomnography (PSG) (Type I sleep study). As awareness among the public and health care providers has increased, referrals for OSA diagnosis have increased; consequently, the waiting time for a Type I sleep study has increased significantly. Type I sleep studies are expensive, labor intensive and time consuming. Additionally, a lack of trained technologists capable of performing proper in-lab neuro-cardio-pulmonary monitoring is a major obstacle, particularly in developing countries [4]. As a result, many people with OSA remain undiagnosed and untreated [5]. On the other hand, many patients consider sleeping in the laboratory inconvenient and say that it does not reflect their typical sleep at home. Portable monitoring (PM) has been proposed as an alternative to an in-lab, attended sleep study for the diagnosis of OSA. PM advocates suggest that it reduces the need for expensive, time-and labor-intensive laboratory testing, requires less technical expertise and facilitates the diagnosis and initiation of appropriate therapy. Nevertheless, the accuracy and validity of different PM devices in the diagnosis of OSA remain a major concern. The American Sleep Disorders Association has classified PM into 4 types [6]: Type 1 (standard in-lab, attended PSG) is the gold standard to which other monitoring types are compared; Type 2 (comprehensive portable PSG) includes a minimum of 7 channels, including neuro-cardio-respiratory monitoring; Type 3 incorporates a minimum of 4 channels, including cardio-respiratory monitoring; and Type 4 comprises 1 or 2 channels, usually oxygen saturation or airflow. Most of the previous studies that have addressed the validity and accuracy of PM in OSA patients used Type 3 PM. Consequently, a 2007 practice parameter from the American Academy of Sleep Medicine stated that attended Type 3 PM is acceptable for the diagnosis of OSA with certain limitations, including a required review of the raw data, the absence of comorbidities, and the need for a definitive evaluation if the study is nondiagnostic [7]. The guidelines discouraged the use of Type 4 studies due to lack of sufficient evidence [7]. In the guidelines, AASM gave conditional approval for the use of unattended PM to diagnose OSA. The guidelines concluded that unattended PM can be performed only in conjunction with a comprehensive sleep evaluation by a qualified specialist, in patients with a high pre-test probability of moderate to severe OSA, and in the absence of other sleep disorders and comorbid conditions [7]. Therefore, there remains a need to study the accuracy of PM devices, particularly Type 4, in the diagnosis of OSA. One frequently used Type 4 device is
A limited number of studies have assessed the accuracy of AL in patients with clinical suspicion of OSA [8–11]. Although AL permits the manual checking and scoring of raw data, previous studies have examined only the results of the default automatic scoring system. Therefore, we designed the present study to assess the sensitivity and specificity of both automatic and manual AL scoring compared with PSG for the diagnosis of OSA in a selected group of patients.
Material and Methods
SUBJECTS:
Consecutive patients referred to the Sleep Disorders Center between September 2008 and November 2009 with a history of snoring and clinical suspicion of sleep-disordered breathing were assessed by a sleep disorders specialist. Patients between 18 and 65 years old with a high clinical suspicion of OSA based on the presence of loud, interrupted snoring, daytime sleepiness or witnessed apneas in the absence of symptoms of other sleep disorders were included. Patients with chronic pulmonary diseases, elevated PaCO2, congestive heart failure, neuromuscular diseases, and those on home oxygen or mechanical ventilation were excluded. Patients with a history of nasal blockage at the time of the PSG were not included. The study was approved by our institutional review board (IRB). Informed consent was obtained from all participants.
PROTOCOL:
All participants underwent a simultaneous overnight sleep study with the portable AL device and in-lab Type 1 PSG (Alice 5; Respironics, Inc., Murrysville, PA). Standard in-lab PSG monitored brain activity (electroencephalogram with electrodes placed at C3A2, C4A1, O1A2, O2A1); muscle tone (electromyogram of the chin and both legs); eye movements (electrooculogram); heart rate (electrocardiogram); oxygen saturation (finger pulse oximeter); chest and abdominal wall movements (thoracic and abdominal belts); airflow (thermistor and nasal prong pressure transducer); sleep position (body position sensor); and snoring (microphone). PSG data were scored manually according to established criteria [12,13]. Obstructive apnea was defined as a drop in the peak thermal sensor excursion greater than or equal to 90% of baseline for at least 10 seconds in the presence of continued respiratory effort. At least 90% of the event’s duration had to meet the amplitude reduction criteria. Hypopnea was defined as a reduction in the airflow by 50% of baseline and lasting for more than 10 seconds, resulting in a ≥3% decrease in oxygen saturation or an arousal. At least 90% of the duration of the event had to meet the amplitude reduction criteria [13]. The apnea hypopneas index (AHI) was calculated by dividing all obstructive apnea-hypopnea episodes by the total sleep time (TST).
APNEALINK™ (AL):
AL is a single-channel device that measures flow via a nasal cannula connected to a pressure transducer. AL was connected to one end of a Y-shaped nasal cannula. The other peripheral end of the Y-shaped cannula was directly connected to a pressure prong pressure transducer. The nasal cannula is attached to a small case that includes the pressure transducer. The device is powered by 2 1.5-V AA batteries and is affixed to the patient’s chest with a belt. The sampling rate of the flow signal is 100 Hz, with a flow-sensor effective range of −10 to 10 cmH2O and a 16-bit signal processor. The signal is processed by linearizing, filtering the noise and zeroing to baseline. Flow measurements are digitalized and downloaded to a PC. The device is capable of approximately 10 hours of data collection, with an internal storage memory of 15 MB. The signal is displayed in epochs, with full data disclosure capability that allows both automatic and manual scoring. The default settings for apnea and hypopneas were used in the automatic scoring. Apnea was defined as a reduction of 80% of baseline airflow for at least 10 s, with a maximum duration of 80 s. Hypopnea was defined as a reduction of 50% to 80% of baseline airflow for at least 10 s, with a maximum duration of 100 s. AL firmware version 2.97 and software version 5.13 were used in the automatic scoring. AHIs obtained from AL automatic scoring were based on the total evaluation times with good signals. Evaluation times with poor/bad signals were automatically excluded by AL. For manual scoring, apnea was defined as a 90% or greater reduction in baseline airflow for at least 10 s, and hypopnea was defined as a reduction of 30% or more of baseline airflow for at least 10 s. At least 90% of the duration of the event had to meet amplitude reduction criteria. Manual respiratory event scoring in AL was performed in 2 separate passes by 2 experienced technicians working independently and blinded to each other’s results (the interscorer agreement for respiratory events indices was 94.4%). Each scorer was also blinded to the PSG scoring results and the patients’ clinical data. The average of the 2 scores was used in the final analysis. A minimum recording time of 4 hours with a good signal was required. We defined normal as PSG AHI less than 5/h; mild OSA as AHI 5 to 15/h; moderate OSA as AHI 15 to 30/h; and severe OSA as AHI greater than 30/h [14].
STATISTICAL ANALYSIS:
Data are expressed as the mean ±SD. Pearson’s correlation was used to assess the relationship between PSG (PSG AHI) and AL (Auto and Manual AHI). An agreement analysis between the PSG and AL scoring results was performed following the Bland and Altman method [15]. The Bland-Altman plot represents the difference between AL Auto AHI, AL Manual AHI and PSG AHI against the mean value ([AL auto AHI + PSG AHI]/2 and [AL Manual AHI + PSG AHI]/2) for each patient. The limits of agreement were defined as ±1.96 SD. The receiver operating characteristic (ROC) curves between AL Auto AHI and PSG AHI and AL Manual AHI and PSG AHI were compared with cut-off values of 5, 10, 15 and 30, based on the PSG AHI. To assess the degree of rise of the ROC curve to the upper left-hand corner, the area under the curve (AUC) was measured. In general, a steeper rise of the curve corresponded with better test results. An area of 1 represents perfect agreement, and an area of 0.5 represents the least agreement. In this study, we adopted the AUC values used by Erman et al, where excellent is 0.9 to 1, very good is 0.8 to 0.9 and good is 0.7 to 0.8 [8].
Sensitivity, specificity, positive predictive values (PPV), negative predictive values (NPV) and positive and negative likelihood ratios (LR) were calculated for the same cut-off values of PSG AHI. In general, a likelihood ratio less than 1 indicates that the test result is associated with the absence of the disease, whereas a likelihood ratio greater than 1 indicates that the test result is associated with the presence of disease. Likelihood ratios below 0.1 and above 10 are considered to provide strong evidence to rule out or rule in diagnoses, respectively. When tests report results as being either positive or negative, the 2 likelihood ratios are called the positive likelihood ratio (PLR) and the negative likelihood ratio (NLR) [16].
The Statistical Package for Social Sciences Program (SPSS 16.0, Chicago, IL, USA) and MS Excel 2007 were used in the analysis.
Sample size calculation was performed based on the data reported by Chen et al. [11]. In their study population, the standard deviation of the AHI calculated by AL was 22 events/hr. For the current study, the same number was chosen. The mean difference was assumed to be 8 events/hr. For a power of 85% and an alpha (α) of 0.05, 87 participants were required. A dropout rate of 7% (due to equipment failure or data loss) was calculated. Therefore, a minimum of 95 participants was required. We recruited 107 consecutive patients.
Results
One hundred and seven patients underwent simultaneous PSG and AL monitoring. Three of the patients were excluded because they were diagnosed with hypoventilation during sleep, and 9 patients were excluded due to data loss or short evaluation duration. Complete data were available for 95 patients. The mean evaluation duration with a good signal was 4.5±1.6 hr. The mean duration with a weak signal was 0.7±1.0 hr. The mean signal failure rate was 16%.
The mean age of the studied group was 46.3±12.6 yr (18 to 65 yr), and their body mass index was 34.1±7.9 kg/m2 (18.6–57.7 kg/m2). Based on PSG findings, 14 patients (15%) did not have OSA (AHI<5/h). Mild OSA (5≤ AHI ≤15) was present in 22 patients (23%), moderate (15≤ AHI ≤30) in 18 patients (19%), and 41 patients (43%) had severe OSA (AHI ≥30). The patients’ demographic and polysomnographic characteristics are presented in Table 1. The following comorbidities were present: hypertension (35%), ischemic heart disease without heart failure (7%), diabetes mellitus (29%), hypothyroidism on treatment (13%), and hyperlipidemia (37%).
The mean AHI for PSG was 34.1±32.4/hr (0–142/hr). AL Auto AHI was 20.1±25.2/h (1–110/hr) and AL Manual AHI 39.5±30.4/hr (1–123). There was a strong correlation between PSG AHI and AL Auto AHI (r=0.883, p<0.005) and AL Manual AHI (r=0.966, p<0.005) (Figure 1A, B). Bland-Altman agreement analyses for AL Auto AHI, AL Manual AHI and PSG AHI are shown in Figure 2A, B. For AL Auto AHI, the plot revealed that 94% of the differences were within the limits of agreement ±1.96 SD. However, the mean difference between PSG AHI and AL Auto AHI was −5.8, indicating an underestimation of the score by AHI. Additionally, AHI scores drifted below the identity line, indicating some degree of under-recognition by AL Auto. The plots also revealed that AL Auto tended to better estimate disease severity at lower AHI values (<20/hr), whereas it underestimated severity at higher AHI values. Ninety-five percent of AL Manual scores were within the limits of agreement ±1.96 SD.
Figure 3 shows the comparisons of ROC curves, with cut-off values of AHI of 5, 10, 15 and 30, for both AL Auto and Manual with the corresponding area under the curve (AUC). At all AHI cut-off points, AUC values for AL Auto and Manual indicated very good to excellent agreement. However, AL Manual showed better agreement with PSG AHI at all cut-off points.
Table 2 shows the sensitivity, specificity, positive and negative predictive values (PPV/NPV), area under curve (AUC), and positive and negative LR of AL Auto at the different AHI cut-off points. At AHIs of 5, 10, 15, and 30, the sensitivity/specificity values were 0.79/0.68, 0.70/0.89, 0.65/0.94 and 0.63/0.98, respectively. The PPV/NPV for the same cutoff points were 0.89/0.50, 0.91/0.63, 0.91/0.72, and 0.95/0.85, respectively. The positive and negative likelihood ratios ranged from 2.50 to 41.17 and 0.30 to 0.38, respectively. Table 3 shows the sensitivity, specificity, PPV/NPV and AUC of AL Manual at the different AHI cut-off points. At AHI cut-off points of 5, 10, 15, and 30, the sensitivity/specificity values were 1.00/0.43, 1.00/0.56, 0.98/0.58 and 1.00/0.80, respectively. The PPV/NPV for the same cut-off points were 0.91/1.00, 0.86/1.00, 0.79/0.95, and 0.79/1.00, respectively. The positive and negative likelihood ratios ranged from 1.175 to 4.91 and 0.00 to 0.03, respectively.
Discussion
A simple, portable screening device for OSA is sorely needed. AL is a simple device that requires limited training and short preparation times. It has good potential as a simple screening device, particularly because it permits manual review and scoring of the raw data. Nevertheless, information about the accuracy of Type 4 devices, including AL, remains limited. Therefore, their use as attended or unattended monitoring devices for OSA patients has not been widely accepted [7]. In the current study, we compared AL automatic and manual scoring results with PSG results. The results showed very good to excellent positive predictive values for both AL Auto and Manual, and excellent negative predictive values for AL Manual in patients with a high pre-test probability of OSA. AL Manual showed excellent sensitivity, indicating its potential use as a screening tool, and AL Auto showed very good specificity, indicating its potential usefulness in case elimination. Hence, AL can be endorsed as a good screening tool for patients with a high probability of OSA. Manual scoring has added more power to the reliability of the device. The AASM has recommended that PM devices must allow the display of raw data and permit manual scoring or the editing of automated scoring by a qualified sleep technician [7]. In our study, the raw data were scored in 2 separate passes by 2 trained and experienced sleep technicians, working independently and blinded to each other’s results, using the same criteria used for the in-laboratory PSG scoring. A Bland-Altman plot revealed good agreement between PSG and AL, particularly AL Manual. Additionally, AUC values at all AHI cut-off levels indicated very good agreement between PSG and AL Auto and AL Manual. Given the good agreement, we propose that the AL findings (combining automatic and manual scoring; automatic scoring followed by manual check) can be used to justify the initiation of positive airway pressure therapy in sick patients with high clinical suspicion of obstructive sleep apnea, who cannot wait for PSG confirmation. Nevertheless, we still believe that a scheduled, formal, Type 1 sleep study should be conducted to confirm the findings and to properly titrate the pressure. We do not recommend this device for patients who are likely to have central sleep apnea, such as patients with heart failure and elderly patients (>65 years).
Table 4 presents the findings of previous studies that evaluated AL automatic scoring in patients with a high probability of OSA. In general, all studies found good case-finding and elimination abilities for AL. Studies recruited patients with a wide range of ages, AHIs, races and sexes, which may account for some of the differences observed between the studies [11]. Additionally, one should be cautious when interpreting research data on the performance of any scoring software, as investigators may use different versions of software [7]. The duration of monitoring may also influence the results. Erman et al demonstrated that 4 hours or more of recording are needed for optimal results [8]. Recordings performed for less than 4 hours led to more frequent false negative results at low AHI cut-off points [8]. The presence of comorbid conditions may also influence the outcome of results.
A limitation of the present study is that sleep monitoring was performed in the laboratory. Ragette et al reported better event recognition and diagnostic performance in attended AL monitoring compared to unattended home studies. Nevertheless, simultaneous recording in the laboratory reduces the night-to-night variability in respiratory events. The findings of this study cannot be applied to the general public because we studied a highly specific group of patients with a high pre-test probability of OSA. Future studies should test event recognition and diagnostic performance of combined manual and automatic scoring of AL in an unattended home environment. Because we could not accurately assess TST in AL Auto, total recording time was used as the denominator when calculating AHI, which may have resulted in an underestimation of the true AHI.
Conclusions
The current study demonstrated that combining auto and manual scoring of data (automatic scoring followed by manual scoring) recorded by single-channel AL provides good diagnostic agreement with conventional PSG recordings. Manual scoring has added more power to the reliability of the device. Therefore, we recommend the manual review and scoring of AL data, as it increases ApneaLink™ OSA case-finding abilities. ApneaLink™ is not recommended for patients with central sleep apnea.
References
1. Schahin SP, Nechanitzky T, Dittel C, Long-term improvement of insulin sensitivity during CPAP therapy in the obstructive sleep apnoea syndrome: Med Sci Monit, 2008; 14(3); CR117-21, pmid: 18301354
2. Tkacova R, Dorkova Z, Molcanyiova A, Cardiovascular risk and insulin resistance in patients with obstructive sleep apnea: Med Sci Monit, 2008; 14(9); CR438-44, pmid: 18758413
3. Selim B, Won C, Yaggi HK, Cardiovascular consequences of sleep apnea: Clin Chest Med, 2010; 31(2); 203-20, pmid: 20488282
4. Bahammam AS, Aljafen B, Sleep medicine service in Saudi Arabia. A quantitative assessment: Saudi Med J, 2007; 28(6); 917-21, pmid: 17530111
5. Young T, Evans L, Finn L, Palta M, Estimation of the clinically diagnosed proportion of sleep apnea syndrome in middle-aged men and women: Sleep, 1997; 20(9); 705-6, pmid: 9406321
6. Ferber R, Millman R, Coppola M, Portable recording in the assessment of obstructive sleep apnea. ASDA standards of practice: Sleep, 1994; 17(4); 378-92, pmid: 7973323
7. Collop N, Anderson W, Boehlecke B, Clinical guidelines for the use of unattended portable monitors in the diagnosis of obstructive sleep apnea in adult patients: J Clin Sleep Med, 2007; 3(7); 737-47, pmid: 18198809
8. Erman MK, Stewart D, Einhorn D, Validation of the ApneaLink for the screening of sleep apnea: a novel and simple single-channel recording device: J Clin Sleep Med, 2007; 3(4); 387-92, pmid: 17694728
9. Clark AL, Crabbe S, Aziz A, Reddy P, Greenstone M, Use of a screening tool for detection of sleep-disordered breathing: J Laryngol Otol, 2009; 123(7); 746-49, pmid: 19222876
10. Ragette R, Wang Y, Weinreich G, Teschler H, Diagnostic performance of single airflow channel recording (ApneaLink) in home diagnosis of sleep apnea: Sleep Breath, 2010; 14(2); 109-14, pmid: 19714380
11. Chen H, Lowe AA, Bai Y, Evaluation of a portable recording device (ApneaLink) for case selection of obstructive sleep apnea: Sleep Breath, 2009; 13(3); 213-19, pmid: 19052790
12. Rechtschaffen A, Kales A, A Manual of Standardized Terminology, Techniques and Scoring System for Sleep Stages of Human Subjects: NIH publication 204, 1968, Washington, DC, US Government Printing Office
13. Iber C, Ancoli-Israel S, Chesson AL, Quan SF: The AASM manual for the scoring of sleep and associated events: rules, terminology and technical specifications, 2007, Westchester, Illinois, American Academy of Sleep Medicine
14. , Sleep-related breathing disorders in adults: recommendations for syndrome definition and measurement techniques in clinical research. The Report of an American Academy of Sleep Medicine Task Force: Sleep, 1999; 22(5); 667-89, pmid: 10450601
15. Bland JM, Altman DG, Statistical methods for assessing agreement between two methods of clinical measurement: Lancet, 1986; 1(8476); 307-10, pmid: 2868172
16. Deeks JJ, Altman DG, Diagnostic tests 4: likelihood ratios: BMJ, 2004; 329(7458); 168-69, pmid: 15258077
In Press
Clinical Research
Institutional and Regional Variations in Access to Clinical Trials and Next-Generation Sequencing in Turkis...Med Sci Monit In Press; DOI: 10.12659/MSM.951027
Clinical Research
Low-Intensity Blood Flow-Restricted Multi-Joint Exercise Improves Muscle Function in Patients With Patellof...Med Sci Monit In Press; DOI: 10.12659/MSM.950516
Review article
Musculoskeletal Ultrasound and MRI in the Evaluation of Chemotherapy-Induced Peripheral Neuropathy: A ReviewMed Sci Monit In Press; DOI: 10.12659/MSM.951283
Clinical Research
Sensory Processing, Dissociation, and Affective Symptoms in Misophonia: A Cross-Sectional Study of 35 AdultsMed Sci Monit In Press; DOI: 10.12659/MSM.950938
Most Viewed Current Articles
17 Jan 2024 : Review article 10,187,196
Vaccination Guidelines for Pregnant Women: Addressing COVID-19 and the Omicron VariantDOI :10.12659/MSM.942799
Med Sci Monit 2024; 30:e942799
13 Nov 2021 : Clinical Research 3,708,487
Acceptance of COVID-19 Vaccination and Its Associated Factors Among Cancer Patients Attending the Oncology ...DOI :10.12659/MSM.932788
Med Sci Monit 2021; 27:e932788
14 Dec 2022 : Clinical Research 2,341,643
Prevalence and Variability of Allergen-Specific Immunoglobulin E in Patients with Elevated Tryptase LevelsDOI :10.12659/MSM.937990
Med Sci Monit 2022; 28:e937990
16 May 2023 : Clinical Research 706,524
Electrophysiological Testing for an Auditory Processing Disorder and Reading Performance in 54 School Stude...DOI :10.12659/MSM.940387
Med Sci Monit 2023; 29:e940387






