LETTER 
[Download PDF]


Year : 2014  Volume
: 60
 Issue : 3  Page : 343344 

Understanding correlation in the context of outliers
SK Raina Department of Community Medicine, Dr. Rajendra Prasad Government Medical College (RPGMC), Kangra, Himachal Pradesh, India
Correspondence Address:
Dr. S K Raina Department of Community Medicine, Dr. Rajendra Prasad Government Medical College (RPGMC), Kangra, Himachal Pradesh India
How to cite this article:
Raina S K. Understanding correlation in the context of outliers.J Postgrad Med 2014;60:343344

How to cite this URL:
Raina S K. Understanding correlation in the context of outliers. J Postgrad Med [serial online] 2014 [cited 2022 May 21 ];60:343344
Available from: https://www.jpgmonline.com/text.asp?2014/60/3/343/138830 
Full Text
Sir,
This letter pertains to "Correlation between measures of hypoglycaemia and glycemic improvement in sulfonylurea treated patients with type 2 diabetes in India: Results from the OBSTACLE hypoglycaemia study" by Kalra et al. published in a recent issue. [1] The authors need to be complimented for their effort in linking measures of hypoglycemia and glycemic improvement in sulfonylureatreated patients. However, since the basis of this linkage is purely statistical (correlation) I have a few concerns.
I would like to draw the attention of authors to the numbers presented in [Table 2]. For gender, the number presented is 943 (526 + 417); for age, the number presented is 949 (45 + 171 + 595 + 138); and the number presented for BMI is 947 (370 + 394 + 183). For duration of diabetes the number is 928 (640 + 239 + 49). The authors state in their results section that "of the 1069 patients that were enrolled in the study, 950 patients, having the values for HbA1c at week 12 and hypoglycaemia score without any major protocol deviation, were considered evaluable for primary analysis." The authors also state that keeping 0.70 as the anticipated sample correlation and 0.740.66 as the population correlation, at 5% level of significance and 80% power for a twosided hypothesis, a sample size of 1138 patients was considered adequate planned. In this context, how do they expect to arrive at a correlation that has some external validity or generalizability?
[INLINE:1]
A correlation basically shows whether two variables are related (or not), how strongly are they related, and in what manner (positive or negative). In statistical terms, the relationship between variables is denoted by the correlation coefficient, which is a number between 0 and 1.0. If there is no relationship between the variables under investigation (or between the predicted values and the actual values), then the correlation coefficient is 0, or nonexistent. As the strength of the relationship between the variables increases, so does the value of the correlation coefficient, with a value of 1 showing a perfect relationship.
The authors state that using linear regression analysis, a weak negative correlation was observed between hypoglycemia scores and HbA1c values at the end of study (correlation coefficient 0.12; 95% CI 0.18 to 0.06). However, looking at the figure used in the study [Figure 1], the correlation is found to be more curvilinear than linear. [1] Hence the interpretation needs to be addressed. A correlation coefficient can be misleading when the association is curvilinear or subject to ceiling effects. As an example, think of the relationship between height and age in a population where the age ranges from 1 to 90 years. The relationship rises from 1 to about 16 or so, and stays more or less constant thereafter. Correlations can also be affected by outliers (extreme scores), so it is useful to plot the data on a scatterplot first. Similar to the example cited above, the relationship in this study is affected by the extreme scores as almost 70% of patients in this study belong to the group (with diabetes of less than 5 years) with highly significant correlation. If you move further, in [Table 2] used in the study, and given below this correlation becomes less significant to nonsignificant. A scatterplot in this case would have provided a better understanding of this phenomenon.{Figure 1}
References
1  Kalra S, Deepak MC, Narang P, Singh V, Maheshwari A. Correlation between measures of hypoglycaemia and glycemic improvement in sulfonylurea treated patients with type 2 diabetes in India: Results from the OBSTACLE hypoglycaemia study. J Postgrad Med 2014;60:1515. 
