<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article
  PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "http://jats.nlm.nih.gov/publishing/1.0/JATS-journalpublishing1.dtd">
<article article-type="editorial" dtd-version="1.0" specific-use="sps-1.7" xml:lang="en" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
	<front>
		<journal-meta>
			<journal-id journal-id-type="nlm-ta">Braz J Cardiovasc Surg</journal-id>
			<journal-id journal-id-type="publisher-id">rbccv</journal-id>
			<journal-title-group>
				<journal-title>Brazilian Journal of Cardiovascular Surgery</journal-title>
				<abbrev-journal-title abbrev-type="publisher">Braz. J. Cardiovasc.
					Surg.</abbrev-journal-title>
			</journal-title-group>
			<issn pub-type="ppub">0102-7638</issn>
			<issn pub-type="epub">1678-9741</issn>
			<publisher>
				<publisher-name>Sociedade Brasileira de Cirurgia Cardiovascular</publisher-name>
			</publisher>
		</journal-meta>
		<article-meta>
			<article-id pub-id-type="doi">10.21470/1678-9741-2018-0378</article-id>
			<article-id pub-id-type="publisher-id">00003</article-id>
			<article-categories>
				<subj-group subj-group-type="heading">
					<subject>EDITORIAL</subject>
				</subj-group>
			</article-categories>
			<title-group>
				<article-title>Operating with Data - Statistics for the Cardiovascular Surgeon: Part
					III. Comparing Groups</article-title>
			</title-group>
			<contrib-group>
				<contrib contrib-type="author">
					<name>
						<surname>Liguori</surname>
						<given-names>Gabriel Romero</given-names>
					</name>
					<xref ref-type="aff" rid="aff1">1</xref>
					<role>MD</role>
				</contrib>
				<contrib contrib-type="author">
					<name>
						<surname>Moreira</surname>
						<given-names>Luiz Felipe Pinho</given-names>
					</name>
					<xref ref-type="aff" rid="aff1">1</xref>
					<role>MD, PhD</role>
				</contrib>
			</contrib-group>
				<aff id="aff1">
					<label>1</label>
					<institution content-type="orgname">Hospital das Clínicas da Universidade de São Paulo </institution>
					<institution content-type="orgdiv1">Hospital das Clínicas da Universidade de São Paulo </institution>
					<institution content-type="orgdiv2">Hospital das Clínicas da Universidade de São Paulo </institution>
					<addr-line>
        <named-content content-type="city">São Paulo</named-content>
        <named-content content-type="state">SP</named-content>
					</addr-line>
					<country country="BR">Brazil</country>
					<institution content-type="original">Laboratório de Cirurgia Cardiovascular e
						Fisiopatologia da Circulação (LIM-11), Instituto do Coração (InCor),
						Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de São
						Paulo, São Paulo, SP, Brazil.</institution>
				</aff>
			<pub-date pub-type="epub-ppub">
				<season>Nov-Dec</season>
				<year>2018</year>
			</pub-date>
			<volume>33</volume>
			<issue>6</issue>
			<fpage>V</fpage>
			<lpage>X</lpage>
			<permissions>
				<license license-type="open-access"
					xlink:href="http://creativecommons.org/licenses/by/4.0/" xml:lang="en">
					<license-p>This is an Open Access article distributed under the terms of the
						Creative Commons Attribution License, which permits unrestricted use,
						distribution, and reproduction in any medium, provided the original work is
						properly cited.</license-p>
				</license>
			</permissions>
		</article-meta>
	</front>
	<body>
		<p>In the previous issues of the Brazilian Journal of Cardiovascular Surgery (BJCVS) we
			discussed, first, the fundamental concepts required for understanding
				biostatistics<sup>[</sup><xref ref-type="bibr" rid="B1">1</xref><sup>]</sup> and,
			then, how to evidence associations and assess risk<sup>[</sup><xref ref-type="bibr"
				rid="B2">2</xref><sup>]</sup>. In this third part of the editorial series entitled
			"Operating with Data - Statistics for the Cardiovascular Surgeon", we will examine the
			methods for comparing groups.</p>
		<sec>
			<title>Comparing What?</title>
			<p>Here, again, it is important to clarify what we define as "comparing groups". One
				could state that, in our last editorial<sup>[</sup><xref ref-type="bibr" rid="B2"
					>2</xref><sup>]</sup>, we were also comparing groups. Indeed, we were comparing
				the occurrence of determined events among two or more groups. However, since both
				variables were qualitative (or categorical), we defined those cases as an analysis
				of association. Now, we are referring to the comparison of quantitative (or
				numerical) variables in two or more groups. In these cases, the object of analysis
				is not the frequency of the events, as before, but the central value of a
				quantitative variable. Scientifically speaking, the methods we will present in this
				editorial are valid for those cases in which the independent variable is qualitative
					<italic>i.e</italic>. groups and the independent variable is quantitative.</p>
			<p>Luckily, tests for comparing groups, as defined above, are probably the simplest
				statistical tests to understand, if you are not willing to go deep in the
				mathematical side of them, which is the case for us. In fact, the whole editorial
				could be summarized in three simple questions (<xref ref-type="fig" rid="f1">Figure
					1</xref>): 1) How many groups are being compared?; 2) Are the data normally
				distributed?; and 3) Are the groups paired?. The concepts required to answer
				questions 2 and 3 <italic>i.e</italic>. data distribution and pairing are fully
				explained in our first editorial<sup>[</sup><xref ref-type="bibr" rid="B1"
					>1</xref><sup>]</sup>. Still, it is important to understand what is behind each
				of these different tests and, thus, comprehend why they are the choice for each of
				these different situations.</p>
			<p>
				<fig id="f1">
					<label>Fig. 1</label>
					<caption>
						<title>Decision flowchart for group comparison.</title>
					</caption>
					<graphic xlink:href="0102-7638-rbccv-33-06-000V-gf01.jpg"/>
				</fig>
			</p>
		</sec>
		<sec>
			<title>Comparison Tests for Two Groups</title>
			<p>If you are working with only two groups, let's say intervention and control, somehow
				you will be using a t-test. The t-distribution (and consequent t-test) was first
				proposed in 1908 by William Gosset<sup>[</sup><xref ref-type="bibr" rid="B3"
					>3</xref><sup>]</sup>, a chemist from Guinness brewery who could not publish his
				findings under his own name and, thus, did it under the pseudonym of "Student",
				reason why the most used t-test is named Student's t-test. The idea of all the
				t-tests is the same: to answer if the observed difference is larger than we should
				expect from random inconstancy.</p>
			<p>For that, the t-test calculates the ratio of the difference between group means and
				the variance within the groups (<xref ref-type="fig" rid="f2">Figure 2</xref>) to
				define a t-value. Thus, if the difference between the means is small and the
				variance within the groups is large, the t-value is low (<xref ref-type="fig"
					rid="f2">Figure 2A</xref>). Oppositely, if the difference is large and the
				variance is small, then the t-value is high (<xref ref-type="fig" rid="f2">Figure
					2B</xref>). The higher is the t-value, the most significant is the difference.
				Using this t-value and the degrees of freedom of the sample (which is related to the
				number of observations), the t-test calculates the <italic>P</italic>-value for that
				difference. In the case of paired data, you can use the paired version of the
				Student's t-test. The details of the test's mathematical formula do not belong to
				the scope of this editorial but can be easily found online. The Student's t-test,
				however, is somehow limited because it assumes the data is normally distributed and
				the standard deviation is the same for both groups. When data is not normally
				distributed, however, other approaches are necessary to test the difference between
				the groups.</p>
			<p>
				<fig id="f2">
					<label>Fig. 2</label>
					<caption>
						<title>Graphical representation of the rationale behind the Student's
							t-test. A) Groups not significantly different. B) Groups significantly
							different.</title>
					</caption>
					<graphic xlink:href="0102-7638-rbccv-33-06-000V-gf02.jpg"/>
				</fig>
			</p>
			<p>Frank Wilcoxon, in 1945, proposed a modification to the Student's t-test that allowed
				Gosset's calculations to be used for non-normal distributions<sup>[</sup><xref
					ref-type="bibr" rid="B4">4</xref><sup>]</sup>. Briefly, the proposed approach
				was to put all the data from both groups together and organize it in an ascending
				manner, so that each value would, now, possess a position within an ordered set of
				values. This position was called rank and this rank was used for calculating the
				t-test, instead of the actual value. This test was called Wilcoxon rank-sum test (do
				not confound with the Wilcoxon signed rank test). To illustrate this concept, let's
				imagine we have two groups of five patients, A and B, undergoing cardiopulmonary
				bypass (CPB). The time under CPB was registered and tabulated as in <xref
					ref-type="fig" rid="f3">Figure 3A</xref>. These values were, then, brought
				together and reorganized in an ordered sequence, as in <xref ref-type="fig" rid="f3"
					>Figure 3B</xref>, so that for each value it was assigned a rank. The rank,
				then, substitutes the original value in the original table, originating a new table,
				as in <xref ref-type="fig" rid="f3">Figure 3C</xref>. The values in <xref
					ref-type="fig" rid="f3">Figure 3C</xref> are the ones which will be used to
				calculate the Student's t-test. Naturally, you do not need to perform all these
				steps when running a Wilcoxon rank-sum test, the statistics software does it all
				automatically, but it is interesting to understand how the test is performed so that
				you can better comprehend its applications. The Wilcoxon rank-sum test, however,
				also presented limitations, one of them being the fact that it could only be used
				for groups with equal numbers of subjects. In order to solve this issue, two
				statisticians, Mann and Whitney, proposed, in 1947, a modification to the formula of
				the Wilcoxon rank-sum test so that groups of different sizes could be
					evaluated<sup>[</sup><xref ref-type="bibr" rid="B5">5</xref><sup>]</sup>,
				originating the Mann-Whitney U test, also referred as Mann-Whitney-Wilcoxon (MWW)
				test. Still, the whole concept behind this test is also the ranking of the original
				values and further calculations with the rank values.</p>
			<p>
				<fig id="f3">
					<label>Fig. 3</label>
					<caption>
						<title>Statistics based on ranks. A) Original data. B) Ranked data. C.
							Modified data.</title>
					</caption>
					<graphic xlink:href="0102-7638-rbccv-33-06-000V-gf03.jpg"/>
				</fig>
			</p>
			<p>Still, none of the tests, neither the Wilcoxon rank-sum test nor the Mann-Whitney U
				test were designed to evaluate paired data. To do this, Wilcoxon proposed, in the
				same publication he proposed the rank-sum test, another test, specific for paired
					data<sup>[</sup><xref ref-type="bibr" rid="B4">4</xref><sup>]</sup>. This test
				is what is called the Wilcoxon signed rank test. Because the groups of paired data
				will always have the same size, this test did not require the modifications proposed
				by Mann and Whitney and thus continue to be the choice for the comparison of two
				paired groups with non-normal distribution. </p>
		</sec>
		<sec>
			<title>Comparison Tests For Three or More Groups</title>
			<p>When working with three or more groups, like comparing treatments A, B, and C,
				another type of statistical test must be used, the so-called analysis of variance
				(ANOVA). Maybe some of the readers are asking themselves why not to perform multiple
				t-tests so to compare the multiple groups in separate analyses. The answer is
				actually very simple: by using this approach <italic>i.e</italic>. multiple t-tests,
				the researcher would be increasing the type I error. This means he or she would be
				rejecting the null hypothesis when it is, in fact, true or, putting it in simpler
				words, affirm there is a difference among the groups which actually does not exist.
				Suppose you have five different groups: A, B, C, D, and E. If you would use only
				t-tests, you would have to perform ten different t-tests (A <italic>vs</italic>. B;
				A <italic>vs</italic>. C; A <italic>vs</italic>. D; A <italic>vs</italic>. E; B
					<italic>vs</italic>. C; B <italic>vs</italic>. D; B <italic>vs</italic>. E; C
					<italic>vs</italic>. D; C <italic>vs</italic>. E; and D <italic>vs</italic>. E).
				In each of these tests, you assume an acceptable error of 5% <italic>i.e</italic>.
					<italic>P</italic>=0.05. Thus, after performing all the tests, your final error
				is up to 50%, meaning that if you find a difference between two of those five groups
				by running multiple t-tests, the possibility of that difference being due to chance
				is 50%! To solve this issue, the analysis of variance was created.</p>
			<p>To compare several groups at once keeping a fixed type I error, the analysis of
				variance calculates two types of variance <italic>i.e</italic>. the spread between
				numbers in a data set. The first is the variance within each group, what is done by
				calculating the variance between each observation in a group and this group mean.
				Then, it is calculated the variance between the groups, which, in turn, is done by
				calculating the variance between each group mean and the overall mean (the mean of
				all values in all groups). Finally, the ratio between the variance and the within
				variance (b/w) is calculated. If the ratio is large, it means the groups differ, if
				the ratio is low, it means the groups do not differ. <xref ref-type="fig" rid="f4"
					>Figure 4</xref> illustrates very well this rationale. When the variance between
				groups is smaller than the variance within the group (<xref ref-type="fig" rid="f4"
					>Figure 4A</xref>), there is probably no difference among these groups. On the
				other hand, when the variance between groups is larger than the variance within the
				group (<xref ref-type="fig" rid="f4">Figure 4B</xref>), there is probably a
				difference among them. To calculate if this ratio <italic>i.e</italic>. the
				difference among groups is statistically significant it is used the F-test.
				Similarly, to the t-test, several variables are used in this calculation and the
				mathematical details of this formula will not be covered by this editorial. Besides
				understanding the rationale behind the analysis of variance, it is also important to
				recognize that, although this test can inform if there is at least one group that
				differs from the others, it cannot state which is the different group and what is
				size or direction of this difference. For that, <italic>post-hoc</italic> tests are
				necessary, and we will discuss them later in this editorial.</p>
			<p>
				<fig id="f4">
					<label>Fig. 4</label>
					<caption>
						<title>Graphical representation of the rationale behind the analysis of
							variance (ANOVA). A) Groups not significantly different. B) Groups
							significantly different.</title>
					</caption>
					<graphic xlink:href="0102-7638-rbccv-33-06-000V-gf04.jpg"/>
				</fig>
			</p>
			<p>The analysis of variance above described is what is traditionally called One-way
				ANOVA and is valid for normally distributed data and non-paired groups. Other types
				of analysis of variance, however, can also be performed in cases where data is not
				normally distributed and/or if the groups are paired. First, for the cases in which
				data is normally distributed, but paired, the test of choice will be the Repeated
				Measures (RM) One-way ANOVA. As any test for paired samples, RM One-way ANOVA will
				consider the variations within each subject when making the previously explained
				calculations. The second variation of the One-way ANOVA is the so-called
				Kruskal-Wallis test, also known as One-way ANOVA on Ranks, which was designed by
				these two statisticians, William Kruskal and W. Allen Wallis, for variables which
				are both not paired and not normally distributed<sup>[</sup><xref ref-type="bibr"
					rid="B6">6</xref><sup>]</sup>. The Kruskal-Wallis test is a derivation of the
				Mann-Whitney U test and, thus, does not assume a normal distribution of the data and
				uses its ranked values for calculations. Finally, the third variation of the One-way
				ANOVA is valid for variables which are paired but not normally distributed. This
				test was developed by the Nobel laureate Milton Friedman<sup>[</sup><xref
					ref-type="bibr" rid="B7">7</xref><sup>]</sup> and, for this reason, is called
				the Friedman test. To think this test in a simple way, it could be explained as a
				combination of the two previous tests <italic>i.e</italic>. it is a ranked test for
				repeated measures.</p>
		</sec>
		<sec>
			<title><italic>Post-hoc</italic> tests</title>
			<p>As commented above, all the previously described tests only inform if there is at
				least one group that differs from the others, but do not state which is the
				different group and what is size or direction of this difference. For dissecting
				these differences, a <italic>post-hoc</italic> test is necessary.
					<italic>Post-hoc</italic> tests should only be performed after a statistically
				significant difference was found in the analysis of variance. These tests use
				different means to determine what is the different group among all others. In fact,
				the <italic>post-hoc</italic> test can be performed in three different ways: 1)
				comparing all groups against each other (all pairwise comparison); 2) comparing
				specific pairs of interest (specific pairwise comparison); or 3) comparing all
				treatment groups against one control group. Not all <italic>post-hoc</italic> tests
				can be used for any of these three situations. Also, not all
					<italic>post-hoc</italic> tests can be used after any parametric or
				non-parametric analysis of variance. <xref ref-type="table" rid="t1">Table 1</xref>
				summarizes when each of the most used <italic>post-ho</italic>c tests should be
				used. Another important observation is that each <italic>post-hoc</italic> tests are
				more or less prone to type I or type II errors (for definitions, check our first
				editorial (1)) so that they are more liberal or more conservative in regard to
				accepting false-positives in order to not risk false-negatives. <xref
					ref-type="table" rid="t1">Table 1</xref> also list the limitations of each test,
				such as the type of error each test is more prone to incur and other statistical
				pitfalls. Other <italic>post-hoc</italic> tests not described in <xref
					ref-type="table" rid="t1">Table 1</xref> exist, but this editorial does not
				intend to cover all of them.</p>
			<table-wrap id="t1">
				<label>Table 1</label>
				<caption>
					<title><italic>Post-hoc</italic> tests.</title>
				</caption>
						<alternatives>
							<graphic xlink:href="t1.jpg"/>
				<table frame="hsides" rules="all">
					<colgroup>
						<col width="16%"/>
						<col width="28%"/>
						<col width="28%"/>
						<col width="28%"/>
					</colgroup>
					<thead>
						<tr>
							<th align="left">Test</th>
							<th align="center">ANOVA</th>
							<th align="center">Comparison</th>
							<th align="center">Requirements and limitations</th>
						</tr>
					</thead>
					<tbody>
						<tr>
							<td align="left">Fisher's LSD</td>
							<td align="center">Parametric (One-way ANOVA<break/>or RM One-way
								ANOVA)</td>
							<td align="center">All pairwise comparisons, specific<break/>pairwise
								comparisons and compare<break/>treatments with a control</td>
							<td align="center">Prone to type I error</td>
						</tr>
						<tr>
							<td align="left">Holm-Sidak</td>
							<td align="center">Parametric (One-way ANOVA<break/>or RM One-way
								ANOVA)</td>
							<td align="center">All pairwise comparisons, specific<break/>pairwise
								comparisons and compare<break/>treatments with a control</td>
							<td align="center">Prone to type II error and does<break/>not give
								confidence interval (only<break/>significance)</td>
						</tr>
						<tr>
							<td align="left">Bonferroni</td>
							<td align="center">Parametric (One-way ANOVA<break/>or RM One-way
								ANOVA)</td>
							<td align="center">All pairwise comparisons, specific<break/>pairwise
								comparisons and compare<break/>treatments with a control</td>
							<td align="center">Prone to type II error</td>
						</tr>
						<tr>
							<td align="left">Tukey-Kramer</td>
							<td align="center">Parametric (One-way ANOVA<break/>or RM One-way
								ANOVA)</td>
							<td align="center">Only for all pairwise comparisons</td>
							<td align="center">Prone to type I error (less than
								Fisher's<break/>LSD)</td>
						</tr>
						<tr>
							<td align="left">Newman-Keuls</td>
							<td align="center">Parametric (One-way ANOVA<break/>or RM One-way
								ANOVA)</td>
							<td align="center">Only for all pairwise comparisons</td>
							<td align="center">Require an equal number of subjects<break/>in all
								groups; prone to type II error;<break/>and does not give confidence
								interval<break/>(only significance)</td>
						</tr>
						<tr>
							<td align="left">Dunnet</td>
							<td align="center">Parametric (One-way ANOVA<break/>or RM One-way
								ANOVA)</td>
							<td align="center">Only when comparing treatments<break/>with a
								control</td>
							<td align="center">Prone to type II error</td>
						</tr>
						<tr>
							<td align="left">Dunn's</td>
							<td align="center">Non-parametric (Kruskal-<break/>Wallis or
								Friedman)</td>
							<td align="center">All pairwise comparisons, specific<break/>pairwise
								comparisons and compare<break/>treatments with a control</td>
							<td align="center">Prone to type II error and does<break/>not give
								confidence interval (only<break/>significance)</td>
						</tr>
					</tbody>
				</table>
			</alternatives>
			</table-wrap>
		</sec>
		<sec>
			<title>The Two-way ANOVA</title>
			<p>Finally, it is important to point to the existence of a Two-way ANOVA. The Two-way
				ANOVA is a type of analysis of variance for when you have two independent variables
				being analyzed at the same time. One example could be the evaluation of cardiac
				function after 30, 120 and 180 days after patients were submitted to two different
				approaches, A and B, of myocardial revascularization. The first variable is the
				intervention, which could be A or B. The second variable is the time at evaluation,
				which could be 30, 120, or 180 days. Performing a Two-way ANOVA can lead to three
				different conclusions: 1) if there are differences due to the intervention group; 2)
				if there are differences due to the time point; and 3) if there are differences due
				to a combination of intervention group and timepoint. This combination is called
				interaction and, if significant, means that differences found in one of the
				independent variables could also be partially attributed to the other, making it
				difficult to determine what is, in fact, the main variable responsible for the
				observed effect. Two-way ANOVA can also be followed by <italic>post-hoc</italic>
				tests, many of which are the same used for One-way ANOVA.</p>
		</sec>
	</body>
	<back>
		<ref-list>
			<title>REFERENCES</title>
			<ref id="B1">
				<label>1</label>
				<mixed-citation>Liguori GR, Moreira LFP. Operating with Data - Statistics for the
					cardiovascular surgeon: Part I. Fundamentals of Biostatistics. Braz J Cardiovasc
					Surg. 2018;33(3):III-VIII.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Liguori</surname>
							<given-names>GR</given-names>
						</name>
						<name>
							<surname>Moreira</surname>
							<given-names>LFP</given-names>
						</name>
					</person-group>
					<article-title>Operating with Data - Statistics for the cardiovascular surgeon:
						Part I. Fundamentals of Biostatistics</article-title>
					<source>Braz J Cardiovasc Surg</source>
					<year>2018</year>
					<volume>33</volume>
					<issue>3</issue>
					<fpage>III</fpage>
					<lpage>VIII</lpage>
				</element-citation>
			</ref>
			<ref id="B2">
				<label>2</label>
				<mixed-citation>Liguori GR, Moreira LFP. Operating with Data - Statistics for the
					cardiovascular surgeon: Part II. Association and risk. Braz J Cardiovasc Surg.
					2018;33(4):IV-VIII.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Liguori</surname>
							<given-names>GR</given-names>
						</name>
						<name>
							<surname>Moreira</surname>
							<given-names>LFP</given-names>
						</name>
					</person-group>
					<article-title>Operating with Data - Statistics for the cardiovascular surgeon:
						Part II. Association and risk</article-title>
					<source>Braz J Cardiovasc Surg</source>
					<year>2018</year>
					<volume>33</volume>
					<issue>4</issue>
					<fpage>IV</fpage>
					<lpage>VIII</lpage>
				</element-citation>
			</ref>
			<ref id="B3">
				<label>3</label>
				<mixed-citation>Student. The probable error of a mean. Biometrika.
					1908;6(1):1-25.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<collab>Student</collab>
					</person-group>
					<article-title>The probable error of a mean</article-title>
					<source>Biometrika</source>
					<year>1908</year>
					<volume>6</volume>
					<issue>1</issue>
					<fpage>1</fpage>
					<lpage>25</lpage>
				</element-citation>
			</ref>
			<ref id="B4">
				<label>4</label>
				<mixed-citation>Wilcoxon F. Individual comparisons by ranking methods. Biometrics
					Bulletin. 1945;1(6):80-3.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Wilcoxon</surname>
							<given-names>F</given-names>
						</name>
					</person-group>
					<article-title>Individual comparisons by ranking methods</article-title>
					<source>Biometrics Bulletin</source>
					<year>1945</year>
					<volume>1</volume>
					<issue>6</issue>
					<fpage>80</fpage>
					<lpage>83</lpage>
				</element-citation>
			</ref>
			<ref id="B5">
				<label>5</label>
				<mixed-citation>Mann HB, Whitney DR. On a test of whether one of two random
					variables is stochastically larger than the other. Ann Math Statist.
					1947;18(1):50-60.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Mann</surname>
							<given-names>HB</given-names>
						</name>
						<name>
							<surname>Whitney</surname>
							<given-names>DR</given-names>
						</name>
					</person-group>
					<article-title>On a test of whether one of two random variables is
						stochastically larger than the other</article-title>
					<source>Ann Math Statist</source>
					<year>1947</year>
					<volume>18</volume>
					<issue>1</issue>
					<fpage>50</fpage>
					<lpage>60</lpage>
				</element-citation>
			</ref>
			<ref id="B6">
				<label>6</label>
				<mixed-citation>Kruskal WH, Wallis WA. Use of ranks in one-criterion variance
					analysis. J Am Stat Assoc. 1952;47(260):583-621.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Kruskal</surname>
							<given-names>WH</given-names>
						</name>
						<name>
							<surname>Wallis</surname>
							<given-names>WA</given-names>
						</name>
					</person-group>
					<article-title>Use of ranks in one-criterion variance analysis</article-title>
					<source>J Am Stat Assoc</source>
					<year>1952</year>
					<volume>47</volume>
					<issue>260</issue>
					<fpage>583</fpage>
					<lpage>621</lpage>
				</element-citation>
			</ref>
			<ref id="B7">
				<label>7</label>
				<mixed-citation>Friedman M. The use of ranks to avoid the assumption of normality
					implicit in the analysis of variance. J Am Stat Assoc.
					1937;32(200):675-701.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Friedman</surname>
							<given-names>M</given-names>
						</name>
					</person-group>
					<article-title>The use of ranks to avoid the assumption of normality implicit in
						the analysis of variance</article-title>
					<source>J Am Stat Assoc</source>
					<year>1937</year>
					<volume>32</volume>
					<issue>200</issue>
					<fpage>675</fpage>
					<lpage>701</lpage>
				</element-citation>
			</ref>
		</ref-list>
	</back>
</article>
