Scaling

Y Unidimensional Scaling (UDS)

 Begin by considering a graphic representation of points plotted along a number line. Let S = { O1, O2, ... On} represent a set of objects indexed i = 1,...n and, for each object, xi is the coordinate of object OI for i = 1,...n. In this example, x1 = 5, x2 = 10, x3 = 1 and x4 = 7. A picture is worth a thousand words--and quite a number of matrix entries. The visual representation of the placement of the objects on a number line contains information concerning both the order of objects as well as the distance between the objects. Converting the information into a matrix, P, each entry, pij, is the distance or dissimilarity between xi and xj. Although dissimilarity can refer to data in other contexts, the term describes Euclidean distances in this example. Notice that, the rows and columns are in the order of the objects placed on the number line (O3 - O1 - O4 - O2). In this arrangement, matrix entries increase in each row moving from the diagonal to the right and, hence, increase in each column moving from the diagonal to the bottom, which is known as Anti-Robinson structure (Robinson, 1951). Of course, information is usually recorded sequentially. Rearranging or permuting the rows and columns, while maintaining accurate information, results in the following matrix. This is the usual form in which proximity data are presented. Notice that the entries no longer hold their property of increasing as they move away from the diagonal. Working from information recorded in the matrix, P = [pij], to the actual observed coordinates of the objects (xA = 5, xB = 10, xC = 1 xD = 7), we can check the number assignments of the observed variables, xi, against the matrix entries, pij using the Least Squares Loss Function, . Because this example is perfectly constructed, the Least Squares Loss Function is minimized at zero. Notice that the equation takes advantage of the symmetry in the dissimilarity matrix with i < j. Real data from parapsychology, psychology or any other field rarely conform to such "clean" interpretation. Fortunately, seriation methods allow analysts to find structure in otherwise obscure or disorderly proximity matrices.A Analysis is aided by the use of the combinatorial equivalent of equation (1). Let Y be the set of all n! permutations, y , of the rows and columns of an n x n matrix. Then, y (k) is the object in the kth position of permutation y and py (k) y (l) is the value of the entry in the matrix generated by y in row y (k) and column y (l). For example, if y = (O2 - O4 - O3 - O1), then py (4) y (2) = = 2. Because the matrix is ordered according to object indices in Table 2, = p14. The Least Squares Loss Function can now be restated as a combinatorial problem: for .   Consider the visual and ESP matrices for suits in Kelly et al.'s 1975 experiment. These are confusion matrices in which larger entries reflect greater confusion and, hence, greater similarity. A column for row sums has been added to the right of each matrix. The lack of symmetry is apparent in these similarity matrices, which is characteristic of confusion matrices. Although there is a temptation to induce symmetry by adding entries mirrored along the diagonal (cij = cij + cji = cji), there is a danger of skewing data with simple aggregation and data pooling techniques. A similar technique can be used that will preserve more information. Specifically, use row sums to normalize the aggregation, . Now, the ESP and visual matrices are transformed into symmetric matrices. The presented data is in the form of a similarity matrix, i.e. larger numbers indicate greater similarity. Conversion to a dissimilarity matrix is rather intuitive. In both matrices, off-diagonal entries are less than one. Hence, to preserve the information while converting to dissimilarities, i.e. larger numbers indicating greater dissimilarity, simply subtract each entry from 1, i.e., pij = 1 - aij. For a similarity matrix with larger numbers, entries can be subtracted from the largest entry to preserve information while converting to a dissimilarity matrix, i.e., pij = max(aij) - aij or entries can be subtracted from the sum of the largest and smallest entries, i.e., pij = (max(aij) + min(aij)) - aij. In these symmetric dissimilarity matrices, we can still see problems with structure. Specifically, matrix entries do not always increase in each row moving from the diagonal to the right nor do they always increase in each column moving from the diagonal to the bottom. A casual examination of the patterns reveals that the data is unlikely to strictly conform to this Anti-Robinson trait for unidimensional scaling, regardless of which permutation we try for the rows and columns in either matrix. A technique for optimal seriation (also see the seriation chapters in the branch-and-bound monograph) can be used to find the optimal permutations to minimize the least squares loss function for the symmetric dissimilarity ESP matrix for suits. The optimal permutations are (S-C-H-D) and (D-HC-S), reflecting the symmetry of the matrix. For the symmetric dissimilarity visual matrix for suits. The optimal permutations are (H-D-C-S) and (S-C-D-H). Ignoring the diagonal entries, multiplying by 1000 and rounding, the reordered matrices are presented in Table 6. If left in decimal form, the targets would be pretty much scaled on a (0,1) interval. The data can now be used to calculate optimal unidimensional values (coordinates) as shown in Table 7. The ESP data is thus unidimensionally scaled: And the visual data is unidimensionally scaled: Although these are easily interpretable scales, the least squares error shows that neither is not a particularly good fit. [For the symmetric dissimilarity data, equation (1) calculates the loss for the ESP scaling as 190918.5 and the loss for the visual scaling as 554513.5.] Perhaps, multidimensional scaling (MDS) will yield more illuminating results and, from there, we can examine the concordance between the two matrices. [MDS is apt to converge quickly for these data because, for four points, there will be no more than three dimensions. (?)] You can try a small UDS program for this (or other matrices). This program uses dynamic programming to find the optimal permutation. This program is designed to accomodate up to five matrices for multiobjective programming. If there are multiple matrices, then simply change the weights (only spaces between weights; no commas). Also, if the matrix needs to have symmetry and dissimilarity induced, then check the appropriate box. The default is a single, symmetric, dissimilarity matrix. Finally, you have the option of entering object/target names to label the rows/columns--the default is simple enumeration.

Y (Metric) MultiDimensional Scaling (MDS)

 Our unidimensional scaling example began with an illustration on a line segment. Ergo, our multidimensional example should begin with a simple 2-dimensional illustration. Once again, we can record the distance-or dissimilarity-information in tabular form. Again, we see the near Anti-Robinson structure in the matrix. In unidimensional scaling, an optimal permutation for minimizing loss is (A - B - C - D), determined in the same manner as the previous section. However, if we attempt to fit this data to a line segment, we quickly discover the error of our ways. The actual loss function is minimized at 14, which is relatively high for the given data. Yet, the data is a perfect fit in the 2-dimensional illustration of Fig. 2. Clearly, multidimensional scaling is preferable in this instance. The matrix in Table 8, D = [dij], is a symmetric distance matrix with zeros in the diagonal and all other dij > 0, which allows us to use metric MDS. To begin scaling, we convert the matrix to B = [bij] using the formula, . There is an easy way to accomplish this. (This is ridiculously easy to code.) First, square the entries in D, calculating the row averages. Subtract the row average from each entry, dij, and calculate the column average. An important procedural note is that the column averages are calculated after the row averages have been subtracted. Subtract the column average from each dij. Notice that this step forces the row and column averages to zero. Finally, multiply by -1/2. Now, we use the power method to determine the eigenvalues, λi, and eigenvectors, vi, for B The characteristic equation, λ2(λ - 18)( λ - 12) =0, for B yields λ1 =18 and λ2 =12. For λ1 =18, v1 = [.707, 0, -.707, 0]; for λ2 =12, v2 = [-.289, -.289, -.289, .866]. The point coordinates are revealed in the last calculation, multiplying the eigenspace, V = [v1, v2], by the matrix with the square roots of the eigenvalues along the diagonal. Thus, on the familiar Cartesian coordinate system, A = (3, -1), B = (0, -1), C = (-3, -1) and D = (0,3), which map to the very same figure as in Figure 2. The metric MDS has worked beautifully.

Y (Nonmetric) MultiDimensional Scaling (MDS)

 Of course, psychological and parapsychological spaces are often multidimensional and MDS is more appropriate in many instances. Classic, metric MDS methodology is fairly straightforward and comfortably Euclidean. However, nonmetric MDS is the methodology of choice when proximity data are presented as ordinal or ranked rather than interval or ratio distances. Although the term "nonmetric" can evoke connotations of the absence of any measure or distance, the nonmetric MDS methodology attempts to find/fit a monotone relationship between the data. Hence, because psychological and parapsychological data rarely present ordinal dissimilarities linearly related to absolute distances, MDS is more commonly used in psychology and parapsychology. More to the point, confusion data are certainly nonlinear, as evidenced by the asymmetry of confusion matrices. Thus, MDS is the proper methodology to multi-dimensionally scale confusion data. Nonmetric MDS has been used in the parapsychological literature (Van Quekelberghe, Altstötter-Gleich, & Hertweck, 1991) and even augmented with canonical correlation analysis (Kelly, Kathamani, Child, & Young, 1975) [Note: I think this is Young of Alscal fame.]. Nonmetric MDS is, indeed, an exceptionally powerful tool with a rich history in the psychological literature (see Carroll & Arabie, 1980, 1998 for excellent reviews of these methods and their applications). However, the technique has not been widely used in parapsychology, and does require some important modeling choices from the analyst. Among the critical decisions in an MDS analysis are the appropriate number of dimensions and the appropriate distance metric (e.g., city-block, Minkowski or Euclidean). Most implementations of MDS typically assume that the stimulus space is Euclidean; however, city-block distance is typically more appropriate if the dimensions of the space are separable (see Arabie, 1991; Shepard, 1991 for reviews). For example, following Kelly's (1980) suggestion for ESP experiments that are explicitly designed to foster confusion, we could select targets that differ among two dimensions: shape and color. As these are ready-made separable dimensions, we might hypothesize a two-dimensional city-block structure. As observed by Burdick and Kelly (1977), MDS is an extraordinarily powerful tool and would seem to hold tremendous promise for future research in parapsychology. ARABIE, P. (1991). Was Euclid an unnecessarily sophisticated psychologist? Psychometrika, 56, 567-587. CARROLL, J. D., & ARABIE, P. (1980). Multidimensional scaling. Annual Review of Psychology, 31, 607-649. CARROLL, J. D., & ARABIE, P. (1998). Multidimensional scaling, in Management, Judgment, and Decision Making, Ed., M. H. Birnbaum, San Diego: Academic Press, pp. 179-250. KELLY, E. F., KANTHAMANI, H., CHILD, I. L., & YOUNG, F. W. (1975). On the relation between visual and ESP confusion structures in an exceptional subject. Journal of the American Society for Psychical Research, 69, 1-31. SHEPARD, R.N. (1991). Integrality versus separability of stimulus dimensions: From an early convergence of evidence to a proposed theoretical basis. In J.R. Pomerantz & C.L. Lockhead (eds.), The perception of structure (pp. 53-71). Washington, DC: American Psychological Association. VAN QUEKELBERGHE, R., ALTSTÖTTER-GLEICH, C., & HERTWECK, E. (1991). Journal of Parapsychology, 55, 375-390. .:: Home : Determining Eigenvalues ::.