Research reported in, and applications available from, thiswebsite exploit a new method for determining and representingthe similarity of meaning of words and passages by statisticalanalysis of large text corpora. After processing a large sampleof machine-readable language, Latent Semantic Analysis (LSA) representsthe words used in it, and any set of these words-such as thosecontained in a sentence, paragraph, or essay, either taken fromthe original corpus or new-as points in a very high (e.g. 50-1,000)dimensional . LSA is basedon singular value decomposition, a mathematical matrix decompositiontechnique closely akin to factor analysis that has recently becomeapplicable to databases approaching the volume of relevant languageexperienced by people. Word and discourse meaning representationsderived by LSA have been found capable of simulating a varietyof human cognitive phenomena, ranging from acquisition of recognitionvocabulary to sentence-word semantic priming and judgments ofessay quality.
In order to supplement the instructional designer’s personal knowledge, experience, and observations derived from teaching Inquiry into Teaching and Learning (Inquiry II) for seven semesters and from employment with the public schools, two brief surveys were developed. One was sent to five other Inquiry II instructors, with three (i.e., 60%) responding; the other was distributed in the two Inquiry classes I teach. Twenty-five (25) out of the 32 students (i.e., 78.13%) completed and returned their surveys. In conducting the Learner and Context Analysis, student surveys were given greater consideration than those completed by the instructors, since the former directly reflect the perceptions and attitudes of the immediate target learning population.
The first step is to represent the text as a matrix in whicheach row stands for a unique word and each column stands for atext passage or other context. Each cell contains the frequencywith which the word of its row appears in the passage denotedby its column. Next, the cell entries are subjected to a preliminarytransformation in which each cell frequency is weighted by a functionthat expresses both the word's importance in the particular passageand the degree to which the word type carries information in thedomain of discourse in general.
This area includes resources on analyzing and producing visual rhetoric, working with colors, and designing effective slide presentations.These resources will help students and teachers better understand the use of visual elements for rhetorical purposes. This resource covers how to write a rhetorical analysis essay of primarily visual texts with a focus on demonstrating the author’s understanding of the rhetorical situation and design principles. This handout addresses how to make appropriate font choices to add additional meaning and emphasis to print documents and web pages This presentation is designed to introduce your students to color theory, which will help them make color choices that are more than appeals to aesthetics.
Next, LSA applies singular value decomposition (SVD) to thematrix. This is a form of factor analysis, or more properly themathematical generalization of which factor analysis is a specialcase. In SVD a rectangular matrix is decomposed into the productof three other matrices. One component matrix describes the originalrow entities as vectors of derived orthogonal factor values, anotherdescribes the original column entities in the same way, and thethird is a diagonal matrix containing scaling values such thatwhen the three components are matrix-multiplied, the originalmatrix is reconstructed. There is a mathematical proof that anymatrix can be so decomposed perfectly, using no more factors thanthe smallest dimension of the original matrix. When fewer thanthe necessary number of factors are used, the reconstructed matrixis a least-squares best fit. One can reduce the dimensionalityof the solution simply by deleting coefficients in the diagonalmatrix, ordinarily starting with the smallest. (In practice, forcomputational reasons, for very large corpora only a limited numberof dimensions can be constructed.)
Some listed here may now be out of print or unavailable. (Sacks)
Ralph Waldo Emerson's essay "Self-Reliance" is often the first or only exposure students get to Emerson's thought. Here are some resources to help understand this essay:
An essay introducing the background and context of Transcendentalism, for help in understanding where Emerson's ideas came from.
From Emerson himself, with some dictionary and other simple definitions listed as well.
Basic information on Transcendentalism - links to the two items above plus more.
- HTML searchable copy of the text at
Ann Woodlief's excellent introduction to the Emerson essay, Self-Reliance.
An article by Alfred I.
LSA can be construed in two ways: (1) simply as a practical expedient for obtaining approximate estimates of the contextual usage substitutability of words in larger text segments, and of the kinds of-as yet incompletely specified- meaning similarities among words and text segments that such relations may reflect, or (2) as a model of the computational processes and representations underlying substantial portions of the acquisition and utilization of knowledge. We next sketch both views.
LSA differs from other statistical approaches in two significantrespects. First, the LSA analysis (at least as currently practiced)uses as its initial data not just the summed contiguous pairwise(or tuple-wise) co-occurrences of words, but the detailed patternsof occurrences of words over very large numbers of local meaning-bearingcontexts, such as sentences or paragraphs, treated as unitarywholes. Second, the LSA method assumes that the choice of dimensionalityin which all of the local word-context relations are jointly representedis of great importance, that reducing the dimensionality (thenumber parameters by which a word or passage is described) ofthe observed data from the number of initial contexts to a muchsmaller-but still large-number will often produce much betterapproximations to human cognitive relations. Thus, an importantcomponent of applying the technique is finding the optimal dimensionalityfor the final representation. A possible interpretation of thisstep, in terms familiar to researchers in psycholinguistics, isthat the resulting dimensions of the description are analogousto the semantic features often postulated as the basis of wordmeaning, although establishing concrete relations to mentalisticlyinterpretable features poses daunting technical and conceptualproblems and has not yet been seriously attempted. Finally, LSA,unlike many other methods, employs a preprocessing step in whichthe overall distribution of words over usage contexts, independentof their correlations, is taken into account; pragmatically, thisstep improves LSA's results considerably.
Latent Semantic Analysis is a fully automatic mathematical/statisticaltechnique for extracting and inferring relations of expected contextualusage of words in passages of discourse. It is not a traditionalnatural language processing or artificial intelligence program;it uses no humanly constructed dictionaries, knowledge bases,semantic networks, grammars, syntactic parsers, or morphologies,etc., and takes as its input only raw text parsed into words definedas unique character strings and separated into meaningful passagesor samples such as sentences or paragraphs.
Landauer, T. K., & Dumais, S. T. (1996). How come you knowso much? From practical problem to theory. In D. Hermann, C. McEvoy,M. Johnson, & P. Hertel (Eds.), Basic and applied memory:Memory in context. Mahwah, NJ: Erlbaum, 105-126.
slaveholder were in stark contrast. But certainly, Emerson's later writing was more interested in relationships among people, and ethical behavior, than early works like "Self-Reliance" may indicate. Nevertheless, the worldview expressed in "Self-Reliance" is not, I would contend, one of radical separation of the individual from the rest of the universe, though Emerson has sometimes been accused of that view.