Latent Semantic Analysis (LSA), sometimes referred to as Latent Semantic Indexing (LSI), is an information modeling and analysis technology. It enables you to organize and examine large collections of information based on meaning (semantics), not just matching the words or items being used. It does this automatically and objectively in a way that helps you understand and manage your information like never before.
Most existing technology falls short in meeting the needs of today’s information rich culture. Other automated information analysis and search techniques are based primarily on the matching of common items such as terms or phrases. Even sophisticated models for text mining based on this sort of matching do not capture the underlying semantics, the meaning, of the information they work with.
LSA is different. Latent Semantic Analysis embodies both a computational model as well as an underlying theory of meaning. It creates a mapping of meaning acquired from the subject information itself, based completely on the semantic relationships between items of information contained in the collection. This is done without the need for any specific foreknowledge of the information in the collection. An LSA based system is able to identify semantically similar items solely based on the collection of information being analyzed.
LSA can be applied to any sort of data items that have a collective meaning even if they are not strictly text based. LSA can be used for any language – even collections containing items in multiple languages – without the need for prior translation. It can be used to search, compare, evaluate, and understand the information in a collection. In fact, LSA provides a computational model that can be used to perform many of the cognitive tasks with information that humans do essentially as well as humans do them.