Version 0.1 of the LSA_ToolkitTM has been released!
The LSA_toolkit provides an optimized scalable Latent Semantic Analysis implementation with a C++ API ready to incorporate into your applications.
We have rebuilt Latent Semantic Analysis processing from the ground up. Going back to the core mathematics, we have designed a complete system architecture, new storage mechanisms, and thoroughly re-examined the computational accuracy to provide the best implementation possible. Our goals have been to create a system that is easy to use and that is robust, scalable, and performant.
This release provides tools to do the following:
- Set up a basic LSA_Toolkit Environment for constructing an LSA Space.
- Load data collection information from an ascii data file into our easy to use sparse matrix data storage mechanism.
- Perform basic construction of the LSA Space using an efficient Lanczos implementation and optimized diagonalization techniques to produce the truncated Singular Value Decomposition.
- Dump the resulting singular values, document vectors, and term vectors to ascii output files for evaluation.
- Linux_X86 64bit
While this initial release does not have a large number of visible features, there is a lot going on behind the scenes. This version marks a solid foundation for building forward. Future releases will expand the featureset to provide all the tools needed to deploy LSA technology into your solutions.