Hopefully other areas of our website have given you enough information to understand what Latent Semantic Analysis(LSA) is and what it can do for you. But who are these people that claim to be the LSA Experts, and how can they make that claim?
The start of it all
Back in the last century, 1995 to be exact, Dian Martin, one of our founders took a class in information retrieval while working on a graduate degree in Computer Science at the University of Tennessee (UT). That class only briefly touched on the subject of Latent Semantic Indexing(LSI), but opened a door for a graduate research position exploring LSI which lead to her Master’s thesis “Downdating the LSI Model for Information Retrieval.”
Meanwhile, John Martin, our other founder, who was also working on a graduate degree in Computer Science at UT, did NOT take the information retrieval course. Instead his studies focused on formal software engineering methods and producing high-reliability software for critical applications.
Well, as sometimes happens, the two of them met, fell in love, and decided to marry. After graduating, Dian took a post graduate research position at the University of Colorado Institute of Cognitive Science working with Dr. Tom Landauer, one of the original inventors of LSI/LSA, exploring cross-language information retrieval using LSA. John continued working as a formal methods software engineering consultant with a few different companies.
The story continues
Dian continued to work in research and as a consultant on several LSA related projects with Dr. Landauer’s company KAT (developer of the Intelligent Essay Assessor) as well as continued academic research (at both the University of Colorado and the University of Tennessee) and with other clients who were exploring the use of LSA. John started working in support of these various projects as the software development become more and more demanding. During this time Dian authored several papers on LSA and the related mathematics, contributed a chapter to the Handbook of Latent Semantic Analysis, and was an invited speaker at the first European conference on LSA.
After using various bits of LSA software developed in the academic environment, repairing and replacing client’s home-grown LSA implementations, and struggling with the errors, limitations, and general poor condition of these various code bases, it became apparent that we needed to wipe the slate clean and rebuild our own implementation of LSA from the ground up. An initial version of our software was first used in a research project on out of core LSA processing that was published in 2008 because the existing academic software available simply could not scale to the required volume. Over the next few years we continued to develop our LSA library as a side project while engaged in various LSA consulting and research projects.
SBT is born
In 2011 we re-incorporated as Small Bear Technologies to focus solely on development and support for LSA and began offering the first commercially available versions of the LSA_ToolkitTM, a robust scalable library for performing LSA processing. The LSA_Toolkit incorporates Dian’s strong mathematics background, John’s rigorous software engineering practices, and many years of combined consulting experience in LSA. After seeing all of the failures, limitations, and inaccuracies present in all of the previous LSA software that has been in circulation from academic sources, we have endeavoured to create an implementation that is correct in its calculations, scalable to meet the demands of today’s “Big Data” environments, maintainable and flexible enough to adapt to the needs of any application, and commercially supported by a professional team.
Where we are going
Over the past few years we have supported both government research projects and commercial applications deploying LSA in various areas including communications analysis, academic evaluation, legal discovery, advertising placement, and information analysis. Our team continues to pursue new research areas to expand and improve LSA and it’s application potential. Those advances are continually being incorporated into each new release of the LSA_Toolkit.