TML has moved to http://www.villalon.cl/tml.html and the code to https://github.com/villalon/tml
Features
- Document indexing and selection using Apache's Lucene
- Fast VSM generation with several local and global weights (term - doc matrix)
- Dimensionality reduction using SVD or NMF for LSA or related.
- Meta-data annotators (PennTree grammar parsing).
- Operations: Document distances, topic clustering, keyword extraction, and many more!
License
Apache License V2.0Follow TML - Text Mining Library for LSA & CMM
Other Useful Business Software
AI Powered Global HCM for the Evolving World of Work
Darwinbox is a new-age & disruptive mobile-first, cloud-based HRMS platform built for the large enterprises to attract, engage and nurture their most critical resource - talent. It is an end-to-end integrated HR system that aids in streamlining activities across the employee lifecycle (Hire to Retire). Our powerful enterprise product features are built with a clear focus on intuitiveness and scalability, with standards of best in class consumer apps. Darwinbox’s motto is to engage, empower, and inspire employees on one side in addition to automating and simplifying all HR processes for the enterprise on the other. Over 350+ leading enterprises with 850k users manage their entire employee lifecycle on this unified platform.
Rate This Project
Login To Rate This Project
User Reviews
-
It seems to be good, but there are some errors that dont let the program load correctly the library ( Abstract Annotator constructor receives parameters but PennTreeAnnotator doesnt receive)
-
very good library for doing text mining
-
great