Analytics1305 Machine Learning Library
Our software library, written in pure C++, combines a rich framework of advanced data structures, optimization and linear algebra utilities as well as ETL tools which are utilized by cutting edge machine learning algorithms for the data mining tasks of classification, regression, clustering, outlier detection etc. Our software library has been built from the ground up to be:
- Super Fast and Scalable. We use the latest most efficient mathematical algorithms, using the best data structures, automatic computational techniques and blazingly fast C++ code that ensures our library does everything faster than anything else out there. Check out the benchmarks. These were all performed on the Amazon Cloud and we've taken every precaution to be extremely fair. We encourage you to try it yourself and guarantee you will come back for more.
- Accurate. In machine learning more data used during the modeling process directly translates to higher accuracy and confidence in the result achieved. Our library not only facilitates this but also uses the best validation techniques in statistics reporting confidence bands, expected error all within measurable and configurable time frames.
- In-Database. Our library is available for standalone use but is as easily integrated into back end systems such as databases, marts and warehouses as well as in enterprise processes such as CRM Systems etc. We attack the data where it lives thereby removing data movement overhead. We have already prototyped such systems on commercial DBMS and Data Warehouses utilizing native data structures and functionality to speed up the analysis task.
- Extensible. New and breakthrough algorithms are discovered regularly and different organizations whether commercial or scientific need customized solutions. We can quickly prototype new algorithms, data mining flows and domain knowledge into our library allowing our users to gain the best from an efficient yet extensible framework.
- Feature Rich. Our library has many advanced features. To name a few:
- Data Types. You can use any numerical precision eg: float, short, double, int, long double. We also support booleans and categorical data.
- Run Modes. We support progressive, iterative and standard modes.
- Sparse Data. All our algorithms can handle sparse data.
- Metric Learning. Automatically tune parameters.
- Hadoop. Our classification and regression algorithms can make use of Hadoop to analyze massive datasets.
- Cloud Support. We've deployed on the Amazon Cloud and have ready AMI's for you to launch.
- Coming Soon:. Server Mode, Streaming Data, Text Mining and much more...