Deep Learning Specialization on Coursera

一起学习 Apache Mahout™ machine learning library 吧

+3 投票

 

The Apache Mahout™ machine learning library's goal is to build scalable machine learning libraries.

Mahout currently has

  • Collaborative Filtering
  • User and Item based recommenders
  • K-Means, Fuzzy K-Means clustering
  • Mean Shift clustering
  • Dirichlet process clustering
  • Latent Dirichlet Allocation
  • Singular value decomposition
  • Parallel Frequent Pattern mining
  • Complementary Naive Bayes classifier
  • Random forest decision tree based classifier
  • High performance java collections (previously colt collections)
  • A vibrant community
  • and many more cool stuff to come by this summer thanks to Google summer of code

With scalable we mean:

Scalable to reasonably large data sets. Our core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm. However we do not restrict contributions to Hadoop based implementations: Contributions that run on a single node or on a non-Hadoop cluster are welcome as well. The core libraries are highly optimized to allow for good performance also for non-distributed algorithms

Scalable to support your business case. Mahout is distributed under a commercially friendly Apache Software license.

Scalable community. The goal of Mahout is to build a vibrant, responsive, diverse community to facilitate discussions not only on the project itself but also on potential use cases. Come to the mailing lists to find out more.

Currently Mahout supports mainly four use cases: Recommendation mining takes users' behavior and from that tries to find items users might like. Clustering takes e.g. text documents and groups them into groups of topically related documents. Classification learns from exisiting categorized documents what documents of a specific category look like and is able to assign unlabelled documents to the (hopefully) correct category. Frequent itemset mining takes a set of item groups (terms in a query session, shopping cart content) and identifies, which individual items usually appear together.

算法列表 https://cwiki.apache.org/confluence/display/MAHOUT/Algorithms

中文简介 http://www.ibm.com/developerworks/cn/java/j-mahout/

时间: 2012年 6月 24日 分类:开源项目 作者: shinchen (220 基本)
重新设置分类 2012年 6月 24日 作者:fandywang
Good Idea!
好主意!研究生阶段学习过些推荐系统,现在就在看mahout,我是从mahout in action这本书开始的。希望能和大家一起学习。
赞,欢迎分享经验!

1个回答

0 投票

新建了一个“开源项目”板块,欢迎进行源码剖析

已回复 2012年 6月 24日 作者: fandywang (2,360 基本)
perfect
以前看过mahout,不过是学c++的,感觉如果能把这些算法用c++实现在分布式平台上会不会更好些!
挺好的注意,最好大家一起来搞个开源项目。
搞一个呗。、
个人精力实在有限,这段时间可能照顾不过来,欢迎你在这里起个头,有机会我也会参与的。
NLPJob

Keyword Extraction

TensorFlow Tutorial

Sentiment Analysis

Free Article Spinner

Text Analysis Online

Text Processing

Word Similarity

本站架设在 DigitalOcean 上, 采用创作共用版权协议, 要求署名、非商业用途和保持一致. 转载本站内容必须也遵循“署名-非商业用途-保持一致”的创作共用协议.