Simple User Based Recommendation Using Mahout

Tonight, i want to share a little brief to give User-Based recommendation. First we need to know what is the algorithm of user-based recommendation. User Based Recommendation is using pearson correlation as main algorithm to compare similiarity user with another user. Pearson Correlation actualy give 3 result :

  • -1 => this is mean that 2 user is very different so we will not use recommendation from user’s item
  • 0 => this is mean that 2 user is not same.
  • 1 => this is mean that between 2 user they have correlation, so we can give item that other user’s like.

And we know the basic concept. Whats next, of course we need to code it. I’m using intellij as my IDE but you can use your own IDE like eclipse, vim, sublime, or etc. First you need to add mahout-core using your Maven. The latest version is 0.9. All you need to know is :

  1. Load data from file, we can do it like this :
  • DataModel model = new FileDataModel(new File("data.csv"));
  1. Which algorithm that you will use.
  • UserSimilarity userSimilarity = new PearsonCorrelationSimilarity(model);
  • UserNeighborhood neighborhood = new ThresholdUserNeighborhood(0.1, userSimilarity, model);
  1. And then we will process the recommendation,
  • Recommender recommender = new GenericUserBasedRecommender(model, neighborhood, userSimilarity);
    Recommender cachingRecommender = new CachingRecommender(recommender);
  1. And then we find item recommended for specific user : List recommendations = cachingRecommender.recommend(2, 3);

Yeayyyy, it will give result item like this :

[
            RecommendedItem[item:12, value:4.8328104],
            RecommendedItem[item:13, value:4.6656213],
            RecommendedItem[item:14, value:4.331242]
        ]

Its easy right, whats next. Next i will implement item-based algorithm and use matrix factorization to do this recommendation thing.

If you want the source code and dataset you can go this gist:
https://gist.github.com/linggom/c1f1b265dd7002d4a476

If you still confuse, you can see the excel visualization right here:
https://docs.google.com/spreadsheets/d/1YfChbvijxcho61dFSWURQ0WMUeVWWTPwqahBZhnF_0k/edit?usp=sharing

Related Posts

Streaming Festival Disrupto Exploration and Experimentation 2020

Streaming Festival Disrupto Exploration and Experimentation 2020

Resiko Berbahaya menggunakan VPN gratisan di Laptopmu!

Resiko Berbahaya menggunakan VPN gratisan di Laptopmu!

Part II — Understanding about RuleChain

Mengenal dasar RxSwift

No Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Tags