Monday, March 16, 2015

PAPER Improving the Scalability of ALS-based Large Recommender Systems with Similar User Index

Alternating Least Squares (ALS) is popular method to compute matrix factorization in the parallel way. However, due to the time complexity in predicting user’s preference, ALS  is  not  scalable  to  large-scale  datasets.  In  this  paper,  we  propose  a  similar  user index-based  parallel  matrix  factorization  approach. Since the  group  of similar  users is indexed in advance, there is no need to compute similarities between all users in datasets. Furthermore,  the  size  of  a  matrix  is  reduced  because  the  matrix  is  only  composed  of indexed  user’s  ratings  and  items.  The  current  advanced  cloud  computing  including Hadoop,  MapReduce  and  Amazon  EC2  are  employed  to  implement  the  proposed approaches. We empirically show that the use of similar user index resolves  the scalable issue  of  ALS  and  improves  the  performance  of  large  scale  recommender  systems  in distributed computing environment.

Please see also:
PAPER "A new similarity measure using Bhattacharyya coefficient

Please remember:
Personality Based Recommender Systems are the next generation of recommender systems because they perform far better than Behavioural ones (past actions and pattern of personal preferences)

That is the only way to improve recommender systems, to include the personality traits of their users. They need to calculate personality similarity between users but there are different formulas to calculate similarity. 
Similarity is a word that has different meanings for different persons or companies, it exactly depends on how mathematically is defined. In case you had not noticed, recommender systems are morphing to .......... compatibility matching engines, as the same used in the Online Dating Industry since years, with low success rates until now because they mostly use the BIG 5 to assess personality and the Pearson correlation coefficient to calculate similarity.
The BIG 5 (Big Five) normative personality test is obsolete. The HEXACO (a.k.a. Big Six) is another oversimplification. Online Dating sites have very big databases, in the range of 20,000,000 (twenty million) profiles, so the BIG 5 model or the HEXACO model are not enough for predictive purposes. That is why I suggest the 16PF5 test instead and another method to calculate similarity. I calculate similarity in personality patterns with (a proprietary) pattern recognition by correlation method. It takes into account the score and the trend to score of any pattern. Also it takes into account women under hormonal treatment because several studies showed contraceptive pills users make different mate choices, on average, compared to non-users. "Only short-term but not long-term partner preferences tend to vary with the menstrual cycle".

If you want to be first in the "personalization arena" == Personality Based Recommender Systems, you should understand the ............ Online Dating Industry first of all!

Please see: "How to calculate personality similarity between users"
Short answer: the key is the ENSEMBLE!
(the whole set of different valid possibilities)

Worldwide there are over 5,000 online dating sites, no one uses the 16PF5, no one is scientifically proven yet, and no one can show you compatibility distribution curves, i.e. if you are a man seeking women, to show how compatible you are with a 20,000,000 women database, and to select a bunch of 100 women from 20,000,000 women database.


What comes after the Social Networking wave?
The Next Big Investment Opportunity on the Internet will be .... Personalization!
Personality Based Recommender Systems and Strict Personality Based Compatibility Matching Engines for serious Online Dating with the normative 16PF5 personality test. 

No comments:

Post a Comment