Neo4j 做推薦 (11)—— 協同過濾(皮爾遜相似性)
阿新 • • 發佈:2018-11-16
皮爾遜相似性或皮爾遜相關性是我們可以使用的另一種相似度量。這特別適合產品推薦,因為它考慮到不同使用者將具有不同的平均評分這一事實:平均而言,一些使用者傾向於給出比其他使用者更高的評分。由於皮爾遜相似性考慮了均值的差異,因此該指標將解釋這些差異。
根據皮爾遜的相似度,找到與Cynthia Freeman最相似的使用者
MATCH (u1:User {name:"Cynthia Freeman"})-[r:RATED]->(m:Movie) WITH u1, avg(r.rating) AS u1_mean MATCH (u1)-[r1:RATED]->(m:Movie)<-[r2:RATED]-(u2) WITH u1, u1_mean, u2, COLLECT({r1: r1, r2: r2}) AS ratings WHERE size(ratings) > 10 MATCH (u2)-[r:RATED]->(m:Movie) WITH u1, u1_mean, u2, avg(r.rating) AS u2_mean, ratings UNWIND ratings AS r WITH sum( (r.r1.rating-u1_mean) * (r.r2.rating-u2_mean) ) AS nom, sqrt( sum( (r.r1.rating - u1_mean)^2) * sum( (r.r2.rating - u2_mean) ^2)) AS denom, u1, u2 WHERE denom <> 0 RETURN u1.name, u2.name, nom/denom AS pearson ORDER BY pearson DESC LIMIT 100