好了,用户的喜欢与不喜欢的问题解决了。下面就可以开始算法了,代码不全贴出来了,贴个流程吧,具体代码可以去看我的github
#读取文件数据
test_contents=readFile(file_name)
#文件数据格式化成二维数组 List[[用户id,电影id,电影评分]...]
test_rates=getRatingInformation(test_contents)
#格式化成字典数据
# 1.用户字典:dic[用户id]=[(电影id,电影评分)...]
# 2.电影用户反查表:dic[电影id]=[用户id1,用户id2...]
test_dic,test_item_to_user=createUserRankDic(test_rates)
#寻找邻居
neighbors=calcNearestNeighbor(userid,test_dic,test_item_to_user)[:k]
#计算推荐列表
recommend_dic={}
for neighbor in neighbors:
neighbor_user_id=neighbor[1]
movies=test_dic[neighbor_user_id]
for movie in movies:
if movie[0] not in recommend_dic:
recommend_dic[movie[0]]=neighbor[0]
else:
recommend_dic[movie[0]]+=neighbor[0]
#建立推荐列表
recommend_list=[]
for key in recommend_dic:
recommend_list.append([recommend_dic[key],key]
recommend_list.sort(reverse=True)
对于随便输入一个用户,我们得到以下这个推荐结果
movie name release
=======================================================
Contact (1997) 11-Jul-1997
Scream (1996) 20-Dec-1996
Liar Liar (1997) 21-Mar-1997
Saint, The (1997) 14-Mar-1997
English Patient, The (1996) 15-Nov-1996
Titanic (1997) 01-Jan-1997
Air Force One (1997) 01-Jan-1997
Star Wars (1977) 01-Jan-1977
Conspiracy Theory (1997) 08-Aug-1997
Toy Story (1995) 01-Jan-1995
Fargo (1996) 14-Feb-1997