题解 | #某店铺用户消费特征评分#
某店铺用户消费特征评分
https://www.nowcoder.com/practice/200c824e9ed4428491c27d65ec56067d
import pandas as pd data = pd.read_csv('sales.csv') # 按照结果要求转换类型 sales[['monetary']] = data[['monetary']].astype('float32') # 求百分位 des = sales[['recency', 'frequency', 'monetary']].describe().loc['25%':'75%']#基于describe结果,提取上下四分位数据 # 计算RFM sales['R_Quartile'] = sales['recency'].apply(lambda x: 4 if x <= des.iloc[0,0] else (3 if x <= des.iloc[1,0] else (2 if x <= des.iloc[2,0] else 1)))#列批量处理函数 sales['F_Quartile'] = sales['frequency'].apply(lambda x: 1 if x <= des.iloc[0,1] else (2 if x <= des.iloc[1,1] else (3 if x <= des.iloc[2,1] else 4))) sales['M_Quartile'] = sales['monetary'].apply(lambda x: 1 if x <= des.iloc[0,2] else (2 if x <= des.iloc[1,2] else (3 if x <= des.iloc[2,2] else 4))) # print(sales.head())
强制类型转换:astype()
求四分位数:
批量计算:data[['recency', 'frequency', 'monetary']].describe().loc['25%':'75%']
分列计算:data.quantile(0.25),data.quantile(0.75)
按照函数关系批量处理列data.apply(lambda x : 条件)
其中条件可以是取data中的某个数或对data操作,data取出的数存放在x中
本例:4 if x <= des.iloc[0,0] else (3 if x <= des.iloc[1,0] else (2 if x <= des.iloc[2,0] else 1))