题解 | #多报名表的运动项目人数统计#
多报名表的运动项目人数统计
https://www.nowcoder.com/practice/423da9d9768243c89e3c724ba7e2ff4d
pd.concat:Concatenate pandas objects along a particular axis with optional set logic along the other axes.
pd.DataFrame.join: Join columns with other DataFrame either on index or on a key column
join(self, other, on=None, how='left', lsuffix='', rsuffix='', sort=False) -> 'DataFrame'If we want to join using the key columns, we need to set key to be the index in both
dfandother. The joined DataFrame will have key as its index.
思路:
- 先合并 signup 和 signup1 两个表
pd.concat([signup, signup1])得到 signup_all - 根据 item_id 字段内连接 items 和 signup_all 两个表
join(on='item_id', how='inner')(注意到题目要求只输出报名人数不为0的项目及其对应的报名人数,因此使用内连接) - 再进行分组聚合统计各个项目的人数
groupby('item_name')['employee_id'].count()
import pandas as pd
signup = pd.read_csv('signup.csv')
signup1 = pd.read_csv('signup1.csv')
items = pd.read_csv('items.csv')
signup_all = pd.concat([signup, signup1])
df = items.set_index('item_id').join(signup_all.set_index('item_id'),
on='item_id', how='inner')
print(df.groupby('item_name')['name'].count())


