问题是:基于user\u id列,我想得到rating和product\u id列的值。同一个文件和其他文件中可以有多个具有相同用户标识的条目。 下表是第一个文件提供的一些数据。你知道吗
| product_id | user_id | user_name | rating |
|-------------|-----------------|----------------------------------------------|--------|
| B0009XRZ92 | A2JFZLAUG3YFQ7 | Entropy Babe "EB" | 5 |
| B0009XRZ92 | A22HGAAO8KZ2N3 | R. Metzelar | 5 |
| B000067A8B | A2NJO6YE954DBH | Lawrance M. Bernabo | 4 |
| B0009XRZ92 | A3HE4MYMWK4AER | Rebecca M. Eddy "Foster Mom and Untbunny" | 5 |
| B003A3R3ZY | A9A2PR663ED1V | Roger D. Goff | 5 |
| B0009XRZ92 | A2MRZDJF90JC1U | Suzanne K. Armstrong "Suzy Q" | 5 |
| B0009XRZ92 | A2YNBDT3170PCR | C. O'Hern | 5 |
| B0009XRZ92 | A10VJ7BDVCPKEZ | Carol S. Bottom | 5 |
| B0009XRZ92 | AAAQO894MG80B | Paul J. Michko | 5 |
| B00067BBQE | A9A2PR663ED1V | Roger D. Goff | 5 |
| B0009XRZ92 | A31S5QUMFR8NH2 | Dana L. Jordan "Mom of Twins" | 5 |
| B0009XRZ92 | A2DS24DHXUH0GM | Gaz Rev(iewer) | 4 |
| B00006AUMZ | A2NJO6YE954DBH | Lawrance M. Bernabo | 4 |
| B0009XRZ92 | A16FRHL2ZC7EUR | M. Claytor | 5 |
| B0009XRZ92 | A3AV8R0A62PP1N | MARCUSHELBLINZ "mmmacman" | 5 |
| B0009XRZ92 | A3QN84C38DE9FU | Gillian M. Kratzer | 5 |
| B0009XRZ92 | A36MLTLVQFEQYL | Yossarian "alienated socialist" | 5 |
| B00006AUMD | A2NJO6YE954DBH | Lawrance M. Bernabo | 4 |
What I want to do is:
To take one user_id only from the first file and display the rating and product_id columns value for that user for all the movies from all the files and if the user didn't rate some movies then the record should be displayed with the product_id value and rating as Nan and the whole process should be repeated for all the users in the first file only.
通过使用pivot_table
import pandas as pd
df = pd.read_csv('LCM1.csv')
df_new=df.pivot_table(index='user_id',columns='product_id',values='rating').rename_axis(None,1)
print(df_new)
The result will be the following:
B000067A8B B00006AUMD B00006AUMZ B00067BBQE \
user_id
A10VJ7BDVCPKEZ NaN NaN NaN NaN
A16FRHL2ZC7EUR NaN NaN NaN NaN
A2DS24DHXUH0GM NaN NaN NaN NaN
A2NJO6YE954DBH 4.0 4.0 4.0 NaN
A2YNBDT3170PCR NaN NaN NaN NaN
A36MLTLVQFEQYL NaN NaN NaN NaN
A3HE4MYMWK4AER NaN NaN NaN NaN
A3QN84C38DE9FU NaN NaN NaN NaN
AAAQO894MG80B NaN NaN NaN NaN
A22HGAAO8KZ2N3 NaN NaN NaN NaN
A2JFZLAUG3YFQ7 NaN NaN NaN NaN
A2MRZDJF90JC1U NaN NaN NaN NaN
A31S5QUMFR8NH2 NaN NaN NaN NaN
A3AV8R0A62PP1N NaN NaN NaN NaN
A9A2PR663ED1V NaN NaN NaN 5.0
B0009XRZ92 B003A3R3ZY
user_id
A10VJ7BDVCPKEZ 5.0 NaN
A16FRHL2ZC7EUR 5.0 NaN
A2DS24DHXUH0GM 4.0 NaN
A2NJO6YE954DBH NaN NaN
A2YNBDT3170PCR 5.0 NaN
A36MLTLVQFEQYL 5.0 NaN
A3HE4MYMWK4AER 5.0 NaN
A3QN84C38DE9FU 5.0 NaN
AAAQO894MG80B 5.0 NaN
A22HGAAO8KZ2N3 5.0 NaN
A2JFZLAUG3YFQ7 5.0 NaN
A2MRZDJF90JC1U 5.0 NaN
A31S5QUMFR8NH2 5.0 NaN
A3AV8R0A62PP1N 5.0 NaN
A9A2PR663ED1V NaN 5.0
But What I want to do is to take user_id values from the only first file and search for
product_id
andrating
values in all files against thatuser_id
.
希望你有我的问题,如果在理解上有任何问题,请在下面评论。谢谢
检查是否符合您的要求。你知道吗
输出为:
相关问题 更多 >
编程相关推荐