如何比较两个str值dataframe python

2024-06-28 05:37:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图比较数据帧中的两个不同值。我发现我无法利用的问题/答案

import pandas as pd
# from datetime import timedelta

"""
read csv file
clean date column
convert date str to datetime
sort for equity options
replace date str column with datetime column
"""
trade_reader = pd.read_csv('TastyTrades.csv')
trade_reader['Date'] = trade_reader['Date'].replace({'T': ' ', '-0500': ''}, regex=True)
date_converter = pd.to_datetime(trade_reader['Date'], format="%Y-%m-%d %H:%M:%S")
options_frame = trade_reader.loc[(trade_reader['Instrument Type'] == 'Equity Option')]
clean_frame = options_frame.replace(to_replace=['Date'], value='date_converter')

# Separate opening transaction from closing transactions, combine frames
opens = clean_frame[clean_frame['Action'].isin(['BUY_TO_OPEN', 'SELL_TO_OPEN'])]
closes = clean_frame[clean_frame['Action'].isin(['BUY_TO_CLOSE', 'SELL_TO_CLOSE'])]
open_close_set = set(opens['Symbol']) & set(closes['Symbol'])
open_close_frame = clean_frame[clean_frame['Symbol'].isin(open_close_set)]

'''
convert Value to float
sort for trade readability
write
'''
ocf_float = open_close_frame['Value'].astype(float)
ocf_sorted = open_close_frame.sort_values(by=['Date', 'Call or Put'], ascending=True)
# for readability, revert back to ocf_sorted below
ocf_list = ocf_sorted.drop(
    ['Type', 'Instrument Type', 'Description', 'Quantity', 'Average Price', 'Commissions', 'Fees', 'Multiplier'], axis=1
    )
ocf_list.reset_index(drop=True, inplace=True)
ocf_list['Strategy'] = ''
# ocf_list.to_csv('Sorted.csv')

# create strategy list
debit_single = []
debit_vertical = []
debit_calendar = []
credit_vertical = []
iron_condor = []

# shift columns
ocf_list['Symbol Shift'] = ocf_list['Underlying Symbol'].shift(1)
ocf_list['Symbol Check'] = ocf_list['Underlying Symbol'] == ocf_list['Symbol Shift']

# compare symbols, append depending on criteria met
for row in ocf_list:
    if row['Symbol Shift'] is row['Underlying Symbol']:
        debit_vertical.append(row)

print(type(ocf_list['Underlying Symbol']))
ocf_list.to_csv('Sorted.csv')
print(debit_vertical)
# delta = timedelta(seconds=10)

我得到的错误是:

line 51, in <module>
    if row['Symbol Check'][-1] is row['Underlying Symbol'][-1]:
TypeError: string indices must be integers

我试图将新创建的移位列与原始列进行比较,如果它们相同,则添加到列表中。在python中有没有一种方法可以比较两个字符串值?我尝试检查符号检查是否为真,但它仍然返回一个关于str的错误。索引必须是int.iterrows()不起作用


Tags: csvtocleanclosedateopensymbolframe
1条回答
网友
1楼 · 发布于 2024-06-28 05:37:57

在这里,您将实际遍历DataFrame的列,而不是行:

for row in ocf_list:
    if row['Symbol Shift'] is row['Underlying Symbol']:
        debit_vertical.append(row)

您可以使用iterrowsitertuples方法之一对行进行迭代,但它们分别以列表和元组的形式返回行,这意味着您无法使用列名对它们进行索引,就像您在这里所做的那样

其次,您应该使用==而不是is,因为您可能在比较值,而不是身份

最后,我将完全跳过对行的迭代,因为pandas用于根据条件选择行。您应该能够用以下代码替换上述代码:

debit_vertical = ocf_list[ocf_list['Symbol Shift'] == ocf_list['Underlying Symbol']].values.tolist()

相关问题 更多 >