擅长:python、mysql、java
<p>为了更进一步,我将所有有效或缺失字符串的列替换为其解析的日期时间,然后对其余未分析的列引发一个错误:</p>
<pre><code>dtCols = ['eventDate', 'registerDate']
dts = dfBad[dtCols].apply(lambda x: pd.to_datetime(x, errors='coerce', format='%m/%d/%Y'))
mask = pd.isnull(dts) & (dfBad[dtCols] != '')
colHasError = mask.any()
invalidCols = colHasError[colHasError].index.tolist()
validCols = list(set(dtCols) - set(invalidCols))
dfBad[validCols] = dts[validCols] # replace the completely valid/empty string cols with dates
if colHasError.any():
raise ValueError("bad dates in col(s) {0}".format(invalidCols))
# raises: ValueError: bad dates in col(s) ['registerDate']
print(dfBad) # eventDate got converted, registerDate didn't
</code></pre>
<p>但是,接受的答案包含了主要的见解,即继续将错误强制到<code>NaT</code>,然后将非空但无效的字符串与带掩码的空字符串区分开来。在</p>