我想把下面的代码(在pandas中运行)转换成在cuDF中运行的代码。你知道吗
来自被操纵序列的.head()
的样本数据被插入第三个代码单元的OG代码中--应该能够复制/粘贴运行。你知道吗
# both are float columns now
# rawcensustractandblock
s_rawcensustractandblock = df_train['rawcensustractandblock'].apply(lambda x: str(x))
# adjust/set new tract number
df_train['census_tractnumber'] = s_rawcensustractandblock.str.slice(4,11)
# adjust block number
df_train['block_number'] = s_rawcensustractandblock.str.slice(start=11)
df_train['block_number'] = df_train['block_number'].apply(lambda x: x[:4]+'.'+x[4:]+'0' )
df_train['block_number'] = df_train['block_number'].apply(lambda x: int(round(float(x),0)) )
df_train['block_number'] = df_train['block_number'].apply(lambda x: str(x).ljust(4,'0') )
# series of values from df_train.['rawcensustractandblock'].head()
data = pd.Series([60371066.461001, 60590524.222024, 60374638.00300401,
60372963.002002, 60590423.381006])
下面是使用上面提供的数据而不是整个数据帧时代码的外观。你知道吗
根据尝试转换时遇到的错误,此问题属于系列级别,因此将下面的单元格转换为在cuDF中执行应该可以解决此问题。你知道吗
import pandas as pd
# series of values from df_train.['rawcensustractandblock'].head()
data = pd.Series([60371066.461001, 60590524.222024, 60374638.00300401,
60372963.002002, 60590423.381006])
# how the first line looks using the series
s_rawcensustractandblock = data.apply(lambda x: str(x))
# adjust/set new tract number
census_tractnumber = s_rawcensustractandblock.str.slice(4,11)
# adjust block number
block_number = s_rawcensustractandblock.str.slice(start=11)
block_number = block_number.apply(lambda x: x[:4]+'.'+x[4:]+'0' )
block_number = block_number.apply(lambda x: int(round(float(x),0)) )
block_number = block_number.apply(lambda x: str(x).ljust(4,'0') )
df\u train['census\u tractnumber'].head()
# out
0 1066.46
1 0524.22
2 4638.00
3 2963.00
4 0423.38
Name: census_tractnumber, dtype: object
df\u train['块号'].head()
0 1001
1 2024
2 3004
3 2002
4 1006
Name: block_number, dtype: object
for循环解决方案
熊猫(原始代码)
cuDF(解决方案代码)
您可以使用cuDF字符串方法(通过nvStrings)来完成几乎所有您想做的事情。在cuDF中将这些浮点数转换为字符串时会丢失一些精度(尽管在上面的示例中这可能无关紧要),因此对于这个示例,我只是事先进行了转换。如果可能,我建议首先将
rawcensustractandblock
创建为字符串列,而不是浮点列。你知道吗相关问题 更多 >
编程相关推荐