我正在从命令行结果生成一个df,代码如下:-
df_output_lines = [s.split() for s in os.popen("my command linecode").read().splitlines()]
df_output_lines = list(filter(None, df_output_lines))
把它转换成数据帧:
df=pd.DataFrame(df_output_lines)
df
数据格式如下:
abc = pd.DataFrame([['time:"08:59:38.000"', 'instance:"(null)"','id:"3214039276626790405"'],['time:"08:59:38.000"', 'instance:"(Ops-MacBook-Pro.local)"','id:"3214039276626790405"'],['time:"08:59:38.000"', 'instance:"(Ops-MacBook-Pro.local)"','id:"3214039276626790405"']])
abc
我想对它进行过滤,使值before :
成为列名,quotes " "
中的值成为值,所有列都是这样。输出应如下所示:-
到目前为止,我正在努力做到这一点:
abc.rename(columns={0:'time',1:'instance',2:'id'},inplace=True)
然后呢
abc['time'] = abc['time'].map(lambda x: str(x)[:-1])
abc['time'] = abc['time'].map(lambda x: str(x)[6:])
abc['instance'] = abc['instance'].map(lambda x: str(x)[:-1])
abc['instance'] = abc['instance'].map(lambda x: str(x)[10:])
abc['id'] = abc.id.str.extract('(\d+)', expand=True).astype(int)
任何关于lambda表达式或任何一个liner的建议。你知道吗
我的原木产量如下:
time:"11:22:20.000" instance:"(null)" id:"723927731576482920" channel:"sip:confctl.com" type:"control" elapsedtime:"0.000631" level:"info" operation:"Init" message:"Initialize (version 4.9.0002.30618) ... "
time:"11:22:21.000" instance:"Ops-MacBook-Pro.local" id:"723927731576482920" channel:"sip:confctl.com" type:"control" elapsedtime:"0.067122" level:"info" operation:"Connect" message:"Connecting to https://hrpd.www.vivox.com/api2/"
time:"11:22:23.000" instance:"Ops-MacBook-Pro.local" id:"723927731576482920" channel:"sip:confctl-.com" type:"control" elapsedtime:"2.685700" level:"info" operation:"Connect" message:"Connected to https://hrpd.www.vivox.com/api2/"
time:"11:22:23.000" instance:"Ops-MacBook-Pro.local" id:"723927731576482920" channel:"sip:confctl-.com" type:"control" elapsedtime:"2.814268" level:"info" operation:"Login" message:"Logged in .tester_food."
time:"11:22:23.000" instance:"Ops-MacBook-Pro.local" id:"723927731576482920" channel:"sip:confctl-.com" type:"control" elapsedtime:"2.912255" level:"error" operation:"Call" message:".tester_food. failed to join sip:confctl-2@hrpd.vivox.com error:Access token has invalid signature(403)"
time:"12:30:41.000" instance:"Ops-MacBook-Pro.local" id:"10316899144153251411" channel:"sip:confctl-2@hrpd.vivox.com" type:"media" sampleperiod:"0.000000" incomingpktsreceived:"0" incomingpktsexpected:"0" incomingpktsloss:"0" incomingpktssoutoftime:"0" incomingpktsdiscarded:"0" outgoingpktssent:"0" predictedmos:"3" latencypktssent:"0" latencycount:"0" latencysum:"0.000000" latencymin:"0.000000" latencymax:"0.000000" callid:"2477580077" r_factor:"0.000000"
给出您的示例输入:
它来自您的
os.popen
命令,然后我们过滤掉空行,并尝试shlex.split
该行,以便保留引号中的空格(但引号本身被删除),例如:例如,这将为您提供
rows[0]
:然后对
:
上的标识符进行分区,将标识符与值分开,并将其输入pd.DataFrame
,例如:给你一个
df
的:尽管已经给出了答案,但是我想添加一个regex基方法来实现相同的目标:
只是在数据帧中应用
regex=True
。你知道吗正则表达式解释:
将词典列表馈送到
pd.DataFrame
pd.DataFrame
构造函数直接接受字典列表。您可以在列表理解中使用str.rstrip
和str.split
:不清楚您使用什么逻辑来确定只有
'null'
字符串被括号包围。你知道吗相关问题 更多 >
编程相关推荐