<blockquote>
<p>I have made some assumptions as the data you have provided seems like a list of JSON data. This is because there is a "," in between your entries.</p>
</blockquote>
<p>这是我的<code>input.json</code>文件。请注意,我在顶部和底部添加了[and],因为这将为您的数据提供适当的JSON结构</p>
<pre><code>[
{
"campaignId": "all",
"startDate": "2020-06-11",
"endDate": "2020-06-11",
"device": "Computers",
"network": "Display Network",
"channel": "all",
"accLevelQS": -1.0,
"impressions": 389,
"clicks": 3,
"ctr": 0.0,
"avgCPC": 0.0,
"convValuePerClick": 0.0,
"convValuePerCost": 0.0,
"costConv1PerClick": 0.0,
"convRate1PerClick": 0.0,
"cost": 0.142884,
"conv1PerClick": 0.0,
"totalConvValue": 0.00,
"allConversions": 0.0,
"allConversionValue": 0.00,
"avgPosition": 0.0,
"intr": 3,
"searchImprShare": 0.0,
"contImprShare": 5.0,
"impressionShare": 5.0
},
{
"campaignId": "all",
"startDate": "2020-06-11",
"endDate": "2020-06-11",
"device": "Mobile devices with full browsers",
"network": "Display Network",
"channel": "all",
"accLevelQS": -1.0,
"impressions": 6101,
"clicks": 90,
"ctr": 0.0,
"avgCPC": 0.0,
"convValuePerClick": 0.0,
"convValuePerCost": 0.0,
"costConv1PerClick": 0.0,
"convRate1PerClick": 0.0,
"cost": 4.342799,
"conv1PerClick": 0.0,
"totalConvValue": 0.00,
"allConversions": 0.0,
"allConversionValue": 0.00,
"avgPosition": 0.0,
"intr": 90,
"searchImprShare": 0.0,
"contImprShare": 5.0077566465021217,
"impressionShare": 5.0077566465021217
}
]
</code></pre>
<p>下面的代码使用<code>pandas</code>库将数据处理为数据帧,然后将其写入CSV文件</p>
<pre><code>import json # Available by default, no install required.
import glob # Available by default, no install required.
import pandas as pd # Requires installation via pip.
# Initialise a list to store our results.
combined_json = []
# Set a glob pattern to *.txt since your files are txt files.
# You can also write the full path e.g. /home/user/textfiles/*.txt
text_files = glob.glob("*.txt")
# Loop through all the text files and combine them into a single JSON list.
# As for 70,000 files, I am unsure how the performance will turn out.
for json_text in text_files:
with open(json_text, 'r') as text_file:
combined_json.extend(json.load(text_file))
# Write all the files to a JSON file. For your future usage.
# You also can read directly from the combed_json variable.
with open('input.json', 'w') as json_file:
json.dump(combined_json, json_file, indent=2)
# Convert the JSON data into a dataframe, using the combined_json variable.
json_df = pd.json_normalize(combined_json)
# Write the data from the dataframe to the CSV file.
# Mode "w" will always overwrite the CSV file, use mode "a" to append text instead of overwriting.
json_df.to_csv("dataframe.csv", mode="w")
</code></pre>
<p>有关<code>pd.json_normalize</code>如何工作的更多信息,您可以参考<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.json_normalize.html" rel="nofollow noreferrer">here</a></p>
<p>要开始使用<code>pandas</code>库,可以参考<a href="https://pandas.pydata.org/pandas-docs/stable/getting_started/index.html" rel="nofollow noreferrer">here</a></p>
<p>如果希望从字符串而不是文件加载JSON,可以引用<a href="https://www.w3schools.com/python/python_json.asp" rel="nofollow noreferrer">here</a></p>
<p>要了解有关<code>glob</code>的更多信息,您可以参考<a href="https://www.w3schools.com/php/func_filesystem_glob.asp" rel="nofollow noreferrer">here</a></p>