java将JSONArray拆分为更小的JSONArray
我遇到了一种情况,即组织。json。JSONArray对象的大小非常大,这最终会导致延迟和其他问题。因此,我们决定将JSONArray拆分为更小的块。 例如,如果JSONArray是这样的:
- [{"alt_party_id_type":"xyz","first_name":"child1ss","status":"1","dob":"2014-10-02 00:00:00.0","last_name":"childSs"},
{"alt_party_id_type":"xyz","first_name":"suga","status":"1","dob":"2014-11-05 00:00:00.0","last_name":"test"},
{"alt_party_id_type":"xyz","first_name":"test4a","status":"1","dob":"2000-11-05 00:00:00.0","last_name":"test4s"},
{"alt_party_id_type":"xyz","first_name":"demo56","status":"0","dob":"2000-11-04 00:00:00.0","last_name":"Demo5"},{"alt_party_id_type":"xyz","first_name":"testsss","status":"1","dob":"1900-01-01 00:00:00.0","last_name":"testssssssssss"},{"alt_party_id_type":"xyz","first_name":"Demo1234","status":"0","dob":"2014-11-21 00:00:00.0","last_name":"Demo1"},{"alt_party_id_type":"xyz","first_name":"demo2433","status":"1","dob":"2014-11-13 00:00:00.0","last_name":"demo222"},{"alt_party_id_type":"xyz","first_name":"demo333","status":"0","dob":"2014-11-12 00:00:00.0","last_name":"demo344"},{"alt_party_id_type":"xyz","first_name":"Student","status":"1","dob":"2001-12-03 00:00:00.0","last_name":"StudentTest"}]
然后我需要帮助将JSONArray划分为三个JSONArray:
- [{"alt_party_id_type":"xyz","first_name":"child1ss","status":"1","dob":"2014-10-02 00:00:00.0","last_name":"childSs"}, {"alt_party_id_type":"xyz","first_name":"suga","status":"1","dob":"2014-11-05 00:00:00.0","last_name":"test"}, {"alt_party_id_type":"xyz","first_name":"test4a","status":"1","dob":"2000-11-05 00:00:00.0","last_name":"test4s"}]
- [{"alt_party_id_type":"xyz","first_name":"demo56","status":"0","dob":"2000-11-04 00:00:00.0","last_name":"Demo5"}, {"alt_party_id_type":"xyz","first_name":"testsss","status":"1","dob":"1900-01-01 00:00:00.0","last_name":"testssssssssss"}, {"alt_party_id_type":"xyz","first_name":"Demo1234","status":"0","dob":"2014-11-21 00:00:00.0","last_name":"Demo1"}]
- [{"alt_party_id_type":"xyz","first_name":"demo2433","status":"1","dob":"2014-11-13 00:00:00.0","last_name":"demo222"}, {"alt_party_id_type":"xyz","first_name":"demo333","status":"0","dob":"2014-11-12 00:00:00.0","last_name":"demo344"}, {"alt_party_id_type":"xyz","first_name":"Student","status":"1","dob":"2001-12-03 00:00:00.0","last_name":"StudentTest"}]
有人能帮我吗。我尝试了许多选择,但都失败了
# 1 楼答案
在处理巨大的输入文件时,您应该使用流式处理方法,而不是将整个文档加载到内存中,以减少内存占用,避免
OutOfMemoryError
,并使读取输入时开始处理成为可能。JSONArray对流媒体几乎没有支持,所以我建议您研究一下Jackson's streaming API、GSON streaming或类似的内容也就是说,如果你坚持使用JSONArray,你可以通过使用JSONTokener拼凑出一个流方法。下面是一个示例程序,它将对输入文件进行流式处理,并创建单独的JSON文档,每个文档最多包含10个元素
要了解为什么大型文件需要流式处理方法,请下载或创建一个大型JSON文件,然后尝试运行一个不流式处理的简单实现。下面是一个Perl命令,用于创建一个包含1000000个元素、文件大小约为16MB的JSON数组
如果你在这个输入上运行
JsonSplit
,它将以很小的内存占用率快速运行,生成100000个文件,每个文件中包含10个元素。此外,它将在启动时立即开始生成输出文件相反,如果您运行以下
JsonSplitNaive
程序,该程序一次读取整个JSON文档,它显然会在很长一段时间内什么都不做,然后使用OutOfMemoryError
终止