1. 程式人生 > 其它 >超大JSON檔案解析方案(Java)

超大JSON檔案解析方案(Java)

解析超大JSON檔案

1、需求

最近專案中需要將一個一個大於800M的JSON檔案匯出到Excel中,試過普通的按行讀取檔案和JSONReader流讀取檔案,由於JSON檔案實在過於龐大,導致OOM問題

2、解決方案

每個json陣列中包含的json物件太多,導致用流和按行讀取時載入到記憶體會導致記憶體溢位。.

最終採用了JsonToken的解決方案。

package com.godfrey.poi.util;


import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.JsonParser;
import com.fasterxml.jackson.core.JsonToken;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.MappingJsonFactory;

import java.io.File;

/**
 * @author godfrey
 * @since 2021-12-05
 */
public class ParseJsonUtil {
    public static void main(String[] args) throws Exception {
        JsonFactory f = new MappingJsonFactory();
        JsonParser jp = f.createJsonParser(new File("F:/FeaturesToJSON.json"));
        JsonToken current;
        current = jp.nextToken();
        if (current != JsonToken.START_OBJECT) {
            System.out.println("Error: root should be object: quiting.");
            return;
        }
        while (jp.nextToken() != JsonToken.END_OBJECT) {
            String fieldName = jp.getCurrentName();
            // move from field name to field value
            current = jp.nextToken();
            if ("features".equals(fieldName)) {
                if (current == JsonToken.START_ARRAY) {
                    // For each of the records in the array
                    while (jp.nextToken() != JsonToken.END_ARRAY) {
                        // read the record into a tree model,
                        // this moves the parsing position to the end of it
                        JsonNode node = jp.readValueAsTree();
                        // And now we have random access to everything in the object
                        System.out.println("field1: " + node.get("field1").asText());
                        System.out.println("field2: " + node.get("field2").asText());
                    }
                } else {
                    System.out.println("Error: records should be an array: skipping.");
                    jp.skipChildren();
                }
            } else {
                System.out.println("Unprocessed property: " + fieldName);
                jp.skipChildren();
            }
        }
    }
}

程式碼中使用流和樹模型解析的組合讀取此檔案。 每個單獨的記錄都以樹形結構讀取,但檔案永遠不會完整地讀入記憶體,因此JVM記憶體不會爆炸。最終解決了讀取超大檔案的問題。