Python JSON序列化排除某些字段

2024-09-27 00:21:12 发布

您现在位置:Python中文网/ 问答频道 /正文

摘要

我有一个Python对象层次结构,我想使用JSON序列化(只需通过https://docs.python.org/3/library/json.html,而不是使用任何额外的第三方库)。我想排除某些字段/属性/子对象。我发现很难找到一个简单的答案来实现这个目标?在

示例

我将有一个派生类实例,结果如下:

class MyItemClass(BaseItemClass):
    self.saveThisProperty = 999
    self.dontSaveThisProperty = "Something"
    self.saveThisObject = ObjectType1()
    self.dontSaveThisObject = ObjectType2()

如果我序列化为XML,我希望它看起来像

^{pr2}$

注意,我只序列化特定的属性/子对象,而且我不想序列化派生类实例的整个BaseItemClass。在

在XML中我很好。我知道如何在处理我想要的内容时输出XML片段,要么输出到最后保存的临时内存中文档,要么以增量方式将各个节点/元素输出到流中。我不必把所有的东西都序列化。E、 g

xmlStream.writeStartElement("MyItemClass")
    xmlStream.writeElementWithValue("saveThisProperty", 999)
    xmlStream.writeStartElement("saveThisObject")
        ...
    xmlStream.writeEndElement("saveThisObject")
xmlStream.writeEndElement("MyItemClass")

对于JSON我不能这么做,对吗?我是否需要创建一些新的“独立”对象层次结构(没有来自BaseClass的派生),方法是只复制我想要的属性/子对象,然后JSON序列化它?在

我确实看到了json.dump(default = ...),但这说明:

If specified, default should be a function that gets called for objects that can’t otherwise be serialized. It should return a JSON encodable version of the object

然而,并不是说原始对象不能按默认Python->JSON序列化,而是我不想那样默认,序列化所有行为,我想要我的“选择性”行为。在


Tags: 对象实例selfjson属性序列化层次结构xml
2条回答

我是手术室。我在这里张贴是为了澄清我在我的案件中使用了什么。在

我把@Sina Rezaei在这个帖子中的帖子标记为可接受的解决方案,因为这(他帖子的最后一部分)和@snakechamerb的评论启发我理解了需要什么。在

我的解决方案概要如下:

class ModelScene(QGraphicsScene):

  # Serialize whole scene to JSON into stream
  def json_serialize(self, stream) -> None:
    # Get `json.dump()` to call `ModelScene.json_serialize_dump_obj()` on every object to be serialized
    json.dump(self, stream, indent=4, default=ModelScene.json_serialize_dump_obj)

  # Static method to be called from `json.dump(default=ModelScene.json_serialize_dump_obj)`
  # This method is called on every object to be dumped/serialized
  @staticmethod
  def json_serialize_dump_obj(obj):
    # if object has a `json_dump_obj()` method call that...
    if hasattr(obj, "json_dump_obj"):
      return obj.json_dump_obj()
    # ...else just allow the default JSON serialization
    return obj

  # Return dict object suitable for serialization via JSON.dump()
  # This one is in `ModelScene(QGraphicsScene)` class
  def json_dump_obj(self) -> dict:
    return {
      "_classname_": self.__class__.__name__,
      "node_data": self.node_data
      }

class CanvasModelData(QAbstractListModel):

  # Return dict object suitable for serialization via JSON.dump()
  # This one is class CanvasModelData(QAbstractListModel)
  def json_dump_obj(self) -> dict:
    _data = {}
    for key, value in self._data.items():
      _data[key] = value
    return {
      "_classname_": self.__class__.__name__,
      "data_type": self.data_type,
      "_data": _data
      }
  • 每个“complex”类都定义一个def json_dump_obj(self) -> dict:方法。在
  • 该方法只返回序列化中需要的属性/子对象。在
  • 顶层json.dump(self, stream, default=ModelScene.json_serialize_dump_obj)通过静态方法ModelScene.json_serialize_dump_obj将访问的每个节点增量序列化为流。如果可用,则调用myobj.json_dump_obj(),否则将调用基本对象类型的默认JSON序列化。在

有趣的是,我遇到了一个和我有同样担心的人。从来看,两者有什么区别json.dump文件()和json.dumps文件()在python中?,解决方案https://stackoverflow.com/a/57087055/489865

In memory usage and speed.

When you call jsonstr = json.dumps(mydata) it first creates a full copy of your data in memory and only then you file.write(jsonstr) it to disk. So this is a faster method but can be a problem if you have a big piece of data to save.

When you call json.dump(mydata, file) without 's', new memory is not used, as the data is dumped by chunks. But the whole process is about 2 times slower.

Source: I checked the source code of json.dump() and json.dumps() and also tested both the variants measuring the time with time.time() and watching the memory usage in htop.

我可以为你的情况想出三种解决办法:

解决方案1: 使用Pykson第三方库并定义要序列化为pykson字段的字段。在

样品:

class MyItemClass(pykson.JsonObject):
    saved_property = pykson.IntegerField()

my_object = MyItemClass(saved_property=1, accept_unknown=True)
my_object.unsaved_property = 2
pykson.Pykson().to_json(my_object)

免责声明:我是pykson库的开发人员。在

解决方案2: 第二种解决方案是使用带有自定义默认反序列化程序的包装类。在

^{pr2}$

解决方案3: 这可能是个坏主意,但是如果你有一个深层层次结构,你也可以添加一个函数到所有的C类中,这些类将被序列化,并使用这个函数来获取一个字典,并轻松地将字典转换为json。在

class MyChildClass:
     def __init__(self, serialized_property, not_serialized_property):
        self.serialized_property = serialized_property
        self.not_serialized_property = not_serialized_property

     def to_dict(self):
        # only add serialized property here
        return {
            "serialized_property": self.serialized_property
        }

class MyParentClass:
    def __init__(self, child_property, some_other_property):
        self.child_property = child_property
        self.some_other_property = some_other_property

    def to_dict(self):
        return {
            'child_property': self.child_property.to_dict(),
            'some_other_property': self.some_other_property
        }

my_child_object = MyChildClass(serialized_property=1, not_serialized_property=2)
my_parent_object = MyParentClass(child_property=my_child_object, some_other_property='some string here')
json.dumps(my_parent_object.to_dict())

也可以使用默认处理程序获得相同的结果:

class MyChildClass:
     def __init__(self, serialized_property, not_serialized_property):
        self.serialized_property = serialized_property
        self.not_serialized_property = not_serialized_property

     def to_dict(self):
        # only add serialized property here
        return {
            "serialized_property": self.serialized_property
        }

class MyParentClass:
    def __init__(self, child_property, some_other_property):
        self.child_property = child_property
        self.some_other_property = some_other_property

    def to_dict(self):
        return {
            'child_property': self.child_property,
            'some_other_property': self.some_other_property
        }

def handle_default(obj):
    if isinstance(obj, MyChildClass):
        return obj.to_dict()
    elif isinstance(obj, MyParentClass):
        return obj.to_dict()
    return None

my_child_object = MyChildClass(serialized_property=1, not_serialized_property=2)
my_parent_object = MyParentClass(child_property=my_child_object, some_other_property='some string here')
json.dumps(my_parent_object, default=handle_default)

相关问题 更多 >

    热门问题