2

I have a Contract class, that is composed of lists of pages and paragraphs. The pages and paragraphs are defined as separate EmbeddedDocument classes. (Note that a Paragraph can span multiple pages, so can not be a child of Page)

Each Page and Paragraph, in turn, have a list of Line objects. A line will be simultaneously in a page and in a paragraph.

The structure of the classes is as follows:

class Line(mongoengine.EmbeddedDocument):
    _id = mongoengine.ObjectIdField(required=True, default=ObjectId)
    text = mongoengine.StringField()

class Page(mongoengine.EmbeddedDocument):
    lines = mongoengine.ListField(mongoengine.EmbeddedDocumentField(Line))

class Paragraph(mongoengine.EmbeddedDocument):
    lines = mongoengine.ListField(mongoengine.EmbeddedDocumentField(Line))

class Contract(mongoengine.Document):
    pages = mongoengine.ListField(mongoengine.EmbeddedDocumentField(Page))
    paragraphs = mongoengine.ListField(mongoengine.EmbeddedDocumentField(Paragraph))

When I create a new contract and add a line to both the first page and paragraph, the line is one single object. This can be seen below:

# create a new contract
contract = Contract()

# create a new line
line = Line()
line.text = 'This is a test'

# create a new page and add the new line
page = Page()
page.lines.append(line)
contract.pages.append(page)

# create a new paragraph and add the new line
paragraph = Paragraph()
paragraph.lines.append(line)
contract.paragraphs.append(paragraph)

contract.save()

print(contract.pages[0].lines[0] is contract.paragraphs[0].lines[0])
>> True

print(contract.pages[0].lines[0]._id)
>> 5e7b85ebd3844b44ee1a0c8e

print(contract.paragraphs[0].lines[0]._id)
>> 5e7b85ebd3844b44ee1a0c8e

The problem is that after I save the Contract object to MongoDB and then load it again in python, the line objects are no longer the same. They still have the same _id, but if I test for equality, now it returns False:

print(contract.pages[0].lines[0] is contract.paragraphs[0].lines[0])
>> False

print(contract.pages[0].lines[0]._id)
>> 5e7b85ebd3844b44ee1a0c8e

print(contract.paragraphs[0].lines[0]._id)
>> 5e7b85ebd3844b44ee1a0c8e

This is a problem, because when I now update the Line under Page, the change will not be reflected in Paragraph.

Is there a way to ensure that Python/MongoEngine understand that the line is the same?

I'm running Python 3.6, mongoengine 0.19.1

RogB
  • 441
  • 1
  • 4
  • 14

1 Answers1

0

is will return True if two variables point to the same object, == if the objects referred to by the variables are equal.

Is there a difference between "==" and "is"?

Change to:

print(contract.pages[0].lines[0] == contract.paragraphs[0].lines[0])

Explanation

Before you save your contract instance, both pages and paragraphs had the same instance of lines.

When you've saved it, pymongo serialized your object into BSON structure.

Once you retrieve from MongoDB, python will create new instances for pages and paragraphs from BSON structure (deserialize)

Valijon
  • 12,667
  • 4
  • 34
  • 67
  • 1
    But that was exactly my question. I need to have only one instance of `Line`. So they must point to the same object when I retrieve from MongoDB. – RogB Mar 25 '20 at 19:48
  • @RogB Actually it's bad practise work with the same instance for different objects, it can give unexpected results when your app grows. All primitive types are immutable, except lists, dict, objects... Pymongo deserialization creates a clone and totally individual objects. The answer is no. – Valijon Mar 25 '20 at 19:59
  • I understand what you're saying, but in this case they must be the same instance. The same line belongs to both a page and a paragraph. They must be the same or, when I try to update the line, it will only be updated in the page and not the paragraph. – RogB Mar 25 '20 at 20:03