Adam's Debugging Adventures: The Immutable Mutable Object

Adam Williamson

2018-06-27 16:03

Here's a puzzle for you, Python fans:

[adamw@adam dnf (master %)]$ python3
Python 3.6.5 (default, Apr 23 2018, 22:53:50) 
[GCC 8.0.1 20180410 (Red Hat 8.0.1-0.21)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from dnf.conf import MainConf
>>> mc = MainConf()
>>> print(mc.group_package_types)
['mandatory', 'default', 'conditional']
>>> mc.group_package_types.append('foo')
>>> print(mc.group_package_types)
['mandatory', 'default', 'conditional']
>>>

Note: if you want to try reproducing this...make sure you use DNF 3. It works as expected with DNF < 3. That's why it just showed up as a problem.

Before I explain what's going on there...let's unpick the problem a bit for non-Pythonistas.

In Python (and in many other languages) some things - objects - are 'mutable', and some are 'immutable'. A 'mutable' object can be changed. An 'immutable' object cannot.

In the Python session above, we create an object, mc, which is an instance of the class MainConf (don't worry if you don't entirely understand that, it's not compulsory). We then examine one attribute of mc: mc.group_package_types. Python tells us that this is ['mandatory', 'default', 'conditional'] - which is a Python list containing the values 'mandatory', 'default' and 'conditional'.

In Python, lists are 'mutable'. That means you can take a list and change it somehow - remove an item from it, add an item to it, re-order it - and it's still the same object. Any existing reference to the object is still valid, and now refers to the changed list.

For comparison, an example of an 'immutable' object is a string. If you do this:

mystring = "here's some text"

you can't then change the actual string object referenced by mystring there. It has no methods that let you change it in any way, and there are no functions that can operate on it and change it. You can do this:

mystring = "here's some text"
mystring = "here's some different text"
mystring = mystring.replace("some", "some more")

and at each step the contents of the string to which the name mystring refers are different - but also, at each step mystring refers to a DIFFERENT object. (That's why the replace() string method returns a new string - it can't mutate the existing string). So if you did this:

mystring = "here's some text"
otherref = mystring
mystring = "here's some different text"

then at the end, otherref still points to the first-created string object and its value is still "here's some text", while mystring points to a new string object and its value is "here's some different text". Let's compare with a similar case using a mutable object:

mylist = [1, 2, 3]
otherref = mylist
mylist.append(4)
print(mylist)
print(otherref)

In this case, when we get to the end, both mylist and otherref are still pointing to the same object, the original object, and both prints will print [1, 2, 3, 4]. No new list object was created at any point after the initial creation of mylist.

So with that understood, take a look back at the original example, and maybe you can see why it's so weird. By all appearances, it looks like we have a pretty simple scenario here: we have an object that has an attribute which is a list, and we just append a value to that list. Then we go look at its value again, and it...hasn't changed at all? But we didn't get any kind of crash, or error, or anything. Our append call returned apparently successfully. It just...didn't seem to change anything. The list is an immutable mutable object!

This is a real problem in real code: it broke the most recent Fedora Rawhide compose. So, obviously, I was pretty keen to figure out what the hell was going on, here! It turns out that it's down to dnf getting clever (arguably over-clever).

Python's a very...flexible language. The key to the problem here turned out to be exactly what happens when we actually look up the group_package_types attribute of mc, the dnf.conf.MainConf instance.

Getting and setting attributes of objects in Python is usually a kind of core operation that you never really think about, you just expect it to work in the 'usual way'. A simple approximation of how it works is that the object has a Python dict (like a 'hash' in some languages - a key/value store, more or less) whose keys are attribute names and whose values are the attribute values. When you ask for an attribute of an object, Python checks if its name is one of the keys in that object's dict, and if it is, returns the value. If it's not, Python raises an error. When you set an attribute, Python stuffs it into the dict.

But since Python's flexible, it provides some mechanisms to let you mess around with this stuff, if you want to. You can define __setattr__, __getattr__ and/or __getattribute__ methods in a class, and they'll affect this behaviour.

The base object class that almost all Python classes inherit from defines the default __setattr__ and __getattribute__, which work sort of like the approximation I gave above. If you override __setattr__ in a class, then when something tries to set an attribute for an instance of that class, that method will get called instead of the default object.__setattr__. If you override __getattribute__, then that method will get called instead of object.__getattribute__ when something tries to look up an attribute of an instance of that class.

If you leave __getattribute__ alone but define __getattr__, then when something tries to look up an attribute, first the stock object.__getattribute__ will be used to try and look it up, but if that doesn't find it, rather than raising an exception immediately, Python will try your __getattr__ method.

We can actually override __setattr__ and __getattr__ to do a very simplified demonstration of how the default implementation usually works:

#!/bin/python3

class TestClass(object):
    def __init__(self):
        self.__dict__["_attrs"] = {}

    def __setattr__(self, name, value):
        print("Hi, setattr here, just settin' attrs...")
        self._attrs[name] = value

    def __getattr__(self, name):
        print("Hi, getattr here, here's your attr!")
        return self._attrs[name]

tc = TestClass()
tc.foo = [1, 2, 3]
print(tc.foo)
tc.foo.append(4)
print(tc.foo)

Note that __dict__ is the store that object.__getattribute__ uses, so that's why we set up our backing store with self.__dict__["_attrs"] = {} - that ensures that when we look up self._attrs, we will find it via __getattribute__. We can't just do self._attrs = {} because then we wind up in an infinite recursion in our __getattr__.

If you save that and run it, you'll see [1, 2, 3] then [1, 2, 3, 4] (plus the messages that prove our new methods are being used). Our mutable attribute is nice and properly mutable. We can append things to it and everything. Notice that when we append the value, we hit __getattr__ but not __setattr__.

So, how does this manage not to work with dnf config classes? (MainConf is a subclass of BaseConfig, and so are a ton of other config-related classes in dnf - we actually encountered this bug with another subclass, RepoConf). It turns out to be because dnf overrides BaseConfig.__setattr__ and BaseConfig.__getattr__ to do some "clever" stuff, and it breaks this!

We don't need to go into what its __setattr__ does in detail, except to note one thing: it doesn't store the values in the __dict__ store, so object.__getattribute__ can never find them. When looking up any attribute on an instance of one of these classes except _config (which is the store the class' __setattr__ and __getattr__ methods themselves use, just like _attrs in our example, and is created directly in __dict__ in the same way), we always wind up in the class's __getattr__.

Here's the whole of current dnf's BaseConfig.__getattr__:

def __getattr__(self, name):
    option = getattr(self._config, name, None)
    if option is None:
        return None
    try:
        value = option().getValue()
    except Exception as ex:
        return None
    if isinstance(value, cfg.VectorString):
        return list(value)
    if isinstance(value, str):
        return ucd(value)
    return value

There is some more stuff going on in the background here that we don't need to worry about too much (a feature of DNF, I have found, is that it has layers upon layers. It contains multitudes. You usually can't debug anything in DNF without going through at least eight levels of things calling other things that turn out to be yet other things that turn out to be written in C++ just cuz.) In the case of the group_package_types option, and also the option we were actually dealing with in the buggy case (the baseurl option for a repo), the option is basically a list-y type, so we wind up in the if isinstance(value, cfg.VectorString): branch here.

So if you follow it through, when we asked for mc.group_package_types, unlike in the default Python implementation or our simplified example, we didn't just pull an existing list object out from some sekrit store in the mc object. No. We got some kind of object (fact fans: it's a libdnf.conf.OptionStringList - libdnf is the C++ bit I mentioned earlier...) out of the self._config dict that's acting as our sort-of attribute store, and ran its getValue method to get some other sort of object (fact fans: it's a libdnf.conf.VectorString), then we ran list() on that object, and returned that.

The problem is that the thing that gets returned is basically a temporary copy of the 'real' backing value. It's a mutable object - it really is a list! - and we can mutate it...but the next time anyone looks up the same attribute we looked up to get that list, they won't get the same list object we got. This wacky __getattr__ will run through the same indirection maze and return a new listified copy of the backing value. Every time you look up the attribute, it does that. We can mutate the copies all we like, but doing that doesn't actually change the backing value.

Now, it's easy enough to work around this, once you know what's going on. The overridden __setattr__, of course, actually does change the backing store. So if you explicitly try to set an attribute (rather than getting one and mutating it), that does work:

[adamw@adam dnf (master %)]$ python3
Python 3.6.5 (default, Apr 23 2018, 22:53:50) 
[GCC 8.0.1 20180410 (Red Hat 8.0.1-0.21)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from dnf.conf import MainConf
>>> mc = MainConf()
>>> print(mc.group_package_types)
['mandatory', 'default', 'conditional']
>>> mc.group_package_types = mc.group_package_types + ['foo']
>>> print(mc.group_package_types)
['mandatory', 'default', 'conditional', 'foo']
>>>

That time, it worked because we didn't try to mutate our magical immutable mutable object. We just flat out replaced it with a new list. When we explicitly set the attribute like that, we hit the overridden __setattr__ method, which does the necessary magic to write the new value to the backing store.

But any regular Pythonista who sees that the instance attribute foo.bar is a mutable object is naturally going to assume that they can go ahead and mutate it. That's just...standard. They aren't going to figure they should ignore the fact that it's a mutable object and just replace it every time they want to change it. That's exactly what we did do in the code that got broken. That's the exact code that used to work with DNF 2 but no longer does with DNF 3.

So, that took a few hours to figure out! But I got there in the end. I really just wanted to write up my latest 'interesting' debugging adventure! But if there's a moral to this story...I guess it's "think really hard about whether messing with core behaviour like this is the right way to go about implementing your special thing"?

Oh, and by the way: comments should be working again now! I just disabled the plugin that was interfering with them. So, you know, comment away.

Comments