I have to warn you. This is a bit wacky, and I have no illusion that any of this will go into Python. It's just some stuff that Python has inspired me to think about, mainly because Python exposes namespaces to the programmer, and programmers do cool things with it. How far can that go? (You're supposed to run away yelling when you hear someone say "How far can something go?") This idea touches on stackless (because stack frames become objects, which are on the heap) and nested scopes, too, if that'll help you run screaming. But if you're ready to read some crazy stuff, here it is:
[2012-12-02] I wrote this article in 2003 and didn't realize that there was a more general pattern underneath. See Steve Yegge's article on properties.
[2017-01-23] Older versions of Lua have something like this. Just as you can modify the lookup process on objects with metatables, you can modify the lookup process on scopes with the _ENV
table. Then you can use this to implement objects, classes, modules, stack frames, with statement, methods of modules, changing parents, etc. Cool!
[2020] Smalltalk has something like this too, which shouldn't surprise me.
Namespace Unification
Python and many other OO languages have a nice way of looking up
field names in objects — they look in an object, then in
the class, then superclass, and so on. It's a chained lookup
system. Python also uses a chained lookup for variable names.
It first looks in locals()
and then looks in
globals()
. If we have nested scopes in P3k, it
will look in the local namespace, then in the parent namespace,
then in its parent, and so on until we reach the global
namespace. Jim Fulton's ExtensionClass
implements
a chained lookup system that can go from one object to its
"container" to find values.
I'd like to see all of the chained namespace implementations unified. Objects, classes, modules, activation records ("stack frames")— all are namespaces underneath.
In this proposal, the __parent__
name is treated
specially in an object. For looking up a name, you first check
the namespace and then if you don't have the name, you ask your
parent. If you have no __parent__
, you give a
NameError
. For assigning to a name, you always
assign the name in the current namespace, and you never look at
the parent.
Note that this is just like what we do now, except we call the
special field "__class__
" if it's in an object and
"__bases__
" if it's a class. In this system,
"ordinary" objects would simply have their class as the
__parent__
. An ordinary class would have its base
class as its __parent__
, as long as there's no
multiple inheritance [1]. Environmental
acquisition (as implemented in
ExtensionClass
) would have
its container has its __parent__
. At this point,
an ordinary class or an acquisition class is no different than a
"normal" object, so we may not need classes. (Already in
Python, classes and objects aren't so different.)
The lookup semantics (chaining for getattr
and not
chaining for setattr
) almost match both how fields
in objects work and how local variables work [2]. So we could make namespaces into objects
too. In the following example:
# module spam.py x = 5 def foo(y): print x x = 8 print x def cheese(z): print x, y, z return spam bar = foo(10) bar(35)
The global (module) namespace is an object. (In fact, it's the module
object.) It's the object <x = 5, foo = function....>. The function
foo needs to remember its scope, so its co_namespace
points to the
spam module object.
When we call foo(10)
, we need to create a new
namespace for the local variables. This is an object <y=10,
__parent__ = spam-module>. When we print x, we look for x in
this object, and it isn't there. So we look for x in the
__parent__
, which is the spam module object. It's
there, so we return 5 and print it. The next thing we do is
assign x to 8. Since there's no chaining on assignment, we
create an entry x=8 in the namespace object, which becomes
<y=10, x=8, __parent__ = spam-module>. Now we try to
print x again. We look in the namespace object and find x is 8,
so that's what we print. Next we define a function cheese,
which gets its co_namespace
set to the current
namespace object. Then we return that function object.
The global module now gets bar=function... added to it.
And we call bar(35)
. What happens here? When we
call a function, we have to create a new namespace object where
the parent is set to the co_namespace
. So that's
<z=35, __parent__ = <y=10, x=8, __parent__ =
spam-module> >. So when we try to print x, y, z, we find
8, 10, 35.
Does that all make sense?
Summary: rules for variable lookup look and smell a lot like rules for object lookup. If we combine the two, we can also put in classes, acquisition classes, and modules. It's "everything's an object" pushed a bit farther, into the area of local variables. I would keep the syntax that distinguishes classes from objects, but it'd be just syntax— nothing deeply different underneath.
Notes
- Methods on objects
-
We have to resolve the issue of "self" being a magic argument. If objects and classes are the same, then either
def foo(self, x)
or.self.foo = lambda x:
.. isn't consistent.We might want to say "method" introduces an unbound method and "def" introduces a bound method (i.e., a function that isn't going to take 'self').
- A magic "self" variable
-
We could have it so that when you invoke a method, it sets not only
__parent__
to theco_namespace
, but also sets self to the object. That way, you'd definefoo()
without listing "self", but you'd still have a "self" object and you'd access fields with "self.f".method foo(x): print self.f + x
I'm not convinced this is the right thing to do.
- Inline objects
-
It'd be nice to have a lightweight way of declaring objects, like the <name=value, name=value, ...> syntax I used in the examples above. If objects are everything, you want to be able to make them all the time. It's possible to use
{}
even though{}
is used for dictionaries. If it's{key:value, ..}
then it's a dictionary and if it's{name=value, ..}
then it's an object. (Maybe objects and dictionaries become the same thing underneath.) But that may be too confusing. - The "global" keyword
-
We still want some way of getting to the next level of scope. Since objects and namespaces are combined, and the object has a special
__parent__
, the__parent__
is exposed as a local variable. So we could do:x = 5 def foo(): __parent__.x = 10
to assign to the global x. If it's nested more:
x = 5 def foo(): def cheese(): __parent__.__parent__.x = 5
Not ideal, but then, I don't deal with globals too much anyway. Maybe '
global
' can be a magic variable (not keyword) that gets set to the last non-empty__parent__
. Then you'd end up doing:x = 5 def foo(): def cheese(): global.x = 5
- Cycles and refcounting
-
If we do nested scopes, we already get the cycles problem. I don't think making namespaces into objects makes anything worse.
- Modules
-
Modules already exhibit this unification of namespaces and objects. From inside the module, you can refer to variables. From outside the module, you refer to fields. Unifying namespaces and objects wouldn't really do anything to modules. Instead, it'd make all namespaces behave like a module in that they're objects.
- Methods on modules
-
I think you could get methods (including things like
__repr__
) on modules free here. I haven't thought about it enough, but my intuition tells me that since classes have gone away, and you can put methods on objects, you should be able to put methods on modules too. We just have resolve the bound/unbound method issue. - The "
with
" statement from Pascal -
I suspect you can do
with
pretty easily but I'm not sure about the details. It may require multiple inheritance (aiee) if you want to be able to see local variables at the same time you see an object's fields. - Assignment to
__parent__
-
So far I've been assuming that
__parent__
is something set by Python for internal use. For objects, it's set when you call a class constructor. For classes, it's set when you declare the class using the "class" construct. For namespaces, it's set when you call a function.But really, the cool stuff is really when the user can assign to
__parent__
!For example,
__parent__
= self would make it so that you can assign to fields in your object without using "self.
". (You'd also lose your local and global variables, which would discourage anyone from doing this.) The environmental acquisition aspect ofExtensionClass
is pretty easy now — you just create a raw object and set its__parent__
to its container. You could "reparent" an object on the fly to change its class. Or you could use prototype-based programming techniques, as in Self. - Efficiency
-
It'll probably be harder to optimize name lookup with something like this (especially if you can change
__parent__
). However .. with only one underlying implementation for namespaces, objects, modules, and classes, any optimization work you perform here will benefit all of those language features. - Related language features
-
Simula's objects were originally activation records / stack frames. So making the two similar or even the same is a really old idea. Smalltalk makes everything an object, so my guess is that activation records too were objects. Smalltalk blocks are objects too. JavaScript tries to do something that unifies the two, but gets many aspects totally wrong, by mixing up static scopes and dynamic activation records. (For example, functions get an object with local variables when they're declared, even though they haven't been called yet!) Pascal and JavaScript have a "with" statement that lets you view an object's fields as variables.
Footnotes
[1] One snag is multiple inheritance. I don't think we really need multiple inheritance once we have nested scopes, but MI is a really controversial issue. Here's an example of MI:
class TCPServer: ... class ForkingMixIn: ... class ForkingTCPServer(ForkingMixIn, TCPServer): pass
What I'd do instead is write ForkingMixIn
this way:
def ForkingMixIn(AnyServer): class ForkingMixIn(AnyServer): ... return ForkingMixIn
(Note: I wouldn't want to use this syntax, but this illustrates the underlying construct.)
Now I can write ForkingTCPServer
this way:
ForkingTCPServer = ForkingMixIn(TCPServer)
The advantage of this over MI (in Python) is that you can call
the methods from the base class. Right now, ForkingMixIn
can't
easily define method foo()
to call
TCPServer
's foo()
and then do
one more thing. With the above setup, ForkingMixIn
has a handle
to the base class, and can call AnyServer.foo(self)
.
Alternatively, a namespace could have a __parents__
tuple, and it could search each one of them. However, this would
be unnecessary for variable scoping (unless your mind is really
twisted!). It'd be useful for the "with" statement, but it seems
confusing.
[2] One thing I find confusing in Python is that a local name definition applies even before it's been executed:
x = 5 def foo(): print x x = 3 print x
IMO, this should print 5 and then 3. The unified namespace
proposal would do this. However, Python currently gives a
NameError
.