Since this has been posted to lobste.rs and a lot of people seem to be missing the point, here’s the tl;dr:
The C standard bakes in enough details about pointers such that the amount of memory a C program can access (even on a hypothetical infinite-memory machine) is bounded and statically known. Access to an unbounded amount of memory is necessary (but not sufficient) for Turing completeness. Therefore C is not Turing complete.
This is an argument about the specification of C, not any particular implementation. The fact that no real machine has unbounded memory is totally irrelevant.
This is not a criticism of C.
A friend told me that C isn’t actually Turing-complete due to the semantics of pointers, so I decided to dig through the (C11) spec to find evidence for this claim. The two key bits are 22.214.171.124.4 and 126.96.36.199:
Values stored in non-bit-field objects of any other object type consist of
n × CHAR_BITbits, where
nis the size of an object of that type, in bytes. The value may be copied into an object of type
unsigned char [n](e.g., by
memcpy); the resulting set of bytes is called the object representation of the value. Values stored in bit-fields consist of
mis the size specified for the bit-field. The object representation is the set of
mbits the bit-field comprises in the addressable storage unit holding it. Two values (other than NaNs) with the same object representation compare equal, but values that compare equal may have different object representations.
The important bit is the use of the definite article in the first sentence, “where
n is the size of an object of that type”, this means that all types have a size which is known statically.
Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space.
Pointers to distinct objects of the same type00Interestingly, you could have a distinct heap for every type, with overlapping pointer values. And this is totally fine according to the spec! This doesn’t help you, however, because the number of types is finite: they’re specified in the text of the program, which is necessarily finite.
compare unequal. As pointers are fixed in size, this means that there’s only a finite number of them. You can take a pointer to any object 11“The unary
& operator yields the address of its operand.”, first sentence of 188.8.131.52.3.
, therefore there are a finite number of objects that can exist at any one time!
However, C is slightly more interesting than a finite-state machine. We have one more mechanism to store values: the return value of a function! Fortunately, the C spec doesn’t impose a maximum stack depth22“Recursive function calls shall be permitted, both directly and indirectly through any chain of other functions.”, 184.108.40.206.11, nothing else is said on the matter.
, and so we can in principle implement a pushdown automata.
Just an interesting bit of information about C, because it’s so common to see statements like “because C is Turing-complete…”. Of course, on a real computer, nothing is Turing-complete, but C doesn’t even manage it in theory.
Addendum: Virtual Memory
In a discussion about this on Twitter, the possibility of doing some sort of virtual memory shenanigans to make a pointer see different things depending on its context of use came up. I believe that this is prohibited by the semantics of object lifetimes (220.127.116.11):
The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, and retains its last-stored value throughout its lifetime. If an object is referred to outside of its lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.
The lifetime for heap-allocated objects is from the allocation until the deallocation (18.104.22.168):
The lifetime of an allocated object extends from the allocation until the deallocation. Each such allocation shall yield a pointer to an object disjoint from any other object.
Addendum 2: Non-Object Information
I had a fun discussion on IRC, where someone argued that the definition of pointer equality does not mention the object representation, therefore the fixed object representation size is irrelevant! Therefore, pointers could have extra information somehow which is not part of the object representation.
It took a while to resolve, but I believe the final sentence of the object representation quote and the first clause of the pointer equality quote, together with the fact that pointers are values, resolves this:
- Pointers are values.
Two values (other than NaNs) with the same object representation compare equal, but values that compare equal may have different object representations.
- Points (1) and (2) mean that pointers with the same object representation compare equal.
Two pointers compare equal if and only if…
- The “only if” in (4) means that if two pointers compare equal, then the rest of the rules apply.
- Points (3) and (5) mean that two pointers with the same object representation compare equal, and therefore point to the same object (or are both null pointers, etc).
This means that there cannot be any further information that what is stored in the object representation.
Interestingly, I believe this forbids something I initially thought to be the case: I say in a footnote that different types could have different heaps. They could, but that doesn’t let you use the same object representation for pointers of different types!