Memory Leak Behavior with ArrayIterator, SplHeap, and SplDoublyLinkedlist in PHP 5.{3,4,5,6}
UPDATE: 2014-09-27 01:44 -0400 (EDT)
When I originally wrote this post, I had intended it to be the details of a bug report that I would post to bugs.php.net. I never got around to doing this, but I finally discovered that this has already been reported as a bug.
Since PHP 5.3 it has been possible for memory occupied by objects having cyclic references to be reclaimed via the new garbage collection feature. However, there is a case in which the garbage collector does not appear to be able to reclaim memory.
The pattern we will examine is the following (an arrow indicates a “has a reference to” relationship):
Object #1 <-- intermediary <-- Object #2 <-- Object #1
Here we see a typical case of cyclic references of objects. An object refers to another object which ultimately has a reference to the original object. The “intermediary” can be either an object or an array in the examples we'll consider (in case 0 of the test.php script [see below] the intermediary does not exist).
In cases where the intermediary object is an instance of ArrayIterator, SplHeap, or SplDoublyLinkedlist we will demonstrate the memory leak behavior.
First, consider the following classes:
class Y
{
public static $instcount = 0;
public $z;
public $za;
public function __construct()
{
++self::$instcount;
}
public function __destruct()
{
--self::$instcount;
}
}
class Z
{
public static $instcount = 0;
public $y;
public function __construct(Y $y)
{
++self::$instcount;
$this->y = $y;
}
public function __destruct()
{
--self::$instcount;
}
}
class Za
{
public $a;
public function __construct(array $a)
{
$this->a = $a;
}
}
These will serve as the backdrop of our example.
Now, assume the following function definitions:
function do_gc()
{
echo "Cycles Collected: " . gc_collect_cycles() . "\n";
echo "Count of Y: " . Y::$instcount . "\n";
echo "Count of Z: " . Z::$instcount . "\n";
}
function run_test($a)
{
$y = new Y;
$z = new Z($y);
switch ($a) {
case 0:
echo "Option $a (direct)\n";
$y->z = $z;
break;
case 1:
echo "Option $a (using array)\n";
$y->za = array(
$z
);
break;
case 2:
echo "Option $a (using Za object)\n";
$y->za = new Za(array(
$z
));
break;
case 3:
echo "Option $a (using a stdClass)\n";
$y->za = (object) array(
'z' => $z
);
break;
case 4:
echo "Option $a (using ArrayIterator)\n";
$y->za = new ArrayIterator(array(
$z
));
break;
case 5:
echo "Option $a (using SplDoublyLinkedList)\n";
$s = new SplDoublyLinkedList();
$s->push($z);
$y->za = $s;
$s = null;
break;
case 6:
echo "Option $a (using SplMaxHeap)\n";
$h = new SplMaxHeap();
$h->insert($z);
$y->za = $h;
$h = null;
break;
case 7:
echo "Option $a (using SplFixedArray)\n";
$fa = new SplFixedArray(1);
$fa[0] = $z;
$y->za = $fa;
$fa = null;
break;
case 8:
echo "Option $a (using SplObjectStorage)\n";
$os = new SplObjectStorage();
$os->attach($z);
$y->za = $os;
$os = null;
break;
default:
die("Bad option '$a'\n");
}
}
Finally, consider the following program (hereafter referred to as “test.php”):
if ($argc < 2) $a = 0;
else $a = intval($argv[1]);
gc_enable();
run_test($a);
do_gc();
For all of the tests, PHP was built using default options (except for --prefix):
php5.6-201406241630$ ./configure --prefix=/home/davidp/local
..snip..
php5.6-201406241630$ make
..snip..
php5.6-201406241630$ sapi/cli/php -v
PHP 5.6.0-dev (cli) (built: Jun 24 2014 16:01:35)
Copyright (c) 1997-2014 The PHP Group
Zend Engine v2.6.0-dev, Copyright (c) 1998-2014 Zend Technologies
If we run this (tested under PHP 5.3.10, 5.4.29, 5.5.13, and 5.6.0-dev [snapshot 201406241630]) program in the follwing ways, we get the following results:
php5.6-201406241630$ sapi/cli/php test.php 0
Option 0 (direct)
Cycles Collected: 2
Count of Y: 0
Count of Z: 0
php5.6-201406241630$ sapi/cli/php test.php 1
Option 1 (using array)
Cycles Collected: 3
Count of Y: 0
Count of Z: 0
php5.6-201406241630$ sapi/cli/php test.php 2
Option 2 (using Za object)
Cycles Collected: 4
Count of Y: 0
Count of Z: 0
php5.6-201406241630$ sapi/cli/php test.php 3
Option 3 (using a stdClass)
Cycles Collected: 3
Count of Y: 0
Count of Z: 0
php5.6-201406241630$ sapi/cli/php test.php 4
Option 4 (using ArrayIterator)
Cycles Collected: 0
Count of Y: 1
Count of Z: 1
Here the last two lines of output are of interest. The count of Z and Y
instances is not zero and the call to gc_collect_cycles()
collected no
instances. We'll see this in the next couple of examples as well.
php5.6-201406241630$ sapi/cli/php test.php 5
Option 5 (using SplDoublyLinkedList)
Cycles Collected: 0
Count of Y: 1
Count of Z: 1
php5.6-201406241630$ sapi/cli/php test.php 6
Option 6 (using SplMaxHeap)
Cycles Collected: 0
Count of Y: 1
Count of Z: 1
But in the last two cases, suddenly garbage collection performs as one would expect (as it did in the first four examples).
php5.6-201406241630$ sapi/cli/php test.php 7
Option 7 (using SplFixedArray)
Cycles Collected: 3
Count of Y: 0
Count of Z: 0
php5.6-201406241630$ sapi/cli/php test.php 8
Option 8 (using SplObjectStorage)
Cycles Collected: 5
Count of Y: 0
Count of Z: 0
In the cases of creating a direct cyclic referent (option 0), using an array,
using a custom class, using a stdClass object, using an SplFixedArray, and
using an SplObjectStorage, garbage collection performs as one expects (all
instances of Z and Y have been destroyed after we invoke
gc_collect_cycles()
).
Consequently it seems that ArrayIterator, SplHeap, and SplDoublyLinkedlist are not playing nicely with cyclic garbage collection.