WeakReferences

18 July 2019 - 12:25am

Among the many wonderful improvements coming in PHP 7.4, are WeakReferences. What is a weak reference? To answer that question, we first need to quickly skim how references work in PHP.

Strong references

In PHP, objects are not stored in the variables they are assigned to. Instead, the variable stores a reference to that object. In this way, multiple variables can all point to the same object.

Up until now, all references used in PHP are implicitly strong references. As long as there is a single reference to an object, that object will be retained in memory, and will not be collected by the garbage collector.

This creates a problem, however. As long as there is a single remaining reference to an object, that object will continue to take up space in memory. This is not usually a problem in PHP, as the memory will be released as soon as the script finishes, which is typically the length of time it takes to make a page request.

Our problem, therefore, tends to occur when we combine two things:

  • The first is an object cache of some sort. Pulling data from the database is expensive, and having multiple copies of the same data floating around can cause confusing issues. An object cache will ameliorate both issues, as data only needs to be pulled once, and the data is contained within a single object that can be updated as necessary.
  • The second is any long-running task that works through a large amount of data. Examples include import and export tasks, and certain types of maintenance task.

Essentially, even though this sort of task doesn't use a large amount of memory at any specific time, most of the data it does use remains stored in memory long past when it's no longer needed.

Weak references to the rescue

A weak reference is similar to a normal reference, except that it doesn't prevent the garbage collector from collecting the object. As long as no strong references to that object remain, it will be trashed at the next opportunity. This provides us a way to implement most of the benefits of a cache, with none of the same memory issues we may run into otherwise.

A very basic cache might look something like this:

<?php
class Cache
{
    /** @var object[] */
    private $objects = [];

    /**
     * @param   int|string   $id  An identifier to store the object
     * @return  object|null       The stored object, or null if that object is not in the cache
     */
    public function getObject($id)
    {
        return $this->objects[$id] ?? null;
    }

    /**
     * @param  int|string  $id      The identifier for the stored object
     * @param  object      $object  The object to store
     */
    public function setObject($id, $object)
    {
        $this->objects[$id] = $object;
    }
}

With the new WeakReference class, we can change it to this:

<?php
class Cache
{
    /** @var WeakReference[] */
    private $objects = [];

    /**
     * @param   int|string   $id  An identifier to store the object
     * @return  object|null       The stored object, or null if that object is not in the cache
     */
    public function getObject($id)
    {
        $reference = $this->objects[$id] ?? null;

        if($reference === null)
        {
            return null;
        }

        return $reference->get();
    }

    /**
     * @param  int|string  $id      The identifier for the stored object
     * @param  object      $object  The object to store
     */
    public function setObject($id, $object)
    {
        $this->objects[$id] = WeakReference::create($object);
    }
}

Notes:

  • If the object that a WeakReference was referencing has been collected by the garbage collector, the get() method will return null.
  • Due to the nature of weak references, the value returned by get() can be changed by other arbitrary parts of your codebase. You should not rely on that value remaining unchanged if you call another function or method.
  • WeakReference should not be confused with WeakRef which is a PHP extension, and not a native part of PHP.