Rector is a reconstructor tool for PHP. It takes source code and transformation rules as inputs and modifies the code according to the rules as output.
Even if we don’t think about it, we often use tooling to transform PHP code. For instance, PHP CodeSniffer can validate whether the code abides by PSR
standards and, when it doesn’t, executing the phpcbf
command can automatically fix it. Or PHP-Scoper will scope the dependencies in the project to avoid potential conflicts.
Rector is different from these tools in that it is a meta application. It doesn’t have a predetermined objective, such as fixing styles or scoping the project. Instead, it will transform the code following rules, for whichever rules it is given. Then, Rector can perfectly reproduce PHP CodeSniffer, PHP-Scoper, or any other code transformation tool.
In this article, I’ll share a few tips for creating rules in Rector.
The Rector pillars
Rector stands on the shoulders of two giants:
- PHP Parser: a library that parses PHP code, enabling static code analysis and manipulation
- PHPStan: a static analysis tool
Thanks to PHP Parser, Rector can manipulate the code using nodes in an AST (short for Abstract Syntax Tree). And thanks to PHPStan, Rector can understand the code, so it is able to map, browse, and validate the relationships among entities in the code, such as getting the ancestor for a class or all its implemented interfaces.
It’s a good idea to have a basic understanding of these two libraries before starting with Rector and to keep learning from their documentation as we work with Rector. Indeed, the more complex the Rector rule, the more important it becomes to have a good grasp of these two libraries.
What are Rector rules?
A rule is a PHP class inheriting from AbstractRector
, which executes the transformations on the nodes from the AST (corresponding to the parsed PHP file).
It is composed of three main methods, which we must satisfy:
getRuleDefinition
: used to document the rulegetNodeTypes
: on what type of node will the rule be appliedrefactor
: logic to produce the new AST node
For instance, rule DowngradeNullCoalescingOperatorRector
replaces the ??=
operator introduced in PHP 7.4 with its equivalent from PHP 7.3. It has this implementation:
use PhpParser\Node; use PhpParser\Node\Expr\Assign; use PhpParser\Node\Expr\AssignOp\Coalesce as AssignCoalesce; use PhpParser\Node\Expr\BinaryOp\Coalesce; use Rector\Core\Rector\AbstractRector; use Symplify\RuleDocGenerator\ValueObject\CodeSample\CodeSample; use Symplify\RuleDocGenerator\ValueObject\RuleDefinition; final class DowngradeNullCoalescingOperatorRector extends AbstractRector { public function getRuleDefinition(): RuleDefinition { return new RuleDefinition('Remove null coalescing operator ??=', [ new CodeSample( <<<'CODE_SAMPLE' $array = []; $array['user_id'] ??= 'value'; CODE_SAMPLE , <<<'CODE_SAMPLE' $array = []; $array['user_id'] = $array['user_id'] ?? 'value'; CODE_SAMPLE ), ]); } /** * @return string[] */ public function getNodeTypes(): array { return [AssignCoalesce::class]; } /** * @param AssignCoalesce $node */ public function refactor(Node $node): ?Node { return new Assign($node->var, new Coalesce($node->var, $node->expr)); } }
Let’s see how it works.
getRuleDefinition
We must provide an example of the code before and after the transformation. Rector then uses these two states to document the changes, using the diff format, as done here:
$array = [];
-$array['user_id'] ??= 'value';
+$array['user_id'] = $array['user_id'] ?? 'value';
getNodeTypes
In this function, we indicate on which node from the AST the transformation will be applied. These nodes come directly from PHP Parser.
In the example above, the rule is applied only on nodes of type Coalesce
(aliased as AssignCoalesce
), which is the node representing ??=
.
Some examples of other nodes are:
FuncCall
: whenever calling a function, such asvar_dump("hello")
MethodCall
: whenever calling a method from a class, such as$foo->bar()
Assign
: when assigning a value via=
Equal
,NotEqual
,Identical
, andNotIdentical
: whenever using the binary operator==
,!=
,===
, or!==
, respectively
refactor
This function performs the transformation, if needed. It has return type ?Node
, which means:
- Either return a new node, which will replace the old node; or
- Return
null
, to signify no change
Please note that returning null
means “do not modify the node”; it does not mean “remove the node.”
The rule from above aims to replace $foo ??= $bar
with its equivalent $foo = $foo ?? $bar
. Function refactor
then returns this new node:
return new Assign( $node->var, new Coalesce( $node->var, $node->expr ) );
The new node is of type Assign
, which is the =
in $foo = $foo ?? $bar
. This type requires two elements:
- The variable
$foo
, which is retrieved from the original node, as$node->var
- The expression
$foo ?? $bar
To create the expression, we nest a new node on it, of type [Coalesce](https://github.com/nikic/PHP-Parser/blob/master/lib/PhpParser/Node/Expr/BinaryOp/Coalesce.php)
, which is the ??
in $foo ?? $bar
. The coalesce operator requires two elements:
- The expression on the left
$foo
, which is retrieved from the original node as$node->var
- The expression on the right
$bar
, which is retrieved from the original node as$node->expr
This example shows the basic concept of what creating a rule involves:
- Find what new node satisfies the target code
- Identify the data it requires
- Port data (variables, expressions) from the old node to the new node
Reusing code from existing rules
At the time of writing, the Rector repo provides almost 700 rules, involving transformations of many kinds. These existing rules are a wonderful source to help us implement our own custom rules.
So this is my advice: whenever you need to create a custom rule, check first if a similar logic has already been coded in any of the existing rules. Chances are, there will be.
For instance, I have implemented rule DowngradeStripTagsCallWithArrayRector
, which converts the array parameter passed to strip_tags
— supported from PHP ≥7.4 — into a string parameter that can be used with PHP 7.3 and below:
-strip_tags($string, ['a', 'p']);
+strip_tags($string, '<' . implode('><', ['a', 'p']) . '>');
Now, we may not know the type of the parameter during the static analysis. For instance, this function returns either a string or an array:
function getStringOrArray() { if (rand(0, 1)) { return ['a', 'p']; } return 'ap'; }
Then, our code needs to check the type of the parameter during runtime:
-strip_tags($string, getStringOrArray());
+strip_tags($string, is_array(getStringOrArray()) ? ( '<' . implode('><', getStringOrArray()) . '>' ) : getStringOrArray());
But now we have a problem: function getStringOrArray()
is executed twice, which could be expensive, or even worse, it could produce unintended side effects (for instance, if it increases a global counter, it will do it twice).
So the solution is to assign the value from getStringOrArray()
to a variable first:
-strip_tags($string, getStringOrArray());
+$var = getStringOrArray();
+strip_tags($string, is_array($var) ? ( '<' . implode('><', $var) . '>' ) : $var);
But then, I can’t randomly choose the name for the variable as $var
(or anything else) since it may already exist, and I’d be overriding its value:
$var = "blah blah blah";
-strip_tags($string, getStringOrArray());
+$var = getStringOrArray();
+strip_tags($string, is_array($var) ? ( '<' . implode('><', $var) . '>' ) : $var);
var_dump($var);
// It expects "blah blah blah". It got "ap"
I had no idea how to deal with this. So I browsed the list of all the rules in the repo, checking if any of them would deal with this problem, i.e., creating a new variable with an unused name.
And I found it. Rule ForRepeatedCountToOwnVariableRector
does this transformation:
class SomeClass
{
public function run($items)
{
- for ($i = 5; $i <= count($items); $i++) {
+ $itemsCount = count($items);
+ for ($i = 5; $i <= $itemsCount; $i++) {
echo $items[$i];
}
}
}
The variable $itemsCount
is being created out of nowhere. Checking how it’s done, I discovered the VariableNaming
service, which can identify whether variable $itemsCount
already exists. If it does, it attempts again for $itemsCount2
, and so on until it finds a variable name that has not been added.
Then I copy/pasted the logic to use the service, from here:
$variableName = $this->variableNaming->resolveFromFuncCallFirstArgumentWithSuffix( $node, 'Count', 'itemsCount', $forScope );
As a general note, I find the source code in the Rector repo quite elegant. I particularly like that it makes extensive use of the Symfony components, including for dependency injection, CLI commands, and file and directory finder. And I learned quite a bit about programming best practices while browsing it, so I’d recommend you do, too.
Tips for testing
Here are a few helpful tips for testing rules connected to PHPUnit.
When executing phpunit
to test a rule, if the rule has many tests and only one is failing, we can execute only that one by passing --filter=test#X
, where X
is the order number of the fixture test.
For instance, when executing the following:
vendor/bin/phpunit rules/solid/tests/Rector/Class_/FinalizeClassesWithoutChildrenRector/FinalizeClassesWithoutChildrenRectorTest.php
I would get this error:
There was 1 failure: 1) Rector\DowngradePhp73\Tests\Rector\List_\DowngradeListReferenceAssignmentRector\DowngradeListReferenceAssignmentRectorTest::test with data set #4 (Symplify\SmartFileSystem\SmartFileInfo Object (...)) rules/downgrade-php73/tests/Rector/List_/DowngradeListReferenceAssignmentRector/Fixture/nested_list.php.inc Failed asserting that string matches format description.
From the error, we can tell that test nested_list.php.inc
is #4
, so I could execute only that test like this:
vendor/bin/phpunit rules/solid/tests/Rector/Class_/FinalizeClassesWithoutChildrenRector/FinalizeClassesWithoutChildrenRectorTest.php --filter=test#4
This is useful for debugging, to do the quick and easy method of dumping the output to screen so as to visualize where the problem may be.
If we need to dump the content of a node, we can do so inside the rule class, like this:
dump($this->print($node)); die;
We must use dump
, from Symfony’s VarDumper component, instead of var_dump
because:
- It formats the output to make it more understandable
- The node may contain cyclical references, and
dump
identifies and stops them, butvar_dump
does not, so the output on the screen would go on forever
Conclusion
Rector is a wonderful tool to transform PHP code. I am using it to transpile my application from PHP 7.4 to 7.1 so I can code it using modern PHP features, yet deploy it to the environment supported by my clients.
Get set up with LogRocket's modern error tracking in minutes:
- Visit https://logrocket.com/signup/ to get an app ID
- Install LogRocket via npm or script tag.
LogRocket.init()
must be called client-side, not server-side - (Optional) Install plugins for deeper integrations with your stack:
- Redux middleware
- NgRx middleware
- Vuex plugin
$ npm i --save logrocket
// Code:
import LogRocket from 'logrocket';
LogRocket.init('app/id');
Add to your HTML:
<script src="https://cdn.lr-ingest.com/LogRocket.min.js"></script>
<script>window.LogRocket && window.LogRocket.init('app/id');</script>
thnks