Modify html attribute with php

I have a html string that contains exactly one a-element in it. Example:

   <a href="http://www.test.com" rel="nofollow noreferrer noopener" rel="nofollow noreferrer noopener" rel="nofollow external">test</a>

In php I have to test if rel contains external and if yes, then modify href and save the string.

I have looked for DOM nodes and objects. But they seem to be too much for only one A-element, as I have to iterate to get html nodes and I am not sure how to test if rel exists and contains external.

$html = new DOMDocument();
$html->loadHtml($txt);
$a = $html->getElementsByTagName('a');
$attr = $a->item(0)->attributes();
...

At this point I am going to get NodeMapList that seems to be overhead. Is there any simplier way for this or should I do it with DOM?

Here is Solutions:

We have many solutions to this problem, But we recommend you to use the first solution because it is tested & true solution that will 100% work for you.

Solution 1

Is there any simplier way for this or should I do it with DOM?

Do it with DOM.

Here’s an example:

<?php
$html = '<a href="http://example.com" rel="nofollow noreferrer noopener" rel="nofollow noreferrer noopener" rel="nofollow external">test</a>';
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$nodes = $xpath->query("//a[contains(concat(' ', normalize-space(@rel), ' '), ' external ')]");
foreach($nodes as $node) {
    $node->setAttribute('href', 'http://example.org');
}
echo $dom->saveHTML();

Solution 2

I kept going to modify with DOM. This is what I get:

$html = new DOMDocument();
$html->loadHtml('<?xml encoding="utf-8" ?>' . $txt);
$nodes = $html->getElementsByTagName('a');
foreach ($nodes as $node) {
    foreach ($node->attributes as $att) {
        if ($att->name == 'rel') {
            if (strpos($att->value, 'external')) {
                $node->setAttribute('href','modified_url_goes_here');
            }
        }
    }
}
$txt = $html->saveHTML();

I did not want to load any other library for just this one string.

Solution 3

The best way is to use a HTML parser/DOM, but here’s a regex solution:

$html = '<a href="http://www.test.com" rel="nofollow noreferrer noopener" rel="nofollow noreferrer noopener" rel="nofollow external">test</a><br>
<p> Some text</p>
<a href="http://test.com" rel="nofollow noreferrer noopener">test2</a><br>
<a rel="external">test3</a> <-- This won\'t work since there is no href in it.
';

$new = preg_replace_callback('/<a.+?rel\s*=\s*"([^"]*)"[^>]*>/i', function($m){
    if(strpos($m[1], 'external') !== false){
        $m[0] = preg_replace('/href\s*=\s*(("[^"]*")|(\'[^\']*\'))/i', 'href="http://example.com" rel="nofollow noreferrer noopener" rel="nofollow noreferrer noopener"', $m[0]);
    }
    return $m[0];
}, $html);

echo $new;

Online demo.

Solution 4

You could use a regular expression like
if it matches /\s+rel\s*=\s*".*external.*"/
then do a regExp replace like
/(<a.*href\s*=\s*")([^"]\)("[^>]*>)/\1[your new href here]\3/

Though using a library that can do this kind of stuff for you is much easier (like jquery for javascript)

Note: Use and implement solution 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply