By : AlexG
Date : May 03 2020, 08:53 AM

There are images I want to scrapting, by using DOM xPath as scraping tool. But xPath can't find the src attributes although I can see the attributes in the sources code of the website.

Normally I should fine the image's attribute, but xPath returns empty.

$html = pageContent($link."photo");
$path = new \DOMXPath($html);
$route = $path->query("//ul[@class='categoryBox']//li[@class='photoList_item']/a/img");
foreach($route as $val){
    $images[] = trim($val->getAttribute("src"));


the website is: https://hana-yume.net/174/photo/ you can check the path here.

What are the possible reasons?

And if you need to see pageContent() function here:

function pageContent(String $url): \DOMDocument
    $html = cache()->rememberForever($url, function () use ($url) {

        $opts = array(
            "http" => array(
            "header"=>"Content-Type: text/html; charset=utf-8"

        $context = stream_context_create($opts);
        $result = @file_get_contents($url,false,$context);
        return $result;


    $parser = new \DOMDocument();
    $parser->loadHTML($html = mb_convert_encoding($html,"HTML-ENTITIES", "ASCII, JIS, UTF-8, EUC-JP, SJIS"));
    return $parser;
Answer :

you need to target it in another way.

If you carefully examine:

<a data-lightbox="tile10" href="/uploads/hall_photo/174/1/0/main_0.jpg?1566895565" onClick="ga('send', 'event', 'kanto', 'hall/photo', 'photo/1_0_main0_174', 1, {nonInteraction: true});">
    <img alt="アニヴェルセル 柏 挙式会場" width="750" height="330" class="lazy" data-original="/uploads/hall_photo/174/1/0/main_0_s.jpg?1566895565" />
    <noscript><img alt="アニヴェルセル 柏 挙式会場" width="750" height="330" src="/uploads/hall_photo/174/1/0/main_0_s.jpg?1566895565" /></noscript>

The <img> tag isn't static, meaning on load its not present but manipulated by JS. But as you can see, the source is still there.

So just target the data attribute instead:

$html = pageContent('https://hana-yume.net/174/photo/'); $path = new \DOMXPath($html); $images = []; $route = $path->query("//ul[@class='categoryBox']//li[contains(@class, 'photoList_item')]/a/img"); foreach($route as $val){ $images[] = trim($val->getAttribute('data-original')); }

