沙滩星空的博客沙滩星空的博客

PHP正则表达式简介

正则表达式函数

preg_match()

用于执行一个正则表达式匹配,第一次匹配后,将会停止搜索。

int preg_match ( string $pattern , string $subject [, array &$matches [, int $flags = 0 [, int $offset = 0 ]]] )

返回 pattern 匹配次数。 它的值将是 0(不匹配)或 1。 如果发生错误,则返回 FALSE。

preg_match_all()

用于执行一个全局正则表达式匹配,会一直搜索直到结尾。

int preg_match_all ( string $pattern , string $subject [, array &$matches [, int $flags = PREG_PATTERN_ORDER [, int $offset = 0 ]]] )

返回完整pattern 匹配次数(可能是0)。或者如果发生错误,则返回FALSE。

正则表达式 - 元字符

字符描述
(pattern)匹配 pattern 并获取这一匹配。要匹配圆括号字符,请在括号前加反斜杠 '(' 或 ')'。
(?:pattern)匹配 pattern 但不获取匹配结果,也就是说这是一个非获取匹配。很多时候,可替代 "或" 匹配。例, 'industr(?:y|ies) 等价于 'industry|industries' 。

正则表达式 - 修饰符

正则表达式中常用的模式修正符有i、g、m、s、x、e等。它们之间可以组合搭配使用

i 不区分大小写的匹配;
g表示全局匹配
m 将字符串视为多行,不管是那行都能匹配;
s 将字符串视为单行,换行符作为普通字符;
x 将模式中的空白忽略;
A 强制从目标字符串开头匹配;
D 如果使用$限制结尾字符,则不允许结尾有换行;
U 只匹配最近的一个字符串;不重复匹配;
e 配合函数preg_replace()使用,

使用在线工具练习或测试

http://c.runoob.com/front-end/854

示例

准备待解析的内容

新建 rival_goods.html 文件,存放等下要用正则表达式解析的内容。

    <tbody class="ant-table-tbody">
        <tr class="ant-table-row oui-table-row-tree-node-1 ant-table-row-level-0" data-row-key="tree-node-1">
            <td class="">
                <span class="ant-table-row-indent indent-level-0" style="padding-left: 0px;"></span><!-- react-empty: 1350 --><div class="sycm-goods-td" style="width: 260px;"><a class="goodsImg pull-left" href="//detail.tmall.com/item.htm?id=626987898197" target="_blank" rel="noopener noreferrer" title="秋冬季2020新款高帮男鞋潮流百搭运动休闲加绒保暖棉鞋老爹潮鞋" style="width: 38px; height: 38px;"><img class="mediaObject" src="//img.alicdn.com/bao/uploaded/i1/2074376818/O1CN01U3Rzep20Egzw0ViDL_!!2074376818-0-lubanu-s.jpg_36x36.jpg"></a><div class="goodsInfo" style="width: 202px; max-height: 76px;"><p class="singleGoodsName"><a href="//detail.tmall.com/item.htm?id=626987898197" target="_blank" rel="noopener noreferrer" title="秋冬季2020新款高帮男鞋潮流百搭运动休闲加绒保暖棉鞋老爹潮鞋">秋冬季2020新款高帮男鞋潮流百搭运动休闲加绒保暖棉鞋老爹潮鞋</a></p><p class="goodsShopName" style="width: 202px;">较前一日</p></div></div>
            </td>
            <td class="">
                <div class="alife-dt-card-common-table-sortable-td alife-dt-card-common-table-cateRankId"><span class="alife-dt-card-common-table-sortable-value">26</span><span class="alife-dt-card-common-table-sortable-ratio-value"></span><div class="alife-dt-card-common-table-sortable-cycleCrc" style="margin-right: 0px;"><span style="color: red;">升6名</span></div><div class="alife-dt-card-common-table-sortable-syncCrc" style="margin-right: 0px;"></div></div>
            </td>
            <td class="">
                <div class="alife-dt-card-common-table-sortable-td alife-dt-card-common-table-tradeIndex"><span class="alife-dt-card-common-table-sortable-value">37,135</span><span class="alife-dt-card-common-table-sortable-ratio-value"></span><div class="alife-dt-card-common-table-sortable-cycleCrc" style="margin-right: 0px;"><span style="color: gray;">-0.78%</span></div><div class="alife-dt-card-common-table-sortable-syncCrc" style="margin-right: 0px;"></div></div>
            </td>
            <td class="alife-dt-card-common-table-right-column">
                <a href="/mc/ci/item/analysis?rivalItem1Id=626987898197&amp;cateId=50011740" target="_blank">竞品分析</a>
            </td>
        </tr>
        <tr class="ant-table-row oui-table-row-tree-node-2 ant-table-row-level-0" data-row-key="tree-node-2">
            <td class="">
                <span class="ant-table-row-indent indent-level-0" style="padding-left: 0px;"></span><!-- react-empty: 1377 --><div class="sycm-goods-td" style="width: 260px;"><a class="goodsImg pull-left" href="//detail.tmall.com/item.htm?id=629272537596" target="_blank" rel="noopener noreferrer" title="aj男鞋正品官网旗舰店官空军一号2020新款aj1莆田篮球高帮潮鞋男" style="width: 38px; height: 38px;"><img class="mediaObject" src="//img.alicdn.com/bao/uploaded/i1/2932519149/O1CN016qZWP32HSIFYZUvdB_!!0-item_pic.jpg_36x36.jpg"></a><div class="goodsInfo" style="width: 202px; max-height: 76px;"><p class="singleGoodsName"><a href="//detail.tmall.com/item.htm?id=629272537596" target="_blank" rel="noopener noreferrer" title="aj男鞋正品官网旗舰店官空军一号2020新款aj1莆田篮球高帮潮鞋男">aj男鞋正品官网旗舰店官空军一号2020新款aj1莆田篮球高帮潮鞋男</a></p><p class="goodsShopName" style="width: 202px;">较前一日</p></div></div>
            </td>
            <td class="">
                <div class="alife-dt-card-common-table-sortable-td alife-dt-card-common-table-cateRankId"><span class="alife-dt-card-common-table-sortable-value">28</span><span class="alife-dt-card-common-table-sortable-ratio-value"></span><div class="alife-dt-card-common-table-sortable-cycleCrc" style="margin-right: 0px;"><span style="color: red;">升3名</span></div><div class="alife-dt-card-common-table-sortable-syncCrc" style="margin-right: 0px;"></div></div>
            </td>
            <td class="">
                <div class="alife-dt-card-common-table-sortable-td alife-dt-card-common-table-tradeIndex"><span class="alife-dt-card-common-table-sortable-value">35,899</span><span class="alife-dt-card-common-table-sortable-ratio-value"></span><div class="alife-dt-card-common-table-sortable-cycleCrc" style="margin-right: 0px;"><span style="color: gray;">-7.61%</span></div><div class="alife-dt-card-common-table-sortable-syncCrc" style="margin-right: 0px;"></div></div>
            </td>
            <td class="alife-dt-card-common-table-right-column">
                <a href="/mc/ci/item/analysis?rivalItem1Id=629272537596&amp;cateId=50011740" target="_blank">竞品分析</a>
            </td>
        </tr>
    </tbody>

使用正则表达式,解析并提取数据

$html = file_get_contents("rival_goods.html");
$vars = array('detailUrls'=>array(),'titles'=>array(), 'images'=>array());
preg_match_all("/(?:<p class=\"singleGoodsName\"><a href=\")(.+)(?:\" target=\".+\">.+<\/a>)/U", $html, $vars['detailUrls']);
preg_match_all("/(?:<img class=\"mediaObject\" src=\")(.+)(?:\">)/U", $html, $vars['images']);
preg_match_all("/(?:<p class=\"singleGoodsName\"><a .+>)(.+)(?:<\/a><\/p>)/U", $html, $vars['titles']);
print_r($vars);

提取结果

Array
(
    [detailUrls] => Array
        (
            [0] => Array
                (
                    [0] => <p class="singleGoodsName"><a href="//detail.tmall.com/item.htm?id=626987898197" target="_blank" rel="noopener noreferrer" title="秋冬季2020新款高帮男鞋潮流百搭运动休闲加绒保暖棉鞋老爹潮鞋">秋冬
季2020新款高帮男鞋潮流百搭运动休闲加绒保暖棉鞋老爹潮鞋</a>
                    [1] => <p class="singleGoodsName"><a href="//detail.tmall.com/item.htm?id=629272537596" target="_blank" rel="noopener noreferrer" title="aj男鞋正品官网旗舰店官空军一号2020新款aj1莆田篮球高帮潮鞋男">aj男
鞋正品官网旗舰店官空军一号2020新款aj1莆田篮球高帮潮鞋男</a>
                )

            [1] => Array
                (
                    [0] => //detail.tmall.com/item.htm?id=626987898197
                    [1] => //detail.tmall.com/item.htm?id=629272537596
                )

        )

    [titles] => Array
        (
            [0] => Array
                (
                    [0] => <p class="singleGoodsName"><a href="//detail.tmall.com/item.htm?id=626987898197" target="_blank" rel="noopener noreferrer" title="秋冬季2020新款高帮男鞋潮流百搭运动休闲加绒保暖棉鞋老爹潮鞋">秋冬
季2020新款高帮男鞋潮流百搭运动休闲加绒保暖棉鞋老爹潮鞋</a></p>
                    [1] => <p class="singleGoodsName"><a href="//detail.tmall.com/item.htm?id=629272537596" target="_blank" rel="noopener noreferrer" title="aj男鞋正品官网旗舰店官空军一号2020新款aj1莆田篮球高帮潮鞋男">aj男
鞋正品官网旗舰店官空军一号2020新款aj1莆田篮球高帮潮鞋男</a></p>
                )

            [1] => Array
                (
                    [0] => 秋冬季2020新款高帮男鞋潮流百搭运动休闲加绒保暖棉鞋老爹潮鞋
                    [1] => aj男鞋正品官网旗舰店官空军一号2020新款aj1莆田篮球高帮潮鞋男
                )

        )

    [images] => Array
        (
            [0] => Array
                (
                    [0] => <img class="mediaObject" src="//img.alicdn.com/bao/uploaded/i1/2074376818/O1CN01U3Rzep20Egzw0ViDL_!!2074376818-0-lubanu-s.jpg_36x36.jpg">
                    [1] => <img class="mediaObject" src="//img.alicdn.com/bao/uploaded/i1/2932519149/O1CN016qZWP32HSIFYZUvdB_!!0-item_pic.jpg_36x36.jpg">
                )

            [1] => Array
                (
                    [0] => //img.alicdn.com/bao/uploaded/i1/2074376818/O1CN01U3Rzep20Egzw0ViDL_!!2074376818-0-lubanu-s.jpg_36x36.jpg
                    [1] => //img.alicdn.com/bao/uploaded/i1/2932519149/O1CN016qZWP32HSIFYZUvdB_!!0-item_pic.jpg_36x36.jpg
                )

        )

)

PHP 正则表达式(PCRE) https://www.runoob.com/php/php-pcre.html
正则表达式 - 教程 https://www.runoob.com/regexp/regexp-tutorial.html
正则表达式中模式修正符作用详解(i、g、m、s、x、e)https://www.cnblogs.com/kevin-yuan/archive/2012/09/25/2702167.html

未经允许不得转载:沙滩星空的博客 » PHP正则表达式简介

评论 抢沙发

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址