长亭百川云 - 文章详情

浅测 长亭雷池 WAF “动态防护”

Anye

98

2024-06-03

前言

雷池 WAF 社区版的更新速度是真快啊,几乎一周一个小版本,俩月一个大版本,攻城狮们真的狠啊,没法测了。

废话不多说,前两天看到了 这篇文章,对雷池的“动态防护”功能挺感兴趣,特地来试试。

安装部署

本文以测评为主,不再阐述部署过程,介绍一下我这里的测试环境:

VM1:1Panel 部署 OpenResty,部署项目 Anyeの导航 ,IP(192.168.0.220)

VM2:部署雷池 WAF 社区版,添加站点,开启“动态防护”,IP(192.168.0.225)

测试

扒取页面

通常,我会采用这种方式来复刻一个主题,最常用的就是直接从浏览器开发人员工具中扒取出页面的 html,css,js 等文件,来重制主题。

开启了雷池动态防护的页面,会有一个解密的过程,其实也就是 js 执行的过程

HTML

这个过程极大的延长了页面的加载时间,大致是 3s 左右。

页面打开后,对于元素发现页面构建相同,代表页面并没有因为加密而产生变形

可见页面已加密,不过加密也导致 索引 页面严重增大🤣,看看后期有没有希望继续优化。

JS

加密了 js 文件尝试了一下,每次返回的js加密结果都不相同。

很明显是进行了混淆,不过经过文本对比后发现了端倪。

这里贴出完整 js 代码

// 源js文件  
/*! * Lazy Load - JavaScript plugin for lazy loading images * * Copyright (c) 2007-2017 Mika Tuupola * * Licensed under the MIT license: *   http://www.opensource.org/licenses/mit-license.php * * Project home: *   https://appelsiini.net/projects/lazyload * * Version: 2.0.0-beta.2 * */  
(function(root, factory) {  
    if (typeof exports === "object") {  
        module.exports = factory(root);  
    } else if (typeof define === "function" && define.amd) {  
        define([], factory(root));  
    } else {  
        root.LazyLoad = factory(root);  
    }  
}  
)(typeof global !== "undefined" ? global : this.window || this.global, function(root) {  
    "use strict";  
    const defaults = {  
        src: "data-src",  
        srcset: "data-srcset",  
        selector: ".lazyload"  
    };  
    /** * Merge two or more objects. Returns a new object. * @private * @param {Boolean}  deep     If true, do a deep (or recursive) merge [optional] * @param {Object}   objects  The objects to merge together * @returns {Object}          Merged values of defaults and options */  
    const extend = function() {  
        let extended = {};  
        let deep = false;  
        let i = 0;  
        let length = arguments.length;  
        /* Check if a deep merge */  
        if (Object.prototype.toString.call(arguments[0]) === "[object Boolean]") {  
            deep = arguments[0];  
            i++;  
        }  
        /* Merge the object into the extended object */  
        let merge = function(obj) {  
            for (let prop in obj) {  
                if (Object.prototype.hasOwnProperty.call(obj, prop)) {  
                    /* If deep merge and property is an object, merge properties */  
                    if (deep && Object.prototype.toString.call(obj[prop]) === "[object Object]") {  
                        extended[prop] = extend(true, extended[prop], obj[prop]);  
                    } else {  
                        extended[prop] = obj[prop];  
                    }  
                }  
            }  
        };  
        /* Loop through each object and conduct a merge */  
        for (; i < length; i++) {  
            let obj = arguments[i];  
            merge(obj);  
        }  
        return extended;  
    };  
    function LazyLoad(images, options) {  
        this.settings = extend(defaults, options || {});  
        this.images = images || document.querySelectorAll(this.settings.selector);  
        this.observer = null;  
        this.init();  
    }  
    LazyLoad.prototype = {  
        init: function() {  
            /* Without observers load everything and bail out early. */  
            if (!root.IntersectionObserver) {  
                this.loadImages();  
                return;  
            }  
            let self = this;  
            let observerConfig = {  
                root: null,  
                rootMargin: "0px",  
                threshold: [0]  
            };  
            this.observer = new IntersectionObserver(function(entries) {  
                entries.forEach(function(entry) {  
                    if (entry.intersectionRatio > 0) {  
                        self.observer.unobserve(entry.target);  
                        self.loadImage(entry.target);  
                    }  
                });  
            }  
            ,observerConfig);  
            this.images.forEach(function(image) {  
                self.observer.observe(image);  
            });  
        },  
        loadAndDestroy: function() {  
            if (!this.settings) {  
                return;  
            }  
            this.loadImages();  
            this.destroy();  
        },  
        loadImage: function(image) {  
            image.onerror = function() {  
                image.onerror = null;  
                image.src = image.srcset = image.dataset.original;  
            }  
            ;  
            let src = image.getAttribute(this.settings.src);  
            let srcset = image.getAttribute(this.settings.srcset);  
            if ("img" === image.tagName.toLowerCase()) {  
                if (src) {  
                    image.dataset.original = image.src;  
                    image.src = src;  
                }  
                if (srcset) {  
                    image.srcset = srcset;  
                }  
            } else {  
                image.style.backgroundImage = "url(" + src + ")";  
            }  
        },  
        loadImages: function() {  
            if (!this.settings) {  
                return;  
            }  
            let self = this;  
            this.images.forEach(function(image) {  
                self.loadImage(image);  
            });  
        },  
        destroy: function() {  
            if (!this.settings) {  
                return;  
            }  
            this.observer.disconnect();  
            this.settings = null;  
        }  
    };  
    root.lazyload = function(images, options) {  
        return new LazyLoad(images,options);  
    }  
    ;  
    if (root.jQuery) {  
        const $ = root.jQuery;  
        $.fn.lazyload = function(options) {  
            options = options || {};  
            options.attribute = options.attribute || "data-src";  
            new LazyLoad($.makeArray(this),options);  
            return this;  
        }  
        ;  
    }  
    return LazyLoad;  
});
// 动态防护加密后的js文件  
function vgo8rYXzpS() {  
    var YIhUo91Nlh = 99.6174697329428;  
    while (YIhUo91Nlh < 6) {  
        YIhUo91Nlh++  
    }  
    var kJsBQ2iTCw = 77.7991427720637;  
    while (kJsBQ2iTCw < 8) {  
        kJsBQ2iTCw++  
    }  
    var Uv8SujYUUJ = 54.122410119766634;  
    62.94717341414315 + 14.215159769026501;  
    "eCDkWHqKcu";  
    20.29250300507593 + 96.90578776550426;  
    var hKDl2Z6IyR = 2.1154780179250436;  
    while (hKDl2Z6IyR < 9) {  
        hKDl2Z6IyR++  
    }  
    var jJJdYPyWC8 = 96.35369160356686;  
    while (jJJdYPyWC8 < 10) {  
        jJJdYPyWC8++  
    }  
    var q1lUq8lALI = 79.3826780702858;  
    var KSm4kSmK5Q = 16.811363665066132;  
    while (KSm4kSmK5Q < 5) {  
        KSm4kSmK5Q++  
    }  
    while (q1lUq8lALI < 5) {  
        q1lUq8lALI++  
    }  
    var k8jxtioSu1 = 46.12863667479478;  
    if (k8jxtioSu1 < 50)  
        VdgkMuAloP("dbKMKN3DiD");  
    else  
        VdgkMuAloP("Z_GUlDIf7g");  
    32.61116098968565 + 39.92340222133316;  
    var WOIqRFoBWI = 35.570788142150256;  
    while (WOIqRFoBWI < 10) {  
        WOIqRFoBWI++  
    }  
    var REP52ajkkB = 68.57029249635578;  
    while (REP52ajkkB < 9) {  
        REP52ajkkB++  
    }  
    var UvGT8ugsmm = 77.45257249038768;  
    var c_XLMPoMhw = 70.0508383263844;  
    var oXNng_nyI3 = 61.714023740614785;  
    "f3dzUmlSrt";  
    while (oXNng_nyI3 < 8) {  
        oXNng_nyI3++  
    }  
    "oiou9de1Yg";  
    "jdpOma9ApF";  
    var NeReO5OH2M = 63.89278655453103;  
    while (NeReO5OH2M < 8) {  
        NeReO5OH2M++  
    }  
    var _p_ydR_UZY = 83.88263735619535;  
    var F85mcn2g_m = 17.165604886412726;  
    while (F85mcn2g_m < 10) {  
        F85mcn2g_m++  
    }  
    24.428701219017995 + 36.33105120927406;  
    var E_btPRjrmk = 95.02151619364821;  
    var No5m6438qj = 4.1049208686863246;  
    while (No5m6438qj < 9) {  
        No5m6438qj++  
    }  
    while (E_btPRjrmk < 6) {  
        E_btPRjrmk++  
    }  
    var NW66eJHW18 = 80.32092123501981;  
    if (NW66eJHW18 < 50)  
        VdgkMuAloP("wVsjIS9XQo");  
    else  
        VdgkMuAloP("GoLW5hTcVj");  
    43.682142399473555 + 22.477837399452866;  
    var UNuZiogsXq = 70.37483244640134;  
    while (UNuZiogsXq < 6) {  
        UNuZiogsXq++  
    }  
    var JyjoRvjvV4 = 60.33084553561216;  
    "PPLO3pqrCR";  
    function VdgkMuAloP() {  
        "CCutBlYuiL";  
        "MgmL5Sv_33";  
        19.82880014765916 + 66.83450544153038;  
        var eH8bW0LeRO = 91.84505170089825;  
        var GGEb99P0LW = 92.75726773787632;  
        "eM_DBnNLNQ";  
        "LTExkL39fU";  
        var ayFeMZ7J9o = 4.08422739984733;  
        var rvzNYoM37B = 42.468405912837106;  
        while (rvzNYoM37B < 6) {  
            rvzNYoM37B++  
        }  
        "YLovxab17O";  
        var dpUNCcw57i = 4.9146145098517575;  
        while (dpUNCcw57i < 10) {  
            dpUNCcw57i++  
        }  
        var ZjzssshCHy = 22.711581319339555;  
        "lsyj2Pu6bi";  
        "mAdio22F97";  
        95.9152148251555 + 18.563789346616783;  
        "Kisq7F_TOW";  
        var EO9rGZSTTK = 53.3184198670574;  
        while (EO9rGZSTTK < 6) {  
            EO9rGZSTTK++  
        }  
        var Lfrg2SayBj = 96.40296951052316;  
        var SR4gkdmFPm = 24.691037844119176;  
        while (SR4gkdmFPm < 7) {  
            SR4gkdmFPm++  
        }  
        "njAS_NShim";  
        "DIoi_JwNCk";  
        var qZALlgtAos = 65.84687374547939;  
        while (qZALlgtAos < 5) {  
            qZALlgtAos++  
        }  
        "i8UnwEQqP2";  
        var mkzN8inJtT = 89.67717243925355;  
        "EXgVlnAkaM";  
        var HGgVbs9bD5 = 56.50704313244045;  
        var myQFrz2kY4 = 54.55344568694437;  
        while (myQFrz2kY4 < 8) {  
            myQFrz2kY4++  
        }  
        var VcC388Sonl = 78.22901590625897;  
        while (VcC388Sonl < 5) {  
            VcC388Sonl++  
        }  
        var EDQM5T4i5x = 58.12080899105871;  
        while (EDQM5T4i5x < 8) {  
            EDQM5T4i5x++  
        }  
        var lS3HRC8N0e = 42.29800294748819;  
        while (lS3HRC8N0e < 5) {  
            lS3HRC8N0e++  
        }  
    }  
}  
(function(that, a) {  
    var checkF = new RegExp("\\w *\\(\\){.+}");  
    "Saz5menkPn";  
    var checkR = new RegExp("(\\[x|u](\\w){2,4})+");  
    var checkFunction = function checkFunction1() {  
        if (checkR.test(checkFunction.toString())) {  
            f2([2, 15, 12])  
        }  
        ;return '\x63\x68\x65\x63\x6b\x46\x75\x6e\x63\x74\x69\x6f\x6e'  
    };  
    var f1 = function f1(a) {  
        a.push[a];  
        f2(a)  
    };  
    var f2 = function f2(a) {  
        a.push[a];  
        f1(a)  
    };  
    if (!checkF.test(checkFunction.toString())) {  
        f2([])  
    } else if (checkR.test(checkFunction.toString())) {  
        f2([1, 3, 7])  
    }  
    ;"ISXAG9bapu";  
    var KOaO2Jk15j = 13.366279497772231;  
    KOaO2Jk15j = 65.44109390187671;  
    return a(that);  
    function f5OTtZ1pUr() {  
        "PFAZrkUkjJ";  
        "yohosCBZku";  
        "czcc_QG98P";  
        var Y7ZMWHKbB5 = 54.61165307973092;  
        while (Y7ZMWHKbB5 < 7) {  
            Y7ZMWHKbB5++  
        }  
        "RAAE3i3HNJ";  
        var yYJU5WMbNs = 81.78463513401672;  
        var Dsj0YbE3nh = 60.12962678573978;  
        while (Dsj0YbE3nh < 8) {  
            Dsj0YbE3nh++  
        }  
        "OrNvH3Vm8U";  
        var FcVeaK_8CJ = 70.65213865609662;  
        var V10P1fXl1e = 93.28700416475893;  
        while (V10P1fXl1e < 7) {  
            V10P1fXl1e++  
        }  
        var _jQaUeEOlz = 50.09958863458343;  
        while (_jQaUeEOlz < 6) {  
            _jQaUeEOlz++  
        }  
        "mkyDT6LuXp";  
        "i_d0Jej01W";  
        93.1178573977863 + 65.0171586053574;  
        "MdcdXdZD8e";  
        var FvwOTr68cW = 63.96686898919228;  
        while (FvwOTr68cW < 5) {  
            FvwOTr68cW++  
        }  
        12.007514883700402 + 67.83201664204582;  
        var lVk87OnDY0 = 10.352772574019035;  
        while (lVk87OnDY0 < 10) {  
            lVk87OnDY0++  
        }  
        "ZIxYt7RDz5";  
        var HOjmKYxZYn = 73.07394273998264;  
        var xf5LYnXM_h = 34.42670716048105;  
        while (xf5LYnXM_h < 7) {  
            xf5LYnXM_h++  
        }  
        25.31980979829108 + 70.92299314386324;  
        53.64987099085665 + 15.95767193794893;  
        81.04615728361688 + 53.03190420900158;  
        "xMyhHs6tqa";  
        var CicbLkYxKL = 71.43209809856342;  
        while (CicbLkYxKL < 9) {  
            CicbLkYxKL++  
        }  
        81.82768180662697 + 44.54696909044475;  
        "a7S1hc6l6e";  
        40.02457556515699 + 50.13884740950273  
    }  
}  
)(this, function(that) {  
    var EfanXdEsAo = 45.044183852209066;  
    var YiD8rJkjM4 = 60.40560906974519;  
    while (YiD8rJkjM4 < 9) {  
        YiD8rJkjM4++  
    }  
    var SFXCSJnYT5 = 5.6590674829357;  
    function DROWk3baLH() {  
        var zJ2eGCAqG6 = 88.73517894514487;  
        while (zJ2eGCAqG6 < 7) {  
            zJ2eGCAqG6++  
        }  
        "fVZRgjskF7";  
        var imVKfLWElR = 98.4392536479853;  
        var JRZm9ZRXt9 = 65.52198475056669;  
        "B07bldh2rt";  
        94.61531928891331 + 21.79165407508193;  
        "H7umcmyF_g";  
        var kUwyNiUzjX = 48.5975080540464;  
        while (kUwyNiUzjX < 5) {  
            kUwyNiUzjX++  
        }  
        var G5q4i5ptYT = 17.20767169078439;  
        while (G5q4i5ptYT < 9) {  
            G5q4i5ptYT++  
        }  
        "S0RQAJX7ZD";  
        var sbwEsBL3on = 31.621188048769;  
        var Rbqnn2M5lo = 1.6814855430946412;  
        var wXdYUzqLKS = 53.13735728383625;  
        var iIMDA_Qowp = 87.59602310423611;  
        55.899852486295956 + 23.463153124052145;  
        49.64055650210554 + 21.699124979927305  
    }  
});  
(function(root, factory) {  
    if (typeof exports === "object") {  
        module.exports = factory(root)  
    } else if (typeof define === "function" && define.amd) {  
        define([], factory(root))  
    } else {  
        root.LazyLoad = factory(root)  
    }  
    20.075305793669145 + 96.72088665263502;  
    function mCbgSrID5z() {  
        var ymZDdeQQne = 84.05118287547904;  
        while (ymZDdeQQne < 8) {  
            ymZDdeQQne++  
        }  
        "oeSlUltLf7";  
        var vRcSMA7HZy = 95.63940228416716;  
        while (vRcSMA7HZy < 5) {  
            vRcSMA7HZy++  
        }  
        42.847416356461686 + 64.228599747433;  
        var S4X325xZd0 = 36.8477478357082;  
        var OjdsnE6IgU = 54.6819719737484;  
        while (OjdsnE6IgU < 7) {  
            OjdsnE6IgU++  
        }  
        var tsdIhKV6Tu = 21.849203694513204;  
        "IBEZHnHB9P";  
        var W1xUDNOclb = 97.45176245010938;  
        var qBhKSMePxI = 53.918246237085604;  
        while (qBhKSMePxI < 8) {  
            qBhKSMePxI++  
        }  
        var tG5h3fCZpA = 8.9527278684316;  
        1.8795339533222326 + 85.30147367116075;  
        3.4838274267666733 + 52.70631782675951;  
        var sglVSvKjZv = 13.453736652916252;  
        while (sglVSvKjZv < 9) {  
            sglVSvKjZv++  
        }  
        var bzJ8IfE03K = 72.4963090140686;  
        var IzYpLOgN6D = 1.7126081902147487;  
        86.90125410102027 + 60.096220564929666;  
        55.32420449194843 + 93.21714769547813;  
        15.44123941754805 + 88.74042551968007;  
        "ayjAX7QOFR";  
        "nsviM21tO7";  
        "RGvq8LBnOO";  
        2.5365166268296333 + 58.41895276641477;  
        var NwT9TZgChj = 3.2736264569624316;  
        "ZVRMQfyCrJ";  
        "fwutOlKiEI";  
        var Ejy2yBkKAt = 51.83509013431559;  
        while (Ejy2yBkKAt < 7) {  
            Ejy2yBkKAt++  
        }  
        92.07691206149254 + 13.437580090223227;  
        "qdd7jYm20k";  
        var j2KtWorODN = 42.14264503067741;  
        2.478519996620122 + 58.627727544483704  
    }  
}  
)(typeof global !== "undefined" ? global : this.window || this.global, function(root) {  
    "use strict";  
    var defaults = {  
        src: "data-src",  
        srcset: "data-srcset",  
        selector: ".lazyload"  
    };  
    var extend = function extend1() {  
        var extended = {};  
        var deep = false;  
        var i = 0;  
        var length = arguments.length;  
        if (Object.prototype.toString.call(arguments[0]) === "[object Boolean]") {  
            deep = arguments[0];  
            i++  
        }  
        var merge = function merge(obj) {  
            for (var prop in obj) {  
                if (Object.prototype.hasOwnProperty.call(obj, prop)) {  
                    if (deep && Object.prototype.toString.call(obj[prop]) === "[object Object]") {  
                        extended[prop] = extend(true, extended[prop], obj[prop])  
                    } else {  
                        extended[prop] = obj[prop]  
                    }  
                }  
            }  
        };  
        for (; i < length; i++) {  
            var obj = arguments[i];  
            merge(obj)  
        }  
        return extended  
    };  
    function LazyLoad(images, options) {  
        this.settings = extend(defaults, options || {});  
        this.images = images || document.querySelectorAll(this.settings.selector);  
        this.observer = null;  
        this.init()  
    }  
    LazyLoad.prototype = {  
        init: function init() {  
            if (!root.IntersectionObserver) {  
                this.loadImages();  
                return  
            }  
            var self = this;  
            var observerConfig = {  
                root: null,  
                rootMargin: "0px",  
                threshold: [0]  
            };  
            this.observer = new IntersectionObserver(function(entries) {  
                entries.forEach(function(entry) {  
                    if (entry.intersectionRatio > 0) {  
                        self.observer.unobserve(entry.target);  
                        self.loadImage(entry.target)  
                    }  
                })  
            }  
            ,observerConfig);  
            this.images.forEach(function(image) {  
                self.observer.observe(image)  
            })  
        },  
        loadAndDestroy: function loadAndDestroy() {  
            if (!this.settings) {  
                return  
            }  
            this.loadImages();  
            this.destroy()  
        },  
        loadImage: function loadImage(image) {  
            image.onerror = function() {  
                image.onerror = null;  
                image.src = image.srcset = image.dataset.original  
            }  
            ;  
            var src = image.getAttribute(this.settings.src);  
            var srcset = image.getAttribute(this.settings.srcset);  
            if ("img" === image.tagName.toLowerCase()) {  
                if (src) {  
                    image.dataset.original = image.src;  
                    image.src = src  
                }  
                if (srcset) {  
                    image.srcset = srcset  
                }  
            } else {  
                image.style.backgroundImage = "url(" + src + ")"  
            }  
        },  
        loadImages: function loadImages() {  
            if (!this.settings) {  
                return  
            }  
            var self = this;  
            this.images.forEach(function(image) {  
                self.loadImage(image)  
            })  
        },  
        destroy: function destroy() {  
            if (!this.settings) {  
                return  
            }  
            this.observer.disconnect();  
            this.settings = null  
        }  
    };  
    root.lazyload = function(images, options) {  
        return new LazyLoad(images,options)  
    }  
    ;  
    if (root.jQuery) {  
        var $ = root.jQuery;  
        $.fn.lazyload = function(options) {  
            options = options || {};  
            options.attribute = options.attribute || "data-src";  
            new LazyLoad($.makeArray(this),options);  
            return this  
        }  
    }  
    98.21293314139757 + 7.022202427695869;  
    return LazyLoad;  
    var zb5YrDEzX8 = 49.67145292566205;  
    function EMQ2TiywJ2() {  
        "kDDG4hcurX";  
        "LmMQDl5Guf";  
        "H1d2hSNdZu";  
        "vR3uU0dztV";  
        "BYg6Cwwew1";  
        "Z7Cgb85The";  
        var vqCn2LKHiQ = 79.8849526204871;  
        while (vqCn2LKHiQ < 6) {  
            vqCn2LKHiQ++  
        }  
        "HzKhtzFo0S";  
        var Ruo3QF3HKv = 57.10873587557603;  
        while (Ruo3QF3HKv < 8) {  
            Ruo3QF3HKv++  
        }  
        var NIVPEUabT_ = 25.838978101412078;  
        var pmvJIgXrA7 = 41.71629707156116;  
        while (pmvJIgXrA7 < 7) {  
            pmvJIgXrA7++  
        }  
        var cTqhNJmnqp = 7.679109504729966;  
        var Xkldnu7eiS = 89.26080892492617;  
        while (Xkldnu7eiS < 8) {  
            Xkldnu7eiS++  
        }  
        var Iq6adjqOVj = 86.04734679658776;  
        while (Iq6adjqOVj < 7) {  
            Iq6adjqOVj++  
        }  
        var TyC7F0mXPj = 57.405394830228786;  
        while (TyC7F0mXPj < 7) {  
            TyC7F0mXPj++  
        }  
        var q7oS54FoCf = 19.715578974920984;  
        6.354381716758419 + 48.514464467999424;  
        var JPOOCo51Cg = 94.50513995137923;  
        18.85838453981073 + 55.22787281970704;  
        "rUww9Es4UQ"  
    }  
});

可见该加密方式采用了:

  • 变量名和函数名替换:将原有的变量名和函数名替换为难以理解的字符或字符串,例如vgo8rYXzpS
    、YIhUo91Nlh
    等。

  • 字符串混淆:将代码中的字符串通过某些算法转换为难以阅读的形式。

  • 控制流改变:通过添加无意义的循环、条件判断等,改变代码的控制流,使得代码执行过程变得复杂。

  • 代码拆分:将代码拆分为多个部分,并通过某些机制动态地组合执行。

可见还是较为容易反混淆的,期待后期加强。

python 爬取

根据 官方文档 所述,动态防护还是为了可以更好地阻止爬虫和攻击自动化程序的分析,那么就尝试编写一段python代码来进行 HTML 内容爬取测试。

比如爬取 本站导航站,这里使用 Microsoft 的 playwright 库

import asyncio  
from playwright.async_api import async_playwright  
  
async def scrape_data():  
    async with async_playwright() as p:  
        browser = await p.chromium.launch(headless=True)  
        page = await browser.new_page()  
  
        # 加载页面  
        await page.goto('http://192.168.0.220')  
  
        # 等待页面加载完成  
        await page.wait_for_load_state('networkidle')  
  
        # 提取链接、图标和描述  
        items = await page.query_selector_all('.list-item.block')  
        data = []  
  
        for item in items:  
            link = await item.query_selector('a.list-content')  
            if link:  
                href = await link.get_attribute('href')  
                title = await link.query_selector('.list-title')  
                desc = await link.query_selector('.list-desc')  
                img = await item.query_selector('img')  
  
                data.append({  
                    'link': href,  
                    'title': await title.inner_text() if title else None,  
                    'desc': await desc.inner_text() if desc else None,  
                    'icon': await img.get_attribute('src') if img else None,  
                })  
  
        await browser.close()  
        return data  
  
# 运行  
async def main():  
    data = await scrape_data()  
    for entry in data:  
        print(entry)  
  
asyncio.run(main())

源站执行爬取后输出:

雷池防护后:

效果明显,不过,感觉还是有绕过的空间呀?试试浏览器有头模式?

import asyncio  
from playwright.async_api import async_playwright  
  
async def scrape_data():  
    async with async_playwright() as p:  
        # 指定浏览器路径并启用有头模式  
        browser = await p.chromium.launch(  
            headless=False,  # 设置为 False 以显示浏览器窗口  
            executable_path="C:\\Users\\Anye\\AppData\\Local\\Chromium\\Application\\chrome.exe"  
        )  
        page = await browser.new_page()  
  
        # 加载本地服务器上的页面  
        await page.goto('http://192.168.0.225/')  
  
        # 手动处理人机验证  
        print("请手动处理页面上的人机验证...")  
        await page.wait_for_selector('.list-item.block', timeout=0)  # 等待页面加载完成,没有超时限制  
  
        # 提取链接、图标和描述  
        items = await page.query_selector_all('.list-item.block')  
        data = []  
  
        for item in items:  
            link = await item.query_selector('a.list-content')  
            if link:  
                href = await link.get_attribute('href')  
                title = await link.query_selector('.list-title')  
                desc = await link.query_selector('.list-desc')  
                img = await item.query_selector('img')  
  
                data.append({  
                    'link': href,  
                    'title': await title.inner_text() if title else None,  
                    'desc': await desc.inner_text() if desc else None,  
                    'icon': await img.get_attribute('src') if img else None,  
                })  
  
        await browser.close()  
        return data  
  
# 运行  
async def main():  
    data = await scrape_data()  
    for entry in data:  
        print(entry)  
  
asyncio.run(main())

结果:成功获取内容

既然这样,那岂不是等待解密后就可以获取内容了,那么等待5秒试试?

import asyncio  
from playwright.async_api import async_playwright  
  
async def scrape_data():  
    async with async_playwright() as p:  
        browser = await p.chromium.launch(headless=True)  
        page = await browser.new_page()  
  
        # 加载页面  
        await page.goto('http://192.168.0.225')  
  
        # 等待页面加载完成  
        await page.wait_for_load_state('networkidle')  
  
        # 等待5秒  
        await page.wait_for_timeout(5000)  
  
        # 提取链接、图标和描述  
        items = await page.query_selector_all('.list-item.block')  
        data = []  
  
        for item in items:  
            link = await item.query_selector('a.list-content')  
            if link:  
                href = await link.get_attribute('href')  
                title = await link.query_selector('.list-title')  
                desc = await link.query_selector('.list-desc')  
                img = await item.query_selector('img')  
  
                data.append({  
                    'link': href,  
                    'title': await title.inner_text() if title else None,  
                    'desc': await desc.inner_text() if desc else None,  
                    'icon': await img.get_attribute('src') if img else None,  
                })  
  
        await browser.close()  
        return data  
  
# 运行  
async def main():  
    data = await scrape_data()  
    for entry in data:  
        print(entry)  
  
asyncio.run(main())

成功获取。

雷池开发别打我,HTML 确实不太好加密

不过目前来讲确实可以拦截大部分爬虫的爬取,正常爬虫不会长时间等待页面加载,也不会用有头模式。

测试++

经测试,开启 人机验证 后,是可以有效拦截爬虫获取内容。

不过还是希望雷池开发大大可以继续研究研究如何加强“动态防护”的算法😘。

后记

不让黑客,越雷池一步。

本次测试为内部测试环境,请勿用于黑客攻击行为。后期雷池也会加强加密算法,保护 WEB 安全。

相关推荐
关注或联系我们
添加百川云公众号,移动管理云安全产品
咨询热线:
4000-327-707
百川公众号
百川公众号
百川云客服
百川云客服

Copyright ©2024 北京长亭科技有限公司
icon
京ICP备 2024055124号-2