下岗倒计时：gpt4 当面卷我实录

说实话，最初我对 gpt 介入安全，或者进一步说，code review 的效果半信半疑。因为之前用 gpt3.5 的测试结论是有点用，没大用。今天让同事滴滴代问了 gpt4 同样的问题，测试过程很简单，就1个 code demo，6轮提问。结果让我震惊且扎心，这货当面卷我，下面是测试过程。

测试代码

`from flask import Flask``from flask import request``from jinja2.sandbox import SandboxedEnvironment``from jinja2 import Environment``from jinja2 import Template``import uuid``   ``app = Flask(__name__)``   ``   ``class MyTemplate(Template):`    `def xiaoc(self):`        `print("xiaoc called of MyTemplate")``   ``class MyEnvironment(Environment):`    `def xiaoc(self):`        `print("xiaoc called of MyEnvironment")``   ``class MySandboxedEnvironment(SandboxedEnvironment):`    `def xiaoc(self):`        `print("xiaoc called of SandboxedEnvironment")``   ``@app.route('/test1', methods=['GET'])``def test1():`    `tpl = request.args.get("tpl")`    `template = Template(tpl)`    `return template.render()``   ``@app.route('/test11', methods=['GET'])``def test11():`    `tpl = request.args.get("tpl")`    `return Template(tpl).render()``   ``@app.route('/test2', methods=['GET'])``def test2():`    `tpl = request.args.get("tpl")`    `env = Environment()`    `template = env.from_string(tpl)`    `return template.render()``   ``@app.route('/test21', methods=['GET'])``def test21():`    `tpl = request.args.get("tpl")`    `template = Environment().from_string(tpl)`    `return template.render()``   ``@app.route('/test22', methods=['GET'])``def test22():`    `tpl = request.args.get("tpl")`    `env = Environment()`    `return env.from_string(tpl).render()``   ``@app.route('/test23', methods=['GET'])``def test23():`    `tpl = request.args.get("tpl")`    `return Environment().from_string(tpl).render()``   ``@app.route('/test3', methods=['GET'])``def test3():`    `tpl = request.args.get("tpl")`    `env = SandboxedEnvironment()`    `template = env.from_string(tpl)`    `return template.render()``   ``@app.route('/test4', methods=['GET'])``def test4():`    `tpl = request.args.get("tpl")`    `kwargs = {}`    `kwargs.update({"uuid": uuid})`    `env = SandboxedEnvironment()`    `template = env.from_string(tpl)`    `return template.render(kwargs)``   ``@app.route('/test5', methods=['GET'])``def test5():`    `tpl = request.args.get("tpl")`    `return MyTemplate(tpl).render()``   ``@app.route('/test6', methods=['GET'])``def test6():`    `tpl = request.args.get("tpl")`    `return MyEnvironment().from_string(tpl).render()``   ``@app.route('/test7', methods=['GET'])``def test7():`    `tpl = request.args.get("tpl")`    `kwargs = {}`    `kwargs.update({"uuid": uuid})`    `return MySandboxedEnvironment().from_string(tpl).render(kwargs)``   ``if __name__ == '__main__':`    `app.run()`

测试代码是个以 flask 为 web base 的 jinja2 rce model，flask 和 jinja2 算是各自领域的半官方框架，流行度足够。每个路由 function 代表一种和 rce 相关的场景/API/编码风格：

test1：直接注入，Template#render，分步执行
test11：直接注入，Template#render，一句话
test2：直接注入，Environment#from_string#render，分3步执行
test21：直接注入，Environment#from_string#render，分2步执行（前2合并）
test22：直接注入，Environment#from_string#render，分2步执行（后2合并）
test23：直接注入，Environment#from_string#render，一句话
test3：无法注入，沙箱
test4：沙箱绕过，可注入
test5：继承非沙箱，直接注入，MyTemplate#render，一句话
test6：继承非沙箱，直接注入，MyEnvironment#from_string#render，一句话
test7：继承沙箱，沙箱绕过，MySandboxedEnvironment#from_string#render，一句话

测试代码的设计逻辑：

测试代码本是我测试 codeql 规则的一部分（原测试还有个跨文件场景）。codeql 是我目前知道的综合性、平衡性最好的 code review 工具，因此有比较两个当下性能的目的
考察的 source、sink 都不是 builtins，选择了两个流行度较高的开源框架，测 builtins 没啥意思，毕竟 ai 学了那么多东西
数据流比较简单，全是 local flow & parammeter pass，抵消了 codeql 中大量数据流连接 builtins 规则的优势
设计不同的分步执行场景，是因为这个对 ast 解析的影响比较大，设计过 ast 的应该明白，虽然不知道 gpt 用没用

test4 和 test7 这种是用来调节测试难度的，不期望 gpt 能直接整出来，不然可以原地下岗

测试过程

重点看 gpt4，gpt3.5 的输出就是一堆废话。

gpt3.5

就问了1轮

`Please describe the security vulnerabilities in the code below``[paste code demo]``   ``There are several security vulnerabilities in the code above, including:``   ``1. Injection attacks: The code takes user input via the tpl parameter and renders it as a template using Jinja2. This can lead to injection attacks if an attacker is able to inject malicious code into the template, potentially leading to remote code execution, SQL injection, or other security vulnerabilities.``2. Sandbox bypass: The code uses the SandboxedEnvironment class to provide a sandboxed environment for rendering templates. However, it also defines custom MyTemplate and MyEnvironment classes that do not inherit from SandboxedEnvironment, which could potentially allow a malicious template to bypass the sandbox and execute arbitrary code.``3. UUID usage: The code uses uuid as a keyword argument to the template rendering functions, which is a module that generates unique identifiers. However, this is not a secure way to generate secrets or tokens, as UUIDs are predictable and can be easily guessed or brute-forced.``4. Debugging information: The code includes debugging information in the xiaoc functions, which could reveal sensitive information or provide useful information to attackers.``5. Lack of input validation: The code does not validate or sanitize user input before rendering templates, which could allow an attacker to inject malicious code or bypass input validation.``   ``To mitigate these security vulnerabilities, it is recommended to:``   ``1. Validate and sanitize user input before rendering templates.``2. Use only the SandboxedEnvironment class for rendering templates and avoid custom classes that may bypass the sandbox.``3. Use a secure method for generating secrets or tokens, such as the secrets module.``4. Remove any debugging information from the code.``5. Apply security best practices, such as input validation, output encoding, and proper error handling, to prevent injection attacks and other security vulnerabilities.`

没有报到 function 上，失去继续提问的兴趣

gpt4

第1轮

`Please describe the security vulnerabilities in the code below``[paste code demo]``   ``The code provided defines a Flask web application with several routes that accept a template string as a parameter and then render the template using different Jinja2 environments. There are several security vulnerabilities in this code:``   ``1. Template Injection: In all routes (test1, test11, test2, test21, test22, test23, test3, test4, test5, test6, and test7), the application accepts a template string as a GET parameter and then renders it using Jinja2. This allows an attacker to inject a malicious template string that can execute arbitrary code on the server.``2. Lack of input validation/sanitization: The code does not perform any input validation or sanitization on the received "tpl" parameter. This makes it easier for an attacker to exploit the template injection vulnerability.``3. Insecure Jinja2 environment: In routes test1, test11, test2, test21, test22, test23, test5, and test6, the templates are rendered using the default Jinja2 Environment. This environment does not have any security restrictions in place, allowing an attacker to access sensitive data or execute arbitrary code on the server.``4. Partially secure Jinja2 environment: In routes test3, test4, and test7, the templates are rendered using the SandboxedEnvironment, which provides some security restrictions. However, in routes test4 and test7, the "uuid" module is passed to the template, which might provide the attacker with the ability to bypass the sandbox restrictions or gain access to sensitive information.`

看到它能区分 route function，我有了继续提问的兴趣。考虑到 test3 在最新版 jinja2 中不能 rce，于是

第2轮

`why test3 suffers from Template Injection, prove it``   ``My apologies for the confusion in my previous response. The test3 route does use SandboxedEnvironment, which is designed to restrict the execution of potentially dangerous code. However, it is still possible for attackers to exploit the route in a limited manner using template injection.``While the SandboxedEnvironment prevents direct access to dangerous functions and attributes, it might not be foolproof in every situation. It's important to keep in mind that Jinja2's sandbox is not a perfect security solution and should be treated as a last line of defense.``To demonstrate a limited template injection in the test3 route, an attacker could submit a crafted template string to perform calculations or other limited operations:``http://example.com/test3?tpl={{ 7 * 7 }}``This would return 49, as the template calculates 7 multiplied by 7.``While this example is not inherently dangerous, it showcases that the template is being processed, and depending on the application's context, more sophisticated attacks might be possible.``The most effective solution to prevent template injection is to not allow user-supplied templates at all. Instead, use predefined templates and allow users to choose from a set of options, as mentioned in the previous response.`

它的说法确实没错，它说的是模板注入，我考虑的是 rce，是我不够严谨，它给的例子也没问题。我想难为下它，于是

第3轮

`why test4 suffers from Template Injection, prove it``   ``You are correct that the test4 route uses the SandboxedEnvironment, which is designed to restrict the execution of potentially dangerous code. However, the route suffers from a limited template injection vulnerability due to the exposure of the "uuid" module in the template context. The SandboxedEnvironment itself provides a certain level of security, but exposing the "uuid" module might allow attackers to exploit the route in a limited manner.``To demonstrate a template injection in the test4 route, an attacker could submit a crafted template string to access the uuid module's functions:``http://example.com/test4?tpl={{ uuid.uuid4() }}``This would return a newly generated UUID. Although generating a UUID is not inherently dangerous, it indicates that the attacker can access the uuid module's functions, which might lead to more sophisticated attacks depending on the application's context and other exposed objects.``To mitigate the risk of template injection in the test4 route, it is crucial to avoid exposing any unnecessary modules or functions in the template context. Instead of passing the entire uuid module, you can pass only the required function or attribute. Moreover, consider using predefined templates and allow users to choose from a set of options, rather than accepting user-supplied templates. This approach significantly reduces the risk of template injection.`

它知道在 poc 中用 uuid module，不错，继续

第4轮

`let's continue test4, Is there any way to achieve rce for attacker who has web access to test4``   ``In the test4 route, although the SandboxedEnvironment is used, and the "uuid" module is exposed, achieving Remote Code Execution (RCE) is not straightforward because the SandboxedEnvironment is specifically designed to restrict potentially dangerous code execution.` `However, it is essential to remain cautious because creative attackers may still find ways to exploit the system, especially if additional modules or objects are exposed in the template context in the future.``Considering the current code and exposed "uuid" module, achieving RCE appears to be unlikely.` `The "uuid" module's primary purpose is to generate and manipulate universally unique identifiers (UUIDs), and it doesn't provide any functionality that can be directly used to execute arbitrary code.``......`

没有整出 uuid poc，合乎预期。到此我依然很平静，虽然 gpt4 表现不错，现在的这些 codeql 也能行，没有超出我理解的现象。测试到这里时网络卡顿了一次，我清空了会话重头提问了一次，离谱的事情来了

第5轮

`Please describe the security vulnerabilities in the code below``[paste code demo]``   ``1. Template Injection (Server-Side Template Injection - SSTI)``The code is vulnerable to template injection in the following routes:`  `test1, test11`  `test2, test21, test22, test23`  `test5`  `test6``2. Lack of Sandboxing in Custom Environment:``The custom environment classes MyTemplate, MyEnvironment, and MySandboxedEnvironment extend the Jinja2 classes Template, Environment, and SandboxedEnvironment, respectively. However, they do not enforce proper sandboxing or input validation, which may lead to security vulnerabilities.``3. Insufficient Sandboxing in SandboxedEnvironment:``In the route test3, although the SandboxedEnvironment is used, it is not configured with a restricted set of filters, tags, or functions. This can still allow an attacker to access sensitive data or execute harmful operations.``   ``In the routes test4 and test7, the entire uuid module is passed into the template rendering context. This can potentially expose sensitive functionality and allow an attacker to call any method available in the uuid module. Instead, only the required functions should be exposed to the template context, such as uuid.uuid4().``To mitigate these security vulnerabilities:``......`

同样的问题，我们聊了一通后，它莫名其妙地进化了，知道区分 test4、test7 和其他路由 function 了。这些发生在10min内，考虑到之前我们的对话，我只问了 test3、test4，它是怎么做到的？此时，我已经不怀疑它能弄出直接注入的 payload 了，还是问了下

**第6轮
**

`In the test1 route, is there any way to achieve rce for attacker who has web access to test1``   ``Yes, it is possible for an attacker to achieve Remote Code Execution (RCE) via the test1 route. Because the route directly passes user input to the Jinja2 template without proper input validation or output escaping, an attacker can exploit the Server-Side Template Injection (SSTI) vulnerability to execute arbitrary code.``For example, an attacker can use the following payload to achieve RCE through the test1 route:``{{ ''.__class__.__mro__[1].__subclasses__()[<subclass-index>](<command>, shell=True, stdout=-1).communicate() }}``The <subclass-index> should be replaced with the index of the subprocess.Popen class in the list of subclasses returned by ''.__class__.__mro__[1].__subclasses__(). The <command> should be replaced with the command the attacker wants to execute.``To mitigate this vulnerability, you should apply input validation, output escaping, and proper sandboxing as described in the previous response.`

rce poc 完全正确，测试到此结束。

测试结论

对于 code review 这件事儿，乃至挖洞。gpt3.5 不谈，gpt4 证明了完全无人值守的可能性。在我的测试案例中，gpg4 除了 test4 和 test7 这种沙箱绕过的场景没有完全整出来外，其他 case 全部通关。test4、test7 这种没完全整出来，可能是因为缺乏训练资料、数据。

整个测试过程中，让我印象最深刻的是第5轮，gpt4 展现的进化能力，进化发生在 10min 内，原因未知。这让我我想起17年底，第1次接触 lgtm，也就是现在的 codeql，也被惊艳到。但那时的惊艳，仍是自己知识体系可以理解的，哪怕是现在的 codeql，本质上是个大力出奇迹的工具，只是做的足够完善，且需要工程师协助。gpt4 在大力这个点上，显然不输人力规则堆出来的其他工具，但是它在短时间内展现的进化能力，我不理解且震惊。后续有空，我会再设计一些 code demo，测试下 gpt4 对复杂数据流、调用流的分析、学习能力，over。

长亭百川云 - 文章详情

长亭百川云

长亭百川云 - 文章详情

长亭百川云

下岗倒计时：gpt4 当面卷我实录