编者按
关于LLMs(大型语言模型)的风险和监管,本公号发布过以下文章:
今天和大家分享的是苹果公司在其官网上发布的关于“私有云计算”的技术逻辑博客文章——私有云计算:云计算中人工智能隐私保护的新领域。苹果公司之所以提出“私有云计算”,主要是应对将大预言模型(ChatGPT)引入苹果生态体系所可能带来的个人信息保护风险【重磅!OpenAI与苹果合作,将ChatGPT集成在iOS 18中】。苹果公司这篇技术博客的原文链接:https://security.apple.com/blog/private-cloud-compute/
Apple Intelligence is the personal intelligence system that brings powerful generative models to iPhone, iPad, and Mac. For advanced features that need to reason over complex data with larger foundation models, we created Private Cloud Compute (PCC), a groundbreaking cloud intelligence system designed specifically for private AI processing. For the first time ever, Private Cloud Compute extends the industry-leading security and privacy of Apple devices into the cloud, making sure that personal user data sent to PCC isn’t accessible to anyone other than the user — not even to Apple. Built with custom Apple silicon and a hardened operating system designed for privacy, we believe PCC is the most advanced security architecture ever deployed for cloud AI compute at scale.
Apple Intelligence 是一款个人智能系统,为 iPhone、iPad 和 Mac 带来了强大的生成模型。对于需要使用大型基础模型对复杂数据进行推理的高级功能,我们创建了 Private Cloud Compute (PCC),这是一个专为私有人工智能处理而设计的开创性云智能系统。Private Cloud Compute 首次将 Apple 设备业界领先的安全性和隐私性扩展到云中,确保发送到 PCC 的用户个人数据不会被用户以外的任何人访问,甚至 Apple 也无法访问。PCC 由定制的 Apple 芯片和专为保护隐私而设计的加固操作系统构建而成,我们相信它是有史以来为大规模云 AI 计算所部署的最先进的安全架构。
Apple has long championed on-device processing as the cornerstone for the security and privacy of user data. Data that exists only on user devices is by definition disaggregated and not subject to any centralized point of attack. When Apple is responsible for user data in the cloud, we protect it with state-of-the-art security in our services — and for the most sensitive data, we believe end-to-end encryption is our most powerful defense. For cloud services where end-to-end encryption is not appropriate, we strive to process user data ephemerally or under uncorrelated randomized identifiers that obscure the user’s identity.
长期以来,苹果公司一直主张将设备上的处理作为用户数据安全和隐私的基石。仅存在于用户设备上的数据,本质上是分散的而非聚合性,不受任何集中攻击点的影响。当 Apple 要对云中的用户数据负责时,我们会在服务中采用最先进的安全技术来保护这些数据,对于最敏感的数据,我们认为端到端加密是最强大的防御手段。对于不适合进行端到端加密的云服务,我们会努力以短暂方式或不相关的随机标识符处理用户数据,以掩盖用户身份。
Secure and private AI processing in the cloud poses a formidable new challenge. Powerful AI hardware in the data center can fulfill a user’s request with large, complex machine learning models — but it requires unencrypted access to the user's request and accompanying personal data. That precludes the use of end-to-end encryption, so cloud AI applications have to date employed traditional approaches to cloud security. Such approaches present a few key challenges:
在云中进行安全、私密的人工智能处理是一项艰巨的新挑战。数据中心强大的人工智能硬件可以通过大型、复杂的机器学习模型来满足用户的请求,但这需要对用户的请求和随附的个人数据进行未加密的访问。这就排除了端到端加密的使用,因此云上人工智能应用迄今为止一直采用传统的云安全方法。这种方法存在一些关键挑战:
Cloud AI security and privacy guarantees are difficult to verify and enforce. If a cloud AI service states that it does not log certain user data, there is generally no way for security researchers to verify this promise — and often no way for the service provider to durably enforce it. For example, a new version of the AI service may introduce additional routine logging that inadvertently logs sensitive user data without any way for a researcher to detect this. Similarly, a perimeter load balancer that terminates TLS may end up logging thousands of user requests wholesale during a troubleshooting session.
云上人工智能的安全和隐私保证很难验证和执行。如果云上人工智能服务声明不记录某些用户数据,安全研究人员通常无法验证这一承诺,而服务提供商通常也无法持久地执行这一承诺。例如,新版本的人工智能服务可能会引入额外的例行日志记录,无意中记录敏感的用户数据,而研究人员却没有办法发现。同样,终止 TLS 的外围负载平衡器最终可能会在故障诊断会话期间全盘记录数千个用户请求。
It’s difficult to provide runtime transparency for AI in the cloud. Cloud AI services are opaque: providers do not typically specify details of the software stack they are using to run their services, and those details are often considered proprietary. Even if a cloud AI service relied only on open source software, which is inspectable by security researchers, there is no widely deployed way for a user device (or browser) to confirm that the service it’s connecting to is running an unmodified version of the software that it purports to run, or to detect that the software running on the service has changed.
为云上人工智能提供运行时透明度非常困难。云上人工智能服务是不透明的:提供商通常不会说明他们用于运行服务的软件堆栈的细节,而这些细节通常被认为是专有的。即使云上人工智能服务只依赖安全研究人员可检查的开源软件,用户设备(或浏览器)也没有广泛部署的方法来确认其连接的服务正在运行其声称运行的软件的未修改版本,或检测服务上运行的软件是否已更改。
It’s challenging for cloud AI environments to enforce strong limits to privileged access. Cloud AI services are complex and expensive to run at scale, and their runtime performance and other operational metrics are constantly monitored and investigated by site reliability engineers and other administrative staff at the cloud service provider. During outages and other severe incidents, these administrators can generally make use of highly privileged access to the service, such as via SSH and equivalent remote shell interfaces. Though access controls for these privileged, break-glass interfaces may be well-designed, it’s exceptionally difficult to place enforceable limits on them while they’re in active use. For example, a service administrator who is trying to back up data from a live server during an outage could inadvertently copy sensitive user data in the process. More perniciously, criminals such as ransomware operators routinely strive to compromise service administrator credentials precisely to take advantage of privileged access interfaces and make away with user data.
对于云上人工智能环境来说,对特权访问实施严格限制是一项挑战。云上人工智能服务的大规模运行既复杂又昂贵,其运行时性能和其他运行指标需要由站点可靠性工程师和云服务提供商的其他管理人员进行持续监控和调查。在故障和其他严重事故期间,这些管理员通常可以使用高权限访问服务,例如通过 SSH 和类似的远程 shell 接口。虽然对这些权限较高的"打碎玻璃"性质的接口的访问控制,可能设计得很好,但要在这些接口处于激活状态时对其施加可执行的限制却异常困难。例如,服务管理员在中断期间试图从实时服务器备份数据时,可能会无意中复制敏感的用户数据。更恶劣的是,勒索软件操作员等犯罪分子经常会破坏服务管理员的凭证,正是为了利用特权访问界面,盗取用户数据。
When on-device computation with Apple devices such as iPhone and Mac is possible, the security and privacy advantages are clear: users control their own devices, researchers can inspect both hardware and software, runtime transparency is cryptographically assured through Secure Boot, and Apple retains no privileged access (as a concrete example, the Data Protection file encryption system cryptographically prevents Apple from disabling or guessing the passcode of a given iPhone).
当可以使用 iPhone 和 Mac 等苹果设备进行设备上计算时,其安全和隐私优势显而易见:用户可以控制自己的设备,研究人员可以检查硬件和软件,运行时的透明度通过安全启动得到加密保证,而且苹果公司不会保留任何特权访问(举个具体例子,数据保护文件加密系统通过加密技术防止苹果公司禁用或猜测特定 iPhone 的密码)。
However, to process more sophisticated requests, Apple Intelligence needs to be able to enlist help from larger, more complex models in the cloud. For these cloud requests to live up to the security and privacy guarantees that our users expect from our devices, the traditional cloud service security model isn't a viable starting point. Instead, we need to bring our industry-leading device security model, for the first time ever, to the cloud.
但是,为了处理更复杂的请求,Apple Intelligence 需要能够从更大型、更复杂的云模型中获得帮助。要使这些云请求达到用户对我们设备的安全和隐私保证要求,传统的云服务安全模式并不是一个可行的起点。相反,我们需要首次将业界领先的设备安全模型引入云中。
The rest of this post is an initial technical overview of Private Cloud Compute, to be followed by a deep dive after PCC becomes available in beta. We know researchers will have many detailed questions, and we look forward to answering more of them in our follow-up post.
这篇文章的其余部分是对私有云计算的初步技术概述,在 PCC 推出测试版后,我们将对其进行深入探讨。我们知道研究人员会有很多详细问题,我们期待在后续文章中回答更多问题。
设计私有云计算**
We set out to build Private Cloud Compute with a set of core requirements:
我们在构建私有云计算时提出了一系列核心要求:
Stateless computation on personal user data. Private Cloud Compute must use the personal user data that it receives exclusively for the purpose of fulfilling the user’s request. This data must never be available to anyone other than the user, not even to Apple staff, not even during active processing. And this data must not be retained, including via logging or for debugging, after the response is returned to the user. In other words, we want a strong form of stateless data processing where personal data leaves no trace in the PCC system.
对个人用户数据进行无状态计算。私有云计算必须将收到的用户个人数据仅用于满足用户请求的目的。除用户外,绝不能向任何人提供这些数据,即使是 Apple 员工,即使是在主动处理过程中。而且,在向用户返回响应后,不得保留这些数据,包括通过日志记录或用于调试。换句话说,我们需要的是一种强大的无状态数据处理形式,个人数据不会在 PCC 系统中留下任何痕迹。
Enforceable guarantees. Security and privacy guarantees are strongest when they are entirely technically enforceable, which means it must be possible to constrain and analyze all the components that critically contribute to the guarantees of the overall Private Cloud Compute system. To use our example from earlier, it’s very difficult to reason about what a TLS-terminating load balancer may do with user data during a debugging session. Therefore, PCC must not depend on such external components for its core security and privacy guarantees. Similarly, operational requirements such as collecting server metrics and error logs must be supported with mechanisms that do not undermine privacy protections.
可执行的保证。当安全和隐私保证在技术上完全可执行时,它们才是最有力的保证,这意味着必须能够约束和分析对整个私有云计算系统的保证起关键作用的所有组件。举个刚才的例子,在调试会话期间,很难推理出 TLS 终端负载平衡器可能会如何处理用户数据。因此,PCC 的核心安全和隐私保证绝不能依赖于此类外部组件。同样,收集服务器指标和错误日志等操作要求也必须由不会破坏隐私保护的机制来支持。
No privileged runtime access. Private Cloud Compute must not contain privileged interfaces that would enable Apple’s site reliability staff to bypass PCC privacy guarantees, even when working to resolve an outage or other severe incident. This also means that PCC must not support a mechanism by which the privileged access envelope could be enlarged at runtime, such as by loading additional software.
无特权运行时访问。私有云计算不得包含可使 Apple 网站可靠性人员绕过 PCC 隐私保证的特权接口,即使在解决故障或其他严重事故时也是如此。这也意味着 PCC 不得支持在运行时扩大特权访问范围的机制,例如加载额外的软件。
Non-targetability. An attacker should not be able to attempt to compromise personal data that belongs to specific, targeted Private Cloud Compute users without attempting a broad compromise of the entire PCC system. This must hold true even for exceptionally sophisticated attackers who can attempt physical attacks on PCC nodes in the supply chain or attempt to obtain malicious access to PCC data centers. In other words, a limited PCC compromise must not allow the attacker to steer requests from specific users to compromised nodes; targeting users should require a wide attack that’s likely to be detected. To understand this more intuitively, contrast it with a traditional cloud service design where every application server is provisioned with database credentials for the entire application database, so a compromise of a single application server is sufficient to access any user’s data, even if that user doesn’t have any active sessions with the compromised application server.
非目标性。攻击者在不对整个 PCC 系统进行广泛攻击的情况下,不得试图破坏属于特定目标私有云计算用户的个人数据。这一点必须适用于异常复杂的攻击者,他们可以尝试对供应链中的 PCC 节点进行物理攻击,或尝试恶意访问 PCC 数据中心。换句话说,有限的 PCC 攻击绝不能让攻击者将特定用户的请求引导到被攻击的节点上;针对用户的攻击应该是大范围的,而且有可能被检测到。为了更直观地理解这一点,请与传统的云服务设计进行对比,在传统的云服务设计中,每个应用服务器都为整个应用数据库配置了数据库凭据,因此入侵单个应用服务器就足以访问任何用户的数据,即使该用户与被入侵的应用服务器没有任何活动会话。
Verifiable transparency. Security researchers need to be able to verify, with a high degree of confidence, that our privacy and security guarantees for Private Cloud Compute match our public promises. We already have an earlier requirement for our guarantees to be enforceable. Hypothetically, then, if security researchers had sufficient access to the system, they would be able to verify the guarantees. But this last requirement, verifiable transparency, goes one step further and does away with the hypothetical: security researchers must be able to verify the security and privacy guarantees of Private Cloud Compute, and they must be able to verify that the software that’s running in the PCC production environment is the same as the software they inspected when verifying the guarantees.
可验证的透明度。安全研究人员需要能够以高度的信心验证我们对私有云计算的隐私和安全保证是否与我们的公开承诺相符。我们之前已经要求我们的保证是可执行的。假设,如果安全研究人员有足够的权限访问系统,他们就能验证这些保证。但最后一项要求——可验证的透明度——则更进一步,消除了这种假设:安全研究人员必须能够验证私有云计算的安全和隐私保证,而且他们必须能够验证在私有云计算生产环境中运行的软件与他们在验证保证时检查的软件相同。
This is an extraordinary set of requirements, and one that we believe represents a generational leap over any traditional cloud service security model.
这是一组非同寻常的要求,我们相信,它代表了任何传统云服务安全模式的新飞跃。
引入私有云计算节点**
The root of trust for Private Cloud Compute is our compute node: custom-built server hardware that brings the power and security of Apple silicon to the data center, with the same hardware security technologies used in iPhone, including the Secure Enclave and Secure Boot. We paired this hardware with a new operating system: a hardened subset of the foundations of iOS and macOS tailored to support Large Language Model (LLM) inference workloads while presenting an extremely narrow attack surface. This allows us to take advantage of iOS security technologies such as Code Signing and sandboxing.
私有云计算的信任根基是我们的计算节点:定制的服务器硬件将苹果芯片的强大功能和安全性带到了数据中心,并采用了与 iPhone 相同的硬件安全技术,包括安全飞地(Secure Enclave)和安全启动(Secure Boot)。我们将这一硬件与新的操作系统搭配使用:iOS 和 macOS 基础的加固子集,专为支持大型语言模型 (LLM) 推理工作负载而定制,同时提供极窄的攻击面。这使我们能够利用 iOS 的安全技术,如代码签名和沙箱。
On top of this foundation, we built a custom set of cloud extensions with privacy in mind. We excluded components that are traditionally critical to data center administration, such as remote shells and system introspection and observability tools. We replaced those general-purpose software components with components that are purpose-built to deterministically provide only a small, restricted set of operational metrics to SRE staff. And finally, we used Swift on Server to build a new Machine Learning stack specifically for hosting our cloud-based foundation model.
在此基础上,我们建立了一套考虑到隐私的定制云扩展。我们排除了传统上对数据中心管理至关重要的组件,如远程外壳、系统内省和可观察性工具。我们将这些通用软件组件替换为专门为确定性地向 SRE 人员提供少量受限运行指标集而构建的组件。最后,我们使用 Swift on Server 构建了一个新的机器学习堆栈,专门用于托管我们基于云的基础模型。
Let’s take another look at our core Private Cloud Compute requirements and the features we built to achieve them.
让我们再来看看我们的核心私有云计算需求以及为实现这些需求而构建的功能。
无状态计算和可执行保证**
With services that are end-to-end encrypted, such as iMessage, the service operator cannot access the data that transits through the system. One of the key reasons such designs can assure privacy is specifically because they prevent the service from performing computations on user data. Since Private Cloud Compute needs to be able to access the data in the user’s request to allow a large foundation model to fulfill it, complete end-to-end encryption is not an option. Instead, the PCC compute node must have technical enforcement for the privacy of user data during processing, and must be incapable of retaining user data after its duty cycle is complete.
通过端到端加密服务(如 iMessage),服务运营商无法访问通过系统传输的数据。此类设计能确保隐私的一个重要原因是,它们能防止服务对用户数据进行计算。由于私有云计算需要能够访问用户请求中的数据,以便让大型基础模型来完成请求,因此完全的端到端加密并不是一种选择。相反,PCC 计算节点必须在处理过程中对用户数据的隐私进行技术保护,并且在其工作周期结束后不能保留用户数据。
We designed Private Cloud Compute to make several guarantees about the way it handles user data:
我们在设计私有云计算时,对其处理用户数据的方式做出了多项保证:
A user’s device sends data to PCC for the sole, exclusive purpose of fulfilling the user’s inference request. PCC uses that data only to perform the operations requested by the user.
用户设备向 PCC 发送数据的唯一目的是满足用户的推理请求。PCC 仅使用这些数据来执行用户请求的操作。
User data stays on the PCC nodes that are processing the request only until the response is returned. PCC deletes the user’s data after fulfilling the request, and no user data is retained in any form after the response is returned.
用户数据只保留在处理请求的 PCC 节点上,直到返回响应为止。PCC 会在完成请求后删除用户数据,在返回响应后不会以任何形式保留用户数据。
User data is never available to Apple — even to staff with administrative access to the production service or hardware.
用户数据永远不会提供给苹果公司,即使是拥有生产服务或硬件管理权限的员工。
When Apple Intelligence needs to draw on Private Cloud Compute, it constructs a request — consisting of the prompt, plus the desired model and inferencing parameters — that will serve as input to the cloud model. The PCC client on the user’s device then encrypts this request directly to the public keys of the PCC nodes that it has first confirmed are valid and cryptographically certified. This provides end-to-end encryption from the user’s device to the validated PCC nodes, ensuring the request cannot be accessed in transit by anything outside those highly protected PCC nodes. Supporting data center services, such as load balancers and privacy gateways, run outside of this trust boundary and do not have the keys required to decrypt the user’s request, thus contributing to our enforceable guarantees.
当 Apple Intelligence 需要使用私有云计算时,它会构建一个请求,其中包括提示以及所需的模型和推理参数,作为云模型的输入。然后,用户设备上的 PCC 客户端会将该请求直接加密到 PCC 节点的公钥上,而 PCC 节点首先要确认这些公钥是有效的并经过加密认证。这就提供了从用户设备到经过验证的 PCC 节点的端到端加密,确保请求在传输过程中不会被这些受到高度保护的 PCC 节点之外的任何东西访问。负载平衡器和隐私网关等支持性数据中心服务在此信任边界之外运行,不具备解密用户请求所需的密钥,因此有助于我们提供可执行的保证。
Next, we must protect the integrity of the PCC node and prevent any tampering with the keys used by PCC to decrypt user requests. The system uses Secure Boot and Code Signing for an enforceable guarantee that only authorized and cryptographically measured code is executable on the node. All code that can run on the node must be part of a trust cache that has been signed by Apple, approved for that specific PCC node, and loaded by the Secure Enclave such that it cannot be changed or amended at runtime. This also ensures that JIT mappings cannot be created, preventing compilation or injection of new code at runtime. Additionally, all code and model assets use the same integrity protection that powers the Signed System Volume. Finally, the Secure Enclave provides an enforceable guarantee that the keys that are used to decrypt requests cannot be duplicated or extracted.
接下来,我们必须保护 PCC 节点的完整性,防止 PCC 用来解密用户请求的密钥被篡改。系统采用安全启动和代码签名技术,以确保只有经过授权和加密检验的代码才能在节点上执行。所有可在节点上运行的代码都必须是信任缓存的一部分,该缓存已由 Apple 签名,经特定 PCC 节点批准,并由 Secure Enclave 加载,因此在运行时无法更改或修改。这还能确保无法创建 JIT 映射,防止在运行时编译或注入新代码。此外,所有代码和模型资产都使用与签名系统卷相同的完整性保护。最后,安全飞地提供了一种可执行的保证,即用于解密请求的密钥无法被复制或提取。
The Private Cloud Compute software stack is designed to ensure that user data is not leaked outside the trust boundary or retained once a request is complete, even in the presence of implementation errors. The Secure Enclave randomizes the data volume’s encryption keys on every reboot and does not persist these random keys, ensuring that data written to the data volume cannot be retained across reboot. In other words, there is an enforceable guarantee that the data volume is cryptographically erased every time the PCC node’s Secure Enclave Processor reboots. The inference process on the PCC node deletes data associated with a request upon completion, and the address spaces that are used to handle user data are periodically recycled to limit the impact of any data that may have been unexpectedly retained in memory.
私有云计算软件栈的设计旨在确保用户数据不会泄露到信任边界之外,也不会在请求完成后被保留,即使在出现执行错误的情况下也是如此。Secure Enclave 会在每次重启时随机化数据卷的加密密钥,并且不会持久保留这些随机密钥,从而确保写入数据卷的数据不会在重启后被保留。换句话说,每次重启 PCC 节点的 Secure Enclave 处理器时,数据卷都会被加密清除,这是一个可执行的保证。PCC 节点上的推理进程会在请求完成后删除与请求相关的数据,用于处理用户数据的地址空间也会定期回收,以限制内存中可能意外保留的任何数据的影响。
Finally, for our enforceable guarantees to be meaningful, we also need to protect against exploitation that could bypass these guarantees. Technologies such as Pointer Authentication Codes and sandboxing act to resist such exploitation and limit an attacker’s horizontal movement within the PCC node. The inference control and dispatch layers are written in Swift, ensuring memory safety, and use separate address spaces to isolate initial processing of requests. This combination of memory safety and the principle of least privilege removes entire classes of attacks on the inference stack itself and limits the level of control and capability that a successful attack can obtain.
最后,为了使我们的可执行保证具有实际意义,我们还需要防止可能绕过这些保证的攻击。指针验证码和沙箱等技术可以抵御此类利用,并限制攻击者在 PCC 节点内的横向移动。推理控制层和调度层是用 Swift 编写的,以确保内存安全,并使用独立的地址空间来隔离请求的初始处理。这种内存安全性与最小权限原则的结合,消除了对推理堆栈本身的整类攻击,并限制了成功攻击所能获得的控制水平和能力。
无运行时访问权限**
We designed Private Cloud Compute to ensure that privileged access doesn’t allow anyone to bypass our stateless computation guarantees.
我们设计私有云计算的目的是确保特权访问不允许任何人绕过我们的无状态计算保证。
First, we intentionally did not include remote shell or interactive debugging mechanisms on the PCC node. Our Code Signing machinery prevents such mechanisms from loading additional code, but this sort of open-ended access would provide a broad attack surface to subvert the system’s security or privacy. Beyond simply not including a shell, remote or otherwise, PCC nodes cannot enable Developer Mode and do not include the tools needed by debugging workflows.
首先,我们有意不在 PCC 节点上加入远程 shell 或交互式调试机制。我们的代码签名机制可防止此类机制加载额外代码,但这种开放式访问会为破坏系统安全或隐私提供广泛的攻击面。除了不包含远程或其他 shell 之外,PCC 节点还不能启用开发者模式,也不包含调试工作流所需的工具。
Next, we built the system’s observability and management tooling with privacy safeguards that are designed to prevent user data from being exposed. For example, the system doesn’t even include a general-purpose logging mechanism. Instead, only pre-specified, structured, and audited logs and metrics can leave the node, and multiple independent layers of review help prevent user data from accidentally being exposed through these mechanisms. With traditional cloud AI services, such mechanisms might allow someone with privileged access to observe or collect user data.
其次,我们在构建系统的可观察性和管理工具时采取了隐私保护措施,以防止用户数据被泄露。例如,系统甚至不包括通用日志机制。相反,只有预先指定的、结构化的、经过审计的日志和指标才能离开节点,而且多层独立审查有助于防止用户数据通过这些机制意外暴露。在传统的云人工智能服务中,此类机制可能会允许拥有特权访问权限的人观察或收集用户数据。
Together, these techniques provide enforceable guarantees that only specifically designated code has access to user data and that user data cannot leak outside the PCC node during system administration.
这些技术共同提供了可执行的保证,即只有专门指定的代码才能访问用户数据,并且在系统管理期间用户数据不会泄漏到 PCC 节点之外。
非目标性**
Our threat model for Private Cloud Compute includes an attacker with physical access to a compute node and a high level of sophistication — that is, an attacker who has the resources and expertise to subvert some of the hardware security properties of the system and potentially extract data that is being actively processed by a compute node.
我们的私有云计算威胁模型包括攻击者对计算节点的物理访问权限和高度复杂性,即攻击者拥有颠覆系统某些硬件安全属性的资源和专业知识,并可能提取计算节点正在积极处理的数据。
We defend against this type of attack in two ways:
我们通过两种方式抵御此类攻击:
We supplement the built-in protections of Apple silicon with a hardened supply chain for PCC hardware, so that performing a hardware attack at scale would be both prohibitively expensive and likely to be discovered.
我们通过加固 PCC 硬件供应链来补充苹果芯片的内置保护功能,这样大规模实施硬件攻击的成本就会非常高昂,而且很可能会被发现。
We limit the impact of small-scale attacks by ensuring that they cannot be used to target the data of a specific user.
我们通过确保小规模攻击不能用于针对特定用户的数据,来限制其影响。
Private Cloud Compute hardware security starts at manufacturing, where we inventory and perform high-resolution imaging of the components of the PCC node before each server is sealed and its tamper switch is activated. When they arrive in the data center, we perform extensive revalidation before the servers are allowed to be provisioned for PCC. The process involves multiple Apple teams that cross-check data from independent sources, and the process is further monitored by a third-party observer not affiliated with Apple. At the end, a certificate is issued for keys rooted in the Secure Enclave UID for each PCC node. The user’s device will not send data to any PCC nodes if it cannot validate their certificates.
私有云计算硬件安全始于制造阶段,在每台服务器密封并启动防篡改开关之前,我们会对 PCC 节点的组件进行清点和高分辨率成像。服务器运抵数据中心后,我们会进行大量的重新验证,然后才允许为 PCC 进行配置。这一过程涉及多个 Apple 团队,他们会交叉检查来自独立来源的数据,这一过程还受到与 Apple 无关的第三方观察员的进一步监控。最后,会为每个 PCC 节点根植于 Secure Enclave UID 的密钥颁发证书。如果无法验证 PCC 节点的证书,用户的设备将不会向任何 PCC 节点发送数据。
These processes broadly protect hardware from compromise. To guard against smaller, more sophisticated attacks that might otherwise avoid detection, Private Cloud Compute uses an approach we call target diffusion to ensure requests cannot be routed to specific nodes based on the user or their content.
这些流程可广泛保护硬件免遭破坏。为了防范规模更小、更复杂的攻击,否则可能无法检测到这些攻击,私有云计算采用了一种我们称之为目标扩散的方法,以确保请求不会根据用户或其内容被路由到特定节点。
Target diffusion starts with the request metadata, which leaves out any personally identifiable information about the source device or user, and includes only limited contextual data about the request that’s required to enable routing to the appropriate model. This metadata is the only part of the user’s request that is available to load balancers and other data center components running outside of the PCC trust boundary. The metadata also includes a single-use credential, based on RSA Blind Signatures, to authorize valid requests without tying them to a specific user. Additionally, PCC requests go through an OHTTP relay — operated by a third party — which hides the device’s source IP address before the request ever reaches the PCC infrastructure. This prevents an attacker from using an IP address to identify requests or associate them with an individual. It also means that an attacker would have to compromise both the third-party relay and our load balancer to steer traffic based on the source IP address.
目标扩散从请求元数据开始,元数据不包括源设备或用户的任何个人身份信息,只包括请求的有限上下文数据,这些数据是路由到适当模型所必需的。该元数据是负载平衡器和在 PCC 信任边界外运行的其他数据中心组件可以使用的用户请求的唯一部分。元数据还包括一个基于 RSA 盲签名的单次使用凭证,用于授权有效请求,而不会将其与特定用户绑定。此外,PCC 请求会通过第三方运营的 OHTTP 中继,在请求到达 PCC 基础设施之前隐藏设备的源 IP 地址。这就防止了攻击者利用 IP 地址来识别请求或将请求与个人联系起来。这也意味着,攻击者必须同时入侵第三方中继器和我们的负载平衡器,才能根据源 IP 地址引导流量。
User devices encrypt requests only for a subset of PCC nodes, rather than the PCC service as a whole. When asked by a user device, the load balancer returns a subset of PCC nodes that are most likely to be ready to process the user’s inference request — however, as the load balancer has no identifying information about the user or device for which it’s choosing nodes, it cannot bias the set for targeted users. By limiting the PCC nodes that can decrypt each request in this way, we ensure that if a single node were ever to be compromised, it would not be able to decrypt more than a small portion of incoming requests. Finally, the selection of PCC nodes by the load balancer is statistically auditable to protect against a highly sophisticated attack where the attacker compromises a PCC node as well as obtains complete control of the PCC load balancer.
用户设备只对 PCC 节点子集而非整个 PCC 服务加密请求。当用户设备提出请求时,负载平衡器会返回最有可能处理用户推理请求的 PCC 节点子集,但由于负载平衡器不掌握用户或设备的身份信息,因此无法为目标用户选择节点子集。通过这种方式限制可以解密每个请求的 PCC 节点,我们可以确保即使单个节点受到攻击,它也无法解密超过一小部分的传入请求。最后,负载平衡器对 PCC 节点的选择在统计上是可审计的,以防止攻击者入侵 PCC 节点并完全控制 PCC 负载平衡器的高难度攻击。
可核实的透明度**
We consider allowing security researchers to verify the end-to-end security and privacy guarantees of Private Cloud Compute to be a critical requirement for ongoing public trust in the system. Traditional cloud services do not make their full production software images available to researchers — and even if they did, there’s no general mechanism to allow researchers to verify that those software images match what’s actually running in the production environment. (Some specialized mechanisms exist, such as Intel SGX and AWS Nitro attestation.)
我们认为,允许安全研究人员验证私有云计算的端到端安全和隐私保证,是公众持续信任该系统的关键要求。传统的云服务不会向研究人员提供完整的生产软件镜像,即使提供了,也没有通用机制允许研究人员验证这些软件镜像是否与生产环境中实际运行的软件相匹配。(存在一些专门的机制,如英特尔 SGX 和 AWS Nitro 认证)。
When we launch Private Cloud Compute, we’ll take the extraordinary step of making software images of every production build of PCC publicly available for security research. This promise, too, is an enforceable guarantee: user devices will be willing to send data only to PCC nodes that can cryptographically attest to running publicly listed software. We want to ensure that security and privacy researchers can inspect Private Cloud Compute software, verify its functionality, and help identify issues — just like they can with Apple devices.
当我们推出私有云计算时,我们将采取非同寻常的措施,公开 PCC 每个生产构建的软件镜像,用于安全研究。这一承诺也是一种可执行的保证:用户设备将只愿意向能以加密方式证明运行公开列出的软件的 PCC 节点发送数据。我们希望确保安全和隐私研究人员可以检查私有云计算软件、验证其功能并帮助发现问题,就像他们可以检查苹果设备一样。
Our commitment to verifiable transparency includes:
我们对可核查透明度的承诺包括
Publishing the measurements of all code running on PCC in an append-only and cryptographically tamper-proof transparency log.
将 PCC 上运行的所有代码的测量结果公布在一个仅有附件、加密防篡改的透明日志中。
Making the log and associated binary software images publicly available for inspection and validation by privacy and security experts.
公开日志和相关二进制软件图像,供隐私和安全专家检查和验证。
Publishing and maintaining an official set of tools for researchers analyzing PCC node software.
出版并维护一套官方工具,供研究人员分析 PCC 节点软件。
Rewarding important research findings through the Apple Security Bounty program.
通过苹果安全赏金计划奖励重要研究成果。
Every production Private Cloud Compute software image will be published for independent binary inspection — including the OS, applications, and all relevant executables, which researchers can verify against the measurements in the transparency log. Software will be published within 90 days of inclusion in the log, or after relevant software updates are available, whichever is sooner. Once a release has been signed into the log, it cannot be removed without detection, much like the log-backed map data structure used by the Key Transparency mechanism for iMessage Contact Key Verification.
每个生产的私有云计算软件镜像都将公布,供独立的二进制检查,包括操作系统、应用程序和所有相关的可执行文件,研究人员可以根据透明度日志中的测量结果进行验证。软件将在纳入日志后 90 天内发布,或在相关软件更新可用后发布,以时间在前者为准。发布的软件一旦被签入日志,就无法在未被检测到的情况下删除,这就像 iMessage 联系人密钥验证的密钥透明机制所使用的日志支持地图数据结构一样。
As we mentioned, user devices will ensure that they’re communicating only with PCC nodes running authorized and verifiable software images. Specifically, the user’s device will wrap its request payload key only to the public keys of those PCC nodes whose attested measurements match a software release in the public transparency log. And the same strict Code Signing technologies that prevent loading unauthorized software also ensure that all code on the PCC node is included in the attestation.
正如我们所提到的,用户设备将确保只与运行授权和可验证软件映像的 PCC 节点通信。具体来说,用户设备只能将其请求有效载荷密钥封装到那些经证明测量结果与公共透明度日志中的软件版本相匹配的 PCC 节点的公钥上。此外,防止加载未经授权软件的严格代码签名技术还能确保 PCC 节点上的所有代码都包含在认证中。
Making Private Cloud Compute software logged and inspectable in this way is a strong demonstration of our commitment to enable independent research on the platform. But we want to ensure researchers can rapidly get up to speed, verify our PCC privacy claims, and look for issues, so we’re going further with three specific steps:
以这种方式对私有云计算软件进行记录和检查,有力地证明了我们对在该平台上开展独立研究的承诺。但是,我们希望确保研究人员能够快速上手、验证我们的 PCC 隐私声明并查找问题,因此我们将进一步采取三个具体步骤:
We’ll release a PCC Virtual Research Environment: a set of tools and images that simulate a PCC node on a Mac with Apple silicon, and that can boot a version of PCC software minimally modified for successful virtualization.
我们将发布 PCC 虚拟研究环境:这是一套工具和图像,可在装有苹果芯片的 Mac 上模拟 PCC 节点,并可启动经过最小修改的 PCC 软件版本,以成功实现虚拟化。
While we’re publishing the binary images of every production PCC build, to further aid research we will periodically also publish a subset of the security-critical PCC source code.
在发布每个 PCC 生产构建的二进制镜像的同时,为了进一步帮助研究,我们还将定期发布安全关键 PCC 源代码的子集。
In a first for any Apple platform, PCC images will include the sepOS firmware and the iBoot bootloader in plaintext, making it easier than ever for researchers to study these critical components.
PCC 映像将以明文形式包含 sepOS 固件和 iBoot 引导加载程序,这在所有苹果平台中尚属首次,使研究人员比以往任何时候都更容易研究这些关键组件。
The Apple Security Bounty will reward research findings in the entire Private Cloud Compute software stack — with especially significant payouts for any issues that undermine our privacy claims.
Apple 安全悬赏计划将奖励在整个私有云计算软件堆栈中的研究成果,尤其是针对任何有损我们隐私声明的问题的重大奖励。
未来计划
**
Private Cloud Compute continues Apple’s profound commitment to user privacy. With sophisticated technologies to satisfy our requirements of stateless computation, enforceable guarantees, no privileged access, non-targetability, and verifiable transparency, we believe Private Cloud Compute is nothing short of the world-leading security architecture for cloud AI compute at scale.
Private Cloud Compute 秉承了 Apple 对用户隐私的深刻承诺。凭借先进的技术来满足我们对无状态计算、可执行保证、无特权访问、非目标性和可验证透明度的要求,我们相信,Private Cloud Compute 是世界领先的大规模云 AI 计算安全架构。
We look forward to sharing many more technical details about PCC, including the implementation and behavior behind each of our core requirements. And we’re especially excited to soon invite security researchers for a first look at the Private Cloud Compute software and our PCC Virtual Research Environment.
我们期待着与大家分享更多有关 PCC 的技术细节,包括每项核心要求背后的实现和行为。我们非常高兴能邀请安全研究人员率先了解私有云计算软件和我们的 PCC 虚拟研究环境。
数据保护官(DPO)社群主要成员是个人信息保护和数据安全一线工作者。他们主要来自于国内头部的互联网公司、安全公司、律所、会计师事务所、高校、研究机构等。在从事本职工作的同时,DPO社群成员还放眼全球思考数据安全和隐私保护的最新动态、进展、趋势。2018年5月,DPO社群举行了第一次线下沙龙。沙龙每月一期,集中讨论不同的议题。目前DPO社群已超过400人。关于DPO社群和沙龙更多的情况如下:
DPO线下沙龙的实录见:
域外数据安全和个人信息保护领域的权威文件,DPO社群的全文翻译:
传染病疫情防控与个人信息保护系列文章
关于数据与竞争政策的翻译和分析:
健康医疗大数据系列文章:
网联汽车数据和自动驾驶的系列文章:
网络空间的国际法适用问题系列文章:
《网络数据安全管理条例(征求意见稿)》系列文章:
《数据安全法》的相关文章包括:
赴美上市的网络、数据安全方面的两国监管乃至冲突方:
个性化广告或行为定向广告(behavioral targeting advertising)系列的文章:
内容安全方面的文章如下:
关于健康医疗数据方面的文章有:
第29条工作组/EDPB关于GDPR的指导意见的翻译:
数字贸易专题系列:
关于中国数据出境安全管理制度的文章
美国方面的个人信息保护立法的文章:
关于印度的数据保护和数据治理政策和技术文件的文章有:
关于数据的安全、个人信息保护、不正当竞争等方面的重大案例:
围绕供应链安全,本公众号曾发表文章:
围绕着出口管制,本公众号曾发表文章:
本公号发表过的关于数据执法跨境调取的相关文章:
通过技术增强对个人信息的保护的文章包括:
关于保护网络和信息系统安全的相关文章包括:
围绕着TIKTOK和WECHAT的总统令,本公号发表了以下文章:
地缘政治与跨国科技公司运营之间的互动影响:
关于个人信息安全影响评估的文章如下:
关于我国《个人信息保护法》相关文章包括:
关于业务场景中数据跨境流动的文章如下:
[盟-美国数据隐私框架”充分性决定草案(全文中译本)](http://mp.weixin.qq.com/s?__biz=MzIxODM0NDU4MQ==&mid=2247497575&idx=1&sn=d171ad4002ce54b744cc9c9a84c3457d&chksm=97e94a8da09ec39bb4fd005d1276aa87bc2a09eb59efd16c3eab5acfc59797404e89329df8de&scene=21#wechat_redirect)
关于数据要素治理的文章有:
人脸识别系列文章:
针对审计在数据安全、个人信息保护、A安全的作用与落地实操,本公众号发布过的文章:
针对已公开数据的个人信息保护研究,本公号发表过以下文章
关于中国的网络安全审查制度,本公号发表过的文章:
关于域外在数据、电信、外国投资方面所建立的国家安全相关的审查机制,本公号发布过以下文章:
美国司法部狙击中国内幕(Inside DOJ's nationwide effort to take on China)
人工智能安全和可信赖方面的文章:
关于我国人工智能算法监管的文章:
关于AI与标准化工作,本公号发表的文章:
关于欧盟的人工智能监管方面的立法、政策和实践方面的文章:
关于欧盟技术主权相关举措的翻译和分析:
针对美国的人工智能监管政策发展,本公众号发表过如下文章:
关于我国数据跨境流动监管体制变革的系列文章:
关于个人信息的去标识化及匿名化,本公众号发表过以下文章:
关于个人信息的去标识化及匿名化,本公众号发表过以下文章: