Understanding Cysic: The Dawn of Hardware Acceleration and the Emergence of ZK Mining

IntermediateAug 14, 2024
This article introduces the ZK proof system workflow and explores the challenges and optimization strategies for accelerating MSM and NTT computations.
Understanding Cysic: The Dawn of Hardware Acceleration and the Emergence of ZK Mining

In April, Vitalik attended the Hong Kong Blockchain Summit and delivered a speech titled “Reaching the Limits of Protocol Design,” where he highlighted the potential of ZK-SNARKs within Ethereum’s Danksharding roadmap and discussed the promising role of ASIC chips in accelerating ZK processes. Earlier, Scroll co-founder Zhang Ye suggested that the potential applications of ZK could be even greater in traditional sectors than in Web3, with significant demand in areas such as trusted computing, databases, verifiable hardware, content authentication, and zkML. Should real-time ZK proof generation become feasible, it could lead to transformative changes across both Web3 and traditional industries. However, from the standpoint of efficiency and cost, widespread adoption of ZK is still some way off.

Back in 2022, leading venture capital firms a16z and Paradigm released reports underscoring the importance of ZK hardware acceleration. Paradigm went so far as to predict that future earnings for ZK miners could rival those of Bitcoin or Ethereum miners, with hardware acceleration solutions based on GPU, FPGA, and ASIC poised to capture a significant market. Following the rise of mainstream ZK Rollups like Scroll and Starknet, hardware acceleration has become a hot topic, and interest has intensified with the approaching launch of projects like Cysic.

Given the vast demand for ZK, it is likely that ZK mining pools and real-time ZKP generation SaaS models could give rise to a new industry. In this emerging market, ZK hardware manufacturers with strong capabilities and early-mover advantage could potentially become the next Bitmain, dominating the field of hardware acceleration. Cysic stands out as one of the most promising players in this space. The team has won notable awards from the ZKP technology competition platform ZPrize and began mentoring for ZPrize in 2023. Their roadmap features ToB (business-to-business) ZK mining pools and ToC (business-to-consumer) ZK-Depin hardware, attracting substantial investment from top VCs like Polychain, ABCDE, OKX Ventures, and Hashkey, resulting in nearly $20 million in funding.

As Cysic prepares to launch its testnet at the end of July and open its ZK mining pool, discussions about the company are heating up across various communities. This article aims to introduce more people to Cysic’s product concepts and business model while providing an accessible overview of ZK hardware acceleration principles. In the sections that follow, we will briefly outline the key aspects of Cysic, making it easier for readers to understand.

Understanding ZK Proof Systems: A Workflow Perspective

The ZK (Zero-Knowledge) proof system is intricate, but we can simplify its understanding by breaking it down through its functions and workflow. Here’s a basic overview of how a system designed to apply ZK to ordinary computations works: First, the user interacts with the ZK system via a front-end interface, submitting the content they want to prove. The front-end then converts this content into a format suitable for processing by the ZK proof system. The system uses a specific proof system or framework (like Halo2 or Plonk) to generate a ZK Proof. This process includes several key steps:

  1. Defining the Problem: The first step is to identify the specific content that needs to be proven. For instance, the Prover may claim to know or possess certain data, such as stating, “I know a solution N to the equation F(x)=w,” without revealing the actual value of N.
  2. Arithmetic Conversion and Constraint Satisfaction Problems (CSP): After the Prover submits the content, the system creates a specialized mathematical model or program that accurately represents the content to be proven. This is then converted into a format that the proof system can process. For example, the statement “I know a solution N to the equation F(x)=w” is transformed from its original mathematical equation into a form represented by logic gate circuits and polynomials.

  1. Compiling into ZKP: Next, the system selects an appropriate proof system, like Halo or Plonk, and compiles the previously generated content into a ZKP program. The Prover then uses this program to generate a proof, which the Verifier checks for validity.

For systems like zkEVM, commonly used in Ethereum Layer 2 solutions, smart contracts are first compiled into EVM (Ethereum Virtual Machine) bytecode. Each opcode is then converted into logic gate circuits or polynomial constraints before being processed further by the back-end ZK proof system.

It’s important to note that zk-SNARKs (Zero-Knowledge Succinct Non-Interactive Argument of Knowledge) are the most commonly used ZKP technology in blockchain today. Many ZK Rollups leverage the succinctness of SNARKs rather than their zero-knowledge property. Succinctness refers to the ZKP’s ability to compress large amounts of data into a few hundred bytes, significantly reducing verification costs. This results in an asymmetry between the workload of the Prover and Verifier: while it is costly for the Prover to generate the ZKP, it is relatively inexpensive for the Verifier to verify it. By exploiting this asymmetry, a scenario with one Prover and multiple Verifiers can significantly lower the overall cost on the Verifier’s side. This model is particularly advantageous for decentralized verification, as envisioned by Ethereum’s Layer 2 solutions.

However, this model of offloading verification costs onto the ZKP generation process is not a cure-all. For ZK Rollup projects, the high cost of generating ZKP will inevitably be passed on to the user experience and transaction fees, which could hinder the long-term adoption of ZK Rollups. Despite ZK’s potential in trustless and decentralized verification, the current economic conditions do not support large-scale implementation of zkEVM, zkVM, ZK Rollups, or ZK bridges due to the time constraints associated with proof generation. This has led to the rise of ZK acceleration projects like Cysic, Ingonyama, and Irreducible, each working to reduce the cost of ZKP generation from different angles. In the following section, we will briefly discuss the main computational costs and acceleration techniques for ZKP generation, and why Cysic holds significant potential in the ZK acceleration space.

Computational Challenges: MSM and NTT

It’s widely known that generating proofs in ZK systems is time-consuming for the Prover. In the ZK-SNARK protocol, a Verifier might be able to verify a proof in just one second, but it could take the Prover half a day or even a full day to generate that proof. To optimize the use of ZKP computations, it is necessary to convert the computation format from classical programming to a ZK-friendly format.

There are currently two primary methods for achieving this: one involves writing circuits using proof system frameworks like Halo2, while the other involves using domain-specific languages (DSLs) such as Cairo or Circom, to translate computations into an intermediate format that can then be submitted to the proof system. The proof system generates ZK proofs based on these circuits or the intermediate formats compiled by the DSLs. The more complex the operations, the longer it takes to generate the proof. Moreover, some operations are inherently ZK-unfriendly and require additional effort to implement. For example, hash functions like SHA or Keccak are ZKP-unfriendly, meaning using them increases proof generation time. Even operations that are inexpensive to execute on classical computers may not be efficient for ZKP.

Excluding these ZK-unfriendly tasks, the bottlenecks in the proof generation process are quite similar across different proof systems. There are two main computational tasks that consume most of the resources in ZK proof generation: MSM (Multi-Scalar Multiplication) and NTT (Number Theoretic Transform). These two tasks can account for 80-95% of the proof generation time, depending on the ZKP commitment scheme and specific implementation. MSM involves performing multi-scalar multiplication on elliptic curves, while NTT is an FFT (Fast Fourier Transform) on finite fields used to speed up polynomial multiplication. Different combinations of these tasks can result in varying load distributions between FFT and MSM. For example, Stark uses FRI, a hash-based commitment scheme that does not involve MSM, unlike the elliptic curve-based schemes like KZG or IPA. Generally, the more FFT operations required, the fewer MSM operations, and vice versa.

Optimization Strategies

MSM operations are characterized by predictable memory access, which allows for high parallelization but demands significant memory resources. However, MSM also presents scalability challenges; even with parallelization, it can still be slow. While hardware acceleration can help speed up MSM, it requires substantial memory and parallel computing resources.

NTT, on the other hand, involves random memory access, making it less suited for hardware acceleration and challenging to handle in distributed systems. This is because NTT’s random access nature often requires accessing data from other nodes in a distributed environment. When network interaction is necessary, performance can suffer dramatically.

Therefore, the access and movement of stored data become major bottlenecks, limiting the ability to parallelize NTT operations. Most efforts to accelerate NTT focus on managing how computation interacts with memory.

In fact, the simplest way to address the efficiency bottleneck of MSM and NTT is to eliminate these operations altogether. Some newly proposed algorithms, like Hyperplonk, modify Plonk to remove NTT operations, making Hyperplonk easier to accelerate, though it introduces new bottlenecks. Other examples include the computationally expensive sumcheck protocol or the STARK algorithm, which eliminates MSM but adds significant hash computation through its FRI protocol.

ZK Hardware Acceleration and Cysic’s Ultimate Goal

While software and algorithmic optimizations are essential and valuable, they have clear limitations. To fully optimize the efficiency of ZKP generation, hardware acceleration is crucial, much like how ASICs and GPUs eventually dominated the BTC and ETH mining markets.

The question then becomes: what is the best hardware for accelerating ZKP generation? Currently, several hardware options are available for ZK acceleration, such as GPUs, FPGAs, or ASICs, each with its own set of advantages and disadvantages.

Comparing GPU, FPGA, and ASIC Hardware

To better understand the differences in development processes across GPU, FPGA, and ASIC hardware, let’s consider a simple example: implementing parallel multiplication.

  • GPU: Using the CUDA SDK, developers can write code that takes advantage of parallel computing, similar to writing native code.
  • FPGA: Developers need to learn a hardware description language (HDL) to control hardware-level connections and implement parallel algorithms.
  • ASIC: The chip’s transistor layout is fixed during the design phase and cannot be modified later.

Each hardware option has its strengths and weaknesses, making them suitable for different stages of ZK technology development. Cysic’s goal is to become the ultimate solution for ZK hardware acceleration, using a phased strategy:

  1. GPU: Develop an SDK to provide solutions for ZK applications and integrate GPU resources across the network.
  2. FPGA: Leverage FPGA’s flexibility to quickly create customized ZK hardware acceleration.
  3. ASIC: Independently develop ASIC-based ZK Depin hardware.
  4. Cysic Network will integrate all the computing power of ZK Depin and GPU as a SAAS platform/mining pool to provide computing power and verification solutions for the entire ZK industry.

Let’s explore these various sub-fields to better understand the distinctions between ZK acceleration solutions and Cysic’s development approach.

ZK Mining Pool and SaaS Platform: Cysic Network

Both Scroll and Polygon zkEVM have proposed the concept of a “decentralized Prover” in their roadmaps, which essentially means building ZK mining pools. This market-driven approach helps ZK Rollup projects reduce their workload while incentivizing miners and mining pool operators to continuously optimize ZK acceleration solutions. Cysic’s roadmap includes the development of a ZK mining pool and SaaS platform called Cysic Network, which will integrate Cysic’s computing power and attract third-party resources through mining incentives, including idle GPUs and consumer-owned zk DePIN devices. The entire verification workflow works as follows:

  1. Task Submission: The zk project team submits a proof generation task to an agent, who forwards the task to the verification network. Initially, these agents will be operated by Cysic, but later, asset staking will allow anyone to become an agent.
  2. Proof Generation: The Prover accepts the task and uses hardware to generate the ZK proof. The Prover must stake tokens to participate and will be rewarded after completing the task.
  3. Validation: The Validator Committee checks the proof’s validity and votes on it. Once a certain number of votes is reached, the proof is considered valid. Validators join the committee by staking tokens, participating in voting, and earning rewards. This process may incorporate EigenLayer’s AVS concept to reuse existing Restaking facilities.

The detailed interaction process is as follows

In this process, certain actions, such as asset staking, incentive distribution, and task submission, require a dedicated platform supported by blockchain infrastructure. To meet this need, Cysic Network has developed a dedicated public chain with a unique consensus algorithm called Proof of Compute (PoC). This algorithm uses the VRF function and the Prover’s historical performance, such as device availability, the number of proofs submitted, proof accuracy, etc., to select block producers responsible for creating blocks (these blocks likely record device information and distribute token incentives). Beyond the ZK mining pool and SaaS platform, Cysic has made extensive deployments in ZK acceleration solutions based on different hardware. Let’s explore Cysic’s achievements in GPU, FPGA, and ASIC technology.

GPU, FPGA, and ASIC: A Comparison

The essence of ZK (Zero-Knowledge) hardware acceleration lies in maximizing the parallelization of key computations. From a hardware perspective, CPUs are designed for maximum flexibility and general-purpose use. However, a significant portion of the CPU’s chip area is dedicated to control functions and various levels of cache, which limits its parallel computing capabilities. In contrast, a larger proportion of a GPU’s chip area is allocated to computation, enabling it to support large-scale parallel processing. GPUs are now widely available, and libraries like Nvidia CUDA allow developers to leverage GPU parallelism without needing deep knowledge of the underlying hardware. The CUDA SDK provides a framework for accelerating MSM (Multi-Scalar Multiplication) and NTT (Number Theoretic Transform) computations using CUDA ZK libraries.

FPGA (Field-Programmable Gate Array) takes a different approach, comprising arrays of numerous small processing units. To program an FPGA, developers must use a specialized hardware description language (HDL), which is then compiled into transistor circuit combinations. Essentially, FPGA implements specific algorithms directly through transistor circuits, bypassing the traditional instruction system’s compilation process. This approach offers much greater customization and flexibility compared to GPUs. Currently, FPGA prices are about one-third of GPU prices, and they can be more than ten times more energy-efficient. This energy efficiency advantage is partly because GPUs need to be connected to a host device, which typically consumes a lot of power. FPGA can add more computing modules to meet the demands of MSM and NTT without increasing energy consumption, making it particularly suitable for ZK proof scenarios that are computationally intensive, require high data throughput, and need low response times. However, the biggest challenge with FPGA is the scarcity of developers with the necessary programming experience. For ZK project teams, assembling a team with both cryptography expertise and FPGA engineering knowledge is extremely challenging.

ASIC (Application-Specific Integrated Circuit) is the most specialized of the three, essentially implementing a program entirely in hardware. Once an ASIC is designed, the hardware configuration is fixed and cannot be altered, meaning it can only perform specific tasks. The advantages of FPGA in accelerating MSM and NTT also apply to ASIC, but because ASIC is designed for a specific application, it offers the highest efficiency and the lowest power consumption among all hardware options. For mainstream ZK circuits today, Cysic aims to achieve proof times of 1-5 seconds, which only ASIC can deliver. While these benefits are highly appealing, ZK technology is rapidly evolving, and ASIC design and production cycles typically take 1-2 years and cost between $10 million and $20 million. Therefore, large-scale production must wait until ZK technology stabilizes to avoid producing chips that quickly become obsolete.

To address these challenges, Cysic has made comprehensive investments in all three hardware categories: GPU, FPGA, and ASIC. In GPU acceleration, Cysic has adapted to the emergence of various new ZK proof systems through its self-developed CUDA acceleration SDK. By consolidating community resources, Cysic has connected tens of thousands of top-tier GPUs into its GPU computing network, achieving speed improvements of 50%-80% or more compared to the latest open-source frameworks. In the FPGA space, Cysic has developed solutions that set global performance benchmarks for MSM, NTT, and Poseidon Merkle tree modules, covering the most critical components of ZK computation. These solutions have been prototype-tested and validated by several leading ZK projects. Cysic’s proprietary SolarMSM can complete 2^30-scale MSM computations in just 0.195 seconds, while SolarNTT can perform 2^30-scale NTT computations in 0.218 seconds, making them the highest-performing FPGA hardware acceleration results currently available.

In the ASIC field, while widespread adoption of ZK ASICs may still be some time away, Cysic has already positioned itself in this emerging market by developing its own ZK DePIN chips and devices. To appeal to consumer users and meet the diverse performance and cost requirements of different ZK projects, Cysic plans to introduce two ZK hardware products: ZK Air and ZK Pro.

  • ZK Air: This device is compact, similar in size to a power bank or a laptop charger, allowing everyday users to connect it via a Type-C interface to laptops, iPads, or even smartphones. It provides computational support for specific ZK projects while earning rewards for the user. Despite its small size, ZK Air’s computing power exceeds that of consumer-grade GPUs, making it capable of accelerating smaller-scale ZK proof generation tasks.
  • ZK Pro: Designed for more intensive applications, ZK Pro resembles traditional mining rigs and offers computing power equivalent to a multi-GPU server. It significantly speeds up ZK proof generation for large-scale projects such as ZK-Rollup and ZKML (Zero-Knowledge Machine Learning).

Through these two devices, Cysic aims to create a stable and reliable ZK-DePIN network. Both ZK Air and ZK Pro are currently under development, with an anticipated release in 2025. Additionally, the Cysic Network will enable consumer users to enter the ZK hardware acceleration market with very low barriers to entry. Coupled with the high demand for computational power from ZK project teams, this could ignite a new wave of enthusiasm similar to the Bitcoin mining boom, potentially leading to explosive growth in the ZK computing market.

Reference

https://medium.com/amber-group/need-for-speed-zero-knowledge-1e29d4a82fcdhttps://figmentcapital.medium.com/accelerating-zero-knowledge-proofs-cfc806de611b

Disclaimer:

  1. This article is reprinted from Geek Web3. The copyright belongs to the original authors, [Nickqiao & Wuyue]. If there are any objections to the reprint, please contact the Gate Learn team, and the team will process it promptly according to the relevant procedures.
  2. Disclaimer: The views and opinions expressed in this article are solely those of the authors and do not constitute any investment advice.
  3. Other language versions of the article have been translated by the Gate Learn team. The translated articles may not be copied, distributed, or plagiarized without mentioning Gate.io.

Understanding Cysic: The Dawn of Hardware Acceleration and the Emergence of ZK Mining

IntermediateAug 14, 2024
This article introduces the ZK proof system workflow and explores the challenges and optimization strategies for accelerating MSM and NTT computations.
Understanding Cysic: The Dawn of Hardware Acceleration and the Emergence of ZK Mining

In April, Vitalik attended the Hong Kong Blockchain Summit and delivered a speech titled “Reaching the Limits of Protocol Design,” where he highlighted the potential of ZK-SNARKs within Ethereum’s Danksharding roadmap and discussed the promising role of ASIC chips in accelerating ZK processes. Earlier, Scroll co-founder Zhang Ye suggested that the potential applications of ZK could be even greater in traditional sectors than in Web3, with significant demand in areas such as trusted computing, databases, verifiable hardware, content authentication, and zkML. Should real-time ZK proof generation become feasible, it could lead to transformative changes across both Web3 and traditional industries. However, from the standpoint of efficiency and cost, widespread adoption of ZK is still some way off.

Back in 2022, leading venture capital firms a16z and Paradigm released reports underscoring the importance of ZK hardware acceleration. Paradigm went so far as to predict that future earnings for ZK miners could rival those of Bitcoin or Ethereum miners, with hardware acceleration solutions based on GPU, FPGA, and ASIC poised to capture a significant market. Following the rise of mainstream ZK Rollups like Scroll and Starknet, hardware acceleration has become a hot topic, and interest has intensified with the approaching launch of projects like Cysic.

Given the vast demand for ZK, it is likely that ZK mining pools and real-time ZKP generation SaaS models could give rise to a new industry. In this emerging market, ZK hardware manufacturers with strong capabilities and early-mover advantage could potentially become the next Bitmain, dominating the field of hardware acceleration. Cysic stands out as one of the most promising players in this space. The team has won notable awards from the ZKP technology competition platform ZPrize and began mentoring for ZPrize in 2023. Their roadmap features ToB (business-to-business) ZK mining pools and ToC (business-to-consumer) ZK-Depin hardware, attracting substantial investment from top VCs like Polychain, ABCDE, OKX Ventures, and Hashkey, resulting in nearly $20 million in funding.

As Cysic prepares to launch its testnet at the end of July and open its ZK mining pool, discussions about the company are heating up across various communities. This article aims to introduce more people to Cysic’s product concepts and business model while providing an accessible overview of ZK hardware acceleration principles. In the sections that follow, we will briefly outline the key aspects of Cysic, making it easier for readers to understand.

Understanding ZK Proof Systems: A Workflow Perspective

The ZK (Zero-Knowledge) proof system is intricate, but we can simplify its understanding by breaking it down through its functions and workflow. Here’s a basic overview of how a system designed to apply ZK to ordinary computations works: First, the user interacts with the ZK system via a front-end interface, submitting the content they want to prove. The front-end then converts this content into a format suitable for processing by the ZK proof system. The system uses a specific proof system or framework (like Halo2 or Plonk) to generate a ZK Proof. This process includes several key steps:

  1. Defining the Problem: The first step is to identify the specific content that needs to be proven. For instance, the Prover may claim to know or possess certain data, such as stating, “I know a solution N to the equation F(x)=w,” without revealing the actual value of N.
  2. Arithmetic Conversion and Constraint Satisfaction Problems (CSP): After the Prover submits the content, the system creates a specialized mathematical model or program that accurately represents the content to be proven. This is then converted into a format that the proof system can process. For example, the statement “I know a solution N to the equation F(x)=w” is transformed from its original mathematical equation into a form represented by logic gate circuits and polynomials.

  1. Compiling into ZKP: Next, the system selects an appropriate proof system, like Halo or Plonk, and compiles the previously generated content into a ZKP program. The Prover then uses this program to generate a proof, which the Verifier checks for validity.

For systems like zkEVM, commonly used in Ethereum Layer 2 solutions, smart contracts are first compiled into EVM (Ethereum Virtual Machine) bytecode. Each opcode is then converted into logic gate circuits or polynomial constraints before being processed further by the back-end ZK proof system.

It’s important to note that zk-SNARKs (Zero-Knowledge Succinct Non-Interactive Argument of Knowledge) are the most commonly used ZKP technology in blockchain today. Many ZK Rollups leverage the succinctness of SNARKs rather than their zero-knowledge property. Succinctness refers to the ZKP’s ability to compress large amounts of data into a few hundred bytes, significantly reducing verification costs. This results in an asymmetry between the workload of the Prover and Verifier: while it is costly for the Prover to generate the ZKP, it is relatively inexpensive for the Verifier to verify it. By exploiting this asymmetry, a scenario with one Prover and multiple Verifiers can significantly lower the overall cost on the Verifier’s side. This model is particularly advantageous for decentralized verification, as envisioned by Ethereum’s Layer 2 solutions.

However, this model of offloading verification costs onto the ZKP generation process is not a cure-all. For ZK Rollup projects, the high cost of generating ZKP will inevitably be passed on to the user experience and transaction fees, which could hinder the long-term adoption of ZK Rollups. Despite ZK’s potential in trustless and decentralized verification, the current economic conditions do not support large-scale implementation of zkEVM, zkVM, ZK Rollups, or ZK bridges due to the time constraints associated with proof generation. This has led to the rise of ZK acceleration projects like Cysic, Ingonyama, and Irreducible, each working to reduce the cost of ZKP generation from different angles. In the following section, we will briefly discuss the main computational costs and acceleration techniques for ZKP generation, and why Cysic holds significant potential in the ZK acceleration space.

Computational Challenges: MSM and NTT

It’s widely known that generating proofs in ZK systems is time-consuming for the Prover. In the ZK-SNARK protocol, a Verifier might be able to verify a proof in just one second, but it could take the Prover half a day or even a full day to generate that proof. To optimize the use of ZKP computations, it is necessary to convert the computation format from classical programming to a ZK-friendly format.

There are currently two primary methods for achieving this: one involves writing circuits using proof system frameworks like Halo2, while the other involves using domain-specific languages (DSLs) such as Cairo or Circom, to translate computations into an intermediate format that can then be submitted to the proof system. The proof system generates ZK proofs based on these circuits or the intermediate formats compiled by the DSLs. The more complex the operations, the longer it takes to generate the proof. Moreover, some operations are inherently ZK-unfriendly and require additional effort to implement. For example, hash functions like SHA or Keccak are ZKP-unfriendly, meaning using them increases proof generation time. Even operations that are inexpensive to execute on classical computers may not be efficient for ZKP.

Excluding these ZK-unfriendly tasks, the bottlenecks in the proof generation process are quite similar across different proof systems. There are two main computational tasks that consume most of the resources in ZK proof generation: MSM (Multi-Scalar Multiplication) and NTT (Number Theoretic Transform). These two tasks can account for 80-95% of the proof generation time, depending on the ZKP commitment scheme and specific implementation. MSM involves performing multi-scalar multiplication on elliptic curves, while NTT is an FFT (Fast Fourier Transform) on finite fields used to speed up polynomial multiplication. Different combinations of these tasks can result in varying load distributions between FFT and MSM. For example, Stark uses FRI, a hash-based commitment scheme that does not involve MSM, unlike the elliptic curve-based schemes like KZG or IPA. Generally, the more FFT operations required, the fewer MSM operations, and vice versa.

Optimization Strategies

MSM operations are characterized by predictable memory access, which allows for high parallelization but demands significant memory resources. However, MSM also presents scalability challenges; even with parallelization, it can still be slow. While hardware acceleration can help speed up MSM, it requires substantial memory and parallel computing resources.

NTT, on the other hand, involves random memory access, making it less suited for hardware acceleration and challenging to handle in distributed systems. This is because NTT’s random access nature often requires accessing data from other nodes in a distributed environment. When network interaction is necessary, performance can suffer dramatically.

Therefore, the access and movement of stored data become major bottlenecks, limiting the ability to parallelize NTT operations. Most efforts to accelerate NTT focus on managing how computation interacts with memory.

In fact, the simplest way to address the efficiency bottleneck of MSM and NTT is to eliminate these operations altogether. Some newly proposed algorithms, like Hyperplonk, modify Plonk to remove NTT operations, making Hyperplonk easier to accelerate, though it introduces new bottlenecks. Other examples include the computationally expensive sumcheck protocol or the STARK algorithm, which eliminates MSM but adds significant hash computation through its FRI protocol.

ZK Hardware Acceleration and Cysic’s Ultimate Goal

While software and algorithmic optimizations are essential and valuable, they have clear limitations. To fully optimize the efficiency of ZKP generation, hardware acceleration is crucial, much like how ASICs and GPUs eventually dominated the BTC and ETH mining markets.

The question then becomes: what is the best hardware for accelerating ZKP generation? Currently, several hardware options are available for ZK acceleration, such as GPUs, FPGAs, or ASICs, each with its own set of advantages and disadvantages.

Comparing GPU, FPGA, and ASIC Hardware

To better understand the differences in development processes across GPU, FPGA, and ASIC hardware, let’s consider a simple example: implementing parallel multiplication.

  • GPU: Using the CUDA SDK, developers can write code that takes advantage of parallel computing, similar to writing native code.
  • FPGA: Developers need to learn a hardware description language (HDL) to control hardware-level connections and implement parallel algorithms.
  • ASIC: The chip’s transistor layout is fixed during the design phase and cannot be modified later.

Each hardware option has its strengths and weaknesses, making them suitable for different stages of ZK technology development. Cysic’s goal is to become the ultimate solution for ZK hardware acceleration, using a phased strategy:

  1. GPU: Develop an SDK to provide solutions for ZK applications and integrate GPU resources across the network.
  2. FPGA: Leverage FPGA’s flexibility to quickly create customized ZK hardware acceleration.
  3. ASIC: Independently develop ASIC-based ZK Depin hardware.
  4. Cysic Network will integrate all the computing power of ZK Depin and GPU as a SAAS platform/mining pool to provide computing power and verification solutions for the entire ZK industry.

Let’s explore these various sub-fields to better understand the distinctions between ZK acceleration solutions and Cysic’s development approach.

ZK Mining Pool and SaaS Platform: Cysic Network

Both Scroll and Polygon zkEVM have proposed the concept of a “decentralized Prover” in their roadmaps, which essentially means building ZK mining pools. This market-driven approach helps ZK Rollup projects reduce their workload while incentivizing miners and mining pool operators to continuously optimize ZK acceleration solutions. Cysic’s roadmap includes the development of a ZK mining pool and SaaS platform called Cysic Network, which will integrate Cysic’s computing power and attract third-party resources through mining incentives, including idle GPUs and consumer-owned zk DePIN devices. The entire verification workflow works as follows:

  1. Task Submission: The zk project team submits a proof generation task to an agent, who forwards the task to the verification network. Initially, these agents will be operated by Cysic, but later, asset staking will allow anyone to become an agent.
  2. Proof Generation: The Prover accepts the task and uses hardware to generate the ZK proof. The Prover must stake tokens to participate and will be rewarded after completing the task.
  3. Validation: The Validator Committee checks the proof’s validity and votes on it. Once a certain number of votes is reached, the proof is considered valid. Validators join the committee by staking tokens, participating in voting, and earning rewards. This process may incorporate EigenLayer’s AVS concept to reuse existing Restaking facilities.

The detailed interaction process is as follows

In this process, certain actions, such as asset staking, incentive distribution, and task submission, require a dedicated platform supported by blockchain infrastructure. To meet this need, Cysic Network has developed a dedicated public chain with a unique consensus algorithm called Proof of Compute (PoC). This algorithm uses the VRF function and the Prover’s historical performance, such as device availability, the number of proofs submitted, proof accuracy, etc., to select block producers responsible for creating blocks (these blocks likely record device information and distribute token incentives). Beyond the ZK mining pool and SaaS platform, Cysic has made extensive deployments in ZK acceleration solutions based on different hardware. Let’s explore Cysic’s achievements in GPU, FPGA, and ASIC technology.

GPU, FPGA, and ASIC: A Comparison

The essence of ZK (Zero-Knowledge) hardware acceleration lies in maximizing the parallelization of key computations. From a hardware perspective, CPUs are designed for maximum flexibility and general-purpose use. However, a significant portion of the CPU’s chip area is dedicated to control functions and various levels of cache, which limits its parallel computing capabilities. In contrast, a larger proportion of a GPU’s chip area is allocated to computation, enabling it to support large-scale parallel processing. GPUs are now widely available, and libraries like Nvidia CUDA allow developers to leverage GPU parallelism without needing deep knowledge of the underlying hardware. The CUDA SDK provides a framework for accelerating MSM (Multi-Scalar Multiplication) and NTT (Number Theoretic Transform) computations using CUDA ZK libraries.

FPGA (Field-Programmable Gate Array) takes a different approach, comprising arrays of numerous small processing units. To program an FPGA, developers must use a specialized hardware description language (HDL), which is then compiled into transistor circuit combinations. Essentially, FPGA implements specific algorithms directly through transistor circuits, bypassing the traditional instruction system’s compilation process. This approach offers much greater customization and flexibility compared to GPUs. Currently, FPGA prices are about one-third of GPU prices, and they can be more than ten times more energy-efficient. This energy efficiency advantage is partly because GPUs need to be connected to a host device, which typically consumes a lot of power. FPGA can add more computing modules to meet the demands of MSM and NTT without increasing energy consumption, making it particularly suitable for ZK proof scenarios that are computationally intensive, require high data throughput, and need low response times. However, the biggest challenge with FPGA is the scarcity of developers with the necessary programming experience. For ZK project teams, assembling a team with both cryptography expertise and FPGA engineering knowledge is extremely challenging.

ASIC (Application-Specific Integrated Circuit) is the most specialized of the three, essentially implementing a program entirely in hardware. Once an ASIC is designed, the hardware configuration is fixed and cannot be altered, meaning it can only perform specific tasks. The advantages of FPGA in accelerating MSM and NTT also apply to ASIC, but because ASIC is designed for a specific application, it offers the highest efficiency and the lowest power consumption among all hardware options. For mainstream ZK circuits today, Cysic aims to achieve proof times of 1-5 seconds, which only ASIC can deliver. While these benefits are highly appealing, ZK technology is rapidly evolving, and ASIC design and production cycles typically take 1-2 years and cost between $10 million and $20 million. Therefore, large-scale production must wait until ZK technology stabilizes to avoid producing chips that quickly become obsolete.

To address these challenges, Cysic has made comprehensive investments in all three hardware categories: GPU, FPGA, and ASIC. In GPU acceleration, Cysic has adapted to the emergence of various new ZK proof systems through its self-developed CUDA acceleration SDK. By consolidating community resources, Cysic has connected tens of thousands of top-tier GPUs into its GPU computing network, achieving speed improvements of 50%-80% or more compared to the latest open-source frameworks. In the FPGA space, Cysic has developed solutions that set global performance benchmarks for MSM, NTT, and Poseidon Merkle tree modules, covering the most critical components of ZK computation. These solutions have been prototype-tested and validated by several leading ZK projects. Cysic’s proprietary SolarMSM can complete 2^30-scale MSM computations in just 0.195 seconds, while SolarNTT can perform 2^30-scale NTT computations in 0.218 seconds, making them the highest-performing FPGA hardware acceleration results currently available.

In the ASIC field, while widespread adoption of ZK ASICs may still be some time away, Cysic has already positioned itself in this emerging market by developing its own ZK DePIN chips and devices. To appeal to consumer users and meet the diverse performance and cost requirements of different ZK projects, Cysic plans to introduce two ZK hardware products: ZK Air and ZK Pro.

  • ZK Air: This device is compact, similar in size to a power bank or a laptop charger, allowing everyday users to connect it via a Type-C interface to laptops, iPads, or even smartphones. It provides computational support for specific ZK projects while earning rewards for the user. Despite its small size, ZK Air’s computing power exceeds that of consumer-grade GPUs, making it capable of accelerating smaller-scale ZK proof generation tasks.
  • ZK Pro: Designed for more intensive applications, ZK Pro resembles traditional mining rigs and offers computing power equivalent to a multi-GPU server. It significantly speeds up ZK proof generation for large-scale projects such as ZK-Rollup and ZKML (Zero-Knowledge Machine Learning).

Through these two devices, Cysic aims to create a stable and reliable ZK-DePIN network. Both ZK Air and ZK Pro are currently under development, with an anticipated release in 2025. Additionally, the Cysic Network will enable consumer users to enter the ZK hardware acceleration market with very low barriers to entry. Coupled with the high demand for computational power from ZK project teams, this could ignite a new wave of enthusiasm similar to the Bitcoin mining boom, potentially leading to explosive growth in the ZK computing market.

Reference

https://medium.com/amber-group/need-for-speed-zero-knowledge-1e29d4a82fcdhttps://figmentcapital.medium.com/accelerating-zero-knowledge-proofs-cfc806de611b

Disclaimer:

  1. This article is reprinted from Geek Web3. The copyright belongs to the original authors, [Nickqiao & Wuyue]. If there are any objections to the reprint, please contact the Gate Learn team, and the team will process it promptly according to the relevant procedures.
  2. Disclaimer: The views and opinions expressed in this article are solely those of the authors and do not constitute any investment advice.
  3. Other language versions of the article have been translated by the Gate Learn team. The translated articles may not be copied, distributed, or plagiarized without mentioning Gate.io.
Start Now
Sign up and get a
$100
Voucher!