In-Depth Analysis: What Kind of Sparks Can AI and Web3 Create?

AdvancedJun 08, 2024
This article explores the rapid development of Artificial Intelligence (AI) and Web3 technologies and the potential value and impact of their integration. AI excels in enhancing productivity, while Web3 transforms production relationships through decentralization. The combination of these technologies brings innovative applications in data analysis, personalized user services, and security and privacy protection.
In-Depth Analysis: What Kind of Sparks Can AI and Web3 Create?

Introduction: Development of AI+Web3

In recent years, the rapid development of Artificial Intelligence (AI) and Web3 technologies has garnered widespread global attention. AI, a technology that simulates and mimics human intelligence, has achieved significant breakthroughs in fields such as facial recognition, natural language processing, and machine learning. The swift advancement of AI technology has brought tremendous transformation and innovation across various industries.

The AI industry reached a market size of $200 billion in 2023, with industry giants and prominent players like OpenAI, Character.AI, and Midjourney emerging rapidly and leading the AI boom.

Simultaneously, Web3, an emerging internet model, is gradually changing our perception and usage of the internet. Based on decentralized blockchain technology, Web3 realizes data sharing and control, user autonomy, and the establishment of trust mechanisms through features like smart contracts, distributed storage, and decentralized identity verification. The core idea of Web3 is to liberate data from centralized authorities, granting users control over and the ability to share the value of their data.

Currently, the market value of the Web3 industry has reached $25 trillion. From Bitcoin, Ethereum, and Solana to application-level players like Uniswap and Stepn, new narratives and scenarios are continuously emerging, attracting more and more people to join the Web3 industry.

It is evident that the integration of AI and Web3 is a focal point for builders and venture capitalists from both the East and the West. Exploring how to effectively combine these two technologies is a highly worthwhile endeavor.

This article will focus on the current state of AI+Web3 development, exploring the potential value and impact of their integration. We will first introduce the basic concepts and characteristics of AI and Web3, then discuss their interrelationship. Following this, we will analyze the current state of AI+Web3 projects and delve into the limitations and challenges they face. Through this research, we aim to provide valuable references and insights for investors and industry professionals.

How AI Interacts with Web3

The development of AI and Web3 can be seen as two sides of a scale: AI brings productivity enhancements, while Web3 revolutionizes production relationships. So, what kind of sparks can AI and Web3 create when they collide? We will first analyze the challenges and potential improvements in the AI and Web3 industries, and then explore how they can help solve each other’s problems.

  1. Challenges and Potential Improvements in the AI Industry
  2. Challenges and Potential Improvements in the Web3 Industry

2.1 Challenges in the AI Industry

To explore the challenges faced by the AI industry, we must first understand its essence. The core of the AI industry revolves around three key elements: computational power, algorithms, and data.

  1. First, computational power: Computational power refers to the ability to perform large-scale computations and processing. AI tasks typically require handling large amounts of data and performing complex computations, such as training deep neural network models. High computational power can accelerate model training and inference processes, enhancing the performance and efficiency of AI systems. In recent years, advancements in hardware technology, such as graphics processing units (GPUs) and dedicated AI chips (like TPUs), have significantly boosted computational power, driving the development of the AI industry. Nvidia, a major GPU provider, has seen its stock price soar in recent years, capturing a large market share and earning substantial profits.
  2. What is an algorithm: Algorithms are the core components of AI systems. They are mathematical and statistical methods used to solve problems and perform tasks. AI algorithms can be categorized into traditional machine learning algorithms and deep learning algorithms, with deep learning algorithms having made significant breakthroughs in recent years. The choice and design of algorithms are crucial for the performance and effectiveness of AI systems. Continuous improvement and innovation in algorithms can enhance the accuracy, robustness, and generalization capabilities of AI systems. Different algorithms yield different results, so advancements in algorithms are essential for task performance.
  3. Why data is important: The core task of AI systems is to extract patterns and rules from data through learning and training. Data forms the foundation for training and optimizing models. With large-scale data samples, AI systems can learn more accurate and intelligent models. Rich datasets provide comprehensive and diverse information, enabling models to generalize better to unseen data and helping AI systems better understand and solve real-world problems.

After understanding the three core elements of current AI, let’s examine the difficulties and challenges AI faces in these areas.

First, in terms of computational power, AI tasks usually require a large amount of computational resources for model training and inference, especially for deep learning models. However, obtaining and managing large-scale computational power is an expensive and complex challenge. The cost, energy consumption, and maintenance of high-performance computing equipment are significant issues. This is particularly challenging for startups and individual developers, for whom acquiring sufficient computational power can be difficult.

In terms of algorithms, despite the significant successes of deep learning algorithms in many fields, there are still challenges and difficulties. For instance, training deep neural networks requires a large amount of data and computational resources. Additionally, for certain tasks, the interpretability and explainability of models may be insufficient. The robustness and generalization capabilities of algorithms are also crucial issues, as model performance on unseen data can be unstable. Finding the best algorithm to provide optimal performance among many algorithms is an ongoing exploration.

In terms of data, data is the driving force behind AI, but obtaining high-quality, diverse data remains a challenge. Data in some fields, such as sensitive health data in the medical sector, can be difficult to obtain. Furthermore, the quality, accuracy, and labeling of data are critical issues, as incomplete or biased data can lead to erroneous model behavior or biases. Protecting data privacy and security is also a significant consideration.

Moreover, there are issues related to interpretability and transparency. The “black box” nature of AI models is a public concern. In certain applications, such as finance, healthcare, and justice, the decision-making process of models needs to be interpretable and traceable. However, existing deep learning models often lack transparency. Explaining the decision-making process of models and providing reliable explanations remain challenging.

Additionally, the business models of many AI startup projects are not very clear, which also causes confusion for many AI entrepreneurs.

2.2 Challenges in the Web3 Industry

In the Web3 industry, there are numerous challenges that need to be addressed, spanning from data analysis and user experience to smart contract vulnerabilities and hacker attacks. AI, as a tool to enhance productivity, holds significant potential in these areas.

Firstly, there’s room for improvement in data analysis and predictive capabilities. AI applications in data analysis and prediction have had a significant impact on the Web3 industry. Through intelligent analysis and mining by AI algorithms, Web3 platforms can extract valuable information from vast amounts of data and make more accurate predictions and decisions. This is particularly significant for risk assessment, market forecasting, and asset management in decentralized finance (DeFi).

Additionally, there’s the potential for enhancing user experience and personalization services. AI applications enable Web3 platforms to offer better user experiences and personalized services. By analyzing and modeling user data, Web3 platforms can provide personalized recommendations, customized services, and intelligent interaction experiences. This helps increase user engagement and satisfaction, fostering the development of the Web3 ecosystem. For instance, many Web3 protocols integrate AI tools like ChatGPT to better serve users.

In terms of security and privacy protection, AI applications also have a profound impact on the Web3 industry. AI technology can be used to detect and defend against network attacks, identify abnormal behavior, and provide stronger security measures. Additionally, AI can be applied to data privacy protection, safeguarding users’ personal information on Web3 platforms through techniques like data encryption and privacy computing. Regarding smart contract auditing, as vulnerabilities and security risks may exist in the writing and auditing processes of smart contracts, AI technology can be used for automated contract auditing and vulnerability detection, enhancing the security and reliability of contracts.

It’s evident that AI can contribute significantly to addressing the challenges and potential improvements in the Web3 industry across various aspects.

Analysis of Current Situation of AI+Web3 Project

Combining AI and Web3 projects primarily focuses on two main aspects: leveraging blockchain technology to enhance AI projects and utilizing AI technology to serve the improvement of Web3 projects. Numerous projects have emerged along this path, including Io.net, Gensyn, Ritual, among others. The following analysis will delve into different subdomains where AI aids Web3 and where Web3 enhances AI.

3.1 Web3 Helps AI

3.1.1 Decentralized Computing Power

Since the launch of ChatGPT by OpenAI at the end of 2022, it ignited a frenzy in the AI field. Within five days of its release, the user base reached one million, surpassing the download rate of Instagram, which took approximately two and a half months to reach the same milestone. Subsequently, ChatGPT experienced rapid growth, with monthly active users reaching 100 million within two months and weekly active users reaching 100 million by November 2023. With the advent of ChatGPT, the AI sector rapidly transitioned from a niche field to a highly regarded industry.

According to Trendforce’s report, ChatGPT requires 30,000 NVIDIA A100 GPUs to operate, and future models like GPT-5 will require even more computational power. This has sparked an arms race among various AI companies, as possessing sufficient computational power is crucial for maintaining a competitive edge in the AI arena, leading to a shortage of GPUs.

Prior to the rise of AI, the major GPU provider, NVIDIA, primarily served clients from the three major cloud services: AWS, Azure, and GCP. With the rise of artificial intelligence, numerous new buyers emerged, including major tech companies like Meta, Oracle, as well as other data platforms and AI startups, all joining the race to stockpile GPUs for training AI models. Large tech companies like Meta and Tesla significantly increased their purchases of customized AI models and internal research. Basic model companies like Anthropic and data platforms like Snowflake and Databricks also purchased more GPUs to assist their clients in providing AI services.

As mentioned by Semi Analysis last year, there exists a divide between “GPU rich” and “GPU poor” companies, with only a few possessing over 20,000 A100/H100 GPUs, allowing team members to utilize between 100 to 1000 GPUs for projects. These companies are either cloud providers or have built their own large language models (LLMs), including OpenAI, Google, Meta, Anthropic, Inflection, Tesla, Oracle, Mistral, among others.

However, the majority of companies fall into the “GPU poor” category, struggling with significantly fewer GPUs and spending a considerable amount of time and effort on tasks that are more difficult to advance the ecosystem. Moreover, this situation is not limited to startups. Some of the most well-known AI companies, such as Hugging Face, Databricks (MosaicML), Together, and even Snowflake, have A100/H100 quantities fewer than 20,000. Despite having world-class technical talent, these companies are constrained by the limited GPU supply, placing them at a disadvantage compared to larger companies in the AI competition.

This shortage is not limited to the “GPU poor” category; even by the end of 2023, the leading AI player, OpenAI, had to temporarily close paid registrations due to the inability to obtain sufficient GPUs and had to procure more GPU supplies.

It’s evident that the rapid development of AI has led to a serious mismatch between the demand and supply of GPUs, creating an imminent supply shortage.

To address this issue, some Web3 projects have begun to explore decentralized computing power solutions, leveraging the unique characteristics of Web3 technology. These projects include Akash, Render, Gensyn, among others. The common feature among these projects is the use of tokens to incentivize users to provide idle GPU computing power, thereby becoming the supply side of computing power to support AI clients.

The supply side profile mainly consists of three aspects: cloud service providers, cryptocurrency miners, and enterprises. Cloud service providers include major cloud service providers (such as AWS, Azure, GCP) and GPU cloud service providers (such as Coreweave, Lambda, Crusoe), where users can resell idle computing power from these providers to generate income. With Ethereum transitioning from PoW to PoS, idle GPU computing power has become an important potential supply side for cryptocurrency miners. Additionally, large enterprises like Tesla and Meta, which have purchased large quantities of GPUs for strategic purposes, can also contribute their idle GPU computing power as part of the supply side.

Currently, players in this field can generally be divided into two categories: those using decentralized computing power for AI inference and those using it for AI training. The former category includes projects like Render (although focused on rendering, it can also be used for AI computing), Akash, Aethir, while the latter category includes projects like io.net (supporting both inference and training) and Gensyn. The key difference between the two lies in the different requirements for computing power.

Let’s first discuss the projects focusing on AI inference. These projects attract users to provide computing power through token incentives and then provide computing power network services to the demand side, thereby facilitating the matching of idle computing power supply and demand. Details about such projects are covered in a research report by DePIN from our Ryze Labs, feel free to read them.

The core point lies in the token incentive mechanism, where the project first attracts suppliers and then users, thereby achieving the cold start and core operation mechanism of the project, enabling further expansion and development. In this cycle, the supply side receives more valuable token rewards, while the demand side enjoys more cost-effective services. The value of the project’s tokens and the growth of both supply and demand participants remain consistent. As the token price rises, more participants and speculators are attracted, creating a value capture loop.

Another category involves using decentralized computing power for AI training, such as Gensyn and io.net (which support both AI training and inference). In fact, the operational logic of these projects is not fundamentally different from AI inference projects. They still rely on token incentives to attract participation from the supply side to provide computing power, which is then utilized by the demand side.

io.net, as a decentralized computing power network, currently boasts over 500,000 GPUs, making it a standout performer in the decentralized computing power projects. Additionally, it has integrated computing power from Render and Filecoin, demonstrating continuous development of its ecosystem.

Furthermore, Gensyn facilitates machine learning task allocation and rewards through smart contracts to enable AI training. As illustrated in the diagram below, the hourly cost of machine learning training work in Gensyn is approximately $0.4, which is significantly lower than the cost of over $2 on AWS and GCP.

The Gensyn ecosystem involves four participating entities:

  • Submitters: These are the demand-side users who consume tasks and pay for AI training tasks.
  • Executors: Executors carry out the tasks of model training and provide proofs of task completion for verification.
  • Verifiers: Verifiers connect the non-deterministic training process with deterministic linear computation. They compare the proofs provided by executors with the expected thresholds.
  • Reporters: Reporters inspect the work of verifiers and raise challenges to earn rewards upon identifying issues.

As we can see, Gensyn aims to become a massively scalable and cost-effective computing protocol for global deep-learning models. However, looking at this field, why do most projects choose decentralized computing power for AI inference rather than training?

Let’s also help friends who are not familiar with AI training and inference understand the difference between the two:

  • AI Training: If we liken artificial intelligence to a student, then training is similar to providing the AI with a large amount of knowledge and examples, which can be understood as data. The AI learns from these examples. Because learning involves understanding and memorizing a large amount of information, this process requires a significant amount of computational power and time.
  • AI Inference: So what is the inference? It can be understood as using the knowledge learned to solve problems or take exams. During inference, artificial intelligence uses the learned knowledge to provide answers, rather than acquiring new knowledge. Therefore, the computational requirements for the inference process are relatively small.

It can be seen that the computational power requirements for both AI inference and AI training differ significantly. The availability of decentralized computing power for AI inference and AI training will be further analyzed in the upcoming challenge section.

Furthermore, Ritual aims to combine distributed networks with model creators to maintain decentralization and security. Its first product, Infernet, enables smart contracts on the blockchain to access AI models off-chain, allowing such contracts to access AI in a way that maintains verification, decentralization, and privacy protection.

The coordinator of Infernet is responsible for managing the behavior of nodes in the network and responding to computational requests from consumers. When users use Infernet, tasks such as inference and proof are performed off-chain, with the output returned to the coordinator and ultimately transmitted to consumers on-chain via smart contracts.

In addition to decentralized computing power networks, there are also decentralized bandwidth networks like Grass, which aim to improve the speed and efficiency of data transmission. Overall, the emergence of decentralized computing power networks provides a new possibility for the supply side of AI computing power, driving AI forward in new directions.

3.1.2 Decentralized algorithm model

Just like mentioned in the second chapter, the three core elements of AI are computational power, algorithms, and data. Since computational power can form a supply network through decentralization, can algorithms also follow a similar approach and form a supply network for algorithm models?

Before analyzing projects in this field, let’s first understand the significance of decentralized algorithm models. Many people may wonder, since we already have OpenAI, why do we need a decentralized algorithm network?

Essentially, a decentralized algorithm network is a decentralized AI algorithm service marketplace that connects many different AI models. Each AI model has its own expertise and skills. When users pose questions, the marketplace selects the most suitable AI model to answer the question. Chat-GPT, developed by OpenAI, is one such AI model that can understand and generate text similar to humans.

In simple terms, ChatGPT is like a highly capable student helping to solve different types of problems, while a decentralized algorithm network is like a school with many students helping to solve problems. Although the current student (ChatGPT) is highly capable, in the long run, there is great potential for a school that can recruit students from around the globe.

Currently, in the field of decentralized algorithm models, there are also some projects that are experimenting and exploring. Next, we will use the representative project Bittensor as a case study to help understand the development of this niche field.

In Bittensor, the supply side of algorithm models (or miners) contribute their machine learning models to the network. These models can analyze data and provide insights. Model providers receive cryptocurrency tokens, known as TAO, as rewards for their contributions.

To ensure the quality of answers, Bittensor uses a unique consensus mechanism to reach a consensus on the best answer. When a question is posed, multiple model miners provide answers. Then, validators in the network start working to determine the best answer, which is then sent back to the user.

The token TAO in the Bittensor ecosystem plays two main roles throughout the process. On one hand, it serves as an incentive for miners to contribute algorithm models to the network. On the other hand, users need to spend tokens to ask questions and have the network complete tasks.

Because Bittensor is decentralized, anyone with internet access can join the network, either as a user asking questions or as a miner providing answers. This allows more people to harness the power of artificial intelligence.

In summary, decentralized algorithm model networks like Bittensor have the potential to create a more open and transparent landscape. In this ecosystem, AI models can be trained, shared, and utilized in a secure and decentralized manner. Additionally, other networks like BasedAI are attempting similar endeavors, with the intriguing aspect of using Zero-Knowledge Proofs (ZK) to protect user-model interactive data privacy, which will be further discussed in the fourth subsection.

As decentralized algorithm model platforms evolve, they will enable small companies to compete with large organizations in using cutting-edge AI tools, potentially having significant impacts across various industries.

3.1.3 Decentralized Data Collection

For the training of AI models, a large supply of data is indispensable. However, most Web2 companies currently still monopolize user data. Platforms like X, Reddit, TikTok, Snapchat, Instagram, and YouTube prohibit data collection for AI training, which poses a significant obstacle to the development of the AI industry.

On the other hand, some Web2 platforms sell user data to AI companies without sharing any profits with the users. For instance, Reddit reached a $60 million agreement with Google, allowing Google to train AI models using its posts. This results in data collection rights being monopolized by major capital and big data companies, pushing the industry towards a capital-intensive direction.

In response to this situation, some projects are leveraging Web3 and token incentives to achieve decentralized data collection. Take PublicAI as an example: users can participate in two roles:

  • One category is AI data providers. Users can find valuable content on X, tag @PublicAI official account with their insights, and use hashtags #AI or #Web3 to categorize the content, thereby sending it to the PublicAI data center for collection.
  • The other category is data validators. Users can log into the PublicAI data center and vote on the most valuable data for AI training.

As a reward, users can earn tokens through these contributions, fostering a win-win relationship between data contributors and the AI industry.

In addition to projects like PublicAI, which specifically collect data for AI training, there are many other projects using token incentives for decentralized data collection. For example, Ocean collects user data through data tokenization to serve AI, Hivemapper uses users’ car cameras to collect map data, Dimo collects car data, and WiHi collects weather data. These projects, through decentralized data collection, also serve as potential data sources for AI training. Thus, in a broad sense, they can be included in the paradigm of Web3 aiding AI.

3.1.4 ZK Protects User Privacy in AI

Blockchain technology offers decentralization benefits and also introduces a crucial feature: zero-knowledge proofs. Zero-knowledge technology allows for information verification while maintaining privacy.

In traditional machine learning, data typically needs to be stored and processed centrally, which can lead to privacy risks. Methods to protect data privacy, such as encryption or data anonymization, may limit the accuracy and performance of machine learning models.

Zero-knowledge proof technology helps resolve this dilemma by addressing the conflict between privacy protection and data sharing. Zero-Knowledge Machine Learning (ZKML) uses zero-knowledge proof technology to enable machine learning model training and inference without exposing the original data. Zero-knowledge proofs ensure that the features of the data and the results of the model can be verified as correct without revealing the actual data content.

The core goal of ZKML is to balance privacy protection and data sharing. It can be applied in various scenarios such as healthcare data analysis, financial data analysis, and cross-organizational collaboration. By using ZKML, individuals can protect the privacy of their sensitive data while sharing data with others to gain broader insights and collaborative opportunities without the risk of data privacy breaches. This field is still in its early stages, with most projects still under exploration. For example, BasedAI proposes a decentralized approach by seamlessly integrating Fully Homomorphic Encryption (FHE) with Large Language Models (LLMs) to maintain data confidentiality. Zero-Knowledge Large Language Models (ZK-LLMs) embed privacy into their distributed network infrastructure, ensuring that user data remains confidential throughout the network’s operation.

Here’s a brief explanation of Fully Homomorphic Encryption (FHE). FHE is an encryption technique that allows computations to be performed on encrypted data without needing to decrypt it. This means that various mathematical operations (such as addition, multiplication, etc.) performed on FHE-encrypted data yield the same results as if they were performed on the original unencrypted data, thereby protecting user data privacy.

In addition to the aforementioned methods, Web3 also supports AI through projects like Cortex, which enables on-chain execution of AI programs. Running machine learning programs on traditional blockchains faces a challenge as virtual machines are highly inefficient at running any non-trivial machine learning models. Most believe running AI on the blockchain is impossible. However, the Cortex Virtual Machine (CVM) utilizes GPUs to execute AI programs on-chain and is compatible with the Ethereum Virtual Machine (EVM). In other words, the Cortex chain can execute all Ethereum DApps and integrate AI machine learning into these DApps. This allows machine learning models to run in a decentralized, immutable, and transparent manner, with network consensus verifying each step of AI inference.

3.2 AI Helps Web3

In the collision between AI and Web3, in addition to Web3’s assistance to AI, AI’s assistance to the Web3 industry is also worthy of attention. The core contribution of artificial intelligence is the improvement of productivity, so there are many attempts in AI auditing smart contracts, data analysis and prediction, personalized services, security and privacy protection, etc.

3.2.1 Data Analysis and Prediction

Many Web3 projects are integrating existing AI services (like ChatGPT) or developing their own to provide data analysis and prediction services for Web3 users. These services cover a broad range, including AI algorithms for investment strategies, on-chain analysis tools, and price and market forecasts.

For example, Pond uses AI graph algorithms to predict valuable future alpha tokens, offering investment advisory services to users and institutions. BullBear AI trains on user historical data, price history, and market trends to provide accurate information supporting price trend predictions, helping users achieve profits.

Platforms like Numerai host investment competitions where participants use AI and large language models to predict stock markets. They train models on high-quality data provided by the platform and submit daily predictions. Numerai evaluates these predictions over the following month, and participants can stake NMR tokens on their models to earn rewards based on performance.

Arkham, a blockchain data analysis platform, also integrates AI into its services. Arkham links blockchain addresses to entities like exchanges, funds, and whales, displaying key data and analyses to give users a decision-making edge. Arkham Ultra matches addresses to real-world entities using algorithms developed over three years with support from Palantir and OpenAI founders.

3.2.2 Personalized Services

AI applications in search and recommendation are prevalent in Web2 projects, serving users’ personalized needs. Web3 projects similarly integrate AI to enhance user experience.

For instance, the well-known data analysis platform Dune recently introduced the Wand tool, which uses large language models to write SQL queries. With Wand Create, users can generate SQL queries from natural language questions, making it easy for those unfamiliar with SQL to search data.

Content platforms like Followin integrate ChatGPT to summarize viewpoints and updates in specific sectors. The Web3 encyclopedia IQ.wiki aims to be the primary source of objective, high-quality knowledge on blockchain technology and cryptocurrency. It integrates GPT-4 to summarize wiki articles, making blockchain information more accessible worldwide. The LLM-based search engine Kaito aims to revolutionize Web3 information retrieval.

In the creative domain, projects like NFPrompt reduce the cost of content creation. NFPrompt allows users to generate NFTs more easily with AI, providing various personalized creative services.

3.2.3 AI Auditing Smart Contracts

Auditing smart contracts is a crucial task in Web3, and AI can enhance efficiency and accuracy in identifying code vulnerabilities.

Vitalik Buterin has noted that one of the biggest challenges in the cryptocurrency space is errors in our code. AI holds the promise of significantly simplifying the use of formal verification tools to prove code correctness. Achieving this could lead to a nearly error-free SEK EVM (Ethereum Virtual Machine), enhancing space security as fewer errors increase overall safety.

For example, the 0x0.ai project offers an AI-powered smart contract auditor. This tool uses advanced algorithms to analyze smart contracts and identify potential vulnerabilities or issues that could lead to fraud or other security risks. Auditors use machine learning to detect patterns and anomalies in the code, flagging potential problems for further review.

There are other native cases where AI aids Web3. PAAL helps users create personalized AI bots that can be deployed on Telegram and Discord to serve Web3 users. The AI-driven multi-chain DEX aggregator Hera uses AI to provide the best trading paths between any token pairs across various tokens. Overall, AI’s contribution to Web3 is primarily at the tool level, enhancing various processes and functionalities.

Limitations And Current Challenges Of The AI + Web3 Project

4.1 Realistic Obstacles In Decentralized Computing Power

Currently, many Web3 projects assisting AI are focusing on decentralized computing power. Using token incentives to promote global users to become part of the computing power supply side is a very interesting innovation. However, on the other hand, there are some realistic issues that need to be addressed:

Compared to centralized computing power service providers, decentralized computing power products typically rely on nodes and participants distributed globally to provide computing resources. Due to possible latency and instability in network connections between these nodes, performance, and stability might be worse than centralized computing power products.

In addition, the availability of decentralized computing power products is affected by the matching degree between supply and demand. If there are not enough suppliers or if the demand is too high, it may lead to a shortage of resources or an inability to meet user needs.

Finally, compared to centralized computing power products, decentralized computing power products usually involve more technical details and complexity. Users might need to understand and handle aspects of distributed networks, smart contracts, and cryptocurrency payments, which increases the cost of user understanding and usage.

After in-depth discussions with numerous decentralized computing power project teams, it was found that the current decentralized computing power is still mostly limited to AI inference rather than AI training.

Next, I will use four questions to help everyone understand the reasons behind this:

  1. Why do most decentralized computing power projects choose to do AI inference rather than AI training?

  2. What makes NVIDIA so powerful? What are the reasons that decentralized computing power training is difficult?

  3. What will be the endgame for decentralized computing power (Render, Akash, io.net, etc.)?

  4. What will be the endgame for decentralized algorithms (Bittensor)?

Let’s delve into the details step by step:

1) Observing this field, most decentralized computing power projects choose to focus on AI inference rather than training, primarily due to different requirements for computing power and bandwidth.

To help everyone better understand, let’s compare AI to a student:

  • AI Training: If we compare artificial intelligence to a student, training is similar to providing the AI with a large amount of knowledge and examples, akin to what we often refer to as data. The AI learns from these examples. Since learning involves understanding and memorizing vast amounts of information, this process requires substantial computing power and time.

  • AI Inference: Inference can be understood as using the acquired knowledge to solve problems or take exams. During inference, the AI utilizes the learned knowledge to answer questions rather than acquiring new information, hence the computational requirements are relatively lower.

It’s easy to see that the fundamental difference in difficulty lies in the fact that large model AI training requires enormous data volumes and extremely high bandwidth for data transmission, making it very challenging to achieve with decentralized computing power. In contrast, inference requires much less data and bandwidth, making it more feasible.

For large models, stability is crucial. If training is interrupted, it must restart, resulting in high sunk costs. On the other hand, demands with relatively lower computing power requirements, such as AI inference or certain specific scenarios involving medium to small model training, can be achieved. In decentralized computing power networks, some relatively large node service providers can cater to these relatively higher computing power demands.

2) So, where are the bottlenecks in data and bandwidth? Why is decentralized training hard to achieve?

This involves two key elements of large model training: single-card computing power and multi-card parallelism.

Single-card computing power: Currently, all centers requiring large model training, referred to as supercomputing centers, can be compared to the human body, where the underlying unit, the GPU, is like a cell. If the computing power of a single cell (GPU) is strong, then the overall computing power (single cell × quantity) can also be very strong.

Multi-card parallelism: Training a large model often involves hundreds of billions of gigabytes. For supercomputing centers that train large models, at least tens of thousands of A100 GPUs are required. This necessitates mobilizing thousands of cards for training. However, training a large model is not a simple serial process; it doesn’t just train on the first A100 card and then move to the second. Instead, different parts of the model are trained on different GPUs simultaneously, and training part A might require results from part B, involving parallel processing.

NVIDIA’s dominance and rising market value, while AMD and domestic companies like Huawei and Horizon find it hard to catch up, stem from two aspects: the CUDA software environment and NVLink multi-card communication.

CUDA Software Environment: Whether there is a software ecosystem to match the hardware is crucial, like NVIDIA’s CUDA system. Building a new system is challenging, akin to creating a new language with high replacement costs.

NVLink Multi-card Communication: Essentially, multi-card communication involves the input and output of information. How to parallelize and transmit is crucial. NVLink’s presence means NVIDIA and AMD cards cannot communicate; additionally, NVLink limits the physical distance between GPUs, requiring them to be in the same supercomputing center. This makes it difficult for decentralized computing power spread across the globe to form a cohesive computing cluster for large model training.

The first point explains why AMD and domestic companies like Huawei and Horizon struggle to catch up; the second point explains why decentralized training is hard to achieve.

3) What will be the endgame for decentralized computing power? Decentralized computing power currently struggles with large model training because stability is paramount. Interruptions necessitate retraining, resulting in high sunk costs. The high requirements for multi-card parallelism are limited by physical bandwidth constraints. NVIDIA’s NVLink achieves multi-card communication, but within a supercomputing center, NVLink limits the physical distance between GPUs. Thus, dispersed computing power cannot form a computing cluster for large model training.

However, for demands with relatively lower computing power requirements, such as AI inference or certain specific scenarios involving medium to small model training, decentralized computing power networks with some relatively large node service providers have potential. Additionally, scenarios like edge computing for rendering are relatively easier to implement.

4) What will be the endgame for decentralized algorithm models? The future of decentralized algorithm models depends on the ultimate direction of AI. I believe the future of AI might feature 1-2 closed-source model giants (like ChatGPT) alongside a plethora of models. In this context, application-layer products do not need to bind to a single large model but rather cooperate with multiple large models. In this scenario, the model of Bittensor shows significant potential.

4.2 The Integration of AI and Web3 Is Rough, Failing to Achieve 1+1>2

In current projects combining Web3 and AI, particularly those where AI aids Web3 initiatives, most projects merely use AI superficially without demonstrating a deep integration between AI and cryptocurrencies. This superficial application is evident in the following two aspects:

  • First, whether using AI for data analysis and prediction, in recommendation and search scenarios, or for code auditing, there is little difference compared to the integration of AI in Web2 projects. These projects simply leverage AI to enhance efficiency and analysis without showcasing a native fusion of AI and cryptocurrencies or presenting innovative solutions.
  • Second, many Web3 teams incorporate AI more as a marketing gimmick, purely capitalizing on the AI concept. They apply AI technology in very limited areas and then start promoting the trend of AI, creating a facade of close integration with AI. However, these projects lack substantial innovation.

Although current Web3 and AI projects have these limitations, we should recognize that this is only the early stage of development. In the future, we can expect more in-depth research and innovation to achieve a tighter integration between AI and cryptocurrencies, creating more native and meaningful solutions in areas such as finance, decentralized autonomous organizations (DAOs), prediction markets, and NFTs.

4.3 Token Economics Serve as a Buffer for AI Project Narratives

As mentioned initially, AI projects face challenges in their business models, especially as more and more large models are gradually becoming open source. Many AI + Web3 projects, often pure AI projects struggling to thrive and secure funding in the Web2 space, choose to overlay narratives and token economics from Web3 to encourage user participation.

However, the crucial question is whether the integration of token economics truly helps AI projects address real-world needs or if it simply serves as a narrative or short-term value proposition. Currently, most AI + Web3 projects are far from reaching a practical stage. It is hoped that more grounded and thoughtful teams will not only use tokens as a means to hype AI projects but also genuinely fulfill practical use cases.

Summary

Currently, numerous cases and applications have emerged in AI + Web3 projects. Firstly, AI technology can provide more efficient and intelligent use cases for Web3. Through AI’s capabilities in data analysis and prediction, users in Web3 can have better tools for investment decisions and other scenarios. Additionally, AI can audit smart contract code, optimize contract execution, and enhance the performance and efficiency of blockchain. Moreover, AI technology can offer more precise and intelligent recommendations and personalized services for decentralized applications, thus improving user experience.

At the same time, the decentralized and programmable nature of Web3 also presents new opportunities for AI technology. Through token incentives, decentralized computing power projects provide new solutions to the dilemma of insufficient AI computing power. The smart contracts and distributed storage mechanisms of Web3 also offer a broader space and resources for AI algorithm sharing and training. The user autonomy and trust mechanisms of Web3 also bring new possibilities for AI development, allowing users to autonomously choose to participate in data sharing and training, thereby enhancing the diversity and quality of data and further improving the performance and accuracy of AI models.

Although current AI + Web3 cross-over projects are still in their early stages and face many challenges, they also bring many advantages. For example, decentralized computing power products have some drawbacks, but they reduce reliance on centralized institutions, provide greater transparency and auditability, and enable broader participation and innovation. For specific use cases and user needs, decentralized computing power products may be a valuable choice. The same applies to data collection; decentralized data collection projects offer advantages such as reducing reliance on single data sources, providing broader data coverage, and promoting data diversity and inclusivity. In practice, it is necessary to balance these advantages and disadvantages and take appropriate management and technical measures to overcome challenges, ensuring that decentralized data collection projects have a positive impact on AI development.

Overall, the integration of AI + Web3 offers infinite possibilities for future technological innovation and economic development. By combining the intelligent analysis and decision-making capabilities of AI with the decentralized and user-autonomous nature of Web3, it is believed that we can build a smarter, more open, and fairer economic and even social system.

Disclaimer:

  1. This article is reprinted from [Ryze Labs]. All copyrights belong to the original author [Fred]. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
  2. Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
  3. Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.

In-Depth Analysis: What Kind of Sparks Can AI and Web3 Create?

AdvancedJun 08, 2024
This article explores the rapid development of Artificial Intelligence (AI) and Web3 technologies and the potential value and impact of their integration. AI excels in enhancing productivity, while Web3 transforms production relationships through decentralization. The combination of these technologies brings innovative applications in data analysis, personalized user services, and security and privacy protection.
In-Depth Analysis: What Kind of Sparks Can AI and Web3 Create?

Introduction: Development of AI+Web3

In recent years, the rapid development of Artificial Intelligence (AI) and Web3 technologies has garnered widespread global attention. AI, a technology that simulates and mimics human intelligence, has achieved significant breakthroughs in fields such as facial recognition, natural language processing, and machine learning. The swift advancement of AI technology has brought tremendous transformation and innovation across various industries.

The AI industry reached a market size of $200 billion in 2023, with industry giants and prominent players like OpenAI, Character.AI, and Midjourney emerging rapidly and leading the AI boom.

Simultaneously, Web3, an emerging internet model, is gradually changing our perception and usage of the internet. Based on decentralized blockchain technology, Web3 realizes data sharing and control, user autonomy, and the establishment of trust mechanisms through features like smart contracts, distributed storage, and decentralized identity verification. The core idea of Web3 is to liberate data from centralized authorities, granting users control over and the ability to share the value of their data.

Currently, the market value of the Web3 industry has reached $25 trillion. From Bitcoin, Ethereum, and Solana to application-level players like Uniswap and Stepn, new narratives and scenarios are continuously emerging, attracting more and more people to join the Web3 industry.

It is evident that the integration of AI and Web3 is a focal point for builders and venture capitalists from both the East and the West. Exploring how to effectively combine these two technologies is a highly worthwhile endeavor.

This article will focus on the current state of AI+Web3 development, exploring the potential value and impact of their integration. We will first introduce the basic concepts and characteristics of AI and Web3, then discuss their interrelationship. Following this, we will analyze the current state of AI+Web3 projects and delve into the limitations and challenges they face. Through this research, we aim to provide valuable references and insights for investors and industry professionals.

How AI Interacts with Web3

The development of AI and Web3 can be seen as two sides of a scale: AI brings productivity enhancements, while Web3 revolutionizes production relationships. So, what kind of sparks can AI and Web3 create when they collide? We will first analyze the challenges and potential improvements in the AI and Web3 industries, and then explore how they can help solve each other’s problems.

  1. Challenges and Potential Improvements in the AI Industry
  2. Challenges and Potential Improvements in the Web3 Industry

2.1 Challenges in the AI Industry

To explore the challenges faced by the AI industry, we must first understand its essence. The core of the AI industry revolves around three key elements: computational power, algorithms, and data.

  1. First, computational power: Computational power refers to the ability to perform large-scale computations and processing. AI tasks typically require handling large amounts of data and performing complex computations, such as training deep neural network models. High computational power can accelerate model training and inference processes, enhancing the performance and efficiency of AI systems. In recent years, advancements in hardware technology, such as graphics processing units (GPUs) and dedicated AI chips (like TPUs), have significantly boosted computational power, driving the development of the AI industry. Nvidia, a major GPU provider, has seen its stock price soar in recent years, capturing a large market share and earning substantial profits.
  2. What is an algorithm: Algorithms are the core components of AI systems. They are mathematical and statistical methods used to solve problems and perform tasks. AI algorithms can be categorized into traditional machine learning algorithms and deep learning algorithms, with deep learning algorithms having made significant breakthroughs in recent years. The choice and design of algorithms are crucial for the performance and effectiveness of AI systems. Continuous improvement and innovation in algorithms can enhance the accuracy, robustness, and generalization capabilities of AI systems. Different algorithms yield different results, so advancements in algorithms are essential for task performance.
  3. Why data is important: The core task of AI systems is to extract patterns and rules from data through learning and training. Data forms the foundation for training and optimizing models. With large-scale data samples, AI systems can learn more accurate and intelligent models. Rich datasets provide comprehensive and diverse information, enabling models to generalize better to unseen data and helping AI systems better understand and solve real-world problems.

After understanding the three core elements of current AI, let’s examine the difficulties and challenges AI faces in these areas.

First, in terms of computational power, AI tasks usually require a large amount of computational resources for model training and inference, especially for deep learning models. However, obtaining and managing large-scale computational power is an expensive and complex challenge. The cost, energy consumption, and maintenance of high-performance computing equipment are significant issues. This is particularly challenging for startups and individual developers, for whom acquiring sufficient computational power can be difficult.

In terms of algorithms, despite the significant successes of deep learning algorithms in many fields, there are still challenges and difficulties. For instance, training deep neural networks requires a large amount of data and computational resources. Additionally, for certain tasks, the interpretability and explainability of models may be insufficient. The robustness and generalization capabilities of algorithms are also crucial issues, as model performance on unseen data can be unstable. Finding the best algorithm to provide optimal performance among many algorithms is an ongoing exploration.

In terms of data, data is the driving force behind AI, but obtaining high-quality, diverse data remains a challenge. Data in some fields, such as sensitive health data in the medical sector, can be difficult to obtain. Furthermore, the quality, accuracy, and labeling of data are critical issues, as incomplete or biased data can lead to erroneous model behavior or biases. Protecting data privacy and security is also a significant consideration.

Moreover, there are issues related to interpretability and transparency. The “black box” nature of AI models is a public concern. In certain applications, such as finance, healthcare, and justice, the decision-making process of models needs to be interpretable and traceable. However, existing deep learning models often lack transparency. Explaining the decision-making process of models and providing reliable explanations remain challenging.

Additionally, the business models of many AI startup projects are not very clear, which also causes confusion for many AI entrepreneurs.

2.2 Challenges in the Web3 Industry

In the Web3 industry, there are numerous challenges that need to be addressed, spanning from data analysis and user experience to smart contract vulnerabilities and hacker attacks. AI, as a tool to enhance productivity, holds significant potential in these areas.

Firstly, there’s room for improvement in data analysis and predictive capabilities. AI applications in data analysis and prediction have had a significant impact on the Web3 industry. Through intelligent analysis and mining by AI algorithms, Web3 platforms can extract valuable information from vast amounts of data and make more accurate predictions and decisions. This is particularly significant for risk assessment, market forecasting, and asset management in decentralized finance (DeFi).

Additionally, there’s the potential for enhancing user experience and personalization services. AI applications enable Web3 platforms to offer better user experiences and personalized services. By analyzing and modeling user data, Web3 platforms can provide personalized recommendations, customized services, and intelligent interaction experiences. This helps increase user engagement and satisfaction, fostering the development of the Web3 ecosystem. For instance, many Web3 protocols integrate AI tools like ChatGPT to better serve users.

In terms of security and privacy protection, AI applications also have a profound impact on the Web3 industry. AI technology can be used to detect and defend against network attacks, identify abnormal behavior, and provide stronger security measures. Additionally, AI can be applied to data privacy protection, safeguarding users’ personal information on Web3 platforms through techniques like data encryption and privacy computing. Regarding smart contract auditing, as vulnerabilities and security risks may exist in the writing and auditing processes of smart contracts, AI technology can be used for automated contract auditing and vulnerability detection, enhancing the security and reliability of contracts.

It’s evident that AI can contribute significantly to addressing the challenges and potential improvements in the Web3 industry across various aspects.

Analysis of Current Situation of AI+Web3 Project

Combining AI and Web3 projects primarily focuses on two main aspects: leveraging blockchain technology to enhance AI projects and utilizing AI technology to serve the improvement of Web3 projects. Numerous projects have emerged along this path, including Io.net, Gensyn, Ritual, among others. The following analysis will delve into different subdomains where AI aids Web3 and where Web3 enhances AI.

3.1 Web3 Helps AI

3.1.1 Decentralized Computing Power

Since the launch of ChatGPT by OpenAI at the end of 2022, it ignited a frenzy in the AI field. Within five days of its release, the user base reached one million, surpassing the download rate of Instagram, which took approximately two and a half months to reach the same milestone. Subsequently, ChatGPT experienced rapid growth, with monthly active users reaching 100 million within two months and weekly active users reaching 100 million by November 2023. With the advent of ChatGPT, the AI sector rapidly transitioned from a niche field to a highly regarded industry.

According to Trendforce’s report, ChatGPT requires 30,000 NVIDIA A100 GPUs to operate, and future models like GPT-5 will require even more computational power. This has sparked an arms race among various AI companies, as possessing sufficient computational power is crucial for maintaining a competitive edge in the AI arena, leading to a shortage of GPUs.

Prior to the rise of AI, the major GPU provider, NVIDIA, primarily served clients from the three major cloud services: AWS, Azure, and GCP. With the rise of artificial intelligence, numerous new buyers emerged, including major tech companies like Meta, Oracle, as well as other data platforms and AI startups, all joining the race to stockpile GPUs for training AI models. Large tech companies like Meta and Tesla significantly increased their purchases of customized AI models and internal research. Basic model companies like Anthropic and data platforms like Snowflake and Databricks also purchased more GPUs to assist their clients in providing AI services.

As mentioned by Semi Analysis last year, there exists a divide between “GPU rich” and “GPU poor” companies, with only a few possessing over 20,000 A100/H100 GPUs, allowing team members to utilize between 100 to 1000 GPUs for projects. These companies are either cloud providers or have built their own large language models (LLMs), including OpenAI, Google, Meta, Anthropic, Inflection, Tesla, Oracle, Mistral, among others.

However, the majority of companies fall into the “GPU poor” category, struggling with significantly fewer GPUs and spending a considerable amount of time and effort on tasks that are more difficult to advance the ecosystem. Moreover, this situation is not limited to startups. Some of the most well-known AI companies, such as Hugging Face, Databricks (MosaicML), Together, and even Snowflake, have A100/H100 quantities fewer than 20,000. Despite having world-class technical talent, these companies are constrained by the limited GPU supply, placing them at a disadvantage compared to larger companies in the AI competition.

This shortage is not limited to the “GPU poor” category; even by the end of 2023, the leading AI player, OpenAI, had to temporarily close paid registrations due to the inability to obtain sufficient GPUs and had to procure more GPU supplies.

It’s evident that the rapid development of AI has led to a serious mismatch between the demand and supply of GPUs, creating an imminent supply shortage.

To address this issue, some Web3 projects have begun to explore decentralized computing power solutions, leveraging the unique characteristics of Web3 technology. These projects include Akash, Render, Gensyn, among others. The common feature among these projects is the use of tokens to incentivize users to provide idle GPU computing power, thereby becoming the supply side of computing power to support AI clients.

The supply side profile mainly consists of three aspects: cloud service providers, cryptocurrency miners, and enterprises. Cloud service providers include major cloud service providers (such as AWS, Azure, GCP) and GPU cloud service providers (such as Coreweave, Lambda, Crusoe), where users can resell idle computing power from these providers to generate income. With Ethereum transitioning from PoW to PoS, idle GPU computing power has become an important potential supply side for cryptocurrency miners. Additionally, large enterprises like Tesla and Meta, which have purchased large quantities of GPUs for strategic purposes, can also contribute their idle GPU computing power as part of the supply side.

Currently, players in this field can generally be divided into two categories: those using decentralized computing power for AI inference and those using it for AI training. The former category includes projects like Render (although focused on rendering, it can also be used for AI computing), Akash, Aethir, while the latter category includes projects like io.net (supporting both inference and training) and Gensyn. The key difference between the two lies in the different requirements for computing power.

Let’s first discuss the projects focusing on AI inference. These projects attract users to provide computing power through token incentives and then provide computing power network services to the demand side, thereby facilitating the matching of idle computing power supply and demand. Details about such projects are covered in a research report by DePIN from our Ryze Labs, feel free to read them.

The core point lies in the token incentive mechanism, where the project first attracts suppliers and then users, thereby achieving the cold start and core operation mechanism of the project, enabling further expansion and development. In this cycle, the supply side receives more valuable token rewards, while the demand side enjoys more cost-effective services. The value of the project’s tokens and the growth of both supply and demand participants remain consistent. As the token price rises, more participants and speculators are attracted, creating a value capture loop.

Another category involves using decentralized computing power for AI training, such as Gensyn and io.net (which support both AI training and inference). In fact, the operational logic of these projects is not fundamentally different from AI inference projects. They still rely on token incentives to attract participation from the supply side to provide computing power, which is then utilized by the demand side.

io.net, as a decentralized computing power network, currently boasts over 500,000 GPUs, making it a standout performer in the decentralized computing power projects. Additionally, it has integrated computing power from Render and Filecoin, demonstrating continuous development of its ecosystem.

Furthermore, Gensyn facilitates machine learning task allocation and rewards through smart contracts to enable AI training. As illustrated in the diagram below, the hourly cost of machine learning training work in Gensyn is approximately $0.4, which is significantly lower than the cost of over $2 on AWS and GCP.

The Gensyn ecosystem involves four participating entities:

  • Submitters: These are the demand-side users who consume tasks and pay for AI training tasks.
  • Executors: Executors carry out the tasks of model training and provide proofs of task completion for verification.
  • Verifiers: Verifiers connect the non-deterministic training process with deterministic linear computation. They compare the proofs provided by executors with the expected thresholds.
  • Reporters: Reporters inspect the work of verifiers and raise challenges to earn rewards upon identifying issues.

As we can see, Gensyn aims to become a massively scalable and cost-effective computing protocol for global deep-learning models. However, looking at this field, why do most projects choose decentralized computing power for AI inference rather than training?

Let’s also help friends who are not familiar with AI training and inference understand the difference between the two:

  • AI Training: If we liken artificial intelligence to a student, then training is similar to providing the AI with a large amount of knowledge and examples, which can be understood as data. The AI learns from these examples. Because learning involves understanding and memorizing a large amount of information, this process requires a significant amount of computational power and time.
  • AI Inference: So what is the inference? It can be understood as using the knowledge learned to solve problems or take exams. During inference, artificial intelligence uses the learned knowledge to provide answers, rather than acquiring new knowledge. Therefore, the computational requirements for the inference process are relatively small.

It can be seen that the computational power requirements for both AI inference and AI training differ significantly. The availability of decentralized computing power for AI inference and AI training will be further analyzed in the upcoming challenge section.

Furthermore, Ritual aims to combine distributed networks with model creators to maintain decentralization and security. Its first product, Infernet, enables smart contracts on the blockchain to access AI models off-chain, allowing such contracts to access AI in a way that maintains verification, decentralization, and privacy protection.

The coordinator of Infernet is responsible for managing the behavior of nodes in the network and responding to computational requests from consumers. When users use Infernet, tasks such as inference and proof are performed off-chain, with the output returned to the coordinator and ultimately transmitted to consumers on-chain via smart contracts.

In addition to decentralized computing power networks, there are also decentralized bandwidth networks like Grass, which aim to improve the speed and efficiency of data transmission. Overall, the emergence of decentralized computing power networks provides a new possibility for the supply side of AI computing power, driving AI forward in new directions.

3.1.2 Decentralized algorithm model

Just like mentioned in the second chapter, the three core elements of AI are computational power, algorithms, and data. Since computational power can form a supply network through decentralization, can algorithms also follow a similar approach and form a supply network for algorithm models?

Before analyzing projects in this field, let’s first understand the significance of decentralized algorithm models. Many people may wonder, since we already have OpenAI, why do we need a decentralized algorithm network?

Essentially, a decentralized algorithm network is a decentralized AI algorithm service marketplace that connects many different AI models. Each AI model has its own expertise and skills. When users pose questions, the marketplace selects the most suitable AI model to answer the question. Chat-GPT, developed by OpenAI, is one such AI model that can understand and generate text similar to humans.

In simple terms, ChatGPT is like a highly capable student helping to solve different types of problems, while a decentralized algorithm network is like a school with many students helping to solve problems. Although the current student (ChatGPT) is highly capable, in the long run, there is great potential for a school that can recruit students from around the globe.

Currently, in the field of decentralized algorithm models, there are also some projects that are experimenting and exploring. Next, we will use the representative project Bittensor as a case study to help understand the development of this niche field.

In Bittensor, the supply side of algorithm models (or miners) contribute their machine learning models to the network. These models can analyze data and provide insights. Model providers receive cryptocurrency tokens, known as TAO, as rewards for their contributions.

To ensure the quality of answers, Bittensor uses a unique consensus mechanism to reach a consensus on the best answer. When a question is posed, multiple model miners provide answers. Then, validators in the network start working to determine the best answer, which is then sent back to the user.

The token TAO in the Bittensor ecosystem plays two main roles throughout the process. On one hand, it serves as an incentive for miners to contribute algorithm models to the network. On the other hand, users need to spend tokens to ask questions and have the network complete tasks.

Because Bittensor is decentralized, anyone with internet access can join the network, either as a user asking questions or as a miner providing answers. This allows more people to harness the power of artificial intelligence.

In summary, decentralized algorithm model networks like Bittensor have the potential to create a more open and transparent landscape. In this ecosystem, AI models can be trained, shared, and utilized in a secure and decentralized manner. Additionally, other networks like BasedAI are attempting similar endeavors, with the intriguing aspect of using Zero-Knowledge Proofs (ZK) to protect user-model interactive data privacy, which will be further discussed in the fourth subsection.

As decentralized algorithm model platforms evolve, they will enable small companies to compete with large organizations in using cutting-edge AI tools, potentially having significant impacts across various industries.

3.1.3 Decentralized Data Collection

For the training of AI models, a large supply of data is indispensable. However, most Web2 companies currently still monopolize user data. Platforms like X, Reddit, TikTok, Snapchat, Instagram, and YouTube prohibit data collection for AI training, which poses a significant obstacle to the development of the AI industry.

On the other hand, some Web2 platforms sell user data to AI companies without sharing any profits with the users. For instance, Reddit reached a $60 million agreement with Google, allowing Google to train AI models using its posts. This results in data collection rights being monopolized by major capital and big data companies, pushing the industry towards a capital-intensive direction.

In response to this situation, some projects are leveraging Web3 and token incentives to achieve decentralized data collection. Take PublicAI as an example: users can participate in two roles:

  • One category is AI data providers. Users can find valuable content on X, tag @PublicAI official account with their insights, and use hashtags #AI or #Web3 to categorize the content, thereby sending it to the PublicAI data center for collection.
  • The other category is data validators. Users can log into the PublicAI data center and vote on the most valuable data for AI training.

As a reward, users can earn tokens through these contributions, fostering a win-win relationship between data contributors and the AI industry.

In addition to projects like PublicAI, which specifically collect data for AI training, there are many other projects using token incentives for decentralized data collection. For example, Ocean collects user data through data tokenization to serve AI, Hivemapper uses users’ car cameras to collect map data, Dimo collects car data, and WiHi collects weather data. These projects, through decentralized data collection, also serve as potential data sources for AI training. Thus, in a broad sense, they can be included in the paradigm of Web3 aiding AI.

3.1.4 ZK Protects User Privacy in AI

Blockchain technology offers decentralization benefits and also introduces a crucial feature: zero-knowledge proofs. Zero-knowledge technology allows for information verification while maintaining privacy.

In traditional machine learning, data typically needs to be stored and processed centrally, which can lead to privacy risks. Methods to protect data privacy, such as encryption or data anonymization, may limit the accuracy and performance of machine learning models.

Zero-knowledge proof technology helps resolve this dilemma by addressing the conflict between privacy protection and data sharing. Zero-Knowledge Machine Learning (ZKML) uses zero-knowledge proof technology to enable machine learning model training and inference without exposing the original data. Zero-knowledge proofs ensure that the features of the data and the results of the model can be verified as correct without revealing the actual data content.

The core goal of ZKML is to balance privacy protection and data sharing. It can be applied in various scenarios such as healthcare data analysis, financial data analysis, and cross-organizational collaboration. By using ZKML, individuals can protect the privacy of their sensitive data while sharing data with others to gain broader insights and collaborative opportunities without the risk of data privacy breaches. This field is still in its early stages, with most projects still under exploration. For example, BasedAI proposes a decentralized approach by seamlessly integrating Fully Homomorphic Encryption (FHE) with Large Language Models (LLMs) to maintain data confidentiality. Zero-Knowledge Large Language Models (ZK-LLMs) embed privacy into their distributed network infrastructure, ensuring that user data remains confidential throughout the network’s operation.

Here’s a brief explanation of Fully Homomorphic Encryption (FHE). FHE is an encryption technique that allows computations to be performed on encrypted data without needing to decrypt it. This means that various mathematical operations (such as addition, multiplication, etc.) performed on FHE-encrypted data yield the same results as if they were performed on the original unencrypted data, thereby protecting user data privacy.

In addition to the aforementioned methods, Web3 also supports AI through projects like Cortex, which enables on-chain execution of AI programs. Running machine learning programs on traditional blockchains faces a challenge as virtual machines are highly inefficient at running any non-trivial machine learning models. Most believe running AI on the blockchain is impossible. However, the Cortex Virtual Machine (CVM) utilizes GPUs to execute AI programs on-chain and is compatible with the Ethereum Virtual Machine (EVM). In other words, the Cortex chain can execute all Ethereum DApps and integrate AI machine learning into these DApps. This allows machine learning models to run in a decentralized, immutable, and transparent manner, with network consensus verifying each step of AI inference.

3.2 AI Helps Web3

In the collision between AI and Web3, in addition to Web3’s assistance to AI, AI’s assistance to the Web3 industry is also worthy of attention. The core contribution of artificial intelligence is the improvement of productivity, so there are many attempts in AI auditing smart contracts, data analysis and prediction, personalized services, security and privacy protection, etc.

3.2.1 Data Analysis and Prediction

Many Web3 projects are integrating existing AI services (like ChatGPT) or developing their own to provide data analysis and prediction services for Web3 users. These services cover a broad range, including AI algorithms for investment strategies, on-chain analysis tools, and price and market forecasts.

For example, Pond uses AI graph algorithms to predict valuable future alpha tokens, offering investment advisory services to users and institutions. BullBear AI trains on user historical data, price history, and market trends to provide accurate information supporting price trend predictions, helping users achieve profits.

Platforms like Numerai host investment competitions where participants use AI and large language models to predict stock markets. They train models on high-quality data provided by the platform and submit daily predictions. Numerai evaluates these predictions over the following month, and participants can stake NMR tokens on their models to earn rewards based on performance.

Arkham, a blockchain data analysis platform, also integrates AI into its services. Arkham links blockchain addresses to entities like exchanges, funds, and whales, displaying key data and analyses to give users a decision-making edge. Arkham Ultra matches addresses to real-world entities using algorithms developed over three years with support from Palantir and OpenAI founders.

3.2.2 Personalized Services

AI applications in search and recommendation are prevalent in Web2 projects, serving users’ personalized needs. Web3 projects similarly integrate AI to enhance user experience.

For instance, the well-known data analysis platform Dune recently introduced the Wand tool, which uses large language models to write SQL queries. With Wand Create, users can generate SQL queries from natural language questions, making it easy for those unfamiliar with SQL to search data.

Content platforms like Followin integrate ChatGPT to summarize viewpoints and updates in specific sectors. The Web3 encyclopedia IQ.wiki aims to be the primary source of objective, high-quality knowledge on blockchain technology and cryptocurrency. It integrates GPT-4 to summarize wiki articles, making blockchain information more accessible worldwide. The LLM-based search engine Kaito aims to revolutionize Web3 information retrieval.

In the creative domain, projects like NFPrompt reduce the cost of content creation. NFPrompt allows users to generate NFTs more easily with AI, providing various personalized creative services.

3.2.3 AI Auditing Smart Contracts

Auditing smart contracts is a crucial task in Web3, and AI can enhance efficiency and accuracy in identifying code vulnerabilities.

Vitalik Buterin has noted that one of the biggest challenges in the cryptocurrency space is errors in our code. AI holds the promise of significantly simplifying the use of formal verification tools to prove code correctness. Achieving this could lead to a nearly error-free SEK EVM (Ethereum Virtual Machine), enhancing space security as fewer errors increase overall safety.

For example, the 0x0.ai project offers an AI-powered smart contract auditor. This tool uses advanced algorithms to analyze smart contracts and identify potential vulnerabilities or issues that could lead to fraud or other security risks. Auditors use machine learning to detect patterns and anomalies in the code, flagging potential problems for further review.

There are other native cases where AI aids Web3. PAAL helps users create personalized AI bots that can be deployed on Telegram and Discord to serve Web3 users. The AI-driven multi-chain DEX aggregator Hera uses AI to provide the best trading paths between any token pairs across various tokens. Overall, AI’s contribution to Web3 is primarily at the tool level, enhancing various processes and functionalities.

Limitations And Current Challenges Of The AI + Web3 Project

4.1 Realistic Obstacles In Decentralized Computing Power

Currently, many Web3 projects assisting AI are focusing on decentralized computing power. Using token incentives to promote global users to become part of the computing power supply side is a very interesting innovation. However, on the other hand, there are some realistic issues that need to be addressed:

Compared to centralized computing power service providers, decentralized computing power products typically rely on nodes and participants distributed globally to provide computing resources. Due to possible latency and instability in network connections between these nodes, performance, and stability might be worse than centralized computing power products.

In addition, the availability of decentralized computing power products is affected by the matching degree between supply and demand. If there are not enough suppliers or if the demand is too high, it may lead to a shortage of resources or an inability to meet user needs.

Finally, compared to centralized computing power products, decentralized computing power products usually involve more technical details and complexity. Users might need to understand and handle aspects of distributed networks, smart contracts, and cryptocurrency payments, which increases the cost of user understanding and usage.

After in-depth discussions with numerous decentralized computing power project teams, it was found that the current decentralized computing power is still mostly limited to AI inference rather than AI training.

Next, I will use four questions to help everyone understand the reasons behind this:

  1. Why do most decentralized computing power projects choose to do AI inference rather than AI training?

  2. What makes NVIDIA so powerful? What are the reasons that decentralized computing power training is difficult?

  3. What will be the endgame for decentralized computing power (Render, Akash, io.net, etc.)?

  4. What will be the endgame for decentralized algorithms (Bittensor)?

Let’s delve into the details step by step:

1) Observing this field, most decentralized computing power projects choose to focus on AI inference rather than training, primarily due to different requirements for computing power and bandwidth.

To help everyone better understand, let’s compare AI to a student:

  • AI Training: If we compare artificial intelligence to a student, training is similar to providing the AI with a large amount of knowledge and examples, akin to what we often refer to as data. The AI learns from these examples. Since learning involves understanding and memorizing vast amounts of information, this process requires substantial computing power and time.

  • AI Inference: Inference can be understood as using the acquired knowledge to solve problems or take exams. During inference, the AI utilizes the learned knowledge to answer questions rather than acquiring new information, hence the computational requirements are relatively lower.

It’s easy to see that the fundamental difference in difficulty lies in the fact that large model AI training requires enormous data volumes and extremely high bandwidth for data transmission, making it very challenging to achieve with decentralized computing power. In contrast, inference requires much less data and bandwidth, making it more feasible.

For large models, stability is crucial. If training is interrupted, it must restart, resulting in high sunk costs. On the other hand, demands with relatively lower computing power requirements, such as AI inference or certain specific scenarios involving medium to small model training, can be achieved. In decentralized computing power networks, some relatively large node service providers can cater to these relatively higher computing power demands.

2) So, where are the bottlenecks in data and bandwidth? Why is decentralized training hard to achieve?

This involves two key elements of large model training: single-card computing power and multi-card parallelism.

Single-card computing power: Currently, all centers requiring large model training, referred to as supercomputing centers, can be compared to the human body, where the underlying unit, the GPU, is like a cell. If the computing power of a single cell (GPU) is strong, then the overall computing power (single cell × quantity) can also be very strong.

Multi-card parallelism: Training a large model often involves hundreds of billions of gigabytes. For supercomputing centers that train large models, at least tens of thousands of A100 GPUs are required. This necessitates mobilizing thousands of cards for training. However, training a large model is not a simple serial process; it doesn’t just train on the first A100 card and then move to the second. Instead, different parts of the model are trained on different GPUs simultaneously, and training part A might require results from part B, involving parallel processing.

NVIDIA’s dominance and rising market value, while AMD and domestic companies like Huawei and Horizon find it hard to catch up, stem from two aspects: the CUDA software environment and NVLink multi-card communication.

CUDA Software Environment: Whether there is a software ecosystem to match the hardware is crucial, like NVIDIA’s CUDA system. Building a new system is challenging, akin to creating a new language with high replacement costs.

NVLink Multi-card Communication: Essentially, multi-card communication involves the input and output of information. How to parallelize and transmit is crucial. NVLink’s presence means NVIDIA and AMD cards cannot communicate; additionally, NVLink limits the physical distance between GPUs, requiring them to be in the same supercomputing center. This makes it difficult for decentralized computing power spread across the globe to form a cohesive computing cluster for large model training.

The first point explains why AMD and domestic companies like Huawei and Horizon struggle to catch up; the second point explains why decentralized training is hard to achieve.

3) What will be the endgame for decentralized computing power? Decentralized computing power currently struggles with large model training because stability is paramount. Interruptions necessitate retraining, resulting in high sunk costs. The high requirements for multi-card parallelism are limited by physical bandwidth constraints. NVIDIA’s NVLink achieves multi-card communication, but within a supercomputing center, NVLink limits the physical distance between GPUs. Thus, dispersed computing power cannot form a computing cluster for large model training.

However, for demands with relatively lower computing power requirements, such as AI inference or certain specific scenarios involving medium to small model training, decentralized computing power networks with some relatively large node service providers have potential. Additionally, scenarios like edge computing for rendering are relatively easier to implement.

4) What will be the endgame for decentralized algorithm models? The future of decentralized algorithm models depends on the ultimate direction of AI. I believe the future of AI might feature 1-2 closed-source model giants (like ChatGPT) alongside a plethora of models. In this context, application-layer products do not need to bind to a single large model but rather cooperate with multiple large models. In this scenario, the model of Bittensor shows significant potential.

4.2 The Integration of AI and Web3 Is Rough, Failing to Achieve 1+1>2

In current projects combining Web3 and AI, particularly those where AI aids Web3 initiatives, most projects merely use AI superficially without demonstrating a deep integration between AI and cryptocurrencies. This superficial application is evident in the following two aspects:

  • First, whether using AI for data analysis and prediction, in recommendation and search scenarios, or for code auditing, there is little difference compared to the integration of AI in Web2 projects. These projects simply leverage AI to enhance efficiency and analysis without showcasing a native fusion of AI and cryptocurrencies or presenting innovative solutions.
  • Second, many Web3 teams incorporate AI more as a marketing gimmick, purely capitalizing on the AI concept. They apply AI technology in very limited areas and then start promoting the trend of AI, creating a facade of close integration with AI. However, these projects lack substantial innovation.

Although current Web3 and AI projects have these limitations, we should recognize that this is only the early stage of development. In the future, we can expect more in-depth research and innovation to achieve a tighter integration between AI and cryptocurrencies, creating more native and meaningful solutions in areas such as finance, decentralized autonomous organizations (DAOs), prediction markets, and NFTs.

4.3 Token Economics Serve as a Buffer for AI Project Narratives

As mentioned initially, AI projects face challenges in their business models, especially as more and more large models are gradually becoming open source. Many AI + Web3 projects, often pure AI projects struggling to thrive and secure funding in the Web2 space, choose to overlay narratives and token economics from Web3 to encourage user participation.

However, the crucial question is whether the integration of token economics truly helps AI projects address real-world needs or if it simply serves as a narrative or short-term value proposition. Currently, most AI + Web3 projects are far from reaching a practical stage. It is hoped that more grounded and thoughtful teams will not only use tokens as a means to hype AI projects but also genuinely fulfill practical use cases.

Summary

Currently, numerous cases and applications have emerged in AI + Web3 projects. Firstly, AI technology can provide more efficient and intelligent use cases for Web3. Through AI’s capabilities in data analysis and prediction, users in Web3 can have better tools for investment decisions and other scenarios. Additionally, AI can audit smart contract code, optimize contract execution, and enhance the performance and efficiency of blockchain. Moreover, AI technology can offer more precise and intelligent recommendations and personalized services for decentralized applications, thus improving user experience.

At the same time, the decentralized and programmable nature of Web3 also presents new opportunities for AI technology. Through token incentives, decentralized computing power projects provide new solutions to the dilemma of insufficient AI computing power. The smart contracts and distributed storage mechanisms of Web3 also offer a broader space and resources for AI algorithm sharing and training. The user autonomy and trust mechanisms of Web3 also bring new possibilities for AI development, allowing users to autonomously choose to participate in data sharing and training, thereby enhancing the diversity and quality of data and further improving the performance and accuracy of AI models.

Although current AI + Web3 cross-over projects are still in their early stages and face many challenges, they also bring many advantages. For example, decentralized computing power products have some drawbacks, but they reduce reliance on centralized institutions, provide greater transparency and auditability, and enable broader participation and innovation. For specific use cases and user needs, decentralized computing power products may be a valuable choice. The same applies to data collection; decentralized data collection projects offer advantages such as reducing reliance on single data sources, providing broader data coverage, and promoting data diversity and inclusivity. In practice, it is necessary to balance these advantages and disadvantages and take appropriate management and technical measures to overcome challenges, ensuring that decentralized data collection projects have a positive impact on AI development.

Overall, the integration of AI + Web3 offers infinite possibilities for future technological innovation and economic development. By combining the intelligent analysis and decision-making capabilities of AI with the decentralized and user-autonomous nature of Web3, it is believed that we can build a smarter, more open, and fairer economic and even social system.

Disclaimer:

  1. This article is reprinted from [Ryze Labs]. All copyrights belong to the original author [Fred]. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
  2. Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
  3. Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.
Start Now
Sign up and get a
$100
Voucher!