Deconstructing the AI framework: From intelligent agents to the exploration of decentralization

6,681 0 3

原作者：YBB資本研究員澤克

前言

In previous articles, we have discussed many times about the current status of AI Meme and the future development of AI Agent. However, the narrative of AI Agent track is developing so fast and evolving so dramatically that it is still a bit overwhelming. In the short two months since the Truth Terminal launched Agent Summer, the narrative of the combination of AI and Crypto has changed almost every week. Recently, the markets attention has begun to focus on framework projects dominated by technical narratives. This subdivision track has produced many dark horses with a market value of over 100 million or even over 1 billion in the past few weeks alone. And such projects have also derived a new asset issuance paradigm, that is, projects issue coins based on Github code libraries, and Agents built based on frameworks can also issue coins again. With the framework as the bottom and Agent as the top. It looks like an asset issuance platform, but in fact, it is an infrastructure model unique to the AI era that is emerging. How should we examine this new trend? This article will start with an introduction to the framework and combine our own thinking to interpret what the AI framework means for Crypto?

1. What is a framework?

經過去中心化金融nition, an AI framework is a low-level development tool or platform that integrates a set of pre-built modules, libraries, and tools to simplify the process of building complex AI models. These frameworks usually also include functions for processing data, training models, and making predictions. In short, you can also simply understand the framework as an operating system in the AI era, just like Windows and Linux in desktop operating systems, or iOS and Android in mobile terminals. Each framework has its own advantages and disadvantages, and developers can freely choose according to specific needs.

Although the term AI framework is still a new concept in the Crypto field, from its origin, the development of AI framework has actually been nearly 14 years since Theano was born in 2010. In the traditional AI circle, both academia and industry already have very mature frameworks to choose from, such as Googles TensorFlow, Metas Pytorch, Baidus PaddlePaddle, Bytes MagicAnimate, and these frameworks have their own advantages for different scenarios.

The framework projects that have emerged in Crypto are based on the large number of Agent needs at the beginning of this wave of AI craze, and then they are derived from other tracks in Crypto, and finally form AI frameworks in different sub-sectors. Lets take several mainstream frameworks in the current circle as examples to expand this sentence.

1.1 Eliza

First, lets take Eliza from ai16z as an example. This framework is a multi-agent simulation framework that is specifically used to create, deploy, and manage autonomous AI agents. Based on TypeScript as the programming language, its advantage is better compatibility and easier API integration.

According to the official documentation, Eliza is mainly aimed at social media, such as multi-platform integration support. The framework provides full-featured Discord integration and supports voice channels, automated accounts on the X/Twitter platform, Telegram integration, and direct API access. In terms of media content processing, it supports PDF document reading and analysis, link content extraction and summarization, audio transcription, video content processing, image analysis and description, and conversation summarization.

Eliza currently supports four main use cases:

AI assistant applications: customer support agents, community managers, personal assistants;
Social media roles: automated content creator, interactive robot, brand representative;
Knowledge workers: research assistants, content analysts, document processors;
Interactive roles: role-playing characters, educational counselors, entertainment robots.

Eliza currently supports the following models:

Open source model local inference: such as Llama 3, Qwen 1.5, BERT;
Cloud-based inference using OpenAI’s API;
The default configuration is Nous Hermes Llama 3.1 B;
Integrate with Claude for complex queries.

1.2 GAME

GAME (Generative Autonomous Multimodal Entities Framework) is an automatically generated and managed multimodal AI framework launched by Virtual. Its main scenario is the design of intelligent NPCs in games. Another special feature of this framework is that it can be used by users with low or even no code foundation. According to its trial interface, users only need to modify parameters to participate in Agent design.

In terms of project architecture, the core design of GAME is a modular design in which multiple subsystems work together. The detailed architecture is shown in the figure below.

Agent Prompting Interface: The interface for developers to interact with the AI framework. Through this interface, developers can initialize a session and specify parameters such as session ID, agent ID, user ID, etc.
Perception Subsystem: The perception subsystem is responsible for receiving input information, synthesizing it and sending it to the strategic planning engine. It also handles the response of the dialogue processing module;
Strategic Planning Engine: The strategic planning engine is the core part of the entire framework, which is divided into a high-level planner and a low-level policy. The high-level planner is responsible for formulating long-term goals and plans, while the low-level policy converts these plans into specific action steps;
World Context: The world context contains data such as environment information, world state, and game state. This information is used to help the agent understand the current situation;
Dialogue Processing Module: The dialogue processing module is responsible for processing messages and responses. It can generate dialogues or reactions as output.
On Chain Wallet Operator: On-chain wallet operators may be involved in the application scenarios of blockchain technology, but their specific functions are unclear;
Learning Module: The learning module learns from feedback and updates the agent’s knowledge base;
Working Memory: Working memory stores short-term information such as the agents recent actions, results, and current plans;
Long Term Memory Processor: The long-term memory processor is responsible for extracting important information about the agent and its working memory and sorting it according to factors such as importance score, recency and relevance;
Agent Repository: The agent repository stores the agents goals, reflections, experiences, and personality attributes.
Action Planner: The action planner generates a specific action plan based on the low-level strategy;
Plan Executor: The plan executor is responsible for executing the action plan generated by the action planner.

Workflow: The developer starts the agent through the agent prompt interface, and the perception subsystem receives input and passes it to the strategic planning engine. The strategic planning engine uses the information in the memory system, world context, and agent library to develop and execute action plans. The learning module continuously monitors the results of the agents actions and adjusts the agents behavior based on the results.

Application scenarios: From the perspective of the entire technical architecture, this framework mainly focuses on the decision-making, feedback, perception, and personality of the Agent in a virtual environment. In terms of use cases, it is applicable to Metaverse in addition to games. In the list below Virtual, you can see that a large number of projects have adopted this framework for construction.

1.3 Rig

Rig is an open source tool written in Rust, designed to simplify the development of large language model (LLM) applications. It enables developers to easily interact with multiple LLM service providers (such as OpenAI and Anthropic) and multiple vector databases (such as MongoDB and Neo 4 j) by providing a unified operation interface.

Core Features:

Unified interface: Regardless of the LLM provider or vector storage, Rig provides consistent access, greatly reducing the complexity of integration work;
Modular architecture: The framework adopts a modular design, including key parts such as provider abstraction layer, vector storage interface and intelligent proxy system, ensuring the flexibility and scalability of the system;
Type safety: Rusts features are used to implement type-safe embedding operations, ensuring code quality and runtime security;
Efficient performance: Supports asynchronous programming mode and optimizes concurrent processing capabilities; built-in logging and monitoring functions facilitate maintenance and troubleshooting.

Workflow: When a user request enters the Rig system, it first passes through the provider abstraction layer, which is responsible for standardizing the differences between different providers and ensuring consistency in error handling. Next, in the core layer, the intelligent agent can call various tools or query vector storage to obtain the required information. Finally, through advanced mechanisms such as retrieval-augmented generation (RAG), the system can combine document retrieval and context understanding to generate accurate and meaningful responses and return them to the user.

Application scenarios: Rig is not only suitable for building question-answering systems that need to answer questions quickly and accurately, but can also be used to create efficient document search tools, chatbots or virtual assistants with contextual awareness, and even support content creation, automatically generating text or other forms of content based on existing data patterns.

1.4 ZerePy

ZerePy is a Python-based open source framework designed to simplify the process of deploying and managing AI Agents on the X (formerly Twitter) platform. It was born out of the Zerebro project and inherits its core functionality, but is designed in a more modular and extensible way. Its goal is to enable developers to easily create personalized AI Agents and implement various automated tasks and content creation on X.

ZerePy provides a command line interface (CLI) that allows users to manage and control their deployed AI Agents [1]. Its core architecture is based on modular design, allowing developers to flexibly integrate different functional modules, such as:

LLM integration: ZerePy supports OpenAI and Anthropics Large Language Model (LLM), and developers can choose the model that best suits their application scenario. This enables Agent to generate high-quality text content;
X platform integration: The framework directly integrates the API of the X platform, allowing Agents to post, reply, like, forward, etc.
Modular connection system: This system allows developers to easily add support for other social platforms or services, extending the functionality of the framework;
Memory system (future plan): Although it may not be fully implemented in the current version, the design goals of ZerePy include integrating a memory system that allows the agent to remember previous interactions and contextual information, thereby generating more coherent and personalized content.

Although both ZerePy and a16zs Eliza project are dedicated to building and managing AI agents, they have slightly different architectures and goals. Eliza focuses more on multi-agent simulation and broader AI research, while ZerePy focuses on simplifying the process of deploying AI agents on a specific social platform (X), which is more inclined to simplification in practical applications.

2. A replica of the BTC ecosystem

In fact, in terms of development path, AI Agent has a lot in common with the BTC ecosystem at the end of 2023 and the beginning of 2024. The development path of the BTC ecosystem can be simply summarized as: BRC 20-Atomical/Rune and other multi-protocol competition-BTC L2-BTCFi with Babylon as the core. AI Agent has developed more rapidly on the basis of mature traditional AI technology stack, but its overall development path does have many similarities with the BTC ecosystem. I will briefly summarize it as follows: GOAT/ACT-Social Agent/Analysis AI Agent framework competition. From the trend point of view, infrastructure projects focusing on Agent decentralization and security are likely to take over this wave of framework fever and become the main theme of the next stage.

Will this track become homogenized and bubbled like the BTC ecosystem? I think not. First of all, the narrative of AI Agent is not to reproduce the history of the smart contract chain. Secondly, whether the existing AI framework projects are technically powerful or stagnant at the PPT stage or ctrl c+ctrl v, at least they provide a new idea for infrastructure development. Many articles compare AI frameworks to asset issuance platforms and Agents to assets. In fact, compared with Memecoin Launchpad and Inscription Protocol, I personally think that AI frameworks are more like future public chains, and Agents are more like future Dapps.

In todays Crypto, we have thousands of public chains and tens of thousands of Dapps. Among the general chains, we have BTC, Ethereum, and various heterogeneous chains, while the forms of application chains are more diverse, such as game chains, storage chains, and Dex chains. The public chain corresponds to the AI framework, and the two are actually very similar, and Dapp can also correspond to Agent very well.

Crypto in the AI era is very likely to move towards this form. The future debate will also shift from the debate between EVM and heterogeneous chains to the debate between frameworks. The current question is more about how to decentralize or chain? I think subsequent AI infrastructure projects will be carried out on this basis. Another point is what is the significance of doing this on the blockchain?

3. What is the significance of chaining?

No matter what blockchain is combined with, it will eventually face a question: Is it meaningful? In last years article, I criticized GameFi for putting the cart before the horse and Infra for overdeveloping. In previous articles about AI, I also expressed my disapproval of the combination of AI x Crypto in the practical field at this stage. After all, the driving force of narrative has become weaker and weaker for traditional projects. The few traditional projects with good coin prices last year basically have to have the strength to match or exceed the coin price. What can AI do for Crypto? I had thought of the relatively vulgar but demanding ideas of Agents acting on behalf of operations, Metaverse, and Agents as employees. However, these needs do not need to be fully on-chain, and they cannot be closed from a business logic perspective. The Agent browser mentioned in the previous issue can derive needs such as data labeling and reasoning computing power, but the combination of the two is still not tight enough and the computing power is still dominated by centralized computing power in terms of comprehensive aspects.

Rethinking the success of DeFi, the reason why DeFi can get a share of the traditional finance is that it has higher accessibility, better efficiency, lower costs, and security without the need to trust the centralization. If we think along this line of thought, I think there may be several more reasons to support Agent chaining.

1. Can the chaining of Agents achieve lower usage costs and thus higher accessibility and selectivity? Ultimately, the AI “rental rights” that are exclusive to Web2 giants can be shared by ordinary users;

2. Security. According to the simplest definition of Agent, an AI that can be called Agent should be able to interact with the virtual or real world. If Agent can intervene in reality or my virtual wallet, then a blockchain-based security solution is also a rigid demand.

3. Can Agent realize a set of financial gameplay unique to blockchain? For example, LP in AMM allows ordinary people to participate in automatic market making. For example, Agent needs computing power, data labeling, etc., and users invest in the protocol in the form of U when they are optimistic. Or Agent based on different application scenarios can form new financial gameplay;

4. DeFi does not have perfect interoperability at present. If the agent combined with blockchain can achieve transparent and traceable reasoning, it may be more attractive than the agent browser provided by the traditional Internet giants mentioned in the previous article.

4. Creativity?

Framework projects will also provide a similar entrepreneurial opportunity to GPT Store in the future. Although it is still very complicated for ordinary users to release an Agent through a framework, I believe that frameworks that simplify the Agent building process and provide some complex functional combinations will still prevail in the future, thus forming a more interesting Web3 creative economy than GPT Store.

The current GPT Store still tends to be practical in traditional fields, and most popular apps are created by traditional Web2 companies, and the revenue is also exclusively from the creators. According to OpenAIs official explanation, this strategy only provides financial support to some outstanding developers in the United States, giving a certain amount of subsidies.

Web3 still has many areas to fill in terms of demand, and the economic system can make the unfair policies of Web2 giants more fair. In addition, we can naturally introduce community economy to make Agents more perfect. The creative economy of Agents will be an opportunity for ordinary people to participate, and the future AI Meme will be much smarter and more interesting than the Agents issued on GOAT and Clanker.

參考文章：

1. Historical evolution and trend exploration of AI frameworks

2.Bybit: AI Rig Complex (ARC): AI agent framework

3. Deep Value Memetics: Horizontal comparison of the four major Crypto×AI frameworks: adoption status, advantages and disadvantages, and growth potential

4. Eliza Official Documentation

5. Virtual official documentation

This article is sourced from the internet: Deconstructing the AI framework: From intelligent agents to the exploration of decentralization

Related: IOSG Ventures: DePayFi, any asset can be increased in value at any time and anywhere

Original author: IOSG Ventures Thanks to John@Asylum, Jerome@immersive, Mortiz@Fluidkey, Eddie@AEON, Wyatt@VanEck, Momir@IOSG for their support on this article. Proof of Success for Web3 Payments The rapid growth of stablecoin payments is reshaping global finance, with transaction volumes on par with major payment networks. However, this is just the beginning of a transformative financial era. source: visaonchainanalytics source: ycharts Inefficiencies in traditional systems, such as cross-border payments, create huge opportunities for stablecoins: “Cross-border payments typically incur high transaction fees, exchange rate markings, and intermediary costs (and also take a long time to settle)… The market size for B2B cross-border payments is huge… FXC Intelligence estimates the total market size for B2B cross-border payments to be $39 trillion in 2023, and expects it to grow 43% to $53 trillion by 2030.” -…