How to explore the frontline trends of Web3? Web3Caff Research will carefully select and interpret the latest publicly disclosed Web3 financing projects for you. See through the phenomenon to its essence - immediately follow us to sniff out market trends.
Author:ShirleyLi, Web3Caff Research Researcher
Cover:Logo from this project, Typography by Web3Caff Research
Word Count:Over 3200 words in total
According to The Block, on July 23, a16z led a $15 million seed round for the AI decentralized data layer project Poseidon.
In the early stages of AI model training, training data for general models, especially large language models (LLMs), often comes from open networks such as books, Reddit forums, Baidu Tieba, and Wikipedia pages. However, this crawling method leads to multiple models sharing the same generic data. Moreover, with increasing platform copyright awareness and tightening data protection policies, the difficulty of AI models crawling extensive data has also risen. More importantly, the application scenarios of current AI models are gradually expanding towards specialized fields. They have moved from being applied only in pure digital environments like conversational systems and search recommendations to being applied in broader physical real-world environments (such as autonomous driving, voice systems, robotics, etc.). The latter, due to relying on more complex AI Agent architectures, clearly requires not crawled generic data, but more refined video, image, sound, environmental status data existing in real environments, and even data interacting with the physical world. Such data is typically fragmented across different scenarios like dashcams, smartphones, home devices, and storage systems. Therefore, how to collect high-quality data at scale in a compliant manner has become key to building the next-generation AI Agent data network.
Poseidon, which announced its funding this time, is a full-stack decentralized data layer incubated by Story and built on Story Protocol. The project builds a comprehensive IP full-stack solution covering collection, rights confirmation, trading, and tracing, registering training data as on-chain IP assets, thereby unlocking high-quality and compliant data sources and providing trusted data support for AI models.