LYS Lab’s Graph-Based Architecture for Web3 Analytics

LYS Lab’s Graph-Based Architecture for Web3 Analytics

Unlocking DeFi's Hidden Insights with LYS Labs

DeFi is extremely fast-paced, we all know that. The sheer volume and complexity of data generated by the interplay of smart contracts, liquidity pools, and user interactions have created a real challenge for analytics.

As DeFi continues to grow and evolve (at quite frankly an unstoppable pace), traditional data management tools like SQL databases, will always struggle to keep up. The intricate web of relationships between tokens, users, and protocols means we need a new approach… one that can show the complexities and uncover the hidden insights within all these on-chain actions.

LYS Labs uses the power of real-time data, ontologies and advanced graph databases to change the way we navigate and understand Web3.

The Web3 Data Dilemma

To really appreciate the challenge that Web3 analytics presents, we need to understand the nature of the data itself. In the on-chain world , every transaction, liquidity provision, and smart contract interaction creates a data point that is linked to countless others. A single user might interact with multiple protocols, each of which is connected to even more tokens, liquidity pools, and other smart contracts. This results in a complex, ever-evolving graph of relationships that traditional databases are not equipped to handle.

Take SQL databases, they were once considered the go-to solution for data management, but within this environment they struggle to keep up with the demand. As the number of interconnected entities grows, the performance of SQL databases decreases rapidly. Complex queries involving multiple joins and aggregations across tables can become prohibitively slow and resource-intensive. Additionally, the rigid tabular structure of SQL databases make it impossible to integrate the diverse types of data that are needed to understand Web3 (such as unstructured data from social media, news articles, and community forums).

Using Graph Databases 

In order to tackle these challenges, LYS Labs turned to graph databases as a foundational concept. Graph databases (like Neo4j) are specifically designed to store and query highly interconnected data. By representing data as a network of nodes (entities) and edges (relationships), graph databases provide context and meaning from the complex and unstructured Web3 data.

Native graph databases offer significant advantages over traditional SQL databases when it comes to handling dynamic, interconnected data. With graph databases, complex queries that would cripple an SQL database can be executed in milliseconds, even across massive datasets with billions of nodes and relationships. This performance boost is essential in such an environment where the ability to analyze data in real-time can mean the difference between seizing an opportunity and missing out.

What’s more, graph databases provide unmatched flexibility in integrating data from diverse sources. By combining on-chain and off-chain data, such as transaction records, liquidity pool states, social media sentiment, and market data, graph databases mean LYS Labs can create a holistic view of the entire Web3 landscape. This multi-dimensional data integration enables analysts to uncover hidden connections, identify emerging trends, and gain a deeper understanding of the complex dynamics at play.

LYS Labs Is At the Forefront of Graph-Powered Analytics

LYS Labs uses advanced Graph Data Science techniques to transform raw on-chain data into a rich, semantically meaningful knowledge graph. The knowledge graph captures the intricate relationships between wallets, tokens, liquidity pools, and smart contracts, providing a comprehensive and nuanced representation of the ecosystem.

Building upon this graph-based foundation, LYS Labs utilizes cutting-edge machine learning algorithms, such as Graph Neural Networks (GNNs), to uncover patterns, detect anomalies, and make predictions about future behaviors and trends. GNNs can be used to leverage historical responses generated by LLMs, enhancing RAG for long-context global summarization. The proposed method demonstrates consistent improvements of 8-19% over baseline methods, such as sparse retrievers (BM25), dense retrievers (Contriever, Dense Passage Retrieval (DPR)), and long-context LLMs (Gemma-8K, Mistral-8K).

One of the most exciting parts of LYS Labs is the application of Graph Retrieval-Augmented Generation (RAG) techniques. The effectiveness of RAG has been demonstrated through its performance on challenging benchmarks like HotpotQA (19.6% F1 improvement) and 2WikiMultiHopQA (33.5% F1 improvement) compared to HippoRAG. These benchmarks are designed to test a system's ability to handle complex, multi-step questions that require reasoning over multiple pieces of information.

Additionally, another proposed GNN-RAG optimization model achieved strong performance across all indicators, with a quality score of 0.90, knowledge consistency of 0.85, and reasoning ability of 0.91 which significantly surpasses other models (5.67–28.74%).

Utilizing graph databases with advanced language models, LYS Labs has been able to create a solution that can answer complex, ad-hoc questions about Web3 in natural language. This means that users can simply ask questions like:

"Which liquidity pools have seen the highest volume of transactions from whales in the past week?" When posed with a question like this, the RAG will not only find the relevant data but also provide deep, contextually rich insights. It does this by traversing the knowledge graph, retrieving the necessary information, and then generating a comprehensive, easily understandable response.

But RAG goes beyond simply locating and presenting data. Its true power lies in its ability to offer profound context and nuanced understanding. By leveraging the intricate web of relationships captured in the knowledge graph, RAG can uncover hidden patterns, identify key influencers, and surface insights that might otherwise remain buried in the vast expanse of on-chain data.

For instance, in addition to identifying the liquidity pools with the highest whale activity, RAG might also reveal:

  • The specific whale addresses driving this volume and their historical trading patterns
  • Correlations between whale activity and price movements of the associated tokens
  • Potential ripple effects on other related pools and protocols
  • Comparisons to past periods of similar activity and their outcomes

This depth of context is invaluable for making truly informed decisions. It means users can not only see what is happening but also help understand why it's happening and what it might mean for the future.

Also, by presenting these insights in plain English, RAG democratizes access to powerful on-chain analytics. It enables anyone, regardless of their technical background or familiarity with data science, to harness the information contained within the crypto networks.

Advanced Functionality (GNN-RAG and Beyond)

The GNN-RAG developed by LYS Labs utilizes the power of graph neural networks to encode the complex relationships between entities in Web3. However, the RAG component serves a much broader purpose than simply acting as a natural language processing layer. RAG is a framework that alleviates Large Language Model (LLM) hallucinations by enriching the input context with up-to-date and accurate information retrieved from a VectorDB or a Knowledge Graph. This integration ensures that the generated responses are not only coherent and well-formed but also grounded in the most relevant and factual data available. 

Some key advantages of the RAG component include:

  • Facilitating the acquisition of contextually rich sub-graph data, positioning it as a formidable asset for enhancing AI models.
  • Guaranteeing that queries yield the most pertinent and contextual insights by strategically using graphs that delineate entities and their interconnections.
  • Providing profound context from intricate relationships, thereby strengthening the LLM’s ability to provide the most relevant answers

KG-RAG demonstrates a 4.32% increase in answer correctness, relevancy, and faithfulness, along with a 5.71% improvement in semantic similarity. Other implementations of KG-RAG show substantial improvements over traditional methods, with a 20-30% increase in Answer Entity Recall for multi-hop reasoning tasks and up to 40% gains in accuracy-based metrics.

Ultimately, this means that users can ask complex, multi-hop questions about Web3 in plain English and receive accurate, insightful responses in real-time. The GNN-RAG would parse the query, traverse the knowledge graph to identify the relevant entities and relationships, and generate a detailed response, complete with data visualizations and actionable insights.

This level of advanced functionality opens up a whole new realm of possibilities for Web3 analytics, such as:

  • Monitor the real-time flow of assets across the entirety of Web3, identifying emerging trends and opportunities.
  • Perform complex, multi-dimensional analyses of liquidity positions, yield farming strategies, and trading patterns, etc.
  • Investigate suspicious activities and potential security threats with depth and accuracy.
  • Gain a holistic understanding of the complex interplay between protocols, tokens, and users.
  • Make data-driven decisions based on timely, actionable insights.

GNN-RAG also achieves state-of-the-art performance on challenging benchmarks. Using a 7B parameter tuned LLM, GNN-RAG excels on multi-hop and multi-entity questions, outperforming competing approaches by 8.9-15.5% points at answer F1. The F1 score is a common evaluation metric that balances precision (the proportion of returned results that are relevant) and recall (the proportion of relevant results that are returned). 

These results show the effectiveness of integrating graph structure information, showcasing the model's ability to capture complex relationships between nodes and edges. This makes GNN-RAG's capability to handle sophisticated knowledge reasoning tasks much better, resulting in text generation with improved knowledge consistency.

Real-World Impact and Use Cases

The impact of LYS Labs graph-powered analytics extends far beyond theoretical insights, it's already driving tangible value for Web3 users across a wide range of use cases, such as:

AI Training

One of the most significant applications of LYS Labs is AI training. The LYS Sandbox provides a powerful, privacy-centric environment for training and testing AI models using real-time data streams and pre-loaded datasets from the Web3.

The AI copilot assists users in refining models and iterating on their performance. This collaboration between human expertise and AI-powered tools accelerates the development of high-performing models customized to the unique challenges of Web3.

Through this powerful AI training infrastructure, LYS Labs is enabling the development of next-generation AI models that can navigate the complexities of Web3 data with accuracy and insight.

Protocol Governance and Transparency

Graph analytics also plays an important role in bettering the transparency and accountability of Web3 protocol governance (something that is becoming increasingly in demand for). By analyzing voting patterns, proposal histories, the relationships between key holders, and more, LYS Labs can identify potential conflicts of interest, collusion, or other governance risks.

With this level of transparency, building trust in Web3 protocols becomes less of a challenge and governance decisions become aligned with the best interests of the community. Using LYS, users and token holders can easily track the activities of major holders, verify the integrity of governance processes, and make informed decisions about their participation.

Risk Management and Security

One of the key applications of graph analytics in Web3 is in the realm of risk management and security. By utilizing advanced graph pattern matching and anomaly detection, LYS Labs helps protocols and users stay one step ahead of potential threats.

For example, LYS Labs can detect suspicious transaction patterns that may indicate a flash loan attack or a rug pull in progress. By identifying these anomalies in real-time and alerting the relevant users, LYS Labs enables proactive mitigation of risks, helping to keep user funds safe and maintain the integrity of Web3 protocols.

An evaluation of the KG-RAG approach on the UltraDomain benchmark, (comparing it against three chunk-based retrieval methods), showcased significant improvements. The KG-RAG solutions outperformed the closest chunk-based baseline by 15.4% to 221.2%, demonstrating substantial advancements in retrieval quality and response generation.

Looking Ahead

As Web3 continues to grow and evolve at a rapid pace, the importance of advanced analytics tools like LYS Labs has to offer will only continue to increase. With the power of graph databases and AI, LYS Labs is well-positioned to stay at the forefront, continuously pushing the boundaries of what's possible.

Looking ahead, LYS Labs is committed to expanding its overall capabilities to keep pace with the ever-changing needs of Web3.

Some of the key areas of focus for future development could include:

  • Integrating new data sources and types, such as cross-chain data and off-chain data from TradFi.
  • Enhancing the natural language processing capabilities of the RAG component to enable even more sophisticated and nuanced querying.
  • Developing new ML models and architectures specifically customized to the unique challenges of Web3 analytics, such as temporal graph networks for modeling time-series data.
  • Collaborating with leading Web3 protocols to create custom analytics solutions that address their specific needs and use cases.
  • Expanding the explainability and interpretability features to provide users with even greater transparency and trust in the insights generated.

In Conclusion…

Web3 innovation has been great, but it has also created new challenges in terms of data complexity and analytical demands. Traditional tools and approaches simply aren't equipped to handle the scale, speed, and intricacy of the data landscape.

LYS Labs is at the forefront of addressing these challenges, creating new and exciting solutions using graph-powered analytics that utilize the latest advances in databases, AI, and natural language processing. By harnessing the power of graph databases and techniques like GNNs and RAG, LYS Labs is unlocking previously inaccessible insights, enabling Web3 users to navigate the ecosystem with unparalleled clarity and confidence.

If you're as passionate about Web3 and the power of data as we are, we invite you to join us. Whether you're a developer, a liquidity provider, a trader, or a researcher, LYS Labs has something to offer. Together, we can unlock the full potential of Web3, one insight at a time.