Web3.0 GPT4 LLM Artificial Intelligence Experiment

-------------------------------------------  Completed 6/16/2023 ------------------------------------------

Our goal is to find ways to leverage LLM and Knowledge Base technologies to enhance the eCommerce digital experiences.   

This investigation will center on capabilities similar to Zillow's AI powered natural language search link

There are numerous use-cases that LLM AI can address in the eCommerce space.  We are going to focus on the specific Question and Answer (QA) use case as described by Zillow.  

Commerce Question and Answer (QA)

Commerce QA is a unique beast that blends fluid customer discovery with a very specific product centric language centered on product providers, catalog categories, products/assets, client, loyalty, offers, orders and payment.  It involves a fluid discovery flow where the customer is not always sure of what they want at the beginning.  For example, the Zillow flow must incorporate a lot of customer specific requirements. Some customer needs that they might not even be aware of at first.

A pure LLM model with fine-tuning will not (we believe) provide the needed multi-step logic reasoning, perception about topological factors, and handling the temporal progression. link  

We believe the solutions will incorporate components of both knowledge graph and natural language search.  

This experiment will consist of multiple prototypes that demonstrate the various component capabilities.  Over the next few weeks we want to better understand the fundamental LLM and Graph reasoning blocks that are needed.  The actual end-to-end solution will be demonstrated in a future experiment.

Technologies


Experiment demonstrations


First demonstration


Second demonstration


Third demonstration


Conclusions:

The three demonstration overview and conclusions are captured in specific demonstration focused documents.

Our findings were relatively conclusive.   LLM on its own was not adequate.   Augmenting LLM with semantic knowledge is going to be key for an ecommerce QA process.   Our best option was to leverage LangChain knowledge base embeddings within the AI natural language search flow.  This seams to be a common conclusion that folks are coming to for use-cases like eCommerce.

Our analysis also showed the incredible pace at which this technology is moving.  All of our demonstrations required us to make upgrades to python libraries to get code working.  Most of the python libraries we used were less than a month old.  We will need to revisit these technologies and findings every month or so.   

We recommend that commerce shops focus on developing rich ontologies and focused domain specific knowledge graphs.  Also build tools to maintain those knowledge graphs over time.  Find ways to embed knowledge graph information (JSON-LD) into your site content (semantic web) so that third party engines can leverage it.

We also recommend looking at Natural Language Processing search techniques that leverage your knowledge graph data to enhance your current site search and navigation.

This will all prepare brands for the future.  We anticipate that open source technology for a blended LLM and Knowledge Graph based QA flow will be available within the next few months.   The time to market for a solution like this will be mostly determined by the brands ability to get their data ready for this capability. 

It's interesting to note the google search trends in keywords like "Artificial Intelligence", "semantic web" and "ontology".  The initial AI hype has been clearly centered on LLM based Question and Answer flows.  In the past 12 years semantic web and ontology development has just not caught on.  I seems pretty clear that the push for AI driven use-cases will probably invigorate  the semantic metadata discussion.

 



Notes:


https://makersuite.google.com/app/home

PaLM API


Semantic Search Engine

https://www.deepset.ai/blog/how-to-build-a-semantic-search-engine-in-python


LLM Search Solution (good examples)

https://github.com/ray-project/langchain-ray

https://www.anyscale.com/blog/llm-open-source-search-engine-langchain-ray


https://github.com/ggerganov/llama.cpp



https://beebom.com/how-run-chatgpt-like-language-model-pc-offline/


https://blog.replit.com/llm-training

AI engine that extracts information from various metaverse data source
Supports a query API
https://whaleanalytica.com/metaverse/


Third Experiment:  Run entire system locally

LLM Engines

Open Source LLM's

Closed LLM's


Definitions:

A GPT model is a type of neural network that uses the transformer architecture to learn from large amounts of text data

Moat: Moats are defensibility mechanisms that prevent competitors from copying your product and business


Datasource

Interesting Projects


Project


My first working example of an LLM application using LLMChain and OpenAI

https://coinsbench.com/chat-with-your-databases-using-langchain-bb7d31ed2e76


nice overview

https://www.leewayhertz.com/build-private-llm/


Interesting but did not build

https://ajay-arunachalam08.medium.com/a-simple-and-easy-web-interface-for-large-language-models-c32698caea2b

https://medium.com/mlearning-ai/an-open-source-low-code-python-wrapper-for-easy-usage-of-the-large-language-models-such-as-e833985c9062


Looks like a simple example

https://hackernoon.com/a-practical-5-step-guide-to-do-semantic-search-on-your-private-data-with-the-help-of-llms


Local Project

https://github.com/codemaker2015/sqldatabasechain-langchain-demo

C:\Web3Store\LLMService\venv

python db.py

python app.py


example to include document in usecase

https://python.langchain.com/en/latest/use_cases/question_answering.html

https://github.com/hwchase17


good examples

https://github.com/hwchase17


embeddings using langchain and gpt4all

https://artificialcorner.com/gpt4all-is-the-local-chatgpt-for-your-documents-and-it-is-free-df1016bc335

GPT4All Experiment

Open Source project that allows you to run the engine on your local computer. Disconnected from the internet.  The demo gave me a great response to my "explain slipper slope" question. 

GPT4All provides a chat client that hooks to your local server.

GPT4All provides an easy to use client python library.

GPT4All has a discord community

The model architecture is based on LLaMa 


Getting started with GPT4All

download and install from this exe: https://gpt4all.io/index.html

github: https://github.com/nomic-ai/gpt4all

it put a shortcut on my windows desktop.  launched the client application

provides about 10 models to choose from.  I chose gpt4all-j-v1.3-groovy because it has a commercial use license and is only 3.53GB in size.

ask question.  "explain slippery slope"  got a great answer.


Ok,  how do we leverage our RDF Graph within GPT4All 

(knowledge graphs, semantic network, JSON-LD) here 

RDF Graph is a curated set of facts.  Links back to schema.org. 40% of sites support JSON-LD data. 

Here is a good overview, but not the answer link link

Issue with AI is hallucinations that we need to watch out for

Realize that fine tuning does not add knowledge.  Today, you cannot retrain the model.  Add your data into model as it is being created. 


LLM and Knowledge Graphs for ecommerce link 


Approach: Fine-Tuning LLM

Supervised Fine-Tuning of an LLM

Supervised Training Phase

The fine-tuning step is relatively cheap regarding computation cost due to available techniques like the LoRa and QLoRA.

NaLLM project focus on fine-tuning with RDF Graph

structured (JSON-LD) vs unstructure (web content)

Approach: retrieval augmented generation 

real-time LLM reaches out to graph. LangChain.  LlamaIndex is another GPT Index

The idea behind retrieval-augmented LLM applications like ChatGPT Plugins and LangChain is to avoid relying on internal LLM knowledge only to generate answers. 

Generate real time sparql query language.

implementation of this approach

https://github.com/tomasonjo/blogs/blob/master/llm/Neo4jOpenAIApoc.ipynb


Knowledge Graph Completion Models

Neo4j unveils generative AI features for Google Cloud Vertex

example of using graph with google ai

https://neo4j.com/labs/apoc/5/ml/vertexai/


step by step implementation

https://neo4j.com/blog/use-graphs-for-smarter-ai-with-neo4j-and-google-cloud-vertex-ai/


Data graph catalog


Knowledge Graph Conference

Linkedin source of data


Code examples

https://gpt-index.readthedocs.io/en/v0.6.9/examples/index_structs/knowledge_graph/KnowledgeGraphDemo.html