search: show me houses on a lake that are for sale in Littleton
Google smartly returns me a link like "https://www.zillow.com/littleton-co/waterfront/"
And then zillow.com shows me a bunch of homes that are not on the water.
So, how will future applications both understand the customer request, match that request with their data, and then carry on a meaningful interaction?
This is where semantic metadata comes to the rescue.
The application needs to understand that a spatial feature, which is represented by a bunch of GPS points, is a body of water and more specifically a lake. It needs to understand that the listing for sale is a home. And that the listing has a property boundary that is next to the lake.
The app, with all its friend apps, have to collect, calculate and interpolate information relating to "Listing XYZ":
"Listing XYZ" "is next to" "a lake"
"Listing XYZ" "is a" "home"
"Listing XYZ" "is in" "Littleton"
"Listing XYZ" "is in state of" "For Sale"
This data is referred as a Subject (Listing XYZ), Predicate (is next to), Object (a lake) Triple and is the foundation of machine learning.
These triples are based on an Entity Attribute Value (EAV) model, in which the subject is the entity, the predicate is the attribute, and the object is the value. Each triple has a unique identifier known as the Uniform Resource Identifier, or URI. URIs look like web page addresses. The parts of a triple, the subject, predicate, and object, represent links in a graph. link
For this magic to happen the app, and its friends, need to share a "common language". This common language has to extend to the actual person asking the question.
Or, you can just rely on Google to do the smarts for you. The problem comes when your application actually has to do something. Like in the example above, the Zillow site just didn't have the smarts to take the customer forward with their request.
Here are some challenges facing most ecommerce environments as defined by a group of smart folks link
existing product data are not suitable for automated processing
product data often lack interoperability between siloed environments
insufficient use of unique product identifiers
heterogeneity of product category taxonomies
incomplete, inconsistent, or outdated product descriptions
weakness of current product recommender systems
These are actually really big issues when you need to go beyond a simple ecommerce site with a siloed product catalog. Addressing these issues is at the heart of is discussion.
Lets get started.
Google quote: “At its core, Search is about understanding language.”
About 13 years ago Google collaborating with Microsoft, and Yahoo introduced the Schema.org vocabulary. Your application creates "Semantic Metadata" that leverages Schema.org language.
Here is a good overview of "Semantic metadata" link
Google introduced the Knowledge Graph to allow apps to incorporate these language capabilities into their content (things not strings) link
transform data into standard language/terms (schema.org ontology)
draw connections between standard terms (RDF Triple: subject => predicate => object)
term: application of ontologies — formal vocabularies to define concepts, data objects and the relationships between them
Examples of transforming HTML to semantic HTML and JSON into semantic JSON. Two relatively simple practices that make this data so much more useful to applications.
Web3 HTML5 semantic markup
<div itemscope itemtype="https://schema.org/Organization">
<span itemprop="name"> WordLift </span>
<img src="logo.jpg" itemprop="logo" alt ="Wordlift's logo" />
WordLift's home page:
<a href="https://wordlift.io" itemprop="url">wordlift.io </a>
</div>
JSON-LD
{
"@context": {
"@vocab": "https://w3.org/ns/person#"
},
"@type": "Person",
"givenName": "John",
"familyName": "Doe"
}
Here is a good overview of how web content markup is used in eCommerce sites. link
There are numerous large data sources to link your data to. For example that near houses example shown above. You can link the lake to your house by referencing the Geonames database.
Geonames a database with over 25 million geographical entities: states, regions, cities, municipalities, places of interest such as villas, monuments, etc. link
There are also tools that help you decorate your content with this semantic data. link link
But, your main source will be data you manage within your system. This discussion will focus on commerce solutions.
The hardest part of this process is having information that has integrity and supports a single ontology over all its data silos. In some cases it is important to have a unified ontology over the entire industry (i.e. brickschema.org) so that customers and industry partners speak a common language.
This is where composability and Packaged Business Capabilities (PBC's) come in.
Your front-end reaches out to its friends, the various PBC's, for the "Triples".
Product Asset Service PBC knows all about available products and services.
Identity Service PBC knows all about the person
Commerce PBC knows all about the customer, customer offers, customer cart, customer cart, and checkout
Product Asset Service is responsible the management of NFT Assets on the blockchain. The Product Asset Service also provides a Knowledge Base used for NFT Asset smart search. This service leverages Semantic Ontology concepts being developed by POC4COMMERCE which is a part of the European Union ONTOCHAIN Project. link This service is essential to establishing a common e-Commerce ontology that will support smart search (machine learning and artificial intelligence).
Here is a typical flow supported by the Product Asset Service.
Provider publishes Smart Contract Factory used to mint NFT Assets
Provider mints NFT Asset via the Factory. Provider pushes NFT Asset Triples to Knowledge base. NFT Address and TokenId are included.
Customer searches Knowledge base for NFT Assets. Knowledge base uses NFT Asset Triples in search.
Customer selects NFT Asset and makes an offer for it.
Provider accepts offer. Knowledge base is updated. Provider transfers NFT to Customer
Customer gets confirmation.
Bricks provides an abstract ontology for buildings
brickschema.org tries to create an abstraction
Abstract entities of given types can be related to each other
Classes, Entities, Relationships
Graph
Nodes (things) - asset, building
Edges (relationships) - location, control, composition
Resource Description Framework (RDF) - RDF Triple
Way of describing a Graph
Storing in a database
Google and the OGC have been working on geospatial ontology specifications for over 25 years. I developed a battlespace awareness solution on OGC ontology 20 years ago. To this day they debate the value of each of their "standards".
Our Product Asset Service, and Ethereum ERC-721 collection extension is loosely based on the concepts within OGC API specification released several years ago.
Commerce solutions use a very different taxonomy approach. Products and Attributes are at the heart of these solutions. Commerce solutions use a very abstract taxonomy that must support many domains. Features like Product/Variant/Asset definition, Facetted Search and Navigation, Dynamic Pricing, Product Comparison, Attribute Based Pricing, Order Mgt, etc entities support highly sparse attribute set.
Entity - Attribute - Value (EAV Model), sometimes referred to as "Open Schema" link
Entity Type
Attribute Type
Neural Matching
Semantic Triple, RDF Triple or just Triple
RDF
Row Modeling
also known as "generic data modeling" or "open schema",
https://en.wikipedia.org/wiki/Data_modeling#Generic_data_modeling
EAV model is for highly sparse, heterogeneous attributes
https://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model
These are very abstract and represent all types of entities with varying attribute values
Product
Offer, Order
Custom and Entity-Attribute-Value (EAV) attributes
Custom Attributes, Extension Attributes/Properties
Dynamic Bundling, Dynamic Pricing, Dynamic Product Comparison on Attribute Type 1
Service A => Attribute Type 1
Service B => Attribute Type 1
Service C => Attribute Type 1
Bluestone PIM
Commerce Tools
PIM's all represent there data using Product Attributes
The attributes can be configured out of the many different data types including numbers, text, enums but also references to other objects and JSON documents
Contentstack
Contentfull
Strapi
Prismic
Approach:
Need to accurately model your domain. Build a robust information taxonomy
Align that taxonomy with industry standards
Build out your data and API's to align with your domain driven information taxonomy
API's must adopt standard schema representations
API's (discoverability)
Back to our House by the lake use-case.
First lets look at the system data (providers) to make this happen.
House is on a piece of property.
Property has a plat (property line basemap) associated with a survey. But, this is not in GPS coordinates.
Property line has to be translated into GPS coordinates
Get bodies of water with GPS coordinates from third party OGC service. Get those that are tagged as Lakes.
Calculate the minimum distance between these two collections of GPS points.
semantic types
field's primitive type such as BOOLEAN, STRING, NUMBER
A field with a NUMBER data type may semantically represent a currency amount or percentage and a field with a STRING data type may semantically represent a city
JSON schema vs. attribute value triples
Abstraction with dynamic set of attributes
product attribute schema
The display taxonomy leverages the attribute schema for faceted search and guided navigation
Traditional Product management w/ attributes
Product definition
https://support.google.com/merchants/answer/7052112?sjid=3635292693576216074-NA&visit_id=638218740854519227-731767354&rd=1