Why Semantic Metadata?

search: show me houses on a lake that are for sale in Littleton

Google smartly returns me a link like "https://www.zillow.com/littleton-co/waterfront/"

And then zillow.com shows me a bunch of homes that are not on the water. 

So,  how will future applications both understand the customer request, match that request with their data, and then carry on a meaningful interaction?

This is where semantic metadata comes to the rescue.  

The application needs to understand that a spatial feature, which is represented by a bunch of GPS points, is a body of water and more specifically a lake.  It needs to understand that the listing for sale is a home.  And that the listing has a property boundary that is next to the lake.

The app, with all its friend apps, have to collect, calculate and interpolate information relating to "Listing XYZ":  

This data is referred as a Subject (Listing XYZ), Predicate (is next to), Object (a lake) Triple and is the foundation of machine learning.

These triples are based on an Entity Attribute Value (EAV) model, in which the subject is the entity, the predicate is the attribute, and the object is the value. Each triple has a unique identifier known as the Uniform Resource Identifier, or URI. URIs look like web page addresses. The parts of a triple, the subject, predicate, and object, represent links in a graph. link 


How does your application get Smart?

For this magic to happen the app, and its friends, need to share a "common language".  This common language has to extend to the actual person asking the question.

Or, you can just rely on Google to do the smarts for you.  The problem comes when your application actually has to do something.  Like in the example above, the Zillow site just didn't have the smarts to take the customer forward with their request.

Here are some challenges facing most ecommerce environments as defined by a group of smart folks link

These are actually really big issues when you need to go beyond a simple ecommerce site with a siloed product catalog.   Addressing these issues is at the heart of is discussion.

Lets get started.

Google quote: “At its core, Search is about understanding language.”

About 13 years ago Google collaborating with Microsoft, and Yahoo introduced the Schema.org vocabulary.   Your application creates "Semantic Metadata" that leverages Schema.org language.

Here is a good overview of "Semantic metadata" link 



Google introduced the Knowledge Graph to allow apps to incorporate these language capabilities into their content (things not strings) link 


term: application of ontologies — formal vocabularies to define concepts, data objects and the relationships between them

Examples of transforming HTML to semantic HTML and JSON into semantic JSON.  Two relatively simple practices that make this data so much more useful to applications.

Web3 HTML5 semantic markup

<div itemscope itemtype="https://schema.org/Organization">

<span itemprop="name"> WordLift </span>

<img src="logo.jpg" itemprop="logo" alt ="Wordlift's logo" />

  WordLift's home page:

<a href="https://wordlift.io" itemprop="url">wordlift.io </a>

</div>

JSON-LD

{

  "@context": {

    "@vocab": "https://w3.org/ns/person#"

  },

  "@type": "Person",

  "givenName": "John",

  "familyName": "Doe"

}


Here is a good overview of how web content markup is used in eCommerce sites. link


So, where does my front-end application get the "Triples" needed to markup my content?

There are numerous large data sources to link your data to.  For example that near houses example shown above.  You can link the lake to your house by referencing the Geonames database.

Geonames a database with over 25 million geographical entities: states, regions, cities, municipalities, places of interest such as villas, monuments, etc. link

There are also tools that help you decorate your content with this semantic data. link link

But,  your main source will be data you manage within your system.  This discussion will focus on commerce solutions.

The hardest part of this process is having information that has integrity and supports a single ontology over all its data silos.  In some cases it is important to have a unified ontology over the entire industry (i.e. brickschema.org) so that customers and industry partners speak a common language.

This is where composability and Packaged Business Capabilities (PBC's) come in.

Your front-end reaches out to its friends, the various PBC's, for the "Triples".   


Product Asset Service

Product Asset Service is responsible the management of NFT Assets on the blockchain.  The Product Asset Service also provides a Knowledge Base used for NFT Asset smart search.  This service leverages Semantic Ontology concepts being developed by POC4COMMERCE which is a part of the European Union ONTOCHAIN Project. link  This service is essential to establishing a common e-Commerce ontology that will support smart search (machine learning and artificial intelligence).

 Here is a typical flow supported by the Product Asset Service.


brickschema.org ontology overview

Bricks provides an abstract ontology for buildings

brickschema.org tries to create an abstraction

Classes, Entities, Relationships

Resource Description Framework (RDF) - RDF Triple


Geospatial ontology overview

Google and the OGC have been working on geospatial ontology specifications for over 25 years.  I developed a battlespace awareness solution on OGC ontology 20 years ago.  To this day they debate the value of each of their "standards".

Our Product Asset Service, and Ethereum ERC-721 collection extension is loosely based on the concepts within OGC API specification released several years ago.



Other models...

Commerce solutions use a very different taxonomy approach.  Products and Attributes are at the heart of these solutions.   Commerce solutions use a very abstract taxonomy that must support many domains.    Features like Product/Variant/Asset definition, Facetted Search and Navigation, Dynamic Pricing, Product Comparison, Attribute Based Pricing, Order Mgt, etc entities support highly sparse attribute set.

Entity, attribute, value model


Entity - Attribute - Value (EAV Model), sometimes referred to as  "Open Schema" link
Entity Type
Attribute Type

Neural Matching

Semantic Triple, RDF Triple or just Triple


RDF

Row Modeling

also known as  "generic data modeling" or "open schema", 

https://en.wikipedia.org/wiki/Data_modeling#Generic_data_modeling

EAV model is for highly sparse, heterogeneous attributes 


https://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model

These are very abstract and represent all types of entities with varying attribute values

Product

Offer, Order

Custom and Entity-Attribute-Value (EAV) attributes 

Custom Attributes, Extension Attributes/Properties


Dynamic Bundling, Dynamic Pricing, Dynamic Product Comparison on Attribute Type 1

Service A => Attribute Type 1

Service B => Attribute Type 1

Service C => Attribute Type 1




MACH,  Packaged Business Capabilities

Bluestone PIM

Commerce Tools

PIM's all represent there data using Product Attributes

The attributes can be configured out of the many different data types including numbers, text, enums but also references to other objects and JSON documents 


CMS use a GraphQL and specific schema structure

Contentstack

Contentfull

Strapi

Prismic


Approach:


API's (discoverability)



Back to our House by the lake use-case.

First lets look at the system data (providers) to make this happen.




semantic types


field's primitive type such as BOOLEAN, STRING, NUMBER

A field with a NUMBER data type may semantically represent a currency amount or percentage and a field with a STRING data type may semantically represent a city


JSON schema   vs.  attribute value triples

Abstraction with dynamic set of attributes



product attribute schema 

Product display taxonomies 

The display taxonomy leverages the attribute schema for faceted search and guided navigation 


Traditional Product management w/ attributes

Product definition

https://support.google.com/merchants/answer/7052112?sjid=3635292693576216074-NA&visit_id=638218740854519227-731767354&rd=1