Semantic Web Conquers E-Commerce Branch
1st October 5 Comments
(by our guest author Marco Steinhaeuser)
To be spoilt for choice at the bakery around the corner
Imagine a lovely Saturday, before noon: you want to do your beloved a favour, let her sleep and take yourself on a walk to the bakery around the corner. The spring sun shines through the re-vegetated trees, the city blackbirds tweet through the shrubsery. Full of energy and looking forward to a good breakfast in bed, you enter your bakery of choice. “Ten rolls please!” you holler through the small locale full of good Saturday morning moods. Instantly you get the full attention of the saleslady: “Sunday or evening rolls, wheat or rye mix, with grain or without, double or single rolls?” Be quick now. “Five regular, three with sesame and two with pumpkin seeds please.” A bit deflated you leave the bakery and recognize later that you didn’t ask for a loaf today…
What has a bakery to do with the Internet?
You probably can guess what this example is about.
Try to search for “camcorder” in the search engine of your choice and enjoy – believe it or not – 198,000,000 search results.
Contrary to our example, the search engine doesn’t interact with you like our saleslady in the bakery. You are expected to act and to specify your search term. At the bakery of course you know that you will purchase instead of being informed about their products. But this, you have to explicitly tell the search engine (“purchase camcorder online”) along with some attributes such as product title, color, shape etc. Nevertheless it may be that your search will not be successful or only on the third page. What’s the reason for this?
The Internet as a gigantic shredding machine
In the IT world, we try hard to store our data in a structured way. For example in the e-commerce branch, we use databases for our ERP and our shopping cart software to handle products, prices, orders etc. an efficient way. In an online shop, this data is read out from the database and displayed in a browser. We optimize this display until the smallest dot of the i, work with graphic designers and usability experts to pitch our products to the visitors and potential clients of this site. Alone, we didn’t think about automatic processing at this point. So original, cleanly structured data will become a display without any structure, just intended to be seen by human eyes.
Every search engine works with crawlers – small programs to detect the content of web sites and give feedback to the search engine. The search engine will now compute this information and structure the data again but we have no influence on this up to now. On the basis of this structure the search engine will set up heuristics that finally will influence the ranking in the search engine’s index.
Presenting data on a silver plate
In the past, we have already seen a positive influence on the ranking of search engines when displaying data in a more structured way on websites than before: div based sites with standardized headline markups via H1 tags are preferred to older table based technologies.
It means we want to achieve a balancing act between design and usability on the one hand and machine readability on the other. A possible solution could be the so called Rich Snippets.
Amongst others, there are three formats of Rich Snippets accepted by search engines:
- microdata and
Actually, it is about markups that can be determined by crawlers and can be understood as defined structures.
Of course at this point there has to be agreement between the big search engines to standardize it. Today, as far as we know, Google, Yahoo and Bing agreed to drop microformats. Although microformats was easy to implement, there hasn’t been a vocabulary for the mapping between terminologies in the different systems. Microdata by default uses schema.org as a vocabulary and will be supported by the big search engines. However, schema.org is intended to be used much more generally for any kind of websites. According to our research, the ontology (vocabulary) GoodRelations has specialized in the e-commerce sector and is accepted by the search engines as a de facto standard.
Why re-invent markups, as we have META tags?
Data in the META tags has been entered by yourself. However, META data like Rich Snippets (RDFa + GoodRelations in this case) link to places where this information is entered. This way, you will use the already existing pool of data in an effective way without an additional maintenance effort. There is a standardized ontology via GoodRelations that cannot fall prey to misuse by self-edited, maybe irrelevant information. Also other relevant data like shipping costs, not visible on a product detail page, can be displayed (assuming that only this one product will be put in the shopping cart after one’s search).
Everyone of us has already seen search results including star reviews, maybe also prices or the song titles on a CD. This means that the search engine is able to thoroughly display important product data already in the search results listing that we formerly only saw by clicking on this search result. Talking in an online marketing language: it shall turn out that the CTR (click through rate) will increase while the bounce rate will dramatically decrease. That means that by this kind of display of search results you will get clients that are really interested in you, your shopping platform or your products!
Search engine crawlers could be served with content on a silver tablet. By this, search engines de facto could store this content information in a structured way, without any additional checks.
The heuristic of the search engine would become better in the short term and you can guess now how you could be rewarded? Of course, your ranking could increase! We still express this assumption very carefully: there is still nearly no experience nor insider knowledge about it. Nevertheless, we believe in it ;)
The decision to build in this feature into the standard delivery of OXID eShop can be seen as forward-thinking: not many shopping cart systems have this functionality built in by default. Of course there are other e-commerce systems with modules and plugins available. But if nobody understands what it is about, these modules will never be used. In OXID eShop this new feature (like every other as well) has to be explicitly switched on. There’s also a module for older OXID eShop versions that weren’t for several reasons able to do the update to the latest version.
But the most interesting idea is the following: If a search engine crawler would be able to read-out structured data just from the storefront it is probably possible for other services as well. New business models for marketplaces could be established that are not dependent on CSV or XML exports. Price comparison services would not have to update their crawlers to new website designs etc. Stating it clearly: entire branches could die out or be born with a global introduction of this new technology.
By the way, like The Wall Street Journal says, Google already announced that their search mechanisms will basically be changed and will be developed step by step over the next few years in a direction of semantic search, be prepared!
Whoever wants to know how RDFa + GoodRelations have to be adjusted in OXID eShop should take a look at the feature description on OXIDforge.
(Bakery image provided by Martin Hepp)