I am often asked by people with ideas for extending or enhancing Schema.org how they go about it. These requests inevitably fall into two categories – either ‘How do I decide upon and organise my new types & properties and relate them to other vocabularies and ontology‘ or ‘now I have my proposals, how do I test, share, and submit them to the Schema.org community?‘
I touch on both of theses areas in a free webinar I recorded for DCMI/ASIS&T a couple of months ago. It is in the second in a two part series Schema.org in Two Parts: From Use to Extension . The first part covers the history of Schema.org and the development of extensions. That part is based up on my experiences applying and encouraging the use of Schema.org with bibliographic resources, including the set up and work of the Schema Bib Extend W3C Community Group – bibliographically focused but of interest to anyone looking to extend Schema.org.
To add to those webinars, the focus of this post is in answering the ‘now I have my proposals, how do I test, share, and submit them to the Schema.org community?‘ question. In later posts I will move onto how the vocabulary its examples and extensions are defined and how to decide where and how to extend.
What skills do you need
Not many. If you want to add to the vocabulary and/or examples you will naturally need some basic understanding of the vocabulary and the way you navigate around the Schema.org site, viewing examples etc. Beyond that you need to be able to run a few command line instructions on your computer and interact with GitHub. If you are creating examples, you will need to understand how Microdata, RDFa, and JSON-LD mark up are added to html.
I am presuming that you want to do more than tweak a typo, which could be done directly in the GitHub interface, so in this post I step through the practice of working locally, sharing with others, and proposing via a Github Pull Request your efforts..
How do I start
You need to set up the environment on your PC, this needs a local installation of Git so that you can interact with the Schema.org source and a local copy of the Google App Engine SDK to run your local copy of the Schema.org site. The following couple of links should help you get these going.
- Git – https://git-scm.com/book/en/v2/Getting-Started-Installing-Git
- App Engine SDK – https://cloud.google.com/appengine/downloads – you need the Python version
Getting the Source
This is a two-step process. Firstly you need your own parallel fork of the Schema.org repository. If you have not yet, create a user account at Github.com. They are free, unless you want to keep your work private.
Logged into Github, go to the Schema.org repository page – https://github.com/schemaorg/schemaorg, and select Fork this will create a schemaorg repository copy under your account.
Create yourself a working area on your PC and via a command line/terminal window place yourself in that directory to run the following git command, with MyAccount being replaced with your Github account name:
git clone https://github.com/MyAccount/schemaorg.git
This will download and unwrap a copy of the code into a schemaorg subdirectory of your working directory.
Running a Local Version
In the directory where you downloaded the code, run the following command:
This should result in the output at the command line that looks something like this:
The important line being the one telling you module “default” running at: http://localhost:8080 If you drop that web address into your favourite browser you should end up looking at a familiar screen.
Success! You should now be looking at a version that operates exactly like the liver version, but is totally contained on your local PC. Note the message on the home page reminding you which version you are viewing.
Running a Shared Public Version
It is common practice to want to share proposed changes with others before applying them to the Schema.org repository in Github. Fortunately there is an easy free way of running a Google App Engine in the cloud. To do this you will need a Google account which most of us have. When logged in to your Google account visit this page: https://console.cloud.google.com
From the ‘Select a project‘ menu Create a project.. Give your project a name – choose a name that is globally unique. There is a convention that we use names that start with ‘sdo-‘ as an indication that it is a project running a Schema.org instance.
To ready your local code to be able to be uploaded into the public instance you need to make a minor change in a file named app.yaml in the schemaorg directory. Use your favourite text editor to change the line near the top of the file that begins application to have a value that is the same as the project name you have just crated. Note that lines beginning with a ‘#’ character are commented out and have no effect on operation. For this post I have created an App Engine project named sdo-blogpost.
To upload the code run the following command:
appcfg.py update schemaorg/
You should get output that indicates the upload process has happened successfully. Dependant on your login state, you may find a browser window appearing to ask you to login to Google. Make sure at this point you login as the user that created the project.
To view your new shared instance go to the following address http://sdo-blogpost.appspot.com – modified to take account of your project name http://<project name>.appspot.com.
Working on the Files
I will go into the internal syntax of the controlling files in a later post. However, if you would like a preview, take a look in the data directory you will find a large file named schema.rdfa. This contains the specification for core of the Schema.org vocabulary – for simple tweaks and changes you may find things self-explanatory. Also in that directory you will find several files that end in ‘-examples.txt‘. As you might guess, these contain the examples that appear in the Schema.org pages.
Evolving and Sharing
How much you use your personal Github schemaorg repositoy fork to collaborate with like minded colleagues, or just use it as a scratch working area for yourself, is up to you. However you choose to organise yourself, you will find the following git commands, that should be run when located in the schemaorg subdirectory, useful:
- git status – How your local copy is instep with your repository
- git add <filename> – adds file to the ones being tracked against your repository
- git commit <filename> – commits (uploads) local changed or added file to your repository
- git commit –a – commits (uploads) all changed or added files to your repository
It is recommended to commit as you go.
The mechanism for requesting a change of any type to Schema.org is to raise a Github Pull Request. Each new release of Schema.org is assembled by the organising team reviewing and hopefully accepting each Pull Request. You can see the current list of requests awaiting acceptance in Github. To stop the comments associated with individual requests getting out of hand, and to make it easier to track progress, the preferred way of working is to raise a Pull Request as a final step in completing work on an Issue.
Raising an Issue first enables discussion to take place around proposals as they take shape. It is not uncommon for a final request to differ greatly from an original idea after interaction in the comment stream.
So I suggest that you raise an Issue in the Schema.org repository for what you are attempting to solve. Try to give it a good explanatory Title, and explain what you intend in the comment. This is where the code in your repository and the appspot.com working version can be very helpful in explaining and exploring the issue.
When ready to request, take yourself to your repository’s home page to create a New Pull request. Providing you do not create a new branch in the code, any new commits you make to your repository will become part of that Pull Request. A very handy feature in the real world where inevitably you want to make minor changes just after you say that you are done!
Look out for the next post in this series – Working Within the Vocabulary – in which I’ll cover working in the different file types that make up Schema.org and its extensions.