I am often asked by people with ideas for extending or enhancing Schema.org how they go about it. These requests inevitably fall into two categories – either ‘How do I decide upon and organise my new types & properties and relate them to other vocabularies and ontology‘ or ‘now I have my proposals, how do I test, share, and submit them to the Schema.org community?‘
I touch on both of theses areas in a free webinar I recorded for DCMI/ASIS&T a couple of months ago. It is in the second in a two part series Schema.org in Two Parts: From Use to Extension . The first part covers the history of Schema.org and the development of extensions. That part is based up on my experiences applying and encouraging the use of Schema.org with bibliographic resources, including the set up and work of the Schema Bib Extend W3C Community Group – bibliographically focused but of interest to anyone looking to extend Schema.org.
To add to those webinars, the focus of this post is in answering the ‘now I have my proposals, how do I test, share, and submit them to the Schema.org community?‘ question. In later posts I will move onto how the vocabulary its examples and extensions are defined and how to decide where and how to extend.
This post was updated in June 2020 to reflect changes in the processes required to work with the Schema.org sources that have occurred over the proceeding months and years.
What skills do you need
Not many. If you want to add to the vocabulary and/or examples you will naturally need some basic understanding of the vocabulary and the way you navigate around the Schema.org site, viewing examples etc. Beyond that you need to be able to run a few command line instructions on your computer and interact with GitHub. If you are creating examples, you will need to understand how Microdata, RDFa, and JSON-LD mark up are added to html.
I am presuming that you want to do more than tweak a typo, which could be done directly in the GitHub interface, so in this post I step through the practice of working locally, sharing with others, and proposing via a Github Pull Request your efforts..
How do I start
You need to set up the environment on your Linux/Mac OSX system, this needs a local installation of Git so that you can interact with the Schema.org source and a local copy of the Google App Engine SDK to run your local copy of the Schema.org site. The following couple of links should help you get these going.
- Git – https://git-scm.com/book/en/v2/Getting-Started-Installing-Git
- App Engine SDK – https://cloud.google.com/appengine/downloads – you need the Python version
Getting the Source
Note: Many of the steps referenced here are documented in the SOFTWARE_README.md file located in the Schema.org repository. It is always worth checking that file for any updates.
This is a two-step process. Firstly you need your own parallel fork of the Schema.org repository. If you have not yet, create a user account at Github.com. They are free, unless you want to keep your work private.
Logged into Github, go to the Schema.org repository page – https://github.com/schemaorg/schemaorg, and select Fork this will create a schemaorg repository copy under your account.
Create yourself a working area on your PC and via a command line/terminal window place yourself in that directory to run the following git command, with MyAccount being replaced with your Github account name:
git clone --recurse-submodules https://github.com/MyAccount/schemaorg.git
This will download and unwrap a copy of the code into a schemaorg subdirectory of your working directory.
Running a Local Version
In the directory where you downloaded the code, run the following command:
- In most circumstances using value of ‘L‘ (for local configuration files) and the default ‘N‘ (for building site static files) will be sufficient.
- To ensure up to date supplementary files (data dump files, jsonld context, owl file) select ‘Y‘.
This should result in the output at the command line that looks something like this:
The important line being the one telling you module “default” running at: http://localhost:8080 If you drop that web address into your favourite browser you should end up looking at a familiar screen.
Success! You should now be looking at a version that operates exactly like the liver version, but is totally contained on your local PC. Note the message on the home page reminding you which version you are viewing.
Running a Shared Public Version
It is common practice to want to share proposed changes with others before applying them to the Schema.org repository in Github. Fortunately there is an easy free way of running a Google App Engine in the cloud. To do this you will need a Google account which most of us have. When logged in to your Google account visit this page: https://console.cloud.google.com
From the ‘Select a project‘ menu Create a project.. Give your project a name – choose a name that is globally unique. There is a convention that we use names that start with ‘sdo-‘ as an indication that it is a project running a Schema.org instance.
To upload the code run the following command:
- In most circumstances using value of ‘L‘ (for local configuration files) and the default ‘Y‘ (for building site static files) will be sufficient.
- Version for release should be entered as relevant to the vocabulary release version (eg. 3.8, 3.9, 4.0, 5.0, etc)
- Project: This is entered as a valid name for a Google cloud project that you have write permission to access.
- Version: This is the version of code that is running within the project – this is different from the release version of Schema.org
- If a version that is already running in the appengine project is selected, you will be asked to confirm its overwrite.
- After upload you can choose to Exercise site (to pre-load caches) – this should only be necessary for uploading a new version to a busy site.
You should get output that indicates the upload process has happened successfully. Dependant on your login state, you may find a browser window appearing to ask you to login to Google. Make sure at this point you login as the user that created the project.
To view your new shared instance go to the following address http://sdo-blogpost.appspot.com – modified to take account of your project name http://<project name>.appspot.com.
Working on the Files
I will go into the internal syntax of the controlling files in a later post. However, if you would like a preview, take a look in the data directory you will find a large file named schema.rdfa. This contains the specification for core of the Schema.org vocabulary – for simple tweaks and changes you may find things self-explanatory. Also in that directory you will find several files that end in ‘-examples.txt‘. As you might guess, these contain the examples that appear in the Schema.org pages.
Evolving and Sharing
How much you use your personal Github schemaorg repositoy fork to collaborate with like minded colleagues, or just use it as a scratch working area for yourself, is up to you. However you choose to organise yourself, you will find the following git commands, that should be run when located in the schemaorg subdirectory, useful:
- git status – How your local copy is instep with your repository
- git add <filename> – adds file to the ones being tracked against your repository
- git commit <filename> – commits (uploads) local changed or added file to your repository
- git commit –a – commits (uploads) all changed or added files to your repository
It is recommended to commit as you go.
The mechanism for requesting a change of any type to Schema.org is to raise a Github Pull Request. Each new release of Schema.org is assembled by the organising team reviewing and hopefully accepting each Pull Request. You can see the current list of requests awaiting acceptance in Github. To stop the comments associated with individual requests getting out of hand, and to make it easier to track progress, the preferred way of working is to raise a Pull Request as a final step in completing work on an Issue.
Raising an Issue first enables discussion to take place around proposals as they take shape. It is not uncommon for a final request to differ greatly from an original idea after interaction in the comment stream.
So I suggest that you raise an Issue in the Schema.org repository for what you are attempting to solve. Try to give it a good explanatory Title, and explain what you intend in the comment. This is where the code in your repository and the appspot.com working version can be very helpful in explaining and exploring the issue.
When ready to request, take yourself to your repository’s home page to create a New Pull request. Providing you do not create a new branch in the code, any new commits you make to your repository will become part of that Pull Request. A very handy feature in the real world where inevitably you want to make minor changes just after you say that you are done!
Look out for the next post in this series – Working Within the Vocabulary – in which I’ll cover working in the different file types that make up Schema.org and its extensions.