OER Roadmap : The First Quarter in Review

System-search

Goldilocks and the 3 bearsInvestigation:
To start, it was important to look for an existing API to adapt for publishing OER that can support a learning ecosystem around open education repositories (especially delicious remixable ones). Like Goldilocks, I found a cottage in the woods with several to try. WebDAV, CMIS, and gdata are all interesting and well-established protocols for publishing to the web, but they are much too hot, or in other words they are either too complex for the task at hand, or too specific to particular services. Next I tried Atom Publishing Protocol, but unfortunately too cool — it was too general to specify the work flow natural to publishing packages of learning content. I found two bowls of very, very similar and tasty looking publishing APIs, called Simple Publishing Interface and Simple Webservice Offering Repository Deposit. The bowl I found most satisfying will become evident as the story unravels.

Birth of an OER Roadmap wiki and reserved parking spot for code (March 14): After some investigation of different hosting sites, I chose Google Code, because it had a lot of functionality that was easily available, and a very small learning curve. Welcome to OER-Roadmap.

Hewlett OER Grantees Meeting and Wikimedia: (Mar 29 – Apr 1) I went to the Hewlett’s annual OER grantees meeting and spent time reconnecting with old friends and meeting new ones regarding the broad goals of this fellowship and the potential to help catalyze education content production and consumption. After the grantees meeting I met with Erik Möller of Wikimedia about potential wikipedia/wikieducator bridges to Connexions and potentials for implementing the API in wikimedia projects.

I attended the NITLE Summit (April 6,7) where liberal arts college leaders think about education in a digital age. John Seeley Brown’s keynote (my notes here) on educating for change provided a though-provoking challenge regarding the kind of education needed when content changes constantly. I participated in the OER workshop led by Hal Plotkin of the US Department of Education. 

Cataloging the current Connexions’ API: (April 14th on) : Because Connexions software provides a publishing platform that supports “frictionless remix”, its functionality is a good model for the actions that a publishing API for OER should support. With the help of Connexions’ Systems Engineer, we catalogued the data and metadata that is available, the current publishing implementation, and ideas for how to build licensing, versioning, derived copies, and authorship roles into an API. Those details are found on the Roadmap Wiki here.

Connexions’ Google Summer of Code projects were accepted and two students were chosen. : (April 25th on) : Both of Connexions Google Summer of Code projects have the potential to increase OER production in open repositories.  

  1. Creating a Google Docs editor for Connexions would result in a simple pathway for authors to produce content, and the publishing API from this fellowship would allow docs authors to push their content in from wherever they create it. 
  2. The second project, Enhanced Author Profiles and Kudos, is also relevant to API’s for OER, because it will make it much easier for authors to advertise their publication of open education materials from Connexions.

Sprinting at the Plone East Symposium (May 19-22): We sprinted (communal coding) to extend an existing partial publish implementation in Connexions. The extension allows creation of modules from deposited Word files or CNXML (Connexions semantic document format) files and improves the handling of metadata (title, language, etc). With the help of fellow fellow, Mark Horner, we found and invited Carl Scheffler, whose background is in machine learning, and whose interests include improving education, to participate in the sprint, along with Connexions own Phil Schatz and Ross Reedstrom, with Penn State’s team Mike Halm and Michael Mulich advising. The sprint planning is here and the full description of the day is here

IMS Learning Impact and an emphasis on phased implementation strategy: (May 16-18) The IMS Learning Impact conference was the perfect place to corner, I mean get advice from, those with experience creating API’s and making pathways between software and services. Many thanks to Chuck Severance, Jeff Kahn, Gerry Hanley, Brad Felix and several folks I met at the conference. These discussions led to a general approval for (minor spoiler alert) SWORD, and a planned phased approach to implementing it in Connexions (sooner is better than complete). A simple first cut at the API and an implementation in Connexions allows us to start building tools that use the API to generate interest and excitement and software that others can use, improve, and copy as well.

Open Repositories 2011: The SWORD Workshop: The Choice Revealed (June 7,8)
After meeting with the SWORD technical team, learning more about the second version of SWORD, and conferring with Connexions’ Systems Engineer, SWORD built on AtomPub became the clear winner among the API servings. (And now that Goldilocks metaphor officially ends.) SWORD is simple, flexible, popular, and has a head start in Connexions where we will test it first. The full reasoning is published in the blog entry before this one and a bit more detailed reasoning here.

Client investigations: Translation tools
So with the choice of SWORD, and V2 in particular, investigating potential clients is in full swing. Translating content allows multilingual domain experts to contribute to OER without creating something from scratch. Siyavula’s translation sprints for the Free High School Science textbooks demonstrate that people who know the subject and the language are willing to help out. Carl Scheffler is investigating a couple of different approaches to translation.  

The Specification of the API and implementation are underway. The Specification of an OER Publishing API extension of SWORD V2 has begun. The SWORD protocol should work as is, so the extensions should not change the protocol, but rather make use of the natural SWORD flexibility to include extra metadata and return repository specific information. For Connexions, the first phase of implementation of a SWORD V2 service will support creating, updating, publishing, versioning, and deriving copies of modules through the existing editing spaces (to hold the modules as they are being constructed). The following pages show the progress and ongoing work. 

Choosing a SWORD for publishing OER (a pen may be mightier but …)

Choosing or developing a standard way to publish open educational resources (OER) to libraries (repositories) that encourage remixability (sharing and adapting) was one of my main goals for the first three months of my fellowship with the Shuttleworth Foundation. Fortuitously, the way forward seems clear and smooth by using an existing publishing standard called SWORD.

For non-techies and techies alike, I highly recommend watching the video below that Cottage Labs made for SWORD V2. It is very short, clear, and quite nicely done. If you are a Connexions person, think of the package as a module (the document/page/topic itself, plus all the goodies like images, movies, sound clips, and handouts that the document contains). The package could also be an entire collection. If you work with other educational materials in learning management systems, the package could contain anything from a single PDF or geo-tagged image, to a whole common cartridge course.

As the video shows, SWORD is a simple protocol for depositing content into repositories. It is a specialization of the Atom Publishing Protocol which is itself widely used for publishing web content like blogs. After attending Open Repositories 2011, meeting some of the SWORD technical team (Stuart Lewis and Richard Jones), and getting a few technical details ironed out, SWORD looks like a winner. In particular, the second version (V2) has everything that we need for publishing open education resources.

The reasons for choosing SWORD in a nutshell (well really in a blog).

  1. SWORD is simple, but not too simple. SWORD V2 will handle all the basics: finding locations to publish to, creating items, updating them, and signaling that they are ready to publish. And that is about it. SWORD doesn’t specify authentication and authorization, so you can use other standards for that. SWORD doesn’t get into details of organization (like creating and managing folders and such) so much of the complexity of CMIS-like systems is avoided. SWORD V2 specifies one simple packaging format (an atom entry plus a zip) that must be supported, and then leaves all other packaging up to the repository to negotiate with the client. So creating a sword service is straight-forward and encourages lots of implementations. Clients toolkits can provide lots of useable code, but clients do have to know a bit about how the repositories they want to use expect content to be formatted. But that is true anyhow. You can’t send a bunch of PDF’s to Flickr, because it wants images, right? 
  2. SWORD is flexible. SWORD provides specific returned URLs that can be used to give repository specific requirements like signing a license. Repository specific metadata can be added to the entry at will (with a nice namespace to keep them sorted). Repositories can choose to replace content or to create new versions.
  3. SWORD is popular. SWORD is implemented in many existing repositories (DSpace, ePrints, Fedora, arXiv, Zentity, Invenio) the Open Journal System (OJS), and budding data repositories like Chem#. The US Government led Learning Registry is also including a SWORD service for depositing metadata and paradata about learning materials. The SWORD working team includes many different organizations including JISC, UKOLN, US Library of Congress, and the teams from all those implementors. Client toolkits are available in a variety of popular langagues including Java, Python, Ruby, and PHP.    
  4. SWORD has a head start in Connexions. Since I want to show how clients and services can make publishing OER easier while keeping them in remixable formats, implementing SWORD in Connexions is crucial to the process. Connexions already has a partial implementation of the first version of SWORD that was built for a very specific work flow with OJS. So a full implementation of SWORD has a head start in Connexions.

Publishing API and a new service could make translating Connexions modules easy

p { margin-bottom: 0.08in; }a:link { }

Specialized tools for translating and publishing OER is one of the possible uses of an API for publishing to open education repositories. Repositories may have general purpose editors for creating content, but they aren’t likely to have great facilities for translating content.
Carl Scheffler and I spent some time in the Geneva airport investigating whether Google Translator Toolkit could be the translation editor of choice for Connexions modules. Translator Toolkit has to be convinced and helped along, however, because it was designed for HTML (web pages), rather than for the structured XML format of Connexions’ modules. It just might be possible, however, and advice and comments would be most welcome.
The workflow would be just a bit more complicated than the normal route for translation and would look something like this:
  1. Find a module that you want to tranlsate on Connexions and record its ID. Lets say the module is Electric Circuits – Grade 10, http://cnx.org/content/m32830/latest/. Then the id is “m32830”.
  2. Open Google Translator Toolkit and select a URL something like this: http://www.coolhelperservice.org/cnxtranslate/m32830. This would fetch the module in a format that Google Translate can use well.
  3. Translate it using the Translator Toolkit.
  4. Save the file to your laptop.
  5. Go to something like http://www.coolhelperservice.org/cnxpublish and upload the saved file. Fill out a bit of information and then push a button to sign the license and publish it to Connexions.

Although it would be more straightforward to enter a cnx.org web address into the Translator Toolkit and then publish straight from the toolkit, we don’t have the technical hooks into the Translator Toolkit to be able to do that. So instead, we would create this new “coolhelperservice” that would know how to format Connexions content for Translator Toolkit and how to take translations and reformat them and publish them to Connexions.

Does that work flow seem reasonable? Is there a better work flow that you can think of and suggest?

Some technical details for those that are interested. Those that aren’t can safely stop here and still be able to give feedback on the process from a translator’s perspective.

Google Translator Toolkit doesn’t work with XML formats. But Connexions does produce an HTML format for modules that can be be converted back into Connexions XML without any loss. So the “coolhelperservice” needs to retrieve the module, format it in HTML for the translator toolkit, and then do the opposite transform (HTML → CNXML) on the way back into Connexions.

To get the HTML for the body of a module from Connexions, you append “/body” to the module URL. And the module metadata (title and such) is available by appending /metadata to the module URL. So with the module ID, the “coolhelperservice” can put together a nice package of HTML for the translator to use, and still be able to reconstruct the XML to publish the translated version.

One tricky bit is that Google Translator Toolkit makes a mess of the mathematics that comes in from Connexions, so the math has to be protected somehow. Carl and I experimented with a few ideas for how to do that, and toolkit didn’t cooperate with most of those, but Carl came up with the idea of putting all the math into an HTML id. Amazingly, that worked. It comes out all escaped, but that is good enough. (Toolkit won’t keep around a random attribute, so “id” was the way to go). Carl is pretty sure that there is a webservice that will take a snippet of mathml and give back an image. He is going to investigate that further. So in principle, you can stuff the math into an image ID (so it doesn’t get lost) and replace the math with a URL to this service that will render the math. The translator won’t be able to translate words that were inside the math, but Carl had previously looked around and that isn’t very common, so this might just be good enough.

At the end, the “coolhelperservice” will use a publishing API (SWORD V2) to publish the translation back to Connexions. Implementing that API is part of my fellowship work so it is coming later this year. There will have to be a bit of license signing back at Connexions, but the “coolhelperservice” can make that smoother also.

I think something like this could work. What do you think? And did we miss some clever idea or service that could be of help? Actually, I am sure we did since this was a 2 hour experiment. So send help, advice, etc. Carl will keep investigating, and maybe we will have some screenshots to clarify all this for a future post.

Sprinting with Connexions

First progress implementing a bit of a publishing API for OER, based on SWORD and AtomPub.
 

Last week at the Plone East Symposium in State College PA, plone developers across the US gathered together to learn and share about using Plone in educational settings. At the end of the week, Friday and Saturday, about half the attendees stayed to “sprint” (original plan, full report).  At sprints, people develop working code together on various projects in order to share expertise, learn from each other, and expand networks of technical mentors. Knowing that Connexions already had a partial implementation of SWORD for creating modules from Word documents, and that SWORD is likely to be the backbone for the OER Publishing API (your comments, approval, concerns welcome), I brought a sprint topic to the symposium — “OER Publishing API: Extend Connexions SWORD implementation”. Connexions provided an expert, Phil Schatz, to lead the sprint and we created a milestone to track the work. Carl Scheffler joined Phil and me working on SWORD and we got advice and help from Michael Mulich (Penn State), Ross Reedstrom and Ed Woodward at Connexions.

What the Connexions/Rhaptos SWORD service does now:

The current Connexions SWORD service is tailored to a very specific client, the Open Journal System (OJS). It takes a zip of a Word file and a METS file with some metadata and a bibliographic entry that is used to insert a reference to the the original publication of the article in a journal. The service then creates a new, unpublished module with the content of the Word file, and puts it in a work area chosen by the client.

What we got done at the sprint:

  1. Reorganized the existing SWORD code to make the coding cleaner.
  2. Extended the service so that it would take a Word file, or the Connexions native format.
  3. Changed the service to get the title and abstract from standard locations.
  4. Got the SWORD client toolkit, EasyDeposit, to work with the new code (and partially work with the existing code.)

Why a Publishing API? or What’s a Publishing API?

The fellowship that I have through the Shuttleworth Foundation is officially “to foster an ecosystem around open education resources”.  The community creates great teaching and learning resources, and then the ecosystem makes it possible for educators, innovators, students, and life-long learners to use the resources, improve them, and connect them together in novel ways that enhance learning.

When you look at the details of my fellowship proposal, the first thing that I am starting with is creating a publishing API (application programming interface). The API part of that means that it is a language that software programs speak to each other. If you use TweetDeck to update your Twitter and Facebook accounts, then you are using software programs that are talking to each other using APIs. (Because Twitter provided a simple API early on, tons of services I had never heard of use the Twitter API. A cooler-than-me friend sent me this list of programs that use Twitter’s API — “Brizzly, Seesmic, Tweetie, DestroyTwitter, TwitterFox, SoBees, Mixero”.  And another mentioned that all of our email programs communicate with each other via a lovely open API called SMTP. Email succeeded because it is easy for programs all over the world to talk to each other and deliver your email. In fact, the web and the way browsers display web pages are all built around API’s.

What do I mean by “publishing”? Here, I am talking about creating educational resources and then making them available to others in a library or repository. Making them available is the publishing part. So if you create a nice lesson for teaching fractions to middle school students and then want to share that lesson, you might publish it by uploading it to Connexions (cnx.org), or a Google Doc you make public, or your school’s website or Open Courseware (OCW) site. And then you might post a link to it on Facebook and Twitter and email the link to colleagues. Some of those colleagues might want to download it, add an exercise that they have used successfully, and share that updated version.

Now imagine that I have an idea for a simple editor that also creates really nice equations and that I want to build in a “publish” button that will let you choose where your content should be published, figure out what format it has to be in and convert it if need be, and will also advertise your content to your social networks.  If places to share the content (like Connexions) support a publishing API, then my editor can do all that for you. This scenario is only one example. I intend to blog many more examples from conversations I am having with other people and projects that are building software and content for education. These are the sorts of possibilities that a publishing API will support. I am also going to do a few more technical blogs on what things I think should be in that publishing API and what other APIs already do much of the work and provide a good place to start.