Using RDFAlchemy together with RDFLib’s SPARQLStore to query DBPedia and process resources in OO way

I’ve been searching for interesting ways to manipulate RDF graphs in Python, to create an application that would handle Linked Data Resources in an OO-way, i.e. using Python classes and not tables/sets/lists of triples. The data will be persisted in graphs in a triple store, accessed through a SPARQL enpoint.

In this post, I’ll illustrate how I managed to tie RDFLib’s SPARQLStore plugin and RDFAlchemy to reach a rather nice looking result.

Continue reading

Posted in Uncategorized | Tagged , , , , , , , , | 2 Comments

Debian docker containers using a modified baseimage-docker

I have been testing Docker for a few weeks now, and investigated the use of baseimage-docker, which provides support for supervising services with runit, and includes OpenSSH, among other things, based on an Ubuntu base system. Of couse, I’m interested in a Debian counterpart.

I had initially followed instructions provided by Steve Kemp which also prepared a Debian image including OpenSSH and runit, but it appears that baseimage-docker provides more tiny bits that avoid reinventing the wheel.

I’ve then forked the baseimage-docker to do a quick and dirty adaptation for Debian. There’s a sid variant (my ‘debian’ branch) and a wheezy one (my ‘wheezy’ branch, unsurprisingly). I haven’t used all features of baseimage-docker, so things might break for sure.

For the records, I’m playing with it as a base image to construct a docker-based container running the FusionForge test suite.

Did I warn you it’s quick and dirty and without any warranty ? Hoping that this is useful anyway.

Posted in Uncategorized | Tagged , , , , | Leave a comment

Tagged a first version of the TWiki to FusionForge’s MediaWiki converter

As announced previously, I’ve been hacking on a migration tool allowing to import into the MediaWiki of a FusionForge project, a conversion of the contents of a TWiki wiki.

I’ve succesfully imported a first project (from PicoForge to FusionForge) using the tool, so I’ve decided to tag a first release and make the Git repo accessible.

More details at : https://fusionforge.int-evry.fr/projects/pytwiki2mediawi/

Feel free to ask here in the comments or by email, in case of need.

And, yes, my Python is most likely awful, but at least, this works, and much more featureful than existing tools I could test.

Posted in Projects, Uncategorized | Tagged , , , , , | 1 Comment

Working on a TWiki to MediaWiki converter (targetting FusionForge wikis)

I’m currently working on a wiki converter allowing me to transfer old TWiki wikis (hosted on picoforge) to MediaWikis hosted on FusionForge.

Unlike existing tools that I’ve found that more or less target the same needs, mine will address two peculiarities :

  • using MediaWiki’s API to perform the import, where many tools seemed to use SQL requests: this should allow non-administrator users to do the job,
  • importing to wikis of projects hosted on FusionForge instances, even when the project is not public, which means that the API calls need to authenticate to FusionForge first.

The tool is written in Python, and will include my own crappy wiki syntax converter in Python, instead of spawning existing Perl scripts, as others did.

It may happen to work for FosWiki too, but I don’t intend to use it beyond our old TWiki installations, for a start.

Stay tuned for more progress updates.

Edit: I’ve now released a first version.

Posted in Uncategorized | Tagged , , , , , , | 1 Comment

Switched from gnome flashback-session to XFCE

I’ve gotten fed up with Gnome flashback session annoyances on Debian testing, so, for a bit more than a month now I’m running XFCE 4.

So far so good.

YMMV but if you’ve not made the switch to Gnome 3′s shell and all the JS enabled novelties, XFCE might be an option.

Posted in Uncategorized | Tagged , , , | Leave a comment

Generating WebID profiles for Debian project members

I’ve been investigating the generation of WebID profiles for Debian project members for some time.
Continue reading

Posted in Uncategorized | Tagged , , , | Leave a comment

Qu’est-ce que le Linked Data

Voici une traduction en français d’un court document introductif au Linked Data, originellement écrit par Luca Matteis :

http://www-public.telecom-sudparis.eu/~berger_o/linkeddatawhat/whatislinkeddata.html

Bonne lecture.

Posted in Uncategorized | Tagged , , , , | Leave a comment

Publishing RDF views for tastypie/django resources

Here’s some documentation about a hack I’ve been working on to allow publication of RDF views for Tastypie resources (in Django applications).

While working on implementing support for RDF meta-data for descriptions of Debian packages for the Debian Package Tracking System rewrite, I’ve tried and idetify which libraries/frameworks would allow to create some RDF views for Django model objects with a minimal effort.

Ideally, this would save the hassle of writing code, and could just be a matter of mapping some Django model fields to proper ontology attributes.

I haven’t found an existing tool to do so, but it seemed to me that Tastypie could offer a nice starting point. Tastypie (I focused on v. 0.9.15 which is currently in Debian testing) offers some REST content-negociation support, and other niceties.

This post is an attempt at documenting an initial implementation for my problem. I’ve implemented it as some code in the example blog application described in the Tastypie documentation. Unfortunately, it’s not yet a patch that could be applied to Tastypie, to add this as a standard feature.

The code is available in my git clone of Tastypie, based on 0.9.15, and lies in the commit(s) between the example_myapp and rdfviews branches. It needs an up to date RDFLib to work (Debian’s is too old, btw).

It basically relies on the addition of a _rdf_mapping dict in the Django model objects, and an RdfModelResource as a base class for Tastypie ModelResources.

To provide an RDF view for a ModelResource, it also requires to declare a particular MetaClass, and adding a few bits to its Meta subclass.

The principle is that the dehydratation steps of the Tastypie system will replace fields values by their RDF objects counterparts, as either RDFLib Literals or URIRefs. Then the resuting Bundle will be converted to some RDFLib Graph. It is then just a matter of serialization of that Graph to turtle (I’ve not added another format, but it should be pretty straightforward).

There’s a bit of hackish drak magic in the code, in particular when introspecting the Tastypie / Django resource / model objects to be able to dynamically add some dehydrate_FOO methods to perform the conversion to Literals or URIRefs.

Custom dehydrateFoo methods can still be written to process the fields in particular ways.

I’ve added some LDP style paging along the way, so I hope it will be quite usable by LDP compatible tools.

Here’s the result :

First, listing all resources :

$ curl -H 'Accept: text/turtle' "http://localhost:8000/api/v1/entry/"
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix ldp: <http://www.w3.org/ns/ldp#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sioc: <http://rdfs.org/sioc/ns#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

</api/v1/entry/#list> dcterms:hasPart </api/v1/entry/1/#post>,
</api/v1/entry/2/#post>,
</api/v1/entry/3/#post> .

Then, details of one particular post :

$ curl -L -H 'Accept: text/turtle' "http://localhost:8000/api/v1/entry/1"
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sioc: <http://rdfs.org/sioc/ns#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

</api/v1/entry/1/#post> a sioc:Post ;
dcterms:created "2011-05-21T22:46:38+00:00"^^xsd:dateTime ;
dcterms:title "Another Post" ;
sioc:content "MESSAGE: This will prbbly be my lst post. /MESSAGE" ;
sioc:has_creator </api/v1/user/1/> ;
owl:sameAs </posts/another-post#post> .

I hope this is usable, and welcome any feedback.

Posted in Uncategorized | Tagged , , , , , , , | 2 Comments

Experimenting with Linked Open Data about FLOSS projects : matching Debian upstream projects

I’ve been experimenting with Linked Open Data about FLOSS projects harvested from different sources of DOAP or ADMS.SW descriptions. I’ve tried and match upstream projects of Debian packages with upstream projects hosted at Apache, Gnome, or Alioth.debian.org, or catalogued on Pypi.

I’m matching them on identical values of the Homepage field (comparing the Homepage Control field set by Debian packagers with the doap:homepage meta-data in the RDF documents harvested from the upstream project catalogues).

Here are initial results of my little experiment, for number of matched projects, and results on project name’s similarity :

Upstream catalogue Total matching projs Exact same project name Same project name (case independant)
apache 31 0 (0 %) 0 (0 %)
alioth 16 13 (81 %) 13 (81 %)
pypi 439 217 (49 %) 273 (62 %)
gnome 21 0 (0 %) 7 (33 %)
Total 507 230 (45%) 293 (58 %)

The data set contains tens of thousands of projects, with probably many duplicates, but from all of these, only 507 have common homepages.

As you can see, in some cases, the Debian source package names match the upstream project name (sometimes with lower/upper case variants), but in general, the project names aren’t identical, so it is interesting to try and match them by homepage.

For the curious ones, the Apache, Gnome and Pypi project catalogues use to provide RDF meta-data for quite some time. More recently have we introduced ADMS.SW meta-data for Debian source packages, and even more recently for the Alioth projects (through the ADMS.SW exporter plugin for FusionForge).

There are still some ways for improvements, for instance to normalize homepage URLs which tend to vary (trailing slashes, or different HTTP/HTTPS schemes).

Stay tuned for more details.

Posted in Uncategorized | Tagged , , , , , , , , , , , | 2 Comments

Formation Algorithmique et Python pour les profs d’info en prépas scientifiques

Nous avons effectué deux sessions de formation sur l’algorithmique, Python et les bibliothèques Python scientifiques (pylab), la semaine dernière, à destination de futurs professeurs d’informatique de Classes Préparatoires aux Grandes Écoles (CPGE) scientifiques, dans le cadre du dispositif des formations LIESSE.

Cette formation s’inscrit dans l’objectif de former les nouveaux professeurs d’informatique, souvent issus d’autres matières, qui devront enseigner à la rentrée de septembre 2013 l’informatique auprès de tous les étudiants en Sup (nouveau programme).

Cette formation sur deux journées a été montée conjointement entre Télécom SudParis et l’ENSIIE.

Vous trouverez ci-dessous les transparents d’une partie de la formation (essentiellement le premier jour). Le reste correspond au contenu de Notebooks IPython portant sur l’informatique scientifique en Python avec Numpy, Scipy et Matplotlib, et est disponible en ligne (cf. les liens dans ce document PDF, vers des versions des notebooks à télécharger en ligne).

Télécharger : transparents (PDF – 1.6 Mo)

Une archive plus détaillée (avec exemples, codes source Org-Mode et Python, etc.) est également disponible : nous contacter si intéressés.

Posted in Uncategorized | Tagged , , , , , , , , , | 8 Comments

New paper “Authoritative linked data descriptions of debian source packages using ADMS.SW” accepted at OSS 2013

I’ll be presenting “Authoritative linked data descriptions of debian source packages using ADMS.SW” at OSS 2013.

Here’s the abstract :

The Debian Package Tracking System is a Web dashboard for Debian contributors and advanced users. This central tool publishes the status of subsequent releases of source packages in the Debian distribution.

It has been improved to generate RDF meta-data documenting the source packages, their releases and links to other packaging artifacts, using the ADMS.SW 1.0 model. This constitutes an authoritative source of machine-readable Debian “facts” and proposes a reference URI naming scheme for Linked Data resources about Debian packages.

This should enable the interlinking of these Debian package descriptions with other ADMS.SW or DOAP descriptions of FLOSS projects available on the Semantic Web also using Linked Data principles. This will be particularly interesting for traceability with upstream projects whose releases are packaged in Debian, derivative distributions reusing Debian source packages, or with other FLOSS distributions.

Update: If you are interested, a preprint is available here in HTML form. See also previous installments on ADMS.SW in this blog.

Update: The slides of the presentation I made at Isola are here.

Posted in Publications, Uncategorized | Tagged , , , , , , , , , , , , , | 3 Comments

Slides + Manual + programs generated from single org-mode source

I’ve been working on maintaining lecture slides and a manual, by writing a single source org-mode file.

From a single source I want to be able to generate different output PDFs, only changing a few switches :

  • slides deck
  • a manual document
  • source files for examples

The slides may contain notes.

Here’s an archive that contains an example document and complementary files. See this documentation document for more details (itself maintained with such an .org source).

Posted in Uncategorized | Tagged , , , , , | 1 Comment

Managing Python code with UTF-8 (french chars) in org-mode + babel + minted for LaTeX export

The goal of this article is to illustrate how to manage Python code which includes comments in UTF-8 characters inside a latin-1 source org-mode for LaTeX export.

Note that I’ve pasted in wordpress the HTML generated by org-mode, so I hope it isn’t broken too much.

My typical use case is a french lecture on Python where the text is written in french, as well as some of the code comments and examples

We’ll use org-mode’s babel module to include and manage the Python
examples. The goal is to write the source of the Python programs
directly in the same org source as the class book’s text, and to extract them into a subdir (with the “tangle” feature), so that they can be shipped to the students to experiment with.

The minted LaTeX environment is used, for babel, to make the Python syntax highlighting.

Continue reading

Posted in Uncategorized | Tagged , , , , , , , , , | Leave a comment