Tag Archives: java

Creating your own DBpedia

In this first part I will show you how to extract RDF from Wikipedia with the help of the DBpedia Extraction Framework. In future version we are going to import this data into Virtuoso and create a Solr/Lucene Index for it.

First of all you need to install Java 7, Scala, Mercurial and Maven. Then open a Terminal and go to the directory where you want to install the extraction framework. Then you can checkout the DBpedia Extraction Framework from here (this is one line!):

$ hg clone http://dbpedia.hg.sourceforge.net:8000/hgroot/dbpedia/extraction_framework dbpedia-extraction

If this is done go to a command line and enter:

$ cd /path/to/mercurialrepo/dbpedia-extraction
$ mvn install
$ cd dump

Once this is done you need to edit the config files according to your needs. First we need to configure what we want to download. You can download this config file and adjust it. Now we need to configure what we want to extract. Here are two config files which work for English and German. I’ve only activated some of the Extractors, a full list can be found here. Also make sure that you have only “extractor.$Language_Code” entries for languages you want to extract. Otherwise you will get error message for trying to extract from non existing data. The last thing to edit is the pom.xml in the dump directory. Go to the download launcher in the pom file and adjust the name of you config file (no need for change if you use the config I supplied). Then go to the extraction launcher and also change the name of the configuration file according to your needs. Now you should be able to start the download via:

$ mvn mvn scala:run -Dlauncher=download

and run the extraction with:

$ mvn scala:run -Dlauncher=extraction

The data directory specified in the German and English config files should now contain several files n-triples or turtle files. Congratulations! If anything  went wrong please drop me a comment!

Advertisements

1 Comment

Filed under DBpedia, Development

Analyze your music (listening habits) in iTunes

I stumble upon a great tool to analyze your iTunes library. It’s called SuperAnalyzer by the guys of nosleep.net. It’s written in Java and therefore platform independent (Mac OS X & Windows) and even open source under a GPL license, so you can add your own statistics there. Go get it! Here are some if its features:

  1. show the growth rate of your library
  2. how much songs are rated
  3. # of songs per genre/year
  4. listening times
  5. quality
  6. most played artists/genres/albums/songs
  7. all sorts of play counts
  8. most frequent song title names
  9. export statistics to pdf or html
  10. many more!!

1 Comment

Filed under iTunes, Mac OS X Software, Music