Entries from October 2009 ↓

Designing robust lyric and cover art retrievers

One of the very last sessions for the GSoC Mentor Summit was about Media players. There were lead devs from Amarok and XMMS2, and it was cool to speak with them in person. One frequent issue that Amarok (I can’t remember if it was also an issue for XMMS2) was that lyric sites keep going down and changing their format, sometimes adding ads in the middle of the lyrics. Another was that Amazon no longer let’s them use the album cover art, and the substitute of last.fm has very small cover art images.

My suggestion for both, but which would need to be implemented in somewhat different ways, would be to use a variety of lyrics sites, then use text similarity matching to work out what the actual lyrics part of the page was. For images, you could use google image search, and then return the image that was most frequent, as well as having some heuristic for preference of square images. I think that, although not perfect, this would make the the system a lot more robust against further changes.

Text similarity and overlaps is well understood as a computer science problem. It’s used by the shotgun sequencing approach for DNA sequencing… as well as variety of search and indexing problems. Hopefully I’ll release a usable library for it over the summer – I’ll call it libshotgun-lyrics 😉

Views on copyright

Someone I know is quite vehement about the obsolescence of copyright, or that it at least needs to be radically reworked to be tenable in today’s environment. The environment of (almost) zero cost duplication for many copyrighted products. When it comes down to it, writing is data, music is data, and potentially, even physical objects will easily be duplicated. I’m close to that camp, but I don’t believe all data should automatically be free.

On creating something, I think you should be able to profit from your labour, but attempting to control unofficial spread of something is usually futile [1] – the big music industry would be well advised to learn something from that, except I’m sure they’ll opt to go down kicking and screaming.

Continue reading →

Crime and punishment: existential style

Following on from other’s recent discussions of crime and punishment, I offer these completely unhelpful transhumanist thoughts:

  • A mind from the past can become completely different from the one that committed the crime. So is it fair to punish someone in the present, when their current mind state bears as much similarity to the mind that committed the crime in the past is it does to a completely separate person?
  • A body replaces most of it’s cells over the course of many years. So it’s not really someone’s body we convict, but their structure. What happens when people can upload? Supposing we can represent that structure digitally or otherwise (but in a form of easily copy-able data) what happens to the replicates of that individual? Are they convicted as well? Does it become illegal for other people to harbour that sequence of data, even if it’s in stasis and getting no processor time? (which is essentially the same as dead, but with the difference of being revivable at a moments notice)
  • Continuing from the assumption that it’s the structure of a criminal we want to punish/remove from society: Since a baby is essentially derived from the fair proportion of the parent’s structure, if the parent commits a crime, then shouldn’t the child also be considered a criminal? Even though the child takes some of it’s structure from the other, hopefully non-criminal, parent, the first point seems to imply that exact similarity isn’t required.

(Note, most of these thoughts are me just musing on a theoretical level that is not at all pragmatic. I don’t actually believe children of criminals are also guilty)