Re: Usability, hierarchies and IO-slaves
Simon, i must agree that hierarchies are confusing and not too usable. But you fell into the very trap you warned about: proposing to create yet another user targetted hierarchy replacing the unix filesystem. That’s pretty much wrong, in the long term. Anyway, system:/ ioslave does exactly that, it hides the ioslaves and unix filesystem. It is definitely a step in the right direction, since we can’t get rid of hierarchies anyway (for now, at least), so introducing a simple one as a stepping-stone is not that bad. (And again, you propose the very same thing!).
I expect Tenor/Kat to let us gain some ground here, with non-hierarchical data access. Also, tagged collections may be of some interest here, but with the problem that we probably can’t generate tags automagically and noone is going to tag their data. I suppose one could use a naive bayesian categorizer to do some job, but you still need training data. Still, this sounds like a not-so-completely-unworkable idea, since one can always pre-train the categorizer with loads of sample data, so it does at least half-decent job at the start. If users spot that their file is in the wrong place, they may fix that and give the categorizer some learning material.
However, again, we can’t let ourselves fall into the hierarchy pit. The view on the data needs to be multi-faceted (in the case of bayesian sorting), so there need to be several facets (sets of categories), each with own “meaning”. Some of those could be decided pretty well without much guessing in software, say, type::music, type::image etc. Then things like field::mathematics, field::compsci etc can be probably pre-trained fairly well. Maybe area::work (as in the documents related to work of the owner… maybe job is better fitting?) could get some hits with sample data, but probably quite a bit less. Of course, there are going to be categories that are plain unguessable.
Also, there is a problem with non-text data. Automatically categorizing images is going to be highly nontrivial, i guess. Anyway, all this is a pretty wild idea that possibly could support tenor-based searches (like if you get too many hits, computer can always suggest some of the categories that appear in the search results to refine the search… There are quite decent algorithms written for this kind of use).
EOF