Which tags do we allow?

Just everything

This is what everybody does right now. Just every combination of letters will be valid as a single tag.


  • Different people might understand different things under the same terms, but maybe this is irrelevant if you search for more than a single word?
  • SPAM ..

Another approach

Allowing only words that are title of a Wikipedia-Site, or at least use wikipedia as follows to improve the quality of tags automatically:

This could solve some problems:

  • SPAM
  • Words that are acronyms with several meanings could be solved via the information that is contained within wikipedia already. (e.g.: The tagging-interface then could automatically provide a list to the user and allow to choose one of those meanings (by replacing the tag with the correct wikipedia article name).
  • Each tag has a described meaning, thus people will more likely understand the same
  • A search through tags would provide the searcher the same lists for which meaning (of such a word) to search.
  • Often there are several words that mean the same thing. This can also be automatically solved with the help of Wikipedia: Redirects. Redirects are Wikipedia-Sites that contain nothing but a reference to the article with the correct lemma for its meaning. Such tags could be automatically replaced with its correct lemma. (e.g.:


  • some words might be missing completely ..

Lots of other good ideas about a Software Thesaurus can be found here:

List of possible Software Thesauri

