Here are some of the things I did, or I was involved in somehow, and software I maintain.
The collection of tweets from Twitter in Italian language.
Delicious Folksonomy Dataset
A dataset obtained crawling Delicious, the social bookmarking website.
C&C/Boxer Web Interface
Python module to search and download messages from Twitter.
GNU Octave implementation of the Listnet learning-to-rank algorithm.
The Groningen Meaning Bank
A large corpus of semantically annotated English text that anyone can edit.
A Game With A Purpose to collect linguistic annotation.
Word and sentence boundary detection software (a tokenizer, that is), based on supervised statistical methods.