Context of previous post

Just realized that my last post was very context free…so here’s a stab at some context. I’m at a camp this week called CSST or the 2010 Summer Research Institute for the Science of Socio-Technical Systems. I’ll hopefully get a chance to write more about it later, but it’s a great coming together of junior and senior researchers from all across fields of sociotechnical systems. So I don’t usually hang out or talk with people from management and sociology backgrounds about my research, but this was a venue in which to do so. It has been an amazing experience so far (we leave tomorrow), but one of the interesting parts has been the unstructured social time every day. We documented yesterday’s time and turned it into a random, in joke filled comic strip. Having this software makes me think that maybe I’ll mess with this more as a form of storytelling so maybe someday you non-CSST folks will understand what the heck is going on (in comic book form).

PS. Apologies to my mother for the language. I was quoting others…Professionally. Also it was a technical term, so it’s ok 🙂

Alive and well

I’m still alive, but teaching and research are eating up most of my time and I sleep and sing in what’s left.

However I came back to the blog to alert you all to a post I made on the GroupLens blog earlier today. It’s one of a series of posts I have planned about my research, etc…so stay tuned.
Survey writing woes blog post on GroupLens Blog

Foreign Language Help

I’m currently doing research that involves online communities and multiple languages. As part of this, I’m analyzing some exceedingly popular languages (Spanish, German, Japanese…) as well as some less studied communities (Volapuk, Ukrainian, Esperanto).

The idea is that we’re doing some basic text processing. To reduce the amount of time this takes and the value of the analysis, we’re wanting to exclude a standard list of stop words. These are words, in English, such as in to a and the that, etc. (Examples in English, German, French) While I can find these for most European languages and have learned of other languages (Japanese, Chinese) don’t really have a concept of stop words in their language.

While I’ve found stop word lists for most of the languages, I’m stumped on three languages: Esperanto, Volapuk, Ukrainian, and Bengali. Any insights would be appreciated.