Red Hen Lab is a distributed consortium of researchers in multimodal communication, with participants all over the world. We are senior professors at major research universities, senior developers in technology corporations, and also junior professors, postdoctoral students, graduate students, undergraduate students, and even a few advanced high school students. Red Hen develops code in Natural Language Processing, audio parsing, computer vision, and joint multimodal analysis.
Red Hen's multimodal communication research involves locating, identifying, and characterizing auditory and visual elements in videos and pictures. We may provide annotated clips or images and present the challenge of developing the machine learning tools to find additional instances in a much larger dataset. Some examples are gestures, eye movements, and tone of voice. We favor projects that combine more than one modality, but have a clear communicative function -- an example would be floor-holding techniques. Once a feature has been successfully identified in our full dataset of several hundred thousand hours of news videos, cognitive linguists, communication scholars, and political scientists can use this information to study higher-level phenomena in language, culture, and politics and develop a better understanding of the full spectrum of human communication. Our dataset is recorded in a large number of languages, giving Red Hen a global perspective.
For GSoC 2018, we invite proposals from students for components for a unified multimodal processing pipeline, whose aim is to extract information from text, audio, and video, and to develop integrative cross-modal feature detection tasks. Red Hen Lab is directed jointly by Francis Steen (UCLA) and Mark Turner (Case Western Reserve University).
- Twitter: Please clearly state your proposal, whether it is a new project or an improvement to an existing system.
Bear in mind that your project should result in a module that is installed on our high performance computing cluster, fully tested, with clear instructions, and ready to be deployed to process a massive data set. The module should include a well documented API file that can be used by a wide variety of coders, especially those who come after you, and those who are not experts in your problem domain.
Your project should be scaled to the appropriate level of ambition, so that at the end of the summer you have a working product. Be realistic and honest with yourself about what you think you will be able to accomplish in the course of the summer. Provide a detailed list of the steps you believe are needed, the tools you propose to use, and a weekly schedule of deliverables. Clear and proper documentation can take much longer than expected.
If you are proposing a machine learning project, base the proposal on techniques that have already been successful in other similar projects. Include as much information about previous research and results as possible. Accuracy measurements will be required, and they should be as automated as possible.
Please be prepared to follow code formatting standards closely and to work with Singularity (virtual Linux system) images throughout the summer.