I am interested in developing efficient machine learning and deep learning algorithms to solve challenging problems in natural language processing and natural language understanding. I completed my PhD in computer science at the Worcester Polytechnic Institute in May, 2021. My advisor was Professor Elke Rundensteiner. My PhD research has been mainly focused on text classification using Machine Learning, Deep Transfer Learning and Natural Language Processing methods.
My current research focuses on learning from few labeled examples through Transfer learning and semi-supervised learning.
- Maryam Hasan, Elke Rundensteiner, Emmanuel Agu, DeepEmotex: Classifying Emotion in Text Messages using Deep Transfer Learning, IEEE Big Data: Machine Learning on Big Data (IEEE BigData, MLBD), December 2021.
- Maryam Hasan, Elke Rundensteiner, Emmanuel Agu, Automatic Emotion Detection in Text Streams by Analyzing Twitter data, International Journal of Data Science and Analytics, Springer 2019.
- Maryam Hasan, Elke Rundensteiner, Xiangnan Kong, Using Social Sensing to Discover Trends in Public Emotion, In Proceedings of IEEE International Conference on Semantic Computing (IEEE ICSC), 2017.
- Maryam Hasan, Elke Rundensteiner, Emmanuel Agu, Using Hashtags as Labels for Supervised Learning of Emotions in Twitter Messages, In Proceedings of ACM SIGKDD Workshop on Health Informatics (HI-KDD), August 2014.
- Maryam Hasan, Elke Rundensteiner, Emmanuel Agu, EMOTEX: Detecting Emotions in Twitter Messages, In Proceedings of the 6th ASE International Conference on Social Computing, SocialCom, May 2014.
- Di Yang, Kaiyu Zhao, Maryam Hasan, Hanyuan Lu, Elke Rundensteiner and Matthew Ward, Mining and Linking Patterns across Live Data Streams and Stream Archives, VLDB, October 2013.
- Yi Shi, Maryam Hasan, Zhipeng Cai, Guohui Lin and Dale Schuurmans, Linear Coherent Bi-clustering via Beam Searching and Sample Set Clustering, International Journal of Discrete Mathematics, Algorithms and Applications (DMAA). December 2011.
- Maryam Hasan, Eleni Stroulia, Denilson Barbosa and Manar Alalfi, Analyzing Natural Language Artifacts of the Software Process, Early Research Achievement track of the 26th IEEE International Conference on Software Maintenance (ICSM'2010), Timisoara, Romania, September 2010.
- Yi Shi, Maryam Hasan, Zhipeng Cai, Guohui Lin and Dale Schuurmans, Linear coherent bi-cluster discovery via beam detection and sample set clustering, International Conference on Combinatorial Optimization and Applications (COCOA 2010). The Big Island, Hawaii, United States. December 2010.
- Developed a deep transfer learning method to learn domain-specific features from context.
- Fine-tuned pre-trained natural language models (e.g., BERT and USE) on the target classification task.
- Developed a baseline neural network model (i.e., Bi-LSTM) and evaluated DeepEmotex models.
- The proposed DeepEmotex-BERT model outperformed the baseline model by 23%.
- Implemented using Python: Scikit-learn, NumPy, Pandas, TensorFlow, PyTorch
- Developed a binary classification model to classify text messages into emotion and no-emotion classes.
- Developed and evaluated an online method to measure public emotion and detect temporal changes of emotion in a stream of messages during events.
- Used Hoeffding’s inequality to define an upper bound on the probability that the sum of independent random variables deviates from its expected value. Implemented using Java.
- Collected and processed large corpus of labeled messages for supervised learning of emotions in text.
- Developed and evaluated machine learning models to classify text messages including Support Vector Machines (SVM), Naïve Bayes and Decision Tree.
- Implemented using Python (Scikit-learn, NumPy) and Java
LinCoh: A Feature Selection approach using Linear Coherent Bi-Clustering via Beam Searching and Sample Set Clustering, 2012
-Developed a method to find linear coherent bi-clusters in Gene Expression Microarray data. Our method exploits a robust technique to identify conditionally correlated genes, combined with an efficient density-based search for clustering sample sets. Implemented using MatLab
- Designed and developed a tool to extract structured knowledge from textual data in software repositories (using Java, XQuery, DB2, XML, Stanford NLP tools: PoS tagger, Dependency Parser)
- Extracted relations among named entities via Hierarchical Clustering of blogs (using Java, MySQL, Stanford NLP)
- Collaborated in Annoki project which is a social Wiki tool for researchers
- Developed a graphical web interface for Annoki
- Implemented using Adobe Spring-Flex, PHP and MySQL