Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OSS101] Task 8: Classification reasearch : the role of developer in Github open source community #64

Open
huangfan0 opened this issue May 14, 2024 · 2 comments

Comments

@huangfan0
Copy link

huangfan0 commented May 14, 2024

the aim of the task is to classify the role of developer in Github open source community. Based on the developer's behavior and influence in the project, the developer's role can be roughly divided into four categories: observer, contributor, maintainer, and leader. You need to construct a dataset and build a classification model to divide the role of the developer. You need to specify the method and the reason of dataset construction and the classification algorithm must be compare with other algorithm models. you'd better deeply analysis the behavior patterns based on classification result so that we can understand collaboration mechanism and open source ecology.

The relevant code and dataset for this task need to be provided in the repository.

@huangfan0 huangfan0 changed the title [OSS101] Task 8: Classification reasearch : the role of Github in open source community [OSS101] Task 8: Classification reasearch : the role of developer in Github open source community May 14, 2024
@vitaminzl
Copy link

the aim of the task is to classify the role of developer in Github open source community. Based on the developer's behavior and influence in the project, the developer's role can be roughly divided into four categories: observer, contributor, maintainer, and leader. You need to construct a dataset and build a classification model to divide the role of the developer. You need to specify the method and the reason of dataset construction and the classification algorithm must be compare with other algorithm models. you'd better deeply analysis the behavior patterns based on classification result so that we can understand collaboration mechanism and open source ecology.

The relevant code and dataset for this task need to be provided in the repository.

请问一下这个任务是要训练一个有监督学习模型还是无监督学习模型?数据集是否需要自己获取?
如果是有监督学习,那么 observer, contributor, maintainer, leader 这些标签是要自己打吗?github 上貌似没有现成的标签。
如果是无监督学习,是否根据无监督方法,将其分为 4 类,最后通过数据分析将其对应到 observer, contributor, maintainer, leader 这些标签上。

@huangfan0
Copy link
Author

训练模型有监督、无监督都可以。数据集是自己收集,可以通过REST API或graphQL来收集。
有监督需要打标签,或者通过制定规则学习。
无监督在最好给出一些评价指标说明最后结果的好坏。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants