Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emily's internship blog #192

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Conversation

EmilyXinyi
Copy link

Blog post about Emily's summer internship

cc: @francoisgoupil

Copy link

welcome bot commented Aug 13, 2024

Thank You Banner

💖 Thanks for opening this pull request! 💖
scikit-learn community really appreciates your time and effort to contribute to the project.
Please make sure you have read our Contributing Guidelines and filled in our pull request template to the best of your ability.

@francoisgoupil
Copy link
Member

@EmilyXinyi Could you maybe add a link to or embed your Vlog here? I think it would be a nice addition to this well written blogpost about your internship experience.

@EmilyXinyi
Copy link
Author

@EmilyXinyi Could you maybe add a link to or embed your Vlog here? I think it would be a nice addition to this well written blogpost about your internship experience.

I embedded the video for now, though I am not sure how good it would look on the blog because it's a portrait (vertical) video. Alternatively I can post it on some social media first and link it here.

Copy link
Member

@francoisgoupil francoisgoupil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor edits


### Open Source Developement

I started my contributions by adapting certain metrics (tweedie, mean absolute percentage error etc.) to be Array API compatible under the guidance of my mentor, Olivier. The Array API standard is a cross-library API for array operations on Python, which is designed to improve interoperability and consistency across different array libraries. This also means that scikit-learn algorithms written in NumPy for CPU can work on other hardwares (GPU) with PyTorch or CuPy, greatly improving performance. As I gained more familiarity with the scikit-learn codebase and Array API, I began working on adapting “larger” functions to be Array API compatible, which means a lot more fundamental, a lot more dependencies, a lot more challenging, and a lot more fun.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
I started my contributions by adapting certain metrics (tweedie, mean absolute percentage error etc.) to be Array API compatible under the guidance of my mentor, Olivier. The Array API standard is a cross-library API for array operations on Python, which is designed to improve interoperability and consistency across different array libraries. This also means that scikit-learn algorithms written in NumPy for CPU can work on other hardwares (GPU) with PyTorch or CuPy, greatly improving performance. As I gained more familiarity with the scikit-learn codebase and Array API, I began working on adapting “larger” functions to be Array API compatible, which means a lot more fundamental, a lot more dependencies, a lot more challenging, and a lot more fun.
I started my contributions by adapting certain metrics (tweedie, mean absolute percentage error etc.) to be Array API compatible under the guidance of my mentor, [Olivier](https://github.com/ogrisel). The Array API standard is a cross-library API for array operations on Python, which is designed to improve interoperability and consistency across different array libraries. This also means that scikit-learn algorithms written in NumPy for CPU can work on other hardwares (GPU) with PyTorch or CuPy, greatly improving performance. As I gained more familiarity with the scikit-learn codebase and Array API, I began working on adapting “larger” functions to be Array API compatible, which means a lot more fundamental, a lot more dependencies, a lot more challenging, and a lot more fun.


### Chinese Community Outreach

China has the second largest user group of scikit-learn. As a community, we believe that we can be more inclusive to ease Chinese contribution and do what is necessary to recruit more Chinese contributors. Therefore, I need to find out who and where scikit-learn is being used, if there are other platforms (outside of GitHub) that development is happening, because GitHub tends to be very slow in China, and establish scikit-learn’s official presence in the Chinese community.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
China has the second largest user group of scikit-learn. As a community, we believe that we can be more inclusive to ease Chinese contribution and do what is necessary to recruit more Chinese contributors. Therefore, I need to find out who and where scikit-learn is being used, if there are other platforms (outside of GitHub) that development is happening, because GitHub tends to be very slow in China, and establish scikit-learn’s official presence in the Chinese community.
China has the second largest user group of scikit-learn according to documentation web analytics. As a community, we believe that we can be more inclusive to ease Chinese contribution and do what is necessary to onboard more Chinese contributors. Therefore, I need to find out who and where scikit-learn is being used, if there are other platforms (outside of GitHub) that development is happening, because GitHub tends to be very slow in China, and establish scikit-learn’s official presence in the Chinese community.


I also had weekly Peer Programming sessions with Loïc and Stefanie, where my piled-up questions from the week outside of Array API would be answered, and I would almost always learn something new about developer tools or programming fundamentals.

On the Chinese community outreach side, it has always been with the scikit-learn communications team. Here I must give a special shoutout to manager François, who is also part of the communications team, for always being supportive and believing in my outreach efforts, especially because I was nervous doing this kind of task and using Chinese in a professional context for the first time. I also got to interact with [Charlie](https://charlie-xiao.github.io/) (yes, the core-dev Charlie), who is located in China and helped me tremendously with tasks that require physical presence.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
On the Chinese community outreach side, it has always been with the scikit-learn communications team. Here I must give a special shoutout to manager François, who is also part of the communications team, for always being supportive and believing in my outreach efforts, especially because I was nervous doing this kind of task and using Chinese in a professional context for the first time. I also got to interact with [Charlie](https://charlie-xiao.github.io/) (yes, the core-dev Charlie), who is located in China and helped me tremendously with tasks that require physical presence.
On the Chinese community outreach side, it has always been with the scikit-learn communications team. Here I must give a special shoutout to manager [François](https://www.linkedin.com/in/françois-goupil/), who is also part of the communications team, for always being supportive and believing in my outreach efforts, especially because I was nervous doing this kind of task and using Chinese in a professional context for the first time. I also got to interact with [Charlie](https://charlie-xiao.github.io/) (yes, the core-dev Charlie), who is located in China and helped me tremendously with tasks that require physical presence.

Copy link

@Charlie-XIAO Charlie-XIAO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also seems that assets/videos/emily_blog_vid.MOV can be removed now that we are using the TikTok link?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants