Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follow-up Questions and Eager for Your Insights #1

Open
huskydoge opened this issue Sep 5, 2024 · 2 comments
Open

Follow-up Questions and Eager for Your Insights #1

huskydoge opened this issue Sep 5, 2024 · 2 comments

Comments

@huskydoge
Copy link

Thank you so much for sharing your insights in the blog—it’s been incredibly helpful as I’m just starting out on my research journey. It’s sparked a few thoughts and questions, and I’d really appreciate hearing your opinions on them.

1. Do Junior Students Really Need a Clear Research Focus When Applying for PhD Programs?

You talked about how junior students value publishing their first few papers to learn the research process and make progress:

Junior students learn to place a lot of value in publishing their first couple of papers. And that's reasonable: it's how you learn to conduct research, explore initial topics, and demonstrate early progress.

While I see many peers already making solid contributions to specific areas, I’m still kind of broadly interested in different topics. This leaves me feeling a bit uncertain, especially since I’m preparing to apply for PhD programs. I’ve been told my Statement of Purpose should tell a coherent story, but some of my experience comes more about just wanting to explore different things, rather than diving deep into one particular area. Is that okay for a PhD application? I’d love to hear your advice on how to handle this as a junior student.

2. How should junior researchers prioritize "timely, large headroom, and fanout" in research? Which can be sacrificed, and which is most essential?

Another challenge I’ve encountered is in spotting and selecting problems. As you mentioned, it’s crucial to focus on problems that are timely, have large headroom, and offer “fanout”. However, with the rapid growth of the AI community, it feels increasingly difficult for individual researchers or small labs to pursue something both unique and valuable. When I presented an idea related to 3D positional embeddings for video to my advisor, he pointed out that large AI companies were likely already working on something similar. I believed the idea was timely—long, complex video understanding is still a pressing issue, especially with the rise of embodied AI, where accurately interpreting continuous visual input is key. It had headroom, since the abundant PE techniques from LLMs could be adapted to VLMs, and if successful, it would have “fanout,” being quickly adopted across various architectures. Yet, just a few months later, I saw a paper published that was nearly identical to what I had been working on, which confirmed my advisor's concerns.

As a junior student, it’s tough to find high-quality problems that haven’t already been explored by larger research teams. Even if you discover something first, companies can catch up fast. So, some compromise seems necessary, especially for less experienced researchers like me. With that in mind, I’d love to hear your thoughts on prioritizing these three dimensions. If one had to be sacrificed, which would it be? And which do you consider absolutely essential for a great project?

3. Could raising code quality standards, including reviews, reduce "paper manufacturing" while promoting more meaningful contributions, and is it practical?

Your point about “investing in projects, not papers” also brought up some reflections from my first research experience. When I started doing literature reviews, I noticed that many papers lacked accessible code or had repositories with minimal code that didn’t align well with the experiments. Later, when I learned about the open review process for AI papers, I was surprised to find that clean, ready-to-use code wasn’t required at submission. I had initially assumed that the code would be submitted and reviewed along with the paper.

Lately, I’ve seen more papers that focus on storytelling rather than providing practical, reusable code, which should offer clear downstream utility and minimal friction. In AI research, I believe papers should be closely tied to the code itself, with detailed documentation like a thorough README to ensure others can build on the work. Given that while submission numbers in major AI conferences continue to rise, ML systems-related papers, which often require higher coding standards, have been growing at a slower pace, I even think it might be beneficial for conferences to include some level of code review. Do you think raising the bar for code quality could help slow down the rapid pace of "paper manufacturing" and encourage more meaningful, high-quality contributions? What might be the potential downsides, and is this practical in today’s academic environment?


Lastly, I noticed a few minor typos as I carefully read through your blog. I just wanted to mention them in case it’s helpful for future revisions.

**Milestone 4: Understand that there are categories of users, and leverage that to grow.** When I started both ColBERT and DSPy, the original audience I sought were researchers and expert ML engineers. I learned over time to let go of that and to understand that you can reach much larger audiences, but that they require different things. Before anything else, stop blocking different potential categories of users indirectly or even directly. This is more commone than one may thing. Second, when seeking users, seek a balance between advanced usecases (which may require substantial investment on your part) and public builders, who generally will lead a much larger fraction of any massive growth of funnel/excitement and will teach you a lot about your initial assumptions.
commone than one may thing -> common than one may think
Just to illustrate, ColBERT isn't one paper from early 2020. It's probably around ten papers now, with investements into improved training, lower memory footprint, faster retrieval infrastructure, better domain adaptation, and better alignment with downstream NLP tasks. Similarly, DSPy isn't one paper but is a large collection of papers on programming abstractions, prompt optimization, and downstream programs. So many of these papers are written by [different, amazing primary authors](https://github.com/stanfordnlp/dspy?tab=readme-ov-file#dspy-programmingnot-promptingfoundation-models). A good open-source artifact creates modular pieces that can be explored, owned, and grown by new researchers and contributors.
investements -> investments

Thanks again :) for all the valuable insights you’ve shared, and I truly appreciate dspy. I find it more powerful and elegant compared to other popular frameworks, and it’s a really great example of how to apply tips 4-6 when building a project code repository. I feel fortunate to be part of your receptive audience, and I’ve been actively sharing your blog and dspy with my friends and classmates.

Please forgive me if I made any incorrect statement in this post.

@okhat
Copy link
Owner

okhat commented Sep 5, 2024

Hey @huskydoge , thanks for the great questions and the feedback. I just fixed the typos you listed!

A lot of my feedback is useful for pre-PhD students, but just to be clear it's primarily directed at PhD students who have 3-5 years ahead of them to invest during and even after their PhD. If you're still an undergraduate student, for example, it's unusual to expect you to have the level of impact my post discusses.

Your other questions are great. I'll have to defer my thoughts to another time.

@tongyx361
Copy link

  1. I've wondered why we should always "think two steps ahead." Then I saw @huskydoge's issue and was inspired that it might be a good approach to avoid experiences like "large AI companies were likely already working on something similar" in question 2.
  2. Besides, I am also wondering how to be well qualified for a PhD program. As @okhat said, it might be unusual for pre-PhD students to pursue the level of impact your post discusses, so I'd like to follow up with the question: what do you think a pre-PhD student should pursue to be well qualified for a PhD program? I would appreciate any advice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants