Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the observability and readability of BootstrapFewShot optimizer #1574

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

chenmoneygithub
Copy link
Collaborator

The PR has two parts:

  1. Better progbar, currently the progbar is quite confusing because it never finishes. Instead of setting the length of progbar as the training set size, it should reflect the actual step. See below for logging change after this PR:

Before the PR:

 40%|█████████████████████████████████████████████▏                                                                   | 4/10 [00:00<00:00, 800.94it/s]
  0%|                                                                                                                          | 0/10 [00:00<?, ?it/s]
  0%|                                                                                                                          | 0/10 [00:00<?, ?it/s]
  0%|                                                                                                                          | 0/10 [00:00<?, ?it/s]
  0%|                                                                                                                          | 0/10 [00:00<?, ?it/s]
Bootstrapped 4 full traces after 1 examples in round 4.

After the PR:

Bootstrapping 4 examples: 100%|████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 721.88it/s]
2024-10-02T00:33:26.452003Z [info     ] Bootstrapped 4 full traces after 5 examples in round 1. [dspy.teleprompt.bootstrap] filename=bootstrap.py lineno=155
  1. Improved readability, including renaming vague variables and methods, adding comments on code that cannot self explain, and deleting unused code.


if success:
bootstrapped[example_idx] = True
if success:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we remove the check below?

if example_idx not in bootstrapped:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea it's my bad, I thought this condition is always true, but in fact the same example can go through bootstrap more than once.

@okhat
Copy link
Collaborator

okhat commented Oct 7, 2024

Thanks @chenmoneygithub ! Overall, I like the goals of this PR but I don't think we should merge yet.

I think some functionality may have changed accidentally(?), see above. We will re-design this optimizer so it's parallel etc so maybe we should jump straight to that.

On logging, I'm not sure I like the proposed approach. It makes it really hard to see if the bootstrapping keeps failing or how much progress it's making. Keep in mind that in the general case, bootstrapping can fail 100s of times and may even never succeed. Right now, the proposal here would make the user see zero progress.

Maybe the better thing to do is to just bump the progress bar to full after bootstrapping exits :D

But as I said, this all will change with parallelism. We should just jump right into that.

@chenmoneygithub
Copy link
Collaborator Author

chenmoneygithub commented Oct 7, 2024

@okhat Thanks for reviewing! Yea this progbar will not be valid if we enable parallelism.

It makes it really hard to see if the bootstrapping keeps failing or how much progress it's making.

Good point! That's actually an area I am uncertain about - I suspect if our users can understand what "bootstrap failure" means, I couldn't understand what bootstrap means until reading through the source code, which is unlikely to happen for our users. For us seeing bootstrapping failures helps with debugging, but users may find the logging as pure noise and no idea what that means. However, as you mentioned, if bootstrapping keeps failing users will see 0 progress, which is bad as well.

Agree we should not merge this PR! I will open a separate one which only contains trivial fixes as in this PR for better readability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants