Dynamically set max_new_tokens
based on output feature length, GMSL and model window size
#11294
Job | Run time |
---|---|
1s | |
2s | |
2m 48s | |
2m 48s | |
2m 47s | |
2m 48s | |
1s | |
2m 48s | |
1s | |
1s | |
2m 38s | |
1s | |
1s | |
1s | |
1s | |
16m 47s |