Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Wrong behavior on InferenceParams.AntiPrompts #1056

Open
SlimeNull opened this issue Jan 24, 2025 · 0 comments
Open

[BUG]: Wrong behavior on InferenceParams.AntiPrompts #1056

SlimeNull opened this issue Jan 24, 2025 · 0 comments

Comments

@SlimeNull
Copy link

Description

According to the content of my message, it is impossible for the model to reply "Never go to give you up", but every time it encounters an EOS token, it will be at the end of the response
Image

Reproduction Steps

Use this code

using LLama;
using LLama.Common;
using LLama.Sampling;
using LLama.Transformers;
using System.Text;

namespace LlamaTest
{

    internal class Program
    {
        static async Task Main(string[] args)
        {
            string modelPath = @"C:\Users\Xavier\.ollama\models\blobs\sha256-60cfdbde0472c3b850493551288a152f0858a0d1974964d6925c2b908035db76";
            var parameters = new ModelParams(modelPath)
            {
                ContextSize = 1024, // The longest length of chat as memory.
                GpuLayerCount = 5,
            };

            using var model = LLamaWeights.LoadFromFile(parameters);
            using var context = model.CreateContext(parameters);

            var executor = new InteractiveExecutor(context);

            var chatHistory = new ChatHistory();

            ChatSession session = new ChatSession(executor, chatHistory);
            session.WithHistoryTransform(new PromptTemplateTransformer(model, true));

            InferenceParams inferenceParams = new InferenceParams()
            {
                //MaxTokens = 256, // No more than 256 tokens should appear in answer. Remove it if antiprompt is enough for control.
                AntiPrompts = new List<string> { "Never gonna give you up" }, // Stop generation once antiprompts appear.
                SamplingPipeline = new DefaultSamplingPipeline(),
            };

            while (true)
            {
                var input = Console.ReadLine();
                var message = new ChatHistory.Message(AuthorRole.User, input);

                await foreach (var text in session.ChatAsync(message, true, inferenceParams))
                {
                    Console.Write(text);
                }
            }
        }
    }
}

And here is the model: https://ollama.org.cn/library/deepseek-llm

Environment & Configuration

  • Operating system: Windows11 23H2 (22631.4751)
  • .NET runtime version: .NET8
  • LLamaSharp version: 0.20.0
  • CUDA version (if you are using cuda backend): no GPU using
  • CPU & GPU device: 12th Gen Intel(R) Core(TM) i7-12700H, NVIDIA GeForce RTX 3060 Laptop GPU

Known Workarounds

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant