-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when run "python train.py" #20
Comments
Hi Minhbk, I'm getting the same error, it's just as the message says. You're running into encoding problems. That character (soft hyphen apparently) isn't in the utf-8 encoding set. Try opening with a different encoding if you can, otherwise modify the input text to use standard hyphens. |
I run this code with python 2.7, and it work! I don't understand why it work :D |
Solution:
enjoy an be ready for loooooong time of training =) Problem is that dataset has non ASCII characters (about 3k times ) such as 0xAD(some short -) 0x97 (long -) Works on Win 10, GPU TensorFlow and Python 3.5. |
@FrayaMiner Thanks for your response. Since you are using the same OS configuration what I am using. (tensorflow-gpu) C:\Users\user1>python C:\Users\user1\Downloads\tf_seq2seq_chatbot\tf_seq2seq_chatbot\train.py |
@dsblr
|
@FrayaMiner Thank you for your response. I am now getting the following error. UnicodeDecodeError: 'utf-8' codec can't decode byte 0xad in position 2329: invalid start byte |
@dsblr you can try manually search and clean file, but I prefer cleaning with regex. also, you can try Encoding -> Convert to ANSI and then save character 0xad at 2329 is a non-ascii "soft hyphen" |
@FrayaMiner thanks once again for your response. I tried to remove the soft hyphen manually but the errpr still presist . I beleive few of the soft hype I might have missed. Thanks |
@FrayaMiner is there a way to solve the following error. I am getting this while running the test.py "C:\Work\tf_seq2seq_chatbot\tf_seq2seq_chatbot\lib\seq2seq_model_utils.py", line 43, in get_predicted_sentence The above error solved after importing xrange into seq2seq_model_utils.py. But while executing chat.py, another error I am getting :
|
Guys, Traceback (most recent call last): |
are you slove the problem |
Can anyone solve the below errors? Traceback (most recent call last): |
File "train.py", line 184, in C:\TensorFlow\models\research\object_detection> |
Hello, I want to ask if you have solved the problem |
I am very worry, do you solve it/label_map??????????????/ |
Preparing dialog data in /var/lib/tf_seq2seq_chatbot/data
Creating vocabulary /var/lib/tf_seq2seq_chatbot/data/vocab20000.in from data /var/lib/tf_seq2seq_chatbot/data/chat.in
Traceback (most recent call last):
File "train.py", line 15, in
tf.app.run()
File "/home/minh/opt/anaconda3/envs/ml/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "train.py", line 12, in main
train()
File "/home/minh/source/machine-learing/chatbot/tf_seq2seq_chatbot/tf_seq2seq_chatbot/lib/train.py", line 22, in train
train_data, dev_data, _ = data_utils.prepare_dialog_data(FLAGS.data_dir, FLAGS.vocab_size)
File "/home/minh/source/machine-learing/chatbot/tf_seq2seq_chatbot/tf_seq2seq_chatbot/lib/data_utils.py", line 200, in prepare_dialog_data
create_vocabulary(vocab_path, train_path + ".in", vocabulary_size)
File "/home/minh/source/machine-learing/chatbot/tf_seq2seq_chatbot/tf_seq2seq_chatbot/lib/data_utils.py", line 70, in create_vocabulary
for line in f:
File "/home/minh/opt/anaconda3/envs/ml/lib/python3.5/site-packages/tensorflow/python/platform/gfile.py", line 176, in next
return next(self._fp)
File "/home/minh/opt/anaconda3/envs/ml/lib/python3.5/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xad in position 2329: invalid start byte
The text was updated successfully, but these errors were encountered: