Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't encode character #295

Open
mrx23dot opened this issue May 24, 2022 · 2 comments
Open

can't encode character #295

mrx23dot opened this issue May 24, 2022 · 2 comments

Comments

@mrx23dot
Copy link

[898 | 64206.12] loss=0.76 avg=0.84
[899 | 64275.59] loss=0.40 avg=0.84
[900 | 64345.04] loss=0.53 avg=0.83
======== SAMPLE 1 ========

Traceback (most recent call last):
  File "C:\tmp\btc_all\btc_LUT\generators\gen_infintext_gpt2.py", line 27, in <module>
  File "C:\Python37\lib\site-packages\gpt_2_simple\gpt_2.py", line 334, in finetune
    generate_samples()
  File "C:\Python37\lib\site-packages\gpt_2_simple\gpt_2.py", line 309, in generate_samples
    fp.write('\n'.join(all_text))
  File "C:\Python37\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u0e4e' in position 1908: character maps to <undefined>

could solve it by "test".encode("utf-8","ignore")

@kreas
Copy link

kreas commented Oct 4, 2022

I fought with this myself. It has to do with the default encoding. As detailed here you can fix it by setting PYTHONUTF8=1 in System Properties > Advanced > Environment Variables

@mrx23dot
Copy link
Author

mrx23dot commented Oct 4, 2022

I think you need to put these two into the beginning of the file, and save it as utf

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants