generalCHAT manually collected training datasets for training language models whenever i'm finished on a new version, i'll post it here license is specified in the beginning of the datasets but idk how github works