Skip to content
/ EOCA Public

We investigate the optimal condition of an enzyme by directly analyzing the sequence. We propose an embedding method to represent the amino acids and the construct information as vectors in the latent space.

Notifications You must be signed in to change notification settings

bj600800/EOCA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Enzyme-Optimal-Condition-Analysis

We investigate the optimal condition of an enzyme by directly analyzing the sequence. We propose an embedding method to represent the amino acids and the construct information as vectors in the latent space.

File descriptions:

1.change_to_csv.java: Convert .fas data to .csv data

2.change_to_number.java: Convert string data in .csv to real number or one-hot type

3.Split_train_test.java:Split the data into training set and test set proportionally

4.create_probability.java: Generate the sampling ratio of positive and negative samples

5.create_samples.java: Generate positive and negative samples according to the sampling ratio

6.protein_learning.java:Embedding learning

7.check_matrix.java: Evaluate the accuracy of the model

Citation

Li, X., Dou, Z., Sun, Y. et al. A sequence embedding method for enzyme optimal condition analysis. BMC Bioinformatics 21, 512 (2020). https://doi.org/10.1186/s12859-020-03851-5

About

We investigate the optimal condition of an enzyme by directly analyzing the sequence. We propose an embedding method to represent the amino acids and the construct information as vectors in the latent space.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages