-
Notifications
You must be signed in to change notification settings - Fork 10
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'ht23' of https://github.com/NBISweden/workshop-python i…
…nto ht23
- Loading branch information
Showing
68 changed files
with
39,535 additions
and
29,182 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,3 +14,4 @@ dependencies: | |
- beautifulsoup4=4.10.0 | ||
- fsspec=2021.10.0 | ||
- openpyxl=3.0.9 | ||
- scikit-learn=1.3.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "c8d1efdc", | ||
"metadata": {}, | ||
"source": [ | ||
"<span style=\"color:green; font-size:30px\">ChatGPT</span><span style=\"color:red; font-size:30px\"> Exercise</span> \n", | ||
"\n", | ||
"<br>\n", | ||
"ChatGPT can be a tremendous help in writing and understanding code. But, ChatGPT often makes mistakes, and you cannot always trust the result. However, we should try to use it as a tool in the work we do.\n", | ||
"\n", | ||
"In this exercise you will be given a piece of code that is much more complex than what you have worked with so far. As there are no comments explaining what this code does, we will use ChatGPT to help us understand what it does, and modify the code. It often happens that you receive code, or find some code online that you would want to use, but you don't understand it fully. Here, ChatGPT comes in handy.\n", | ||
"\n", | ||
"This exercise has some different levels of difficulty, try around with a few of the tasks below:\n", | ||
"\n", | ||
"1. Input the code below to ChatGPT, and have it explain, line by line, what the code does\n", | ||
"2. Modify the code (on your own, not using ChatGPT) to use 4 clusters instead of 2, and write the results to a file instead of printing it\n", | ||
"3. Use ChatGPT to see if you can generate code that clusters the data into 2 clusters, of which you further cluster the biggest of those clusters into 3 subclusters\n", | ||
"4. Use ChatGPT to see if you can use t-sne to plot the results of the results from both question 2 and question 3. Try using the cluster groups as colors\n", | ||
"\n", | ||
"And if there are parts ChatGPT says that you do not understand, try prompting it for further explanations." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "4ba19843", | ||
"metadata": {}, | ||
"source": [ | ||
"### The code to understand and modify:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "50460557", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import numpy as np\n", | ||
"from sklearn.cluster import KMeans\n", | ||
"\n", | ||
"def one_hot_encode(sequence):\n", | ||
" encoding = {'A': [1, 0, 0, 0], 'C': [0, 1, 0, 0], 'G': [0, 0, 1, 0], 'T': [0, 0, 0, 1]}\n", | ||
" return np.array([encoding[nucleotide] for nucleotide in sequence]).flatten()\n", | ||
"\n", | ||
"def generate_random_dna_sequence(length):\n", | ||
" nucleotides = ['A', 'C', 'G', 'T']\n", | ||
" return ''.join(np.random.choice(nucleotides, size=length))\n", | ||
"\n", | ||
"def main():\n", | ||
" sequences = [generate_random_dna_sequence(20) for _ in range(50)]\n", | ||
"\n", | ||
" encoded_data = [one_hot_encode(sequence) for sequence in sequences]\n", | ||
"\n", | ||
" kmeans = KMeans(n_clusters=4, random_state=42)\n", | ||
" clusters = kmeans.fit_predict(encoded_data)\n", | ||
"\n", | ||
" for i, sequence in enumerate(sequences):\n", | ||
" cluster_label = clusters[i]\n", | ||
" print(f\"Sequence: {sequence}, Cluster: {cluster_label}\")\n", | ||
"\n", | ||
"if __name__ == \"__main__\":\n", | ||
" main()" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.9.4" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.