Skip to content

Commit

Permalink
Merge branch 'ht23' of https://github.com/NBISweden/workshop-python i…
Browse files Browse the repository at this point in the history
…nto ht23
  • Loading branch information
richelbilderbeek committed Oct 13, 2023
2 parents 8af243a + 9813567 commit cd66a25
Show file tree
Hide file tree
Showing 68 changed files with 39,535 additions and 29,182 deletions.
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,4 @@ dependencies:
- beautifulsoup4=4.10.0
- fsspec=2021.10.0
- openpyxl=3.0.9
- scikit-learn=1.3.0
89 changes: 89 additions & 0 deletions exercises/day2/Day_2_Exercise_ChatGPT.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "c8d1efdc",
"metadata": {},
"source": [
"<span style=\"color:green; font-size:30px\">ChatGPT</span><span style=\"color:red; font-size:30px\"> Exercise</span> \n",
"\n",
"<br>\n",
"ChatGPT can be a tremendous help in writing and understanding code. But, ChatGPT often makes mistakes, and you cannot always trust the result. However, we should try to use it as a tool in the work we do.\n",
"\n",
"In this exercise you will be given a piece of code that is much more complex than what you have worked with so far. As there are no comments explaining what this code does, we will use ChatGPT to help us understand what it does, and modify the code. It often happens that you receive code, or find some code online that you would want to use, but you don't understand it fully. Here, ChatGPT comes in handy.\n",
"\n",
"This exercise has some different levels of difficulty, try around with a few of the tasks below:\n",
"\n",
"1. Input the code below to ChatGPT, and have it explain, line by line, what the code does\n",
"2. Modify the code (on your own, not using ChatGPT) to use 4 clusters instead of 2, and write the results to a file instead of printing it\n",
"3. Use ChatGPT to see if you can generate code that clusters the data into 2 clusters, of which you further cluster the biggest of those clusters into 3 subclusters\n",
"4. Use ChatGPT to see if you can use t-sne to plot the results of the results from both question 2 and question 3. Try using the cluster groups as colors\n",
"\n",
"And if there are parts ChatGPT says that you do not understand, try prompting it for further explanations."
]
},
{
"cell_type": "markdown",
"id": "4ba19843",
"metadata": {},
"source": [
"### The code to understand and modify:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "50460557",
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"from sklearn.cluster import KMeans\n",
"\n",
"def one_hot_encode(sequence):\n",
" encoding = {'A': [1, 0, 0, 0], 'C': [0, 1, 0, 0], 'G': [0, 0, 1, 0], 'T': [0, 0, 0, 1]}\n",
" return np.array([encoding[nucleotide] for nucleotide in sequence]).flatten()\n",
"\n",
"def generate_random_dna_sequence(length):\n",
" nucleotides = ['A', 'C', 'G', 'T']\n",
" return ''.join(np.random.choice(nucleotides, size=length))\n",
"\n",
"def main():\n",
" sequences = [generate_random_dna_sequence(20) for _ in range(50)]\n",
"\n",
" encoded_data = [one_hot_encode(sequence) for sequence in sequences]\n",
"\n",
" kmeans = KMeans(n_clusters=4, random_state=42)\n",
" clusters = kmeans.fit_predict(encoded_data)\n",
"\n",
" for i, sequence in enumerate(sequences):\n",
" cluster_label = clusters[i]\n",
" print(f\"Sequence: {sequence}, Cluster: {cluster_label}\")\n",
"\n",
"if __name__ == \"__main__\":\n",
" main()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.4"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
18 changes: 11 additions & 7 deletions exercises/day4/Day_4_exercise_1_hints.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,8 @@
"source": [
"This gives us the code\n",
"```py\n",
" fields = line.split('|')```"
"fields = line.split('|')\n",
"```"
]
},
{
Expand Down Expand Up @@ -158,7 +159,8 @@
"source": [
"The genres are at position 5, which is the second last position\n",
"```py\n",
"genres = fields[5].strip()```\n",
"genres = fields[5].strip()\n",
"```\n",
"or\n",
"```py\n",
"genres = fields[-2].strip()\n",
Expand All @@ -170,7 +172,8 @@
"to `[\"Action\", \"Drama\"]` we must split the string at `,`:\n",
"\n",
"```py\n",
"genres = fields[5].strip().split(',')```"
"genres = fields[5].strip().split(',')\n",
"```"
]
},
{
Expand All @@ -192,7 +195,8 @@
" rating = float(fields[1])\n",
" title = fields[-1].strip()\n",
" m_year = int(fields[2])\n",
" genres = fields[-2].strip().split(',')```\n",
" genres = fields[-2].strip().split(',')\n",
"```\n",
" \n",
" \n",
"------------"
Expand Down Expand Up @@ -287,7 +291,7 @@
" movie_ok = False\n",
" if rating_max and rating_max < rating:\n",
" movie_ok = False\n",
" ``` "
"``` "
]
},
{
Expand Down Expand Up @@ -436,7 +440,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
Expand All @@ -450,7 +454,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
"version": "3.9.4"
}
},
"nbformat": 4,
Expand Down
Loading

0 comments on commit cd66a25

Please sign in to comment.