diff --git a/Contents.md b/Contents.md index 5d271ed..9344337 100644 --- a/Contents.md +++ b/Contents.md @@ -22,7 +22,7 @@ * Compositional * Structural * Graphs - * _Exercise: Crystal space_ + * _Exercise: Navigating crystal space_ 5. **Classical Learning** * _k_-nearest neighbours diff --git a/Lecture1.ipynb b/Lecture1.ipynb index ddf4e28..3a3cded 100644 --- a/Lecture1.ipynb +++ b/Lecture1.ipynb @@ -11,11 +11,9 @@ }, { "cell_type": "markdown", - "metadata": { - "id": "4qYrIcVKciGR" - }, + "metadata": {}, "source": [ - "
\n", + "
\n", " 💡 Ada Lovelace: The more I study, the more insatiable do I feel my genius for it to be.\n", "
" ] @@ -49,7 +47,44 @@ "There are a few components to be aware of:\n", "\n", "### Python\n", - "A working knowledge of the [Python](https://www.python.org) programming language is assumed for this course. If you are rusty, Chapters 1-4 of [Datacamp](https://www.datacamp.com/courses/intro-to-python-for-data-science) cover the base concepts, as do many other online resources including Imperial's [Introduction to Python](https://www.imperial.ac.uk/students/academic-support/graduate-school/students/doctoral/professional-development/research-computing-data-science/courses/intro-to-python) course.\n", + "A working knowledge of the [Python](https://www.python.org) programming language is assumed for this course. If you are rusty, Chapters 1-4 of [Datacamp](https://www.datacamp.com/courses/intro-to-python-for-data-science) cover the base concepts, as do many other online resources including Imperial's [Introduction to Python](https://www.imperial.ac.uk/students/academic-support/graduate-school/professional-development/doctoral-students/research-computing-data-science/courses/python-for-researchers) course.\n", + "\n", + "Choose your degree programme: \n", + "\n", + "
\n", + " \n", + " \n", + "
\n", + "
\n", + " \n", + " \n", + "
\n", + "\n", + "If MSc, have your completed the introductory Python course:\n", + "\n", + "
\n", + " \n", + " \n", + "
\n", + "
\n", + " \n", + " \n", + "
\n", + "\n", + "Rate your current Python level:\n", + "\n", + "
\n", + " \n", + " \n", + "
\n", + "
\n", + " \n", + " \n", + "
\n", + "
\n", + " \n", + " \n", + "
\n", "\n", "### Markdown\n", "Markdown is a markup language that allows easy formatting of text. It is widely used for creating and formatting online content. It is easier to read and write than html. A guide to the syntax can be found [here](https://www.markdownguide.org/basic-syntax/).\n", @@ -64,7 +99,7 @@ "[GitHub](https://github.com) is a platform for writing and sharing code. There are many materials science projects hosted there, which enable researchers from around the world to contribute to their development. These notebooks are hosted on GitHub too. If you find an error, you can raise an [issue](https://github.com/aronwalsh/MLforMaterials/issues) or even better fix it yourself with a [pull request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests).\n", "\n", "### Live coding\n", - "The weekly notebooks are designed to be run online directly in your browser. You can activate the server by clicking the rocket icon on the top right and selecting `Live Code`. There is an option to open in [Binder](https://mybinder.org) or [Google Colab](https://colab.research.google.com), which you may prefer if you are an advanced user, but the formatting won't be as nice. Those services will be used for the research challenge later in the course. You can opt to install Python on your own computer with [Anaconda](https://www.anaconda.com/products/distribution) and run the notebooks locally, but we do not offer support if things go wrong." + "The weekly notebooks are designed to be run online directly in your browser. You can activate the server by clicking the rocket icon on the top right and selecting `Live Code`. There is an option to open in [Binder](https://mybinder.org) or [Google Colab](https://colab.research.google.com). Colab is more powerful, but the formatting won't be as nice. You can opt to install Python on your own computer with [Anaconda](https://www.anaconda.com/products/distribution) and run the notebooks locally, but we do not offer support if things go wrong." ] }, { @@ -119,7 +154,7 @@ }, "outputs": [], "source": [ - "print(\"Beware of 妖精\") # anything after '#' is a comment and ignored" + "print(\"Beware of 小妖精\") # anything after '#' is a comment and ignored" ] }, { @@ -411,7 +446,7 @@ "id": "oteHuO9DciGY" }, "source": [ - "The syntax employed here is Markdown. It can be used in notebooks, is also popular on Github for documentation, and is even a fast way to take notes during lectures.\n", + "The syntax employed here is Markdown. It can be used in notebooks, is popular on Github for documentation, and can even be a fast way to take notes during lectures.\n", "\n", "`![](https://media.giphy.com/media/cxk3z6nMhpf7a/giphy.gif)`\n", "\n", @@ -441,7 +476,7 @@ "- $k_B$ is the Boltzmann constant, and\n", "- $T$ is the temperature.\n", "\n", - "Let's write a function for it, which will take advantage of the wonderful [NumPy](https://numpy.org) package. It also uses the [physical constants](https://docs.scipy.org/doc/scipy/reference/constants.html#physical-constants) in scipy, and explains the function with a [docstring](https://en.wikipedia.org/wiki/Docstring)." + "Let's write a function for it, which will take advantage of the wonderful [NumPy](https://numpy.org) package. It also uses the [physical constants](https://docs.scipy.org/doc/scipy/reference/constants.html#physical-constants) in [SciPy](https://scipy.org), and explains the function with a [docstring](https://en.wikipedia.org/wiki/Docstring)." ] }, { @@ -472,7 +507,7 @@ " float: the rate of the reaction.\n", " \"\"\"\n", " if np.any(temperature <= 0):\n", - " raise ValueError(\"Temperature must be greater than 0 K.\")\n", + " raise ValueError(\"Temperature must be greater than 0 K\")\n", " return D0 * np.exp(-activation_energy / (k_B * temperature))" ] }, @@ -482,7 +517,7 @@ "id": "R8aKxKtuciGY" }, "source": [ - "This function takes `activation_enery` (eV) and `temperature` (K) as inputs and returns the corresponding diffusion coefficient. Recall that the units of the exponential term have to cancel out, so $D_{ion}$ takes the same units as $D_0$. Now let's use it:" + "This function takes `activation_energy` (eV) and `temperature` (K) as inputs and returns the corresponding diffusion coefficient. Recall that the units of the exponential term cancel out, so $D_{ion}$ takes the same units as $D_0$. Now let's use the function:" ] }, { @@ -498,7 +533,8 @@ }, "outputs": [], "source": [ - "arrhenius(0.12, 1000) # Calls the function for Ea = 0.12 eV; T = 1000 K" + " # Call the function for Ea = 0.12 eV; T = 1000 K\n", + "arrhenius(0.12, 1000) " ] }, { @@ -509,52 +545,7 @@ "source": [ "This value tells us the likelihood that each attempt has of overcoming the thermodynamic barrier for ionic diffusion. Decrease the temperature to 100 K and see the difference.\n", "\n", - "Now let's take advantage of the function to make a plot. We will use the numpy function `linspace`, which is documented over [here](https://numpy.org/doc/stable/reference/generated/numpy.linspace.html). It simply generates 100 numbers evenly spaced between 100 and 1000 to represent the temperature range." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "import matplotlib.pyplot as plt\n", - "\n", - "# Pre-exponential term in cm^2/s\n", - "D0 = 0.5\n", - "\n", - "# Temperature range in K\n", - "T = np.linspace(0, 1000, 100)\n", - "\n", - "# Example activation energy in eV\n", - "activation_energy = 0.83\n", - "\n", - "# Calculate rates\n", - "rates = arrhenius(activation_energy, T, D0)\n", - "\n", - "# Plotting\n", - "plt.figure(figsize=(5, 3))\n", - "plt.plot(T, rates, label=f'Activation Energy = {activation_energy} eV')\n", - "plt.xlabel('Temperature (K)')\n", - "plt.ylabel('$D_{ion}$ (cm$^2$/s)') # Adding units to y-axis\n", - "plt.title('Thermally Activated Transport')\n", - "plt.legend()\n", - "plt.grid(True)\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "kZ4t29owb5zj" - }, - "source": [ - "
\n", - " Code hint \n", - "Start the temperature range from 100 instead of 0 K.\n", - "
" + "Now let's take advantage of the function to make a plot. We will use the numpy function `linspace`, which is documented over [here](https://numpy.org/doc/stable/reference/generated/numpy.linspace.html). It is used here to generate 100 numbers evenly spaced between 100 and 5000 that represent the temperature range of our \"experiments\"." ] }, { @@ -571,14 +562,16 @@ }, "outputs": [], "source": [ - "# Pre-exponential term in cm2/s\n", + "import matplotlib.pyplot as plt\n", + "\n", + "# Pre-exponential term in cm^2/s\n", "D0 = 0.5\n", "\n", "# Range of activation energies in eV\n", - "activation_energies = np.linspace(0.1, 1.0, 4) # Range from 0.1 to 1.0 eV in n steps\n", + "activation_energies = np.linspace(0.1, 1, 0) # Range from 0.1 to 0.8 eV in n steps\n", "\n", "# Temperature range in K\n", - "T = np.linspace(100, 1000, 0)\n", + "T = np.linspace(100, 5000, 100)\n", "\n", "# Calculate rates and plot curves\n", "plt.figure(figsize=(5, 3)) \n", @@ -589,7 +582,7 @@ "\n", "plt.xlabel('Temperature (K)')\n", "plt.ylabel('$D_{ion}$ (cm$^2$/s)') \n", - "plt.title('Diffusion with varying activation energies')\n", + "plt.title('Varying activation energy')\n", "plt.legend()\n", "plt.grid(True)\n", "plt.show()" @@ -603,7 +596,7 @@ "source": [ "
\n", " Code hint \n", - "'np.linspace' requires three arguments (start, stop, number of points). 0 points won't work. Try changing it to 10.\n", + "'np.linspace' requires three arguments (start, stop, number of points). 0 points won't work. Try changing it to 5.\n", "
" ] }, @@ -613,7 +606,7 @@ "id": "uWlaJMBQciGZ" }, "source": [ - "To better visualise the trends, we can make an Arrhenius plot by plotting the logarithm of D versus the inverse temperature, 1/T. We use 1000/T to give a nicer range on the $x$-axis." + "To better visualise the trends, we can make an Arrhenius plot by plotting the natural logarithm of $D$ versus the inverse temperature, 1/T. We use 1000/T to give a nicer range on the $x$-axis." ] }, { @@ -651,7 +644,7 @@ "id": "GzN2cRN0ciGZ" }, "source": [ - "The last technique to picked up in this class is data fitting. Later in the module, this will be expanded to more complex functions in high dimensions, but we'll start with linear regression in just 2D. There is no need to code this by hand as we can use a [function](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html) in the machine learning package [scikit-learn](https://scikit-learn.org). The real power of Python is the quality and quantity of available libraries such as this one." + "The last technique to pick up in this class is data fitting. Later in the module, we will use more complex functions in high dimensions, but let's start with linear regression. There is no need to code this by hand as we can use a [function](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html) in the machine learning package [scikit-learn](https://scikit-learn.org). The real power of Python is the quality and quantity of available libraries such as this one." ] }, { @@ -752,7 +745,7 @@ }, "outputs": [], "source": [ - "# Print the model parameters and performance with enhanced formatting\n", + "# Print the model parameters and performance\n", "try:\n", " print(f'Slope: {model2.coef_[0]:.2f}') # Assuming model.coef_ might be an array for multidimensional X\n", " print(f'Intercept: {model2.intercept_:.2f}')\n", @@ -782,12 +775,9 @@ "source": [ "## 🚨 Exercise 1\n", "\n", - "```{admonition} Coding exercises\n", - ":class: note\n", - "The exercises are designed to apply what you have learned with room for creativity. It is fine to discuss solutions with your classmates, but the actual code should not be directly copied.\n", - "\n", - "The completed notebooks are to be submitted at the end of class, but you can revist later, experiment with the code, and follow the further reading suggestions.\n", - "```\n", + "
\n", + " 💡 Coding exercises: The exercises are designed to apply what you have learned with room for creativity. It is fine to discuss solutions with your classmates, but the actual code should not be directly copied.\n", + "
\n", "\n", "### Your details" ] @@ -825,7 +815,25 @@ "id": "DIia0_h9ciGa" }, "source": [ - "### Tasks" + "### Problem\n", + "\n", + "Due to their importance in the electronics industry, the diffusion of atoms in semiconductors has been well studied for decades. Below is a set of data for impurity diffusion in crystalline Si [Source: [Casey and Pearson (1975)](https://link.springer.com/chapter/10.1007/978-1-4684-0904-8_2)]. It has been arranged into a DataFrame for your convenience.\n", + "\n", + "```python\n", + "import pandas as pd\n", + "\n", + "data = {\n", + " 'Impurity': ['B', 'Al', 'Ga', 'In', 'P', 'As', 'Sb', 'Bi'],\n", + " 'Mass': [10.81, 26.98, 69.72, 114.82, 30.97, 74.92, 121.76, 208.98], # atomic mass in g/mol\n", + " 'D0': [5.1, 8.0, 3.6, 16.5, 10.5, 60.0, 12.9, 1.03E3], # cm2/sec\n", + " 'Eact': [3.70, 3.47, 3.51, 3.91, 3.69, 4.20, 3.98, 4.63] # eV\n", + "}\n", + "\n", + "df = pd.DataFrame(data)\n", + "print(df)\n", + "```\n", + "\n", + "Two tasks will be given in class." ] }, { @@ -901,10 +909,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "```{admonition} Submission\n", - ":class: note\n", - "When your notebook is complete, click on the download icon on the top right, select `.ipynb`, save the file and upload it to Blackboard. If you are using Google Colab, you have to File -> Download and choose `.ipynb`.\n", - "```" + "
\n", + " 📓 Submission: When your notebook is complete, click on the download icon on the top right, select .ipynb. If you are using Google Colab, go to File > Download and choose .ipynb. The completed file should be uploaded to Blackboard under assignments for MATE70026.\n", + "
" ] }, { @@ -916,7 +923,7 @@ "source": [ "## 🌊 Dive deeper\n", "\n", - "* _Level 1:_ Read Chapter 1 of [Machine Learning Refined](https://github.com/jermwatt/machine_learning_refined#what-is-new-in-the-second-edition) for a complementary introduction to the field.\n", + "* _Level 1:_ Read Chapter 1 of [Machine Learning Refined](https://github.com/neonwatty/machine_learning_refined) for a complementary introduction to the field.\n", "\n", "* _Level 2:_ Taylor Sparks has a collection of video lectures on [Python for Materials Engineers](https://www.youtube.com/watch?v=tn1wpfpLx6Y&list=PLL0SWcFqypCmkHClksnGlab3wglEVMqNN&index=2).\n", "\n", @@ -930,7 +937,7 @@ "toc_visible": true }, "kernelspec": { - "display_name": "Python 3 (ipykernel)", + "display_name": "vscode24", "language": "python", "name": "python3" }, @@ -944,7 +951,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.11.7" + "version": "3.12.4" } }, "nbformat": 4, diff --git a/_config.yml b/_config.yml index 8fcca83..0725ca5 100644 --- a/_config.yml +++ b/_config.yml @@ -4,7 +4,7 @@ title: Machine Learning for Materials author: Aron Walsh logo: logo.png -copyright: "2024" +copyright: "2025" # Force re-execution of notebooks on each build. # See https://jupyterbook.org/content/execute.html diff --git a/slides/MLforMaterials_Lecture1_Intro_25.pdf b/slides/MLforMaterials_Lecture1_Intro_25.pdf new file mode 100644 index 0000000..9d82e78 Binary files /dev/null and b/slides/MLforMaterials_Lecture1_Intro_25.pdf differ