diff --git a/docs/index.html b/docs/index.html index 4f85e73..369561c 100644 --- a/docs/index.html +++ b/docs/index.html @@ -225,7 +225,7 @@

Teaching Fellows

Frank Cheng xcheng@g.harvard.edu -Fri 4:00-5:00pm (SEC 2.112+2.123) +Fri 4:00-5:00pm (SEC 2.122+2.123) Thu 1:00-2:15pm (SEC 2.122+2.123) @@ -267,7 +267,7 @@

Teaching Fellows

Haitian Liu hliu3@g.harvard.edu -Fri 2:45-3:45pm (SEC 2.112+2.123) +Fri 2:45-3:45pm (SEC 2.122+2.123) Fri 1:30-2:45pm (SEC 2.122+2.123) diff --git a/docs/tipuesearch_content.js b/docs/tipuesearch_content.js index 3fdb6dd..0f82d20 100644 --- a/docs/tipuesearch_content.js +++ b/docs/tipuesearch_content.js @@ -1 +1 @@ -var tipuesearch = {"pages":[{"title":"CS107/AC207 Project","text":"Project Overview Goal You will develop a software library for a client (the teaching staff). The development of this library will leverage modern software development practices covered in the course. By the end of the semester, the client should be able to easily install and run your package. Topic TBD Project Milestones TBD Groups You will work in groups of 4-5 students. You are free to choose your project partners but groups sizes must consist of the number of students mentioned before. Some members of the group will be stronger than others. It is expected that you work together and help each other as needed. This is an opportunity for less experienced coders to improve their skills by working with more experienced coders. Every person must contribute. Expectations This project has a few non-negotiable expectations, which are outlined in basic expectations . The project also has a more open-ended component, which is described in additional expectations . Basic Expectations The client should be able to easily install the library, run the tests, access the documentation, and use the library for their application. Documentation for every subsystem in the project must be provided. Link to the docs from the README.md in each folder. The top level README.md should contain an overview, links to other docs, and an installation guide which will help us install and test your system. The project must utilize a proper packaging system for distribution and installation of the library. The project must ship with a test suite. Documentation on how to run the tests is mandatory. Additional Expectations TBD Broader Impact You must write a broader impact statement for your library. The broader impact should consider the accessibility of your software library to different groups of people. This statement should be around 250 words (approximately 1/2 page). It can be placed in the README.md of your library. Things to consider when writing this statement are: How will you make your library accessible to different groups? What process will contributions to your library need to go through? How will you ensure that this process is fair and welcoming to all groups?","tags":"pages","url":"pages/project.html"},{"title":"Resources","text":"Books No book is required. But we highly recommend two books for this course. Fluent Python: Clear, Concise, and Effective Programming, by Luciano Ramalho. Publisher: O'Reilly Media. 2015. Designing Data Intensive Applications, by http://dataintensive.net/ , The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann. Publisher: O'Reilly Media 2014 Other useful books The Practice of Programming by Brian W. Kernighan and Rob Pike, Addison-Wesley, 1999. Skiena: The Algorithm Design Manual Abelson, Sussmann and Sussmann: SICP and python based online version based on it: http://composingprograms.com/ High Performance Python: By Micha Gorelick, Ian Ozsvald. Oreilly Media 2014. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation by Andreas Griewank Papers and other readings Python pep8 An opinionated guide to python style Git Recommended: Git from the bottom up Recommended: Git Book GitHub Videos and Training GitHub Interactive Tutorial Git - the simple guide Git Reference Git Cheat Sheet Git Immersion Tutorial Git Atlassian Tutorial Python python Rich overview of Python 3 language features (recommended to work through) Scientific visualization with Python and Matplotlib C/C++ Fall 2021 C/C++ primer material C Tutorial C++ Tutorial C++ Cheat Sheet C++ Reference Vim Spend 30 minutes to complete the vimtutor . After you have installed vim , execute the following command in your command line: vimtutor Vim Cheat Sheet Vimcasts Recommended book Bash Command Line Reference Cheat Sheet Bash scripting Cheat Sheet Unix-Related Basic Computing Tools Windows Users Using Linux Subsystem on Windows 10 PuTTY SSH client for Windows Ubuntu Docker Image You can get an Ubuntu based Docker container with docker pull iacs/cs107_ubuntu The container is hosted here . The Dockerfile and run_cs107_docker.sh launch script can be found in the class repository .","tags":"Resources","url":"pages/resources.html"},{"title":"Schedule","text":"1 9/5, 9/7 Lecture 1: Unix and Linux Lecture 2: Command line 2 9/12, 9/14 Lecture 3: Bash Scripting Lecture 4: Version Control / git Pair Programming Wk1(9/22) HW1: (9/12 - 9/27) 3 9/19, 9/21 Lecture 5: git Lecture 6: Python Pair Programming Wk2(9/29) 4 9/26, 9/28 Lecture 7: Python / OOP Lecture 8: Python Pair Programming Wk3(10/06) HW2: (9/27 - 10/11) 5 10/3, 10/5 Lecture 9: Python Lecture 10: Databases I Pair Programming Wk4(10/13) 6 10/10, 10/12 7 10/17, 10/19 8 10/24, 10/26 9 10/31, 11/2 10 11/7, 11/9 11 11/14, 11/16 12 11/21, 11/23 Thanksgiving Break 13 11/28, 11/30 14 12/5, 12/7 Reading Period 15 12/12,12/14 Final Exam Period Final Exam Period","tags":"pages","url":"pages/schedule.html"},{"title":"Schedule","text":"All due events with a given date are due on 21:59pm that day . Wk Tuesday Thursday Labs Events 1(35) Lecture 1: 2023-09-05 Class introduction/organization History of Bell Labs, Unix and Linux Command line introduction Lecture 2: 2023-09-07 More command line Pipes Regular expressions File attributes 2(36) Lecture 3: 2023-09-12 Command line customization I/O redirection Environment variables Shell scripting Process management Lecture 4: 2023-09-14 Version control systems (VCS) Centralized and distributed models Intro to Git PP01: (2023-09-12) Setup private class repository, tmate 3(37) Lecture 5: 2023-09-19 Version control systems (VCS) Managing repositories Remote repositories Branching Lecture 6: 2023-09-21 Python basics Objects and Functions Environments Closures PP02: (2023-09-18) Bash scripting, Git workflow Note: PP01 deadline (2023-09-22) 4(38) Lecture 7: 2023-09-26 TOPIC 1 TOPIC 2 TOPIC 3 Lecture 8: 2023-09-28 TOPIC 1 TOPIC 2 TOPIC 3 PP03: (2023-09-25) Topics PP03 Note: HW1 deadline (2023-09-27) PP02 deadline (2023-09-29) 5(39) Lecture 9: 2023-10-03 Lecture 10: 2023-10-05 6(40) Lecture 11: 2023-10-10 Lecture 12: 2023-10-12 7(40) Lecture 13: 2023-10-17 Lecture 14: 2023-10-19 8(40) Lecture 15: 2023-10-24 Lecture 16: 2023-10-26 9(40) Lecture 17: 2023-10-31 Lecture 18: 2023-11-02 10(40) Lecture 19: 2023-11-07 Lecture 20: 2023-11-09 11(40) Lecture 21: 2023-11-14 Lecture 22: 2023-11-16 12(40) Lecture 23: 2023-11-21 Thanksgiving break: 2023-11-23 11(40) Lecture 24: 2023-11-28 Lecture 25: 2023-11-30 11(40) Lecture 26: 2023-12-05 Reading period: 2023-12-07 11(40) Final exam period: 2023-12-12 Final exam period: 2023-12-14","tags":"pages","url":"pages/schedule_static.html"},{"title":"Syllabus","text":"Course Objective The primary goal of this course is to teach you how to develop effective software for scientific applications. In order to achieve this goal, there are several non-negotiable topics that must be included in the course. We will be concerned with two primary thrusts: System and Software Engineering and Language . Moreover, we aim to provide you with a suite of modern software development techniques and workflows. Learning Objective After successful completion of this course, you will be able to: Use Python, including its advanced features to write scientific programs. Have a basic idea how the Python interpreter works. Understand what features of Python make up its language execution model and how these features impact the code you write: e.g. how modularity, abstraction, and encapsulation can be used to solve problems. Write programs with good software engineering practices. These practices include: working on remote machines, version control, continuous integration, documentation and testing. Utilize data management techniques to store data, starting from a good understanding of data structures to databases. Combine these techniques together to write large pieces of software working in a team. Develop pipelines to integrate data aquisition and processing. Evaluate and test software as part of the development process. Be able to contribute on both the science and software engineering sides of things. Prerequisites You should have some basic familiarity with programming (functions, variables, constants, differences between integer and floating point, etc.) at the level of CS50. Some comfort with a tool to edit text files is beneficial. Any text editor or IDE will suit this purpose. The student should have passed a basic calculus class. The lectures will review the necessary fundamentals required to succeed with the class project. Besides this, you should have interest or investment in scientific computing. You can download Homework 0 for self-assessment here (not graded). You do not need to be able to solve all problems in order to take this class. Jupyter Notebooks Jupyter notebooks are great for code prototyping and learning how to use new features and APIs. However, they are not suitable for large software development projects! One reason for this is because code development in Jupyter notebooks is a nonlinear development process and there is presently no good solution for version control of Jupyter notebooks. A second reason is the question of efficient source editing. A helpful tool to convert (back and forth) Jupyter notebooks to pure python code is Jupytext . Homework assignments and lecture exercises turned in as Jupyter notebooks will not be graded. Textbooks There is no required course textbook. However, the course content will draw from various sources. We will cite the source when appropriate. Please consult the resources page for recommended textbooks and additional helpful material. Course Format The delivery of course content will occur via two weekly lectures as well as weekly pair-programming sections. Attending these sessions is mandatory . Lectures will consist of considerable interaction and discussion and will be greatly enhanced by student participation. The course contains the following main components: Lectures: Deliver the main content of the class. Attendance is mandatory. Quizzes: Graded in-class quizzes intended to assess the learning progress. Pair-programming: Pair-programming (PP) sections offer practice on topics addressed in class and help assess the skills to program in a collaborative environment. Attendance is mandatory. Homeworks: Homework assignments deepen the lecture material and include coding exercises. Exercises may be of theoretical or practical nature. Projects: The class is accompanied by a project (teams of 4-5 students) to practice the methods learned in class on a real Python application. The project topic is given by the teaching staff. The main programming language taught throughout the course is Python. Grading The following weight table is used for individual components of the class. The class does not have standard midterm or final exams. Total Weight Homework (7 Homeworks) 35% Project 35% Quizzes (4 Quizzes) 15% Pair-programming (12 sections) 15% Homework There are 7 homework assignments where each contributes equally to the final grade. The homework is focused on the topics discussed in class and involves programming and theoretical work. The teaching staff is determined to return solutions and graded assignments with feedback after the due date. It is your responsibility to check the consistency between your graded work and the assignment solution. You have the option to address possible inconsistencies in office hours or request a regrading for the assignment (see the homework grading inconsistencies section below). Homework will be released on the CS107/AC207 class repository . Push notifications for that repository will be distributed through the class mailing list . Homework will be graded on a 100 point scale: 100 = Solid / no mistakes (or really minor ones) 80 = Good / some mistakes 60 = Fair / some major conceptual errors 40 = Poor / did not finish 20 = Very Poor / little to no attempt. 0 = Did not participate / did not hand in Homework Submission Homework must be submitted via commits in your private git repository hosted in the CS107 organization at https://code.harvard.edu/CS107 . Grading and feedback for homework is done through the Gradescope platform which is connected to the class' Canvas site . Your homework solutions must therefore be zipped and uploaded in the Gradescope section of the class canvas. See the homework workflow tutorial for more details. The homework due date is indicated on the problem sheet and displayed in the schedule as well as shown on Canvas and Gradescope. Homework submissions will be graded on: Correctness: your code must run and must produce the correct result. We are not debugging issues when grading submissions. Presentation: presentation means structure and readability. We expect you to write high-quality, readable and tested code. A quality code is well commented in places where it is not straight forward to deduce the logic from code itself (from the reviewers perspective). We expect you to think about aspects such as modularity, reusability, code duplication and error handling when you design and write code. Presentation of results also means that unnecessary or superfluous files like editor backup files or other unrelated data should not be included in the submission commits (use .gitignore for this purpose). See the following tutorials to help you get started with homework submissions: How to setup your private class repository (onetime setup) Homework workflow Homework Late Days Homework submissions are accepted before the deadline of the assignment is due. You have three late days at your disposal that can be consumed for late submissions and two consecutive late days can be used at most for any of the homework assignments. Please note that any commits on your homework branch pushed after the deadline has passed are not considered for grading by default. If you wish that we consider a late commit for grading, please contact the teaching staff at cs107-staff@g.harvard.edu with appropriate explanation. This will count towards your late day budget. It is your responsibility to plan your work ahead and commit on time. If you have consumed all your late days and you have another late submission, it is in your benefit to still commit the work. We assume the Harvard Honor Code for all late submissions in case solutions are already posted. If you have a verifiable medical condition or other special circumstances that interfere with your coursework please let us know via cs107-staff@g.harvard.edu as soon as possible. Homework Grading Errors If you believe there is an error in your assignment grading, you can submit a regrade request through the Gradescope platform . Note: The entire assignment will be regraded. This may cause your total grade go up or down . An assignment can only be regraded once . Regrade requests are due within 2 days after the release of the grades . Project Please see the project section for more details. Quizzes There are 4 quizzes out of class which are graded and intended to assess the learning progress. Each quiz addresses topics from the lecture material . Quizzes are open book/ www and include multiple choice questions with at most back of the envelope calculations. Quizzes contain 12 questions and take 25 minutes. They are accessible on canvas within a 12 hour time window from 9am to 9pm at the day of the quiz. Note: if a quiz takes 25 minutes and you start the quiz on 8:50pm, you will have only 10 minutes to work on the quiz. Please see the class schedule as well. Pair-Programming Sections Pair-Programming will form an essential part of the course. Pair-programming will take place in mandatory pair-programming sections led by members of the teaching staff. You are required to sign-up for your preferred pair-programming section at the beginning of the semester. You are expected to attend your chosen section during the semester. Should you not be able to attend one of your sections, please coordinate with your section TF to attend another section this week in order to obtain the attendance credit. In CS107/AC207 we are focusing on command line tools for the development of software projects in computational science. It is important that you get familiar with a small selection of such tools and integrate them in your development process. The pair-programming sections aim at combining some of these tools together to provide you with hands-on experience while developing software. The key is the \"pair\" in pair-programming. The exchange of knowledge between team mates in these pair-programming sections is essential for learning said tools or learning something new from your peers. Please see the following file in the class git repository for the details: https://code.harvard.edu/CS107/main/blob/master/lab_groups.xls Pair-programming Submissions Pair-programming exercise solutions must be submitted in your private git repository hosted in the CS107 organization at https://code.harvard.edu/CS107 . Only commits made on or before the due date will be considered for grading . The deadline for submission is usually one week after the last section for the exercise. Given this extra time for completion, late days do not apply to pair-programming exercises . The submission due date is indicated on the problem sheet and displayed in the schedule . As you are working in groups of 3-4 students for the lab exercises, the solution files you come up with in the group are submitted by each group member individually in her/his own private Git repository. Pair-programming submissions will be graded based on the following criteria: Attendance: your attendance will be recorded by the TF who leads the section. Joining the section at the beginning and then leaving 10-15 minutes later will not reward attendance credit. If you need to leave because of another appointment then it is expected that you communicate beforehand and coordinate with your TF. Please see the attendance policy section below as well. Your pair-programming session is determined at the beginning of the class by choosing lab sections in my.harvard . You can select your preferences depending on your schedule. Once determined, you can lookup your session details in the https://code.harvard.edu/CS107/main/blob/master/lab_groups.xls sheet. Completion: pair-programming submissions should reveal effort that the student attempted to solve the tasks. If you experience difficulties in a particular problem and you are not able to complete the task, please indicate the issues you had in your code using comments; the teaching staff will take that reasoning into account. Handing in an empty skeleton (same as hand-out) does not meet the expected standard and will not award credit for the submission. See the following tutorial to help you get started with pair-programming submissions: Pair-programming workflow Office Hours The teaching staff holds weekly office hours. Office hour times and locations are listed on the class main page. Office hours offer an opportunity to review course materials and receive additional guidance on your homework. Please see the following file in the class git repository for the details: https://code.harvard.edu/CS107/main/blob/master/office_hours.xls Attendance Policy Attendance at lectures and pair-programming sections is mandatory as they are core parts of the class. Pair-programming sections (labs) will be held on weekdays that we determine at the beginning of the class according to a best fit of the students' individual schedules for the term. You are required to attend the labs on the assigned day. Rescheduling of a lab to a different day due to an unforeseeable event must be coordinated with the responsible TF by sending an email to cs107-staff@g.harvard.edu . To be excused from a lecture or a lab, we ask you to follow the Harvard Honor Code and send an email to cs107-staff@g.harvard.edu at least one day before the lecture or lab. Lecture recordings are available only when students are excused for a lecture. Collaboration Policy You are welcome to discuss the course material and homework with others in order to better understand it, but the work you turn in must be your own (with exception of the project where collaborative work is permitted). Any work submitted as your own without properly citing the original author(s), is considered plagiarism. Failure to follow the academic integrity and dishonesty guidelines outlined in the Harvard Student Handbook will have an adverse effect on your final grade. This includes the removal of copyright notices in code. You may not submit the same or similar work to this course that you have submitted or will submit to another without permission. The teaching staff may use tools to compute correlations between submitted work. Use of AI Models Purpose of Policy: This policy outlines the acceptable use of AI models, including but not limited to ChatGPT, in completing assignments for this course. Policy Guidelines: Original Work: Students are expected to complete assignments using their original thoughts and interpretations. AI models can be used to help understand concepts, generate ideas, or learn about different perspectives, but they should not write or complete assignments for students. Collaboration with AI: Students may use AI models for brainstorming or generating preliminary ideas, but the final work submitted must be substantially their own. Students should be able to explain their reasoning, logic, and conclusions without relying on the model's output. Restrictions for Specific Assignments: There may be specific assignments (e.g. quiz part of the midterms) or parts of the course where the use of AI models is entirely prohibited. These restrictions will be clearly stated in the assignment guidelines. Ethical Considerations: Students are encouraged to approach the use of AI with ethical considerations in mind, including issues related to privacy, bias, and authenticity. Consequences for Non-Compliance: Failure to adhere to this policy may result in academic penalties as outlined in the course's academic integrity policy. Questions and Clarifications: If students have questions about the appropriate use of AI models in an assignment, they should consult the course instructor or teaching assistants before proceeding. Please refer to the University's policy for further information. Accessibility If you have a documented disability (physical or cognitive) that may impair your ability to complete assignments or otherwise participate in the course and satisfy course criteria, please contact the teaching staff or directly the Accessible Education Office to receive an AEO letter that will authorize us to help you with corresponding accommodations. Diversity Statement All participants in this class are expected to foster empathy and respect towards each other. This includes instructors, teaching staff or students. The motivation to take this course shall be to experience the joy of learning in an environment that allows for a diversity of thoughts, perspectives and experiences and honors your identity including race, gender, class, sexuality, religion, ability, etc. Any constructive feedback for improving the class environment is welcome and I encourage you to reach out to the instructor or teaching staff with any concerns you may have. If you prefer to speak with someone outside of the course, you may find helpful resources at the Harvard Office of Diversity and Inclusion .","tags":"pages","url":"pages/syllabus.html"},{"title":"Tutorials","text":"How to Setup your Private Class Repository Steps to Setup Your Private Class Repository Add an SSH Key to Your Account Homework Workflow Example Homework Workflow Step 1: Branch Off Step 2: Solving the Homework Step 3: Create a Pull Request Creating a Web Pull Request Step 4: Submit on Gradescope Pair-programming Workflow Protocol How to launch tmate Recommended Workflow How to Setup your Private Class Repository All of your work in CS107/AC207 will be committed in your private class git repository hosted in the CS107 organization at https://code.harvard.edu/CS107 . (The class project will be hosted in another repository in the same organization, see Milestone 1A for this separate task.) This tutorial walks you through the steps to create your private class repository. If you have already created git repositories on GitHub, then there is nothing new to learn in this tutorial and you should be familiar with the process already. A note on https://code.harvard.edu/ : this is an instance of a GitHub Enterprise edition hosted by Harvard University. The user interface is identical to the public GitHub site. The main difference is that https://code.harvard.edu/ is owned by Harvard University , whereas GitHub belongs to Microsoft which gives rise to security concerns regarding data belonging to classes held at Harvard University. Steps to Setup Your Private Class Repository Obtain your Harvard NetID Send an email to cs107-staff@g.harvard.edu (using your .harvard.edu email) to request access to the CS107 organization . Include your NetID from step 1 in the body of the email and choose an appropriate subject line. Once added to the organization, navigate to https://code.harvard.edu/CS107 (login if necessary) and click the green \"New\" button to add a new repository. Your repository must be named after your NetID . You can add an optional description if you like. Make sure the private radio button is checked and click \"Create repository\". You do not need to check any other options. This is all you have to do for now. In the first homework we will focus on how to setup your new repository such that you can work with it from your laptop (you can skip the landing page after you have created the repository). When you navigate back to https://code.harvard.edu/CS107 you should see something similar to this: The blurred repository is your private class repository that was the focus of this tutorial. The main repository is the main CS107/AC207 class repository which is used to distribute all of the class material during the semester. Any updates to this repository will be broadcast via email message such that you will not miss out on new material. In the first homework we will set this repository as an upstream such that you can conveniently unpack class material into your private repository. Note: private repositories are only visible to you within the organization. Please do not create other repositories in the https://code.harvard.edu/CS107 organization. You have your own user account on https://code.harvard.edu/ just like you have on GitHub or other providers. Your user account requires your Harvard login credentials and is a good alternative to hosts like GitHub. Feel free to create as many repositories in your user account as you like. Add an SSH Key to Your Account In order to access content on https://code.harvard.edu using Git you need to setup an SSH key. Check if you already have the file ~/.ssh/id_rsa.pub (assuming RSA). If you do not have such a file you can create one with ssh-keygen -t rsa -b 4096 Choose the default location by just hitting enter. You may enter a password for the key or just hit enter to go without password. If go with password you will have to enter it every time you use the key. To upload the public key to your Harvard GitHub account , click on your icon in the top right corner on your https://code.harvard.edu page, then click on \"Settings\" and then \"SSH and GPG keys\" in the left panel. Alternatively use this link https://code.harvard.edu/settings/keys . Click on the green \"New SSH key\" button in the top right corner and give your new key a title (e.g. the name of your laptop). In the key field paste the contents of your public key found in ~/.ssh/id_rsa.pub . Use for example cat ~/.ssh/id_rsa.pub and copy paste the output into the \"Key\" field on your GitHub page. You are now able to access any repositories on https://code.harvard.edu with corresponding permissions. Never share your private key ~/.ssh/id_rsa with anybody. Note: do not create a key in the class Docker container since the key will be lost when you exit the container. For security reasons, sensitive keys like this should not be put in containers. Homework Workflow The following are the basic rules we apply for homework submissions: Naming convention for homework directories: your private repository should contain one homework directory on the repository root with hwX sub-directories for each homework assignment. The X in hwX is to be replaced with the assignment number. For example hw1 , hw2 and so on. Which files will be considered for grading: within the sub-directory hwX , place the assignment files that you want us to grade in a directory called submission . We will only grade data in these directories . Pull request (PR): your homework assignments must be completed on git branches called hwX , where X is again to be substituted with the assignment number. Your homework X submission requires an open pull request to merge the hwX branch into your main (or deprecated master ) branch for full points (both branches are inside your private class repository in the CS107 organization ). Some implications of this: Solving homework on the main or master branch is always wrong. For each homework submission you need to issue one open PR. Merging an open PR before the teaching staff has reviewed and graded your work will make the PR disappear . Only files inside submission in PR X will contribute to your hwX grade (see next item). Gradescope: your homework will be graded on the Gradescope platform that has been setup and linked to the class canvas page. The platform does currently not support submission directly via your Git repository. You therefore have to create a zip archive of your submission directory created in step 2 above and upload the archive on Gradescope . It is important that you zip-up the directory and not individual files inside. You can use the command zip -r submission.zip submission/ , where the -r option means add files recursively , submission.zip is the name of zip archive and submission is your homework submission directory from step 2 containing your solution. This assumes you are in homework/hwX inside your Git repository. Points will be lost if any of these requirements are violated . The teaching staff will review the open PR for each homework and grade your work accordingly. Grades will be released on canvas and feedback is provided through the Gradescope platform. Once you have received the grade and feedback, your open PR for homework X can be merged into your main or master branch if there are no more pending issues. After the PR is closed, you may delete the hwX branch in your repository. This concludes a homework submission. Example Homework Workflow This example is intended to help you internalize the three basic rules described above. Each homework awards 10 points by performing these steps correctly. Note: Specific instructions provided in each homework assignment may override the following basic approach. Suppose we want to work on homework assignment 3, which consists of 4 problems. Step 1: Branch Off The ease of branching is the main strength of git . Branches allow you to be destructive without affecting production code or data. The reason we solve homeworks on individual branches is to help you develop a feel for this protection and to materialize the required steps to create branches. Branches will provide you true comfort when working on real projects outside of this class. Make sure your master or main branch is in the state you want your new branch to be based on. If you need to synchronize with your default remote branch you can type git pull The next step consists of creating and switching to a new branch that is based off the current branch. For this you can use git checkout -b hw3 which is how you did it before git 2.23.0 . Since the checkout command is ambiguous , the preferred way for more recent versions of git is git switch -c hw3 You are now on a new branch called hw3 as required. You will need to issue a pull request into main or master from this branch such that your homework will be graded. You can create the PR now (see below) or once you are done with solving hw3 , it does not matter to git . (Pull requests are not something designed by git itself, but rather by platforms like GitHub or GitLab.) Note: you will lose 5 points if you are not solving your hwX (in this example it is hw3 ) on a branch named hwX . You are of course free to create additional branches besides hwX if suitable. Step 2: Solving the Homework The files the teaching staff will consider for grading have to be located in the directory homework/hw3/submission . You are free to put other files below homework/hw3 that might be useful when you revisit your work sometime later. The problem sheet might be one of those files. Class handouts are distributed in the main class repository . You can manually create these directories and copy the files you want into your hw3 directory using, for example: mkdir -p homework/hw3/submission cp /homework/hw3/hw3.pdf homework/hw3 Alternatively you can use git by configuring the main class repository as another remote in your local git repository (see homework 1). In this case you can checkout all the distributed homework files at once with git checkout class/master -- homework/hw3 assuming that the remote points to https://code.harvard.edu/CS107/main and is locally named \" class \". You may need to update your refs with git fetch --all before you invoke the checkout command above. The homework sheet will state what files have to be submitted. For this hw3 we assume they are P1.py , P2.py , P3.py and P4.py , one for each of the four problems. These files should run and return the required output. They have to be submitted inside the homework/hw3/submission directory. You should commit your work often in logical chunks. Your commits are to be done on the hw3 branch, of course. The following are a few commands that might be helpful: Use git status often to check your local state. Use the git add command to stage files you have changed for a commit. Use git commit -m to create a commit with an appropriate commit message. Use git stash to temporarily stash modified files (similar to a commit but it is not written to the history). Later on use git stash list to list all your stashed changes if you have used git stash multiple times. You can check what will change when you apply the stash with git stash show -p and apply the stashed changes with git stash apply (or git stash pop which also removes the stash from the list). Note that these commands work on the first stash object in the list stash@{0} if you do not explicitly specify the stash object you want to apply. Use git push to push local branch/commits to your remote repository. Use git restore to undo changes to a single file. Use git revert to undo the changes in a specific commit. Make sure you have committed your solution you want to submit inside the homework/hw3/submission directory with the required file names. Step 3: Create a Pull Request If you have local commits not pushed to the remote issue the git push command. You are now ready to issue a pull request (you could also have done this step at the very beginning of solving this homework, this is up to you). The goal is to merge the hw3 branch into your main or master branch eventually. The teaching staff must review and grade your work first, however. There are two ways to accomplish a PR on GitHub: Through the web browser at https://code.harvard.edu/CS107/ . Through the GitHub command line client . This method is helpful if you get distracted from the context switch that is associated with the first method. Note: you will lose 2 points if you do not create a PR. Disclaimer: you would not typically issue a PR for projects you are the sole contributor. Pull requests are typical for large projects at a company in which someone else will review your code before you can merge your code to the production branch. We want you to become accustomed to this type of workflow. It is a good idea to always use separate development branches. You should never commit straight to your main or master branches until the changes have thoroughly been tested. Creating a Web Pull Request Navigate to your https://code.harvard.edu/CS107/ private class repository and click on the \"Pull Requests\" tab in the top left part of the window. Click on the \"New pull request\" button Choose your main or master branch as the base (the one you want to merge into) and your hw3 branch as the one you want to compare to. This should automatically reload the page and show the changes that will be applied. Click on the \"Create pull request\" button. You can optionally add comments to this pull request if you desire. Click on the \"Create pull request\" button once more to create and open the pull request. The pull request is now open. You can even push more commits to the hw3 branch if you need to correct something (before the deadline has passed of course). Therefore, you could also create the PR at the beginning of the homework. Note: DO NOT click on the button that says \"Merge pull request\" until you have received your grade and feedback for that homework. You will lose 3 points if you prematurely merge your PR. Step 4: Submit on Gradescope Your submission is now ready to be submitted for grading on Gradescope . Simply create a zip archive of your submission directory you have created in your Git repository, e.g. submission.zip , and upload it to Gradescope by following the link above. You can use the command zip -r submission.zip submission/ , where the -r option means add files recursively , submission.zip is the name of zip archive and submission is your homework submission directory. Since you track the change history of your work in Git, you should not add *.zip files to your Git history. You can simply ignore such archives by adding the line *.zip to your .gitignore file in your repository root. Pair-programming Workflow Exercises performed during pair-programming sections should be put under version control similar to homework assignments (see the Homework Workflow section above). You must not branch off and create a pull-request for pair-programming exercises . Just add and commit your work on the main or master branch and push them to your repository ( make sure you are on the correct branch before you commit! ). The following are the basic rules we apply for pair-programming submissions: Your private repository should contain a directory named lab with sub-directories for each session. The sub-directories should be named ppX where X is the session number. Within the sub-directory ppX , place the exercise files that you completed during the pair-programming sections. The exercises must have the name exercise_Y.ext where Y corresponds to the exercise number and ext is the proper extension ( .py , .sh , .c , .cpp ) depending on the exercise. Here is an example how it may look like: The pair-programming exercises will be graded for completeness and help us ensure you are on the right track. You may lose points for the completeness part if you do not follow these two basic rules. Protocol In class we are focusing on command line tools for the development of software projects in computational science. It is important that you get familiar with a small selection of such tools and integrate them in your development process. The pair-programming sections aim at combining some of these tools together to provide you with hands-on experience while developing software. The key is the \"pair\" in pair-programming. The exchange of knowledge between team mates in these pair-programming sections is essential for learning said tools or learning something new you did not know before. The pair-programming works by using a tool called tmate which is based on the tmux terminal multiplexer . It allows for easy sharing of a command line session or a specific instance of a program in read/write and read-only modes via ssh or web browsers. Check out this blog post for more. Text file editing will be performed in any text editor that supports a text-based user interface (TUI). Recommended choices are vim or emacs . If you are mainly working on a Windows operating system, you should install the Windows Subsystem for Linux . A small guide for doing so can be found here . See the How to Launch tmate section below for the steps to launch tmate on your laptop. Note: tmate is perfect for any coding related communication. For example, debugging, work on the project and of course pair-programming. There is no audio channel integrated in tmate . If students are remote, a zoom session or similar must be established for oral communication. The exercises in the pair-programming sections are necessarily collaborative. Each member of the group will turn in the same script. Adhere to the following workflow when solving pair-programming exercises: For each exercise (or sub-exercise for big problems), there will be one sharer , one coder , and one listener . This assumes a group size of 3 . If the group only has two people, then either one of you can take the sharer's role. The sharer will start each coding session and document interactions including points of contention and challenges. The coder will be in charge of writing the code. The listener will make suggestions and may offer tweaks from time to time. The sharer starts a new tmate session and invites the other team mates to join the session. Ideally you want to start the session inside the directory of the current pair-programming exercise in your git repository. You may share a read/write link either through ssh or a web browser. Note: the sharer allows the others access to her/his computer. Any abusive behavior that may cause harm on the sharer's system will not be tolerated and are forwarded to the dean's office. After the team mates have accepted the invite, they will be able to share the terminal instance and can create new files or execute Python together. The team should discuss a strategy on how to approach the exercise. The coder should start writing some code with input from the other two team members. Before each section that you work on, place a comment indicating which team member worked on that section. For example, a bash script could look like this: #!/usr/bin/env bash # File : exercise_1.sh # Created : Sat Aug 07 2021 04:58:49 PM (-0400) # Coder : Alice # Listener : Bob # Sharer : Alice echo 'Hello World' ### Main point of contention: whether to capitalize \"W\" in \"world\" For small exercises, each team member can play a single role once. For large exercises, the team members may rotate roles. The exercise will make it clear when you should rotate. At the end of the exercise, the developed code is inside the sharer's git repository that can readily be committed. Links to download these files can be shared with the other team mates such that they can update their repositories as well. Note that the exercise code will contain comments pertaining to who worked on which section. How to launch tmate Disclaimer: tmate is a tool to share a terminal session and interact with other people. If you host a session, it means the instance runs on your local computer and you are in control of how much permissions you want assign to your mates. There are 2 ways to share a session: read-only: mates that connect to your session can only read data (this is safe provided the data you expose is safe). read-write: mates that connect to your session can read and write data. This is unsafe if you share a session with an mistrusted person. Recommended Workflow Launch the CS107/AC207 docker container with the working directory mounted (see the provided run_cs107_docker.sh launch script ) Start tmate inside the docker container Wrapping tmate in a docker container provides another layer of security. You can also install tmate using your distribution package manager (on Linux or homebrew on MacOSX) and skip step 1 if you wish not to use a container. Steps: Assume Docker is installed and we have pulled the CS107/AC207 docker image . You can install the run_cs107_docker.sh launch script in your PATH for convenience (e.g. ~/bin/run_cs107_docker.sh and add this directory to your PATH environment variable). Be sure that the run_cs107_docker.sh script is executable . See the chmod command to change the permissions of the script. Assume you want to work on the PP1 exercise and you are in the lab directory of your private Git repo and pp1 exists. Launch the docker container and mount the pp1 directory in your repository: $ run_cs107_docker.sh pp1/ root@0a076feb425f:~# You are now inside a running docker container. Note that the hostname 0a076feb425f is arbitrary and yours will differ. Launch tmate (it is already installed in the container): root@0a076feb425f:~# tmate Tip: if you wish to use tmate only for remote access, run: tmate -F To see the following messages again, run in a tmate session: tmate show-messages Press or to continue --------------------------------------------------------------------- Connecting to ssh.tmate.io... Note: clear your terminal before sharing readonly access web session read only: https://tmate.io/t/ro-qNRV5QRVWkW3qr55sfATkBegr ssh session read only: ssh ro-qNRV5QRVWkW3qr55sfATkBegr@nyc1.tmate.io web session: https://tmate.io/t/nMWurZc7Q6Zbv8EnX2wdhf6GB ssh session: ssh nMWurZc7Q6Zbv8EnX2wdhf6GB@nyc1.tmate.io The tmate instance is now running and you can choose between 4 possible links to share with your mates: 2 that can be run in your web browser and another 2 to be used with ssh in your terminal (either read-only and read/write). Choose the appropriate link you want to share with your pair-programming mates. If you press q or ctrl-c you are dropped back to the shell. The server will tell you whenever mates join. You can print the links again with tmate show-messages (be careful when you are sharing screens on zoom for example). Note that pressing ctrl-d or typing exit in the shell will close the active terminal and if only one is left, also the active tmate session. This will close the connections to all connected clients. You can now work together on the exercise. For example: root@0a076feb425f:~# vim exercise_1.py The image below shows a terminal session (left) and two mates connected in a web browser window (right): Note: We run tmate with root in the docker container. Do not run tmate as root in any other situation (even here we could create a regular user) and be careful with password-less sudo (avoid password-less sudo in the first place). In order to use ssh you need to setup an ssh key if you have not done so already. If you do not have such a key, you may create one by running ssh-keygen -t rsa -b 4096 If you are not dropped into a shell after you execute tmate it may be because you are using a shell different than zsh . Install zsh on your system using your package manager and run tmate like this SHELL = /bin/zsh tmate","tags":"pages","url":"pages/tutorials.html"},{"title":"Systems Development for Computational Science","text":"Computation has emerged as the third pillar of science alongside the pillars of theory and experiment. Computational science is maturing rapidly and has found considerable and significant use in supporting scientists from various disciplines (including all engineering disciplines, mathematics, physics, chemistry, finance, biology, and data analysis to name a few). Many burgeoning scientists are still taught to write \"a code\" for some problem and to debug when things look wrong. Given the ever-increasing complexity of software solutions to scientific problems, this old paradigm is no longer tenable and at best inefficient. CS107/AC207 is an applications course highlighting the use of software engineering and computer science in solving scientific problems. You will learn the fundamentals of developing scientific software systems including abstract thinking, the handling of data, and assessment of computational approaches: all in the context of good software engineering practices. The class syllabus can be found by following this link. Teaching Staff The preferred way to reach the teaching staff is described in the Teaching Staff Mailing List section below. Instructor Ignacio Becker ( iebecker@g.harvard.edu ) Office: SEC, Office 1.312-05 Office Hours: Wed 5:00-6:00pm Teaching Fellows Fellow Email Office Hours Pair-Programming Sections Kimon Vogt kvogt@g.harvard.edu Sat 8:00-10:00am (Zoom) Mon 8:00-9:15am (Zoom) Fri 8:00-9:15am (Zoom) Yixian Gan ygan@g.harvard.edu Tue 5:00-6:00pm (SEC 6.301+6.302) Mon 6:00-7:15pm (SEC 6.301+6.302) Allison Karp akarp@mde.harvard.edu Thu 9:30-10:30am (SEC 6.301,6.302) Tue 9:30-10:45am (SEC 6.301+6.302) Gekai Liao gekailiao@g.harvard.edu Thu 4:00-5:00pm (MD PierceHall 100F) Tue 3:45-5:00pm (SEC 6.301+6.302) Victor Zhu dunminzhu@g.harvard.edu Mon 10:00-11:00am (Zoom) Thu 6:00-7:15pm (Zoom) Frank Cheng xcheng@g.harvard.edu Fri 4:00-5:00pm (SEC 2.112+2.123) Thu 1:00-2:15pm (SEC 2.122+2.123) Danni Lai danninglai@g.harvard.edu Wed 4:00-5:00pm (Zoom) Tue 7:00-8:15pm (Zoom) Isabella Bossa isabellabossa@g.harvard.edu Tue 10:45-11:45am (MD 223) Thu 8:00-9:15am (MD 123) Tanner Marsh tam997@g.harvard.edu Fri 10:00-11:00am (SEC 2.112) Thu 1:00-2:15pm (SEC 2.122+2.123) Boxiang Wang bwang@g.harvard.edu Mon 1:00-2:00pm (SEC 6.301+6.302) Thu 3:45-5:00pm (SEC 4.405) Shuheng Liu shuheng_liu@g.harvard.edu Mon 7:00-8:00pm (Zoom) Tue 7:00-8:15pm (Zoom) Cyrus Asgari cyrusasgari@college.harvard.edu Wed 3:00-4:00pm (MD 123) Fri 12:00-1:15pm (SEC 2.122+2.123) Haitian Liu hliu3@g.harvard.edu Fri 2:45-3:45pm (SEC 2.112+2.123) Fri 1:30-2:45pm (SEC 2.122+2.123) Legend: SEC : Science and Engineering Complex, Northwestern Av 150, Allston MD : Maxwell-Dworkin, Cambridge Please see Pages section in Canvas for a Google Calendar. Lecture Hours All lectures are of 75 minutes duration. Time is given in Eastern Standard Time (Boston). Lecture attendance is mandatory : Time Room Tuesday 2:15 - 3:30 PM SEC 1.321 Thursday 2:15 - 3:30 PM SEC 1.321 Important Information Canvas: Is used for posting grades and other sensitive content. The class can be found on Canvas at this link https://canvas.harvard.edu/courses/122565 Class git repository: All handouts in CS107/AC207 are provided through the main repository hosted in the CS107 organization at https://code.harvard.edu/CS107/main . You can set this repository as an upstream in your private class repository or clone it once you have joined the CS107 organization git clone git@code.harvard.edu:CS107/main.git Updates to the main repository are posted on the class mailing list. Your Harvard ID is required to login to https://code.harvard.edu . You can request membership in the CS107 organization (AC207 students join the CS107 organization as well) by sending an email to cs107-staff@g.harvard.edu (using your .harvard.edu email). You must include your NetID in the body of your email, which is also your https://code.harvard.edu username (something similar to abc123 ). Once you have been added to the CS107 organization, create your own private repository inside the organization. Your private repository must have the exact name as your NetID . This will be your private class repository where you submit your homework and pair-programming exercises. See the following tutorial to help you get started with your git repository: How to setup your private class repository Class Discussion Forum We will use the Ed Discussion forum on our Canvas page as our main communication platform. Questions regarding homework, labs or lecture material must be posted on this forum and you are encouraged to reply to questions if you know the answer or you can share a useful contribution. A fraction of your participation grade is computed by how often you visit and the frequency you post on the forum. Class Mailing List You can optionally sign up to our class mailing list if you would like to be notified whenever there is new class content available in the class git repository. Replies to posts in this list will be sent to all list members. To sign up, send an email to: cs107+subscribe@g.harvard.edu (subscribe by sending a blank email to this address; use the email address associated with your HarvardID ) You are required to confirm your subscription. Simply reply to the confirmation email with a blank message to complete the subscription. Teaching Staff Mailing List You can reach the teaching staff directly by sending your email to the following mailing list cs107-staff@g.harvard.edu (email sent to this list is only seen by the teaching staff; only email ending with .harvard.edu is accepted) You are not required to register for this mailing list but only email addresses ending with .harvard.edu are accepted (you will receive a rejection message otherwise). Getting Started Checklist Sign up with the CS107 organization on https://code.harvard.edu/CS107 and create your own private repository inside the organization . Information flow: Canvas → Grades and discussion forum https://code.harvard.edu/CS107 Assignment submissions inside your private repository (homework, pair-programming exercises) Group repositories for project work All course handouts are published in the https://code.harvard.edu/CS107/main repository Need help? → cs107-staff@g.harvard.edu OPTIONAL: Sign up on the class mailing list to receive push notifications when new content is available in the https://code.harvard.edu/CS107/main class repository. You can get an Ubuntu docker container with the necessary class tools by docker pull iacs/cs107_ubuntu . Note that no ssh keys are contained in that image for use with git . See also the docker resources page .","tags":"pages","url":"pages/systems-development-for-computational-science/"},{"title":"Final Deliverables","text":"Due: Saturday, December 10th 2021, 11:59 PM Submission Instructions Your project should be available in your private project repo in the CS107 organization . Your submission should be in the following format: teamXX/ ├── docs │ ├── documentation │ ├── milestone1 │ └── milestone2 ├── LICENSE ├── README.md ├── src │ └── ... └── ... Note that src is a generic name used for source code containers. You may choose a different name for the directory that holds your project source code. It should be a concise name. Software Requirements Here are the main requirements for the final project: Working forward mode implementation See the sections below for more specific details Test suite Updated / extended documentation Your updated documentation will be the final package documentation. Please name it documentation . Do not name it milestone3 . New features Working Forward Mode Implementation You must have a working forward mode implementation. Your library should be able to handle real functions of one or more variables. This includes the situation where a user might have multiple functions each of multiple variables. Your library should be able to handle vector functions with multiple real scalar or vector inputs. Minimum Package Requirements The software should be available in your project repository. The software should also be installable via PyPI . You should provide a pyproject.toml or requirements.txt (depending on your packaging choice) file with your software so other developers are able to install the necessary dependencies. After a user installs your package, they should be able to use it without difficulty. Minimum Implementation Requirements The following is a description of a typical use case. A user downloads your package through PyPI . (It is a good idea to use https://test.pypi.org/ instead of the production URL for this class project.) They install the dependencies. They run the tests if they're a fellow developer. They create a \"driver\" script in the top level. Note: How they interact with your package will depend on your implementation. The interface and other implementation details should be described in your documentation. The next few steps may sound somewhat abstract, but that is only because they hinge on your specific implementation. In the driver script, they import your package. They instantiate an automatic differentiation object to be used in the forward mode. They use the automatic differentiation objects in their own applications (root-finding, optimization, etc). What Kinds of Functions should be Implemented? All basic operations and elementary functions should be implemented. Basic Operations Addition (commutative) Subtraction Multiplication (commutative) Division Power Negation Comparison Operators It is up to you which comparison operators to implement. Here are some options: __lt__ (less than) __gt__ (greater than) __le__ (less than or equal to) __ge__ (greater than or equal to) __eq__ (equal to) __ne__ (not equal to) You may or may not need them. If you have nodes in a tree you may find it useful to have at least __eq__ and __ne__ for node comparison. Similarly, for dual numbers some of these operators may make sense. We will be interested to see what (if any) uses you find for these operators. Elementary Functions Trig functions (at the very least, you must have sine, cosine, tangent) Inverse trig functions (e.g. arcsine, arccosine, arctangent) Exponentials Should be able to handle any base You can treat the natural base (e) as a special case This is what numpy does. Hyperbolic functions (sinh, cosh, tanh) Note that these can be formed from the natural exponential (e) Logistic function Again, this can be formed from the natural exponential Logarithms Should be able to handle any base. Square root Note that the hyperbolic functions and the logistic function are arguably not elemental functions since they can be formed from the natural exponential. On the other hand, they do form a key ingredient in some algorithms (e.g. neural networks) and can therefore be considered as elemental functions. You should implement these functions. Test Suite You should have a test suite that preferably runs with pytest (or unittest ). Your tests should be designed to run unit tests and integration tests if there are dependencies among individual units. Unit tests include class methods (dunder methods are methods of a class) and possibly overloaded functions that are part of a module. The test suite should be designed to integrate with GitHub Action workflows such that you can exploit regression testing. You should have a test harness that allows you to run all your tests easily and compute coverage reports in the same flawless manner. This harness must run for a git repo that is cloned to a local directory as well as in a container if it was run via a third party CI provider. Your project README.md file should contain a badge showing the pass/fail status of your CI builds. The badge should show that your build is passing all tests. You should also have your code test coverage workflow setup. Your project repo should have a badge defined in the README.md file reporting a pass/fail of the test coverage based on the following criterion: Pass if the coverage satisfies 90% or more, fail otherwise. You can implement this behavior in your coverage workflow using shell commands or write a script that is being executed in the workflow. When the criterion is met the script should return a success exit code such that the workflow succeeds. Useful commands are sed and awk . You will need to obtain the total coverage percentage by parsing an output file generated with pytest and the pytest-cov package and then process this information further to decide on the pass or fail criterion. These steps must be done in your CI setup (already done in M2). In addition you are required to publish a GitHub page that hosts an HTML version of your code coverage results. The pytest-cov plugin can generate these sources using the --cov-report=html option. An easy way to achieve this is to use this action in your coverage workflow. If you use pytest-cov and this action you must make sure that you remove .gitignore file inside the HTML sources generated by pytest-cov before you use the publish action (otherwise the plugin will push nothing to the gh-pages branch). Your website will be accessible at https://code.harvard.edu/pages/CS107/teamXX/ , where teamXX is your team ID. Documentation Your documentation must be complete, easy to navigate, and clear. Remember to update the Background and How to Use sections of your documentation as you add more functionality to your package, so that the user has a good understanding of what they can do. Call the final form of your documentation \" documentation \". Please update and consolidate all relevant documentation from milestone1 and milestone2 and make any changes suggested by the teaching staff. Your documentation should be a mix of text and hands-on demos. As always, it is up to you and your group to determine the best way to accomplish this (e.g. Jupyter notebook, GitHub README, Sphinx/Read the Docs). You will receive full points as long as you have a docs/ directory and your documentation is complete. However, you may want to consider alternative ways of hosting your documentation. For example: Read the Docs or Sphinx . Documentation Sections The following sections should be present: Introduction Describe the problem the software solves and why it is important to solve that problem. This can be built off of the milestones, but you may need to update it depending on what new feature you proposed. Background The automatic differentiation background can probably stay the same as in the milestones, unless you were told to update it considerably. Be sure to include any necessary background for your new feature. How to use your package How to install? Include a basic demo for the user. This can be based off of the milestone, but it may change depending on what your new feature is. You may want to consider more than one basic demo: one demo just for automatic differentiation and one demo for your new feature. Note that this is very much dependent on your final deliverable! Keep the basic demos to a manageable number. Software organization High-level overview of how the software is organized. Directory structure Basic modules and what they do Where do the tests live? How are they run? How are they integrated? How can someone install your package? Should developers and consumers follow a different installation procedure? Implementation details Description of current implementation. This section goes deeper than the high-level software organization section. Try to think about the following: Core data structures Core classes Important attributes External dependencies Elementary functions Your extension To start, copy over the new/future feature section of the documentation from Milestone 2 and update it to reflect the M2 Feedback. Description of your extension (the feature(s) you implemented in addition to the minimum requirements.) Additional information or background needed to understand your extension This could include required mathematics or other concepts Broader Impact and Inclusivity Statement (See Below) Future What else do you want to add? What is missing? Don't just think about mathematical things here. Try to think about applications that you'd like to have use your code. Just about every area of science can use automatic differentiation (physics, biology, genetics, applied mathematics, optimization, statistics / machine learning, health science, etc.). Broader Impact and Inclusivity Statement Include a section in your documentation and in your GitHub README on Broader Impact and Inclusivity. This section should be around a 1/2 page in length and it can be the same between your documentation and your README. It should address two points: The potential broader impacts and implications of your software. How is your software inclusive to the broader community? Broader Impact Regarding the broader impact portion, try to think about the ways people will use or misuse your software. What are the consequences? How should people use it responsibly? Are there any ethical implications? The NeurIPS website has a number of references to get you started on thinking about this: Please read through the paper: It's Time to Do Something: Mitigating the Negative Impacts of Computing Through a Change to the Peer Review Process Suggestions for Writing NeurIPS 2020 Broader Impacts Statements A Guide to Writing the NeurIPS Impact Statement Example paper Software Inclusivity In principle, there should be no barrier whatsoever for other developers to contribute to your code base. In practice, these barriers do exist and can be rather subtle. For example, are there any subtle barriers to underrepresented groups? What about working parents? What about people from different countries or non-native English speakers? Do people from rural communities feel they have something to offer? Carefully think about the code contribution process for your software project. How are pull requests being reviewed and approved? Who is reviewing and approving these requests? Python has a Diversity Statement , but it is fairly generic and contains a lot of boiler plate. Different Python groups may have more concrete policies. Can you do better? How this will be graded Both topics in this section are subjective and there is no perfectly correct answer. I am looking for effort and honest attempts to address these issues. Code quality The quality of the written code will be assessed largely based on the material of lecture 14 , slides 32-42 in lecture 8 , object oriented design principles (interface design of your library in addition to dunder methods) as well as lectures 3 and 4 dedicated to version control (VCS). To give you an idea what this means, here are a few practical examples: Are tests separated from source code? Why: when you deploy your project, you do not want to ship development related code. Having tests separated simplifies the packaging process of your source code (lectures 8 and 14). How does your git commit history look like? Why: one of the first things a new developer that has been added to your project group does is browsing through the commit history of your source base. The information that can be extracted in that process depends on how descriptive your commit messages are. Is the commit subject a concise description or just a mere \"Fixed bug\"? Are the commits of reasonable size with logical contributions of code or do they mix many things together (could also be an indicator of last-minute commits)? How is the distribution of commits among group members? How well is your code covered? Why: an important quality metric of your code is coverage. While in the \"Test Suite\" section above we assess the correct usage of pytest and/or unittest as well as the structure of your tests (unit tests, integration and regression), the quality of these tests is assessed here. How high is the line coverage (at least 90%)? Do your tests take into account boundary/edge cases? Can we make your code produce side effects? If that is the case your tests are not robust, even if you achieved 90% minimum coverage. A high coverage percentage does not necessarily imply good tests. Do you use docstrings for your modules, functions, classes, methods? Why: this is the main tool to document your code . A high quality code necessitates consistent docstrings through out your code base. The \"Documentation\" section above requires you to document your project and usage of your library with demos and examples. The documentation of your code is assessed here. A nice way to combine the docstrings with your documentation is the sphinx package (this is optional for your project). Docstrings are further required for a proper working of the pydoc tool that will be utilized by arbitrary users of your library. Ideally, well written docstrings are decorated with short doctests to illustrate example usage of the code (see lecture 14). How are comments utilized in the code? Why: these are important for your fellow colleagues who collaborate or continue development. It is very easy to 'forget' about them because to you (at the time you write the code) the intention is clear. This will not be the case for the other person or you in two weeks from writing the code (especially for complex sections in the code). It is important to keep that actively in mind when writing code and it is a major contributor to high quality code. How accessible is your library (through the interface)? Example: in many cases we are interested in gradients (e.g. Newton's method, learning or optimization). Does your library provide such an interface that returns the gradient of a function f(x) or is the user required to 'build' the gradient by multiple library calls if x is higher dimensional? What are the returned types? Example: following the gradient example above, is the returned type a numpy array or a naive python list? Since gradients are often used in numerical computation, which type would provide more value for your end user? Is your code formatted consistently? Why: a consistent formatting is very important for readability of your code. Do you use consistent indentation in your python code? Are comments formatted consistently throughout? Docstrings should also have a consistent formatting. It is the same with typography in printed media. The font family, font size or font color does not arbitrarily change because it would be very hard on the eye to read such a document. The same applies to code. See slides 6-8 in lecture 6. Final Deliverables The deliverable for the project is a video that describes all of the work done on your project throughout the semester. You are free to decide upon the format of the video yourself (with some restrictions - see below). Some ideas include: an actual video of your group presenting in front of a screen a narrated video of presentation slides with diagrams and illustrations a narrated live code demo a mix of all of the above You may want to consider tools such as asciinema . We won't be judging you on the quality of your camera or your video editing ability so don't worry about that. We will judge you only on the content you present. However, if the quality of the video is so poor as to prevent us from judging the quality of the work, then we will unfortunately need to deduct points. Video Requirements The video should be narrated by all members of your group. Every group member should speak an equal amount in the video. For a group of n people, you should change speakers exactly n-1 times. (i.e. you should all speak exactly once during the video - you shouldn't continuously change who is speaking). The Introduction / Background and Implementation details/Software organization/How to use sections should contain information related to the minimum project requirements only. If you want to provide additional Introduction / Background or Implementation details/software organization/how to use for your extension to the project, you should do this in the Your additional feature(s) and extension section of the video. Section Minimum Length Introduction/background 2 minutes Implementation details/Software organization/How to use 4 minutes Your additional feature(s) and extension 5 minutes Future work/possible extensions 2 minutes Video Submission instructions Your video will be submitted to a group assignment on Canvas. Your video should be at maximum 15 minutes. WARNING: DO NOT exceed the time! We will not grade portions of your project that exceed the time. Make sure the title of your video includes your teamXX ID. Things to keep in mind Remember, the teaching staff already has full access to your code, so there is no need to focus on small implementation details. DO NOT include snippets of your actual library code in your presentation! Pseudo-code and flowcharts can be very useful to give the big idea of how your package works. Library demos can be very useful, but be careful. If they don't work well then you'll waste all your video time. You should provide sufficient background for the project. Don't overdo the mathematical details for automatic differentation. We are already familiar with them. Instead, provide the big ideas behind automatic differentation and the motivation for using it. Spend a fair bit of time on your new feature. You may need to present some mathematical background to get your audience oriented, but this will depend on your extension. Be sure to conclude with future work and possible additional extensions. Grading Breakdown Points Task 20 Complete forward mode functionality (see above) 15 Documentation, Test Suite & Coverage with GitHub page (see above) 4 Broader Impact Statement 20 Video Presentation 30 New Feature (see above) 15 Code quality (see above) 104 Total","tags":"Project","url":"project/FD/"},{"title":"Milestone 1A","text":"Due: Thursday, September 22nd, 11:59 PM You will now begin your final project to develop a Python package for automatic differentiation. Please get together with your project group and complete the tasks below for Milestone 1A. Steps to complete Find team members you would like to work with and establish a way to communicate. Once your team is complete, request a team ID from cs107-staff@g.harvard.edu . Your team ID will be team01 if you are team 1 or team10 if you are team 10 and so on. The project code will be hosted in private repositories in the CS107 organization at code.harvard.edu . Once you have received your team ID from item 2, a member of your team creates a private repository in the CS107 organization at https://code.harvard.edu/CS107 named after your team ID (e.g. team01 if you are team 1) and adds its team members to the repository (you do not need to add the teaching staff, we will have access already). Final Deliverables Form a project team and request team ID from teaching staff. Create a private team repository in CS107 organization. Grading breakdown Points Task 1 Team formation 1 Creation of team repository 2 Total","tags":"Project","url":"project/M1A/"},{"title":"Milestone 1B","text":"Due: Tuesday, October 4th, 11:59 PM You will now further configure your group repository Git Conventions We expect all work from this point onward do be done on feature branches and merged into master or main via Pull Requests. Try to work with different branches and \"approve\" each others pull requests by reviewing their code and then merge into your default project branch. You must work with your project Git repository. The teaching staff will frequently check the history of your project. Steps to complete Within your project repo, you must set up two workflows with GitHub Actions . One workflow will be used for tests and the other for code coverage . You will need two .yml files in the .github/workflows directory in your project repository. The .yml do not need to have meaningful declarations at this point but you should have at least the name: option and the on: option defined. See this link for more details. Make sure the README.md file at the root of your repo includes badges indicating whether your CI workflows are passing or failing. Your workflows are expected to be failing at this point. You should end up with a rendered README.md file that looks like this (workflows may fail or have no status ): In the root of your project repo, you should create a directory called docs . You can use this directory to organize documentation and tutorials for your final package. You will begin creating this documentation as part of the next milestone 1. Grading breakdown Points Task 1 Configuring test action 1 Configuring coverage action 1 Creating project structure 3 Total","tags":"Project","url":"project/M1B/"},{"title":"Milestone 2A","text":"Due: Tuesday, November 1st, 11:59 PM Submission Instructions Please update your milestone1 submission to reflect the feedback given by the teaching staff. Add a Feedback section to your milestone1 document which outlines the feedback given and the updates that you made to address the feedback. One convenient way to organize this would be to create a section titled Feedback and a subsection titled Milestone 1 . Then, under this subsection, you can create a list of the individual comments that you received along with your response on how you addressed each comment. If you receive feedback on the README.md file, please update that document as well. Your submission should be in the same format as milestone 1: teamXX/ ├── docs │ └── milestone1 ├── LICENSE ├── README.md └── ... Grading Breakdown Points Task 1 Adding Milestone1 Feedback to milestone1 doc 1 Updating milestone1 submission to reflect feedback 2 Total","tags":"Project","url":"project/M2A/"},{"title":"Milestone 2B","text":"Due: Thursday, November 10th, 11:59 PM Submission Instructions Milestone 2B is a progress report for Milestone 2 . Specifically, for each member of the group, please answer the following questions. What tasks has each group member been assigned to for Milestone 2. What has each group member done since the submission of Milestone 1. This progress report should be no longer than 1/2 page (maximum 1 page). Its main purpose is to help your group structure project work and delegate tasks. We will be specifically checking to make sure there is equitable distribution of tasks. All members of the group should contribute code to the core library and all members should contribute to the documentation. Your submission should be in the following format: teamXX/ ├── docs │ ├── milestone1 │ └── milestone2_progress ├── LICENSE ├── README.md └── ... Grading Breakdown Points Task 2 Progress report completed 2 Total","tags":"Project","url":"project/M2B/"}]} \ No newline at end of file +var tipuesearch = {"pages":[{"title":"CS107/AC207 Project","text":"Project Overview Goal You will develop a software library for a client (the teaching staff). The development of this library will leverage modern software development practices covered in the course. By the end of the semester, the client should be able to easily install and run your package. Topic TBD Project Milestones TBD Groups You will work in groups of 4-5 students. You are free to choose your project partners but groups sizes must consist of the number of students mentioned before. Some members of the group will be stronger than others. It is expected that you work together and help each other as needed. This is an opportunity for less experienced coders to improve their skills by working with more experienced coders. Every person must contribute. Expectations This project has a few non-negotiable expectations, which are outlined in basic expectations . The project also has a more open-ended component, which is described in additional expectations . Basic Expectations The client should be able to easily install the library, run the tests, access the documentation, and use the library for their application. Documentation for every subsystem in the project must be provided. Link to the docs from the README.md in each folder. The top level README.md should contain an overview, links to other docs, and an installation guide which will help us install and test your system. The project must utilize a proper packaging system for distribution and installation of the library. The project must ship with a test suite. Documentation on how to run the tests is mandatory. Additional Expectations TBD Broader Impact You must write a broader impact statement for your library. The broader impact should consider the accessibility of your software library to different groups of people. This statement should be around 250 words (approximately 1/2 page). It can be placed in the README.md of your library. Things to consider when writing this statement are: How will you make your library accessible to different groups? What process will contributions to your library need to go through? How will you ensure that this process is fair and welcoming to all groups?","tags":"pages","url":"pages/project.html"},{"title":"Resources","text":"Books No book is required. But we highly recommend two books for this course. Fluent Python: Clear, Concise, and Effective Programming, by Luciano Ramalho. Publisher: O'Reilly Media. 2015. Designing Data Intensive Applications, by http://dataintensive.net/ , The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann. Publisher: O'Reilly Media 2014 Other useful books The Practice of Programming by Brian W. Kernighan and Rob Pike, Addison-Wesley, 1999. Skiena: The Algorithm Design Manual Abelson, Sussmann and Sussmann: SICP and python based online version based on it: http://composingprograms.com/ High Performance Python: By Micha Gorelick, Ian Ozsvald. Oreilly Media 2014. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation by Andreas Griewank Papers and other readings Python pep8 An opinionated guide to python style Git Recommended: Git from the bottom up Recommended: Git Book GitHub Videos and Training GitHub Interactive Tutorial Git - the simple guide Git Reference Git Cheat Sheet Git Immersion Tutorial Git Atlassian Tutorial Python python Rich overview of Python 3 language features (recommended to work through) Scientific visualization with Python and Matplotlib C/C++ Fall 2021 C/C++ primer material C Tutorial C++ Tutorial C++ Cheat Sheet C++ Reference Vim Spend 30 minutes to complete the vimtutor . After you have installed vim , execute the following command in your command line: vimtutor Vim Cheat Sheet Vimcasts Recommended book Bash Command Line Reference Cheat Sheet Bash scripting Cheat Sheet Unix-Related Basic Computing Tools Windows Users Using Linux Subsystem on Windows 10 PuTTY SSH client for Windows Ubuntu Docker Image You can get an Ubuntu based Docker container with docker pull iacs/cs107_ubuntu The container is hosted here . The Dockerfile and run_cs107_docker.sh launch script can be found in the class repository .","tags":"Resources","url":"pages/resources.html"},{"title":"Schedule","text":"1 9/5, 9/7 Lecture 1: Unix and Linux Lecture 2: Command line 2 9/12, 9/14 Lecture 3: Bash Scripting Lecture 4: Version Control / git Pair Programming Wk1(9/22) HW1: (9/12 - 9/27) 3 9/19, 9/21 Lecture 5: git Lecture 6: Python Pair Programming Wk2(9/29) 4 9/26, 9/28 Lecture 7: Python / OOP Lecture 8: Python Pair Programming Wk3(10/06) HW2: (9/27 - 10/11) 5 10/3, 10/5 Lecture 9: Python Lecture 10: Databases I Pair Programming Wk4(10/13) 6 10/10, 10/12 7 10/17, 10/19 8 10/24, 10/26 9 10/31, 11/2 10 11/7, 11/9 11 11/14, 11/16 12 11/21, 11/23 Thanksgiving Break 13 11/28, 11/30 14 12/5, 12/7 Reading Period 15 12/12,12/14 Final Exam Period Final Exam Period","tags":"pages","url":"pages/schedule.html"},{"title":"Schedule","text":"All due events with a given date are due on 21:59pm that day . Wk Tuesday Thursday Labs Events 1(35) Lecture 1: 2023-09-05 Class introduction/organization History of Bell Labs, Unix and Linux Command line introduction Lecture 2: 2023-09-07 More command line Pipes Regular expressions File attributes 2(36) Lecture 3: 2023-09-12 Command line customization I/O redirection Environment variables Shell scripting Process management Lecture 4: 2023-09-14 Version control systems (VCS) Centralized and distributed models Intro to Git PP01: (2023-09-12) Setup private class repository, tmate 3(37) Lecture 5: 2023-09-19 Version control systems (VCS) Managing repositories Remote repositories Branching Lecture 6: 2023-09-21 Python basics Objects and Functions Environments Closures PP02: (2023-09-18) Bash scripting, Git workflow Note: PP01 deadline (2023-09-22) 4(38) Lecture 7: 2023-09-26 TOPIC 1 TOPIC 2 TOPIC 3 Lecture 8: 2023-09-28 TOPIC 1 TOPIC 2 TOPIC 3 PP03: (2023-09-25) Topics PP03 Note: HW1 deadline (2023-09-27) PP02 deadline (2023-09-29) 5(39) Lecture 9: 2023-10-03 Lecture 10: 2023-10-05 6(40) Lecture 11: 2023-10-10 Lecture 12: 2023-10-12 7(40) Lecture 13: 2023-10-17 Lecture 14: 2023-10-19 8(40) Lecture 15: 2023-10-24 Lecture 16: 2023-10-26 9(40) Lecture 17: 2023-10-31 Lecture 18: 2023-11-02 10(40) Lecture 19: 2023-11-07 Lecture 20: 2023-11-09 11(40) Lecture 21: 2023-11-14 Lecture 22: 2023-11-16 12(40) Lecture 23: 2023-11-21 Thanksgiving break: 2023-11-23 11(40) Lecture 24: 2023-11-28 Lecture 25: 2023-11-30 11(40) Lecture 26: 2023-12-05 Reading period: 2023-12-07 11(40) Final exam period: 2023-12-12 Final exam period: 2023-12-14","tags":"pages","url":"pages/schedule_static.html"},{"title":"Syllabus","text":"Course Objective The primary goal of this course is to teach you how to develop effective software for scientific applications. In order to achieve this goal, there are several non-negotiable topics that must be included in the course. We will be concerned with two primary thrusts: System and Software Engineering and Language . Moreover, we aim to provide you with a suite of modern software development techniques and workflows. Learning Objective After successful completion of this course, you will be able to: Use Python, including its advanced features to write scientific programs. Have a basic idea how the Python interpreter works. Understand what features of Python make up its language execution model and how these features impact the code you write: e.g. how modularity, abstraction, and encapsulation can be used to solve problems. Write programs with good software engineering practices. These practices include: working on remote machines, version control, continuous integration, documentation and testing. Utilize data management techniques to store data, starting from a good understanding of data structures to databases. Combine these techniques together to write large pieces of software working in a team. Develop pipelines to integrate data aquisition and processing. Evaluate and test software as part of the development process. Be able to contribute on both the science and software engineering sides of things. Prerequisites You should have some basic familiarity with programming (functions, variables, constants, differences between integer and floating point, etc.) at the level of CS50. Some comfort with a tool to edit text files is beneficial. Any text editor or IDE will suit this purpose. The student should have passed a basic calculus class. The lectures will review the necessary fundamentals required to succeed with the class project. Besides this, you should have interest or investment in scientific computing. You can download Homework 0 for self-assessment here (not graded). You do not need to be able to solve all problems in order to take this class. Jupyter Notebooks Jupyter notebooks are great for code prototyping and learning how to use new features and APIs. However, they are not suitable for large software development projects! One reason for this is because code development in Jupyter notebooks is a nonlinear development process and there is presently no good solution for version control of Jupyter notebooks. A second reason is the question of efficient source editing. A helpful tool to convert (back and forth) Jupyter notebooks to pure python code is Jupytext . Homework assignments and lecture exercises turned in as Jupyter notebooks will not be graded. Textbooks There is no required course textbook. However, the course content will draw from various sources. We will cite the source when appropriate. Please consult the resources page for recommended textbooks and additional helpful material. Course Format The delivery of course content will occur via two weekly lectures as well as weekly pair-programming sections. Attending these sessions is mandatory . Lectures will consist of considerable interaction and discussion and will be greatly enhanced by student participation. The course contains the following main components: Lectures: Deliver the main content of the class. Attendance is mandatory. Quizzes: Graded in-class quizzes intended to assess the learning progress. Pair-programming: Pair-programming (PP) sections offer practice on topics addressed in class and help assess the skills to program in a collaborative environment. Attendance is mandatory. Homeworks: Homework assignments deepen the lecture material and include coding exercises. Exercises may be of theoretical or practical nature. Projects: The class is accompanied by a project (teams of 4-5 students) to practice the methods learned in class on a real Python application. The project topic is given by the teaching staff. The main programming language taught throughout the course is Python. Grading The following weight table is used for individual components of the class. The class does not have standard midterm or final exams. Total Weight Homework (7 Homeworks) 35% Project 35% Quizzes (4 Quizzes) 15% Pair-programming (12 sections) 15% Homework There are 7 homework assignments where each contributes equally to the final grade. The homework is focused on the topics discussed in class and involves programming and theoretical work. The teaching staff is determined to return solutions and graded assignments with feedback after the due date. It is your responsibility to check the consistency between your graded work and the assignment solution. You have the option to address possible inconsistencies in office hours or request a regrading for the assignment (see the homework grading inconsistencies section below). Homework will be released on the CS107/AC207 class repository . Push notifications for that repository will be distributed through the class mailing list . Homework will be graded on a 100 point scale: 100 = Solid / no mistakes (or really minor ones) 80 = Good / some mistakes 60 = Fair / some major conceptual errors 40 = Poor / did not finish 20 = Very Poor / little to no attempt. 0 = Did not participate / did not hand in Homework Submission Homework must be submitted via commits in your private git repository hosted in the CS107 organization at https://code.harvard.edu/CS107 . Grading and feedback for homework is done through the Gradescope platform which is connected to the class' Canvas site . Your homework solutions must therefore be zipped and uploaded in the Gradescope section of the class canvas. See the homework workflow tutorial for more details. The homework due date is indicated on the problem sheet and displayed in the schedule as well as shown on Canvas and Gradescope. Homework submissions will be graded on: Correctness: your code must run and must produce the correct result. We are not debugging issues when grading submissions. Presentation: presentation means structure and readability. We expect you to write high-quality, readable and tested code. A quality code is well commented in places where it is not straight forward to deduce the logic from code itself (from the reviewers perspective). We expect you to think about aspects such as modularity, reusability, code duplication and error handling when you design and write code. Presentation of results also means that unnecessary or superfluous files like editor backup files or other unrelated data should not be included in the submission commits (use .gitignore for this purpose). See the following tutorials to help you get started with homework submissions: How to setup your private class repository (onetime setup) Homework workflow Homework Late Days Homework submissions are accepted before the deadline of the assignment is due. You have three late days at your disposal that can be consumed for late submissions and two consecutive late days can be used at most for any of the homework assignments. Please note that any commits on your homework branch pushed after the deadline has passed are not considered for grading by default. If you wish that we consider a late commit for grading, please contact the teaching staff at cs107-staff@g.harvard.edu with appropriate explanation. This will count towards your late day budget. It is your responsibility to plan your work ahead and commit on time. If you have consumed all your late days and you have another late submission, it is in your benefit to still commit the work. We assume the Harvard Honor Code for all late submissions in case solutions are already posted. If you have a verifiable medical condition or other special circumstances that interfere with your coursework please let us know via cs107-staff@g.harvard.edu as soon as possible. Homework Grading Errors If you believe there is an error in your assignment grading, you can submit a regrade request through the Gradescope platform . Note: The entire assignment will be regraded. This may cause your total grade go up or down . An assignment can only be regraded once . Regrade requests are due within 2 days after the release of the grades . Project Please see the project section for more details. Quizzes There are 4 quizzes out of class which are graded and intended to assess the learning progress. Each quiz addresses topics from the lecture material . Quizzes are open book/ www and include multiple choice questions with at most back of the envelope calculations. Quizzes contain 12 questions and take 25 minutes. They are accessible on canvas within a 12 hour time window from 9am to 9pm at the day of the quiz. Note: if a quiz takes 25 minutes and you start the quiz on 8:50pm, you will have only 10 minutes to work on the quiz. Please see the class schedule as well. Pair-Programming Sections Pair-Programming will form an essential part of the course. Pair-programming will take place in mandatory pair-programming sections led by members of the teaching staff. You are required to sign-up for your preferred pair-programming section at the beginning of the semester. You are expected to attend your chosen section during the semester. Should you not be able to attend one of your sections, please coordinate with your section TF to attend another section this week in order to obtain the attendance credit. In CS107/AC207 we are focusing on command line tools for the development of software projects in computational science. It is important that you get familiar with a small selection of such tools and integrate them in your development process. The pair-programming sections aim at combining some of these tools together to provide you with hands-on experience while developing software. The key is the \"pair\" in pair-programming. The exchange of knowledge between team mates in these pair-programming sections is essential for learning said tools or learning something new from your peers. Please see the following file in the class git repository for the details: https://code.harvard.edu/CS107/main/blob/master/lab_groups.xls Pair-programming Submissions Pair-programming exercise solutions must be submitted in your private git repository hosted in the CS107 organization at https://code.harvard.edu/CS107 . Only commits made on or before the due date will be considered for grading . The deadline for submission is usually one week after the last section for the exercise. Given this extra time for completion, late days do not apply to pair-programming exercises . The submission due date is indicated on the problem sheet and displayed in the schedule . As you are working in groups of 3-4 students for the lab exercises, the solution files you come up with in the group are submitted by each group member individually in her/his own private Git repository. Pair-programming submissions will be graded based on the following criteria: Attendance: your attendance will be recorded by the TF who leads the section. Joining the section at the beginning and then leaving 10-15 minutes later will not reward attendance credit. If you need to leave because of another appointment then it is expected that you communicate beforehand and coordinate with your TF. Please see the attendance policy section below as well. Your pair-programming session is determined at the beginning of the class by choosing lab sections in my.harvard . You can select your preferences depending on your schedule. Once determined, you can lookup your session details in the https://code.harvard.edu/CS107/main/blob/master/lab_groups.xls sheet. Completion: pair-programming submissions should reveal effort that the student attempted to solve the tasks. If you experience difficulties in a particular problem and you are not able to complete the task, please indicate the issues you had in your code using comments; the teaching staff will take that reasoning into account. Handing in an empty skeleton (same as hand-out) does not meet the expected standard and will not award credit for the submission. See the following tutorial to help you get started with pair-programming submissions: Pair-programming workflow Office Hours The teaching staff holds weekly office hours. Office hour times and locations are listed on the class main page. Office hours offer an opportunity to review course materials and receive additional guidance on your homework. Please see the following file in the class git repository for the details: https://code.harvard.edu/CS107/main/blob/master/office_hours.xls Attendance Policy Attendance at lectures and pair-programming sections is mandatory as they are core parts of the class. Pair-programming sections (labs) will be held on weekdays that we determine at the beginning of the class according to a best fit of the students' individual schedules for the term. You are required to attend the labs on the assigned day. Rescheduling of a lab to a different day due to an unforeseeable event must be coordinated with the responsible TF by sending an email to cs107-staff@g.harvard.edu . To be excused from a lecture or a lab, we ask you to follow the Harvard Honor Code and send an email to cs107-staff@g.harvard.edu at least one day before the lecture or lab. Lecture recordings are available only when students are excused for a lecture. Collaboration Policy You are welcome to discuss the course material and homework with others in order to better understand it, but the work you turn in must be your own (with exception of the project where collaborative work is permitted). Any work submitted as your own without properly citing the original author(s), is considered plagiarism. Failure to follow the academic integrity and dishonesty guidelines outlined in the Harvard Student Handbook will have an adverse effect on your final grade. This includes the removal of copyright notices in code. You may not submit the same or similar work to this course that you have submitted or will submit to another without permission. The teaching staff may use tools to compute correlations between submitted work. Use of AI Models Purpose of Policy: This policy outlines the acceptable use of AI models, including but not limited to ChatGPT, in completing assignments for this course. Policy Guidelines: Original Work: Students are expected to complete assignments using their original thoughts and interpretations. AI models can be used to help understand concepts, generate ideas, or learn about different perspectives, but they should not write or complete assignments for students. Collaboration with AI: Students may use AI models for brainstorming or generating preliminary ideas, but the final work submitted must be substantially their own. Students should be able to explain their reasoning, logic, and conclusions without relying on the model's output. Restrictions for Specific Assignments: There may be specific assignments (e.g. quiz part of the midterms) or parts of the course where the use of AI models is entirely prohibited. These restrictions will be clearly stated in the assignment guidelines. Ethical Considerations: Students are encouraged to approach the use of AI with ethical considerations in mind, including issues related to privacy, bias, and authenticity. Consequences for Non-Compliance: Failure to adhere to this policy may result in academic penalties as outlined in the course's academic integrity policy. Questions and Clarifications: If students have questions about the appropriate use of AI models in an assignment, they should consult the course instructor or teaching assistants before proceeding. Please refer to the University's policy for further information. Accessibility If you have a documented disability (physical or cognitive) that may impair your ability to complete assignments or otherwise participate in the course and satisfy course criteria, please contact the teaching staff or directly the Accessible Education Office to receive an AEO letter that will authorize us to help you with corresponding accommodations. Diversity Statement All participants in this class are expected to foster empathy and respect towards each other. This includes instructors, teaching staff or students. The motivation to take this course shall be to experience the joy of learning in an environment that allows for a diversity of thoughts, perspectives and experiences and honors your identity including race, gender, class, sexuality, religion, ability, etc. Any constructive feedback for improving the class environment is welcome and I encourage you to reach out to the instructor or teaching staff with any concerns you may have. If you prefer to speak with someone outside of the course, you may find helpful resources at the Harvard Office of Diversity and Inclusion .","tags":"pages","url":"pages/syllabus.html"},{"title":"Tutorials","text":"How to Setup your Private Class Repository Steps to Setup Your Private Class Repository Add an SSH Key to Your Account Homework Workflow Example Homework Workflow Step 1: Branch Off Step 2: Solving the Homework Step 3: Create a Pull Request Creating a Web Pull Request Step 4: Submit on Gradescope Pair-programming Workflow Protocol How to launch tmate Recommended Workflow How to Setup your Private Class Repository All of your work in CS107/AC207 will be committed in your private class git repository hosted in the CS107 organization at https://code.harvard.edu/CS107 . (The class project will be hosted in another repository in the same organization, see Milestone 1A for this separate task.) This tutorial walks you through the steps to create your private class repository. If you have already created git repositories on GitHub, then there is nothing new to learn in this tutorial and you should be familiar with the process already. A note on https://code.harvard.edu/ : this is an instance of a GitHub Enterprise edition hosted by Harvard University. The user interface is identical to the public GitHub site. The main difference is that https://code.harvard.edu/ is owned by Harvard University , whereas GitHub belongs to Microsoft which gives rise to security concerns regarding data belonging to classes held at Harvard University. Steps to Setup Your Private Class Repository Obtain your Harvard NetID Send an email to cs107-staff@g.harvard.edu (using your .harvard.edu email) to request access to the CS107 organization . Include your NetID from step 1 in the body of the email and choose an appropriate subject line. Once added to the organization, navigate to https://code.harvard.edu/CS107 (login if necessary) and click the green \"New\" button to add a new repository. Your repository must be named after your NetID . You can add an optional description if you like. Make sure the private radio button is checked and click \"Create repository\". You do not need to check any other options. This is all you have to do for now. In the first homework we will focus on how to setup your new repository such that you can work with it from your laptop (you can skip the landing page after you have created the repository). When you navigate back to https://code.harvard.edu/CS107 you should see something similar to this: The blurred repository is your private class repository that was the focus of this tutorial. The main repository is the main CS107/AC207 class repository which is used to distribute all of the class material during the semester. Any updates to this repository will be broadcast via email message such that you will not miss out on new material. In the first homework we will set this repository as an upstream such that you can conveniently unpack class material into your private repository. Note: private repositories are only visible to you within the organization. Please do not create other repositories in the https://code.harvard.edu/CS107 organization. You have your own user account on https://code.harvard.edu/ just like you have on GitHub or other providers. Your user account requires your Harvard login credentials and is a good alternative to hosts like GitHub. Feel free to create as many repositories in your user account as you like. Add an SSH Key to Your Account In order to access content on https://code.harvard.edu using Git you need to setup an SSH key. Check if you already have the file ~/.ssh/id_rsa.pub (assuming RSA). If you do not have such a file you can create one with ssh-keygen -t rsa -b 4096 Choose the default location by just hitting enter. You may enter a password for the key or just hit enter to go without password. If go with password you will have to enter it every time you use the key. To upload the public key to your Harvard GitHub account , click on your icon in the top right corner on your https://code.harvard.edu page, then click on \"Settings\" and then \"SSH and GPG keys\" in the left panel. Alternatively use this link https://code.harvard.edu/settings/keys . Click on the green \"New SSH key\" button in the top right corner and give your new key a title (e.g. the name of your laptop). In the key field paste the contents of your public key found in ~/.ssh/id_rsa.pub . Use for example cat ~/.ssh/id_rsa.pub and copy paste the output into the \"Key\" field on your GitHub page. You are now able to access any repositories on https://code.harvard.edu with corresponding permissions. Never share your private key ~/.ssh/id_rsa with anybody. Note: do not create a key in the class Docker container since the key will be lost when you exit the container. For security reasons, sensitive keys like this should not be put in containers. Homework Workflow The following are the basic rules we apply for homework submissions: Naming convention for homework directories: your private repository should contain one homework directory on the repository root with hwX sub-directories for each homework assignment. The X in hwX is to be replaced with the assignment number. For example hw1 , hw2 and so on. Which files will be considered for grading: within the sub-directory hwX , place the assignment files that you want us to grade in a directory called submission . We will only grade data in these directories . Pull request (PR): your homework assignments must be completed on git branches called hwX , where X is again to be substituted with the assignment number. Your homework X submission requires an open pull request to merge the hwX branch into your main (or deprecated master ) branch for full points (both branches are inside your private class repository in the CS107 organization ). Some implications of this: Solving homework on the main or master branch is always wrong. For each homework submission you need to issue one open PR. Merging an open PR before the teaching staff has reviewed and graded your work will make the PR disappear . Only files inside submission in PR X will contribute to your hwX grade (see next item). Gradescope: your homework will be graded on the Gradescope platform that has been setup and linked to the class canvas page. The platform does currently not support submission directly via your Git repository. You therefore have to create a zip archive of your submission directory created in step 2 above and upload the archive on Gradescope . It is important that you zip-up the directory and not individual files inside. You can use the command zip -r submission.zip submission/ , where the -r option means add files recursively , submission.zip is the name of zip archive and submission is your homework submission directory from step 2 containing your solution. This assumes you are in homework/hwX inside your Git repository. Points will be lost if any of these requirements are violated . The teaching staff will review the open PR for each homework and grade your work accordingly. Grades will be released on canvas and feedback is provided through the Gradescope platform. Once you have received the grade and feedback, your open PR for homework X can be merged into your main or master branch if there are no more pending issues. After the PR is closed, you may delete the hwX branch in your repository. This concludes a homework submission. Example Homework Workflow This example is intended to help you internalize the three basic rules described above. Each homework awards 10 points by performing these steps correctly. Note: Specific instructions provided in each homework assignment may override the following basic approach. Suppose we want to work on homework assignment 3, which consists of 4 problems. Step 1: Branch Off The ease of branching is the main strength of git . Branches allow you to be destructive without affecting production code or data. The reason we solve homeworks on individual branches is to help you develop a feel for this protection and to materialize the required steps to create branches. Branches will provide you true comfort when working on real projects outside of this class. Make sure your master or main branch is in the state you want your new branch to be based on. If you need to synchronize with your default remote branch you can type git pull The next step consists of creating and switching to a new branch that is based off the current branch. For this you can use git checkout -b hw3 which is how you did it before git 2.23.0 . Since the checkout command is ambiguous , the preferred way for more recent versions of git is git switch -c hw3 You are now on a new branch called hw3 as required. You will need to issue a pull request into main or master from this branch such that your homework will be graded. You can create the PR now (see below) or once you are done with solving hw3 , it does not matter to git . (Pull requests are not something designed by git itself, but rather by platforms like GitHub or GitLab.) Note: you will lose 5 points if you are not solving your hwX (in this example it is hw3 ) on a branch named hwX . You are of course free to create additional branches besides hwX if suitable. Step 2: Solving the Homework The files the teaching staff will consider for grading have to be located in the directory homework/hw3/submission . You are free to put other files below homework/hw3 that might be useful when you revisit your work sometime later. The problem sheet might be one of those files. Class handouts are distributed in the main class repository . You can manually create these directories and copy the files you want into your hw3 directory using, for example: mkdir -p homework/hw3/submission cp /homework/hw3/hw3.pdf homework/hw3 Alternatively you can use git by configuring the main class repository as another remote in your local git repository (see homework 1). In this case you can checkout all the distributed homework files at once with git checkout class/master -- homework/hw3 assuming that the remote points to https://code.harvard.edu/CS107/main and is locally named \" class \". You may need to update your refs with git fetch --all before you invoke the checkout command above. The homework sheet will state what files have to be submitted. For this hw3 we assume they are P1.py , P2.py , P3.py and P4.py , one for each of the four problems. These files should run and return the required output. They have to be submitted inside the homework/hw3/submission directory. You should commit your work often in logical chunks. Your commits are to be done on the hw3 branch, of course. The following are a few commands that might be helpful: Use git status often to check your local state. Use the git add command to stage files you have changed for a commit. Use git commit -m to create a commit with an appropriate commit message. Use git stash to temporarily stash modified files (similar to a commit but it is not written to the history). Later on use git stash list to list all your stashed changes if you have used git stash multiple times. You can check what will change when you apply the stash with git stash show -p and apply the stashed changes with git stash apply (or git stash pop which also removes the stash from the list). Note that these commands work on the first stash object in the list stash@{0} if you do not explicitly specify the stash object you want to apply. Use git push to push local branch/commits to your remote repository. Use git restore to undo changes to a single file. Use git revert to undo the changes in a specific commit. Make sure you have committed your solution you want to submit inside the homework/hw3/submission directory with the required file names. Step 3: Create a Pull Request If you have local commits not pushed to the remote issue the git push command. You are now ready to issue a pull request (you could also have done this step at the very beginning of solving this homework, this is up to you). The goal is to merge the hw3 branch into your main or master branch eventually. The teaching staff must review and grade your work first, however. There are two ways to accomplish a PR on GitHub: Through the web browser at https://code.harvard.edu/CS107/ . Through the GitHub command line client . This method is helpful if you get distracted from the context switch that is associated with the first method. Note: you will lose 2 points if you do not create a PR. Disclaimer: you would not typically issue a PR for projects you are the sole contributor. Pull requests are typical for large projects at a company in which someone else will review your code before you can merge your code to the production branch. We want you to become accustomed to this type of workflow. It is a good idea to always use separate development branches. You should never commit straight to your main or master branches until the changes have thoroughly been tested. Creating a Web Pull Request Navigate to your https://code.harvard.edu/CS107/ private class repository and click on the \"Pull Requests\" tab in the top left part of the window. Click on the \"New pull request\" button Choose your main or master branch as the base (the one you want to merge into) and your hw3 branch as the one you want to compare to. This should automatically reload the page and show the changes that will be applied. Click on the \"Create pull request\" button. You can optionally add comments to this pull request if you desire. Click on the \"Create pull request\" button once more to create and open the pull request. The pull request is now open. You can even push more commits to the hw3 branch if you need to correct something (before the deadline has passed of course). Therefore, you could also create the PR at the beginning of the homework. Note: DO NOT click on the button that says \"Merge pull request\" until you have received your grade and feedback for that homework. You will lose 3 points if you prematurely merge your PR. Step 4: Submit on Gradescope Your submission is now ready to be submitted for grading on Gradescope . Simply create a zip archive of your submission directory you have created in your Git repository, e.g. submission.zip , and upload it to Gradescope by following the link above. You can use the command zip -r submission.zip submission/ , where the -r option means add files recursively , submission.zip is the name of zip archive and submission is your homework submission directory. Since you track the change history of your work in Git, you should not add *.zip files to your Git history. You can simply ignore such archives by adding the line *.zip to your .gitignore file in your repository root. Pair-programming Workflow Exercises performed during pair-programming sections should be put under version control similar to homework assignments (see the Homework Workflow section above). You must not branch off and create a pull-request for pair-programming exercises . Just add and commit your work on the main or master branch and push them to your repository ( make sure you are on the correct branch before you commit! ). The following are the basic rules we apply for pair-programming submissions: Your private repository should contain a directory named lab with sub-directories for each session. The sub-directories should be named ppX where X is the session number. Within the sub-directory ppX , place the exercise files that you completed during the pair-programming sections. The exercises must have the name exercise_Y.ext where Y corresponds to the exercise number and ext is the proper extension ( .py , .sh , .c , .cpp ) depending on the exercise. Here is an example how it may look like: The pair-programming exercises will be graded for completeness and help us ensure you are on the right track. You may lose points for the completeness part if you do not follow these two basic rules. Protocol In class we are focusing on command line tools for the development of software projects in computational science. It is important that you get familiar with a small selection of such tools and integrate them in your development process. The pair-programming sections aim at combining some of these tools together to provide you with hands-on experience while developing software. The key is the \"pair\" in pair-programming. The exchange of knowledge between team mates in these pair-programming sections is essential for learning said tools or learning something new you did not know before. The pair-programming works by using a tool called tmate which is based on the tmux terminal multiplexer . It allows for easy sharing of a command line session or a specific instance of a program in read/write and read-only modes via ssh or web browsers. Check out this blog post for more. Text file editing will be performed in any text editor that supports a text-based user interface (TUI). Recommended choices are vim or emacs . If you are mainly working on a Windows operating system, you should install the Windows Subsystem for Linux . A small guide for doing so can be found here . See the How to Launch tmate section below for the steps to launch tmate on your laptop. Note: tmate is perfect for any coding related communication. For example, debugging, work on the project and of course pair-programming. There is no audio channel integrated in tmate . If students are remote, a zoom session or similar must be established for oral communication. The exercises in the pair-programming sections are necessarily collaborative. Each member of the group will turn in the same script. Adhere to the following workflow when solving pair-programming exercises: For each exercise (or sub-exercise for big problems), there will be one sharer , one coder , and one listener . This assumes a group size of 3 . If the group only has two people, then either one of you can take the sharer's role. The sharer will start each coding session and document interactions including points of contention and challenges. The coder will be in charge of writing the code. The listener will make suggestions and may offer tweaks from time to time. The sharer starts a new tmate session and invites the other team mates to join the session. Ideally you want to start the session inside the directory of the current pair-programming exercise in your git repository. You may share a read/write link either through ssh or a web browser. Note: the sharer allows the others access to her/his computer. Any abusive behavior that may cause harm on the sharer's system will not be tolerated and are forwarded to the dean's office. After the team mates have accepted the invite, they will be able to share the terminal instance and can create new files or execute Python together. The team should discuss a strategy on how to approach the exercise. The coder should start writing some code with input from the other two team members. Before each section that you work on, place a comment indicating which team member worked on that section. For example, a bash script could look like this: #!/usr/bin/env bash # File : exercise_1.sh # Created : Sat Aug 07 2021 04:58:49 PM (-0400) # Coder : Alice # Listener : Bob # Sharer : Alice echo 'Hello World' ### Main point of contention: whether to capitalize \"W\" in \"world\" For small exercises, each team member can play a single role once. For large exercises, the team members may rotate roles. The exercise will make it clear when you should rotate. At the end of the exercise, the developed code is inside the sharer's git repository that can readily be committed. Links to download these files can be shared with the other team mates such that they can update their repositories as well. Note that the exercise code will contain comments pertaining to who worked on which section. How to launch tmate Disclaimer: tmate is a tool to share a terminal session and interact with other people. If you host a session, it means the instance runs on your local computer and you are in control of how much permissions you want assign to your mates. There are 2 ways to share a session: read-only: mates that connect to your session can only read data (this is safe provided the data you expose is safe). read-write: mates that connect to your session can read and write data. This is unsafe if you share a session with an mistrusted person. Recommended Workflow Launch the CS107/AC207 docker container with the working directory mounted (see the provided run_cs107_docker.sh launch script ) Start tmate inside the docker container Wrapping tmate in a docker container provides another layer of security. You can also install tmate using your distribution package manager (on Linux or homebrew on MacOSX) and skip step 1 if you wish not to use a container. Steps: Assume Docker is installed and we have pulled the CS107/AC207 docker image . You can install the run_cs107_docker.sh launch script in your PATH for convenience (e.g. ~/bin/run_cs107_docker.sh and add this directory to your PATH environment variable). Be sure that the run_cs107_docker.sh script is executable . See the chmod command to change the permissions of the script. Assume you want to work on the PP1 exercise and you are in the lab directory of your private Git repo and pp1 exists. Launch the docker container and mount the pp1 directory in your repository: $ run_cs107_docker.sh pp1/ root@0a076feb425f:~# You are now inside a running docker container. Note that the hostname 0a076feb425f is arbitrary and yours will differ. Launch tmate (it is already installed in the container): root@0a076feb425f:~# tmate Tip: if you wish to use tmate only for remote access, run: tmate -F To see the following messages again, run in a tmate session: tmate show-messages Press or to continue --------------------------------------------------------------------- Connecting to ssh.tmate.io... Note: clear your terminal before sharing readonly access web session read only: https://tmate.io/t/ro-qNRV5QRVWkW3qr55sfATkBegr ssh session read only: ssh ro-qNRV5QRVWkW3qr55sfATkBegr@nyc1.tmate.io web session: https://tmate.io/t/nMWurZc7Q6Zbv8EnX2wdhf6GB ssh session: ssh nMWurZc7Q6Zbv8EnX2wdhf6GB@nyc1.tmate.io The tmate instance is now running and you can choose between 4 possible links to share with your mates: 2 that can be run in your web browser and another 2 to be used with ssh in your terminal (either read-only and read/write). Choose the appropriate link you want to share with your pair-programming mates. If you press q or ctrl-c you are dropped back to the shell. The server will tell you whenever mates join. You can print the links again with tmate show-messages (be careful when you are sharing screens on zoom for example). Note that pressing ctrl-d or typing exit in the shell will close the active terminal and if only one is left, also the active tmate session. This will close the connections to all connected clients. You can now work together on the exercise. For example: root@0a076feb425f:~# vim exercise_1.py The image below shows a terminal session (left) and two mates connected in a web browser window (right): Note: We run tmate with root in the docker container. Do not run tmate as root in any other situation (even here we could create a regular user) and be careful with password-less sudo (avoid password-less sudo in the first place). In order to use ssh you need to setup an ssh key if you have not done so already. If you do not have such a key, you may create one by running ssh-keygen -t rsa -b 4096 If you are not dropped into a shell after you execute tmate it may be because you are using a shell different than zsh . Install zsh on your system using your package manager and run tmate like this SHELL = /bin/zsh tmate","tags":"pages","url":"pages/tutorials.html"},{"title":"Systems Development for Computational Science","text":"Computation has emerged as the third pillar of science alongside the pillars of theory and experiment. Computational science is maturing rapidly and has found considerable and significant use in supporting scientists from various disciplines (including all engineering disciplines, mathematics, physics, chemistry, finance, biology, and data analysis to name a few). Many burgeoning scientists are still taught to write \"a code\" for some problem and to debug when things look wrong. Given the ever-increasing complexity of software solutions to scientific problems, this old paradigm is no longer tenable and at best inefficient. CS107/AC207 is an applications course highlighting the use of software engineering and computer science in solving scientific problems. You will learn the fundamentals of developing scientific software systems including abstract thinking, the handling of data, and assessment of computational approaches: all in the context of good software engineering practices. The class syllabus can be found by following this link. Teaching Staff The preferred way to reach the teaching staff is described in the Teaching Staff Mailing List section below. Instructor Ignacio Becker ( iebecker@g.harvard.edu ) Office: SEC, Office 1.312-05 Office Hours: Wed 5:00-6:00pm Teaching Fellows Fellow Email Office Hours Pair-Programming Sections Kimon Vogt kvogt@g.harvard.edu Sat 8:00-10:00am (Zoom) Mon 8:00-9:15am (Zoom) Fri 8:00-9:15am (Zoom) Yixian Gan ygan@g.harvard.edu Tue 5:00-6:00pm (SEC 6.301+6.302) Mon 6:00-7:15pm (SEC 6.301+6.302) Allison Karp akarp@mde.harvard.edu Thu 9:30-10:30am (SEC 6.301,6.302) Tue 9:30-10:45am (SEC 6.301+6.302) Gekai Liao gekailiao@g.harvard.edu Thu 4:00-5:00pm (MD PierceHall 100F) Tue 3:45-5:00pm (SEC 6.301+6.302) Victor Zhu dunminzhu@g.harvard.edu Mon 10:00-11:00am (Zoom) Thu 6:00-7:15pm (Zoom) Frank Cheng xcheng@g.harvard.edu Fri 4:00-5:00pm (SEC 2.122+2.123) Thu 1:00-2:15pm (SEC 2.122+2.123) Danni Lai danninglai@g.harvard.edu Wed 4:00-5:00pm (Zoom) Tue 7:00-8:15pm (Zoom) Isabella Bossa isabellabossa@g.harvard.edu Tue 10:45-11:45am (MD 223) Thu 8:00-9:15am (MD 123) Tanner Marsh tam997@g.harvard.edu Fri 10:00-11:00am (SEC 2.112) Thu 1:00-2:15pm (SEC 2.122+2.123) Boxiang Wang bwang@g.harvard.edu Mon 1:00-2:00pm (SEC 6.301+6.302) Thu 3:45-5:00pm (SEC 4.405) Shuheng Liu shuheng_liu@g.harvard.edu Mon 7:00-8:00pm (Zoom) Tue 7:00-8:15pm (Zoom) Cyrus Asgari cyrusasgari@college.harvard.edu Wed 3:00-4:00pm (MD 123) Fri 12:00-1:15pm (SEC 2.122+2.123) Haitian Liu hliu3@g.harvard.edu Fri 2:45-3:45pm (SEC 2.122+2.123) Fri 1:30-2:45pm (SEC 2.122+2.123) Legend: SEC : Science and Engineering Complex, Northwestern Av 150, Allston MD : Maxwell-Dworkin, Cambridge Please see Pages section in Canvas for a Google Calendar. Lecture Hours All lectures are of 75 minutes duration. Time is given in Eastern Standard Time (Boston). Lecture attendance is mandatory : Time Room Tuesday 2:15 - 3:30 PM SEC 1.321 Thursday 2:15 - 3:30 PM SEC 1.321 Important Information Canvas: Is used for posting grades and other sensitive content. The class can be found on Canvas at this link https://canvas.harvard.edu/courses/122565 Class git repository: All handouts in CS107/AC207 are provided through the main repository hosted in the CS107 organization at https://code.harvard.edu/CS107/main . You can set this repository as an upstream in your private class repository or clone it once you have joined the CS107 organization git clone git@code.harvard.edu:CS107/main.git Updates to the main repository are posted on the class mailing list. Your Harvard ID is required to login to https://code.harvard.edu . You can request membership in the CS107 organization (AC207 students join the CS107 organization as well) by sending an email to cs107-staff@g.harvard.edu (using your .harvard.edu email). You must include your NetID in the body of your email, which is also your https://code.harvard.edu username (something similar to abc123 ). Once you have been added to the CS107 organization, create your own private repository inside the organization. Your private repository must have the exact name as your NetID . This will be your private class repository where you submit your homework and pair-programming exercises. See the following tutorial to help you get started with your git repository: How to setup your private class repository Class Discussion Forum We will use the Ed Discussion forum on our Canvas page as our main communication platform. Questions regarding homework, labs or lecture material must be posted on this forum and you are encouraged to reply to questions if you know the answer or you can share a useful contribution. A fraction of your participation grade is computed by how often you visit and the frequency you post on the forum. Class Mailing List You can optionally sign up to our class mailing list if you would like to be notified whenever there is new class content available in the class git repository. Replies to posts in this list will be sent to all list members. To sign up, send an email to: cs107+subscribe@g.harvard.edu (subscribe by sending a blank email to this address; use the email address associated with your HarvardID ) You are required to confirm your subscription. Simply reply to the confirmation email with a blank message to complete the subscription. Teaching Staff Mailing List You can reach the teaching staff directly by sending your email to the following mailing list cs107-staff@g.harvard.edu (email sent to this list is only seen by the teaching staff; only email ending with .harvard.edu is accepted) You are not required to register for this mailing list but only email addresses ending with .harvard.edu are accepted (you will receive a rejection message otherwise). Getting Started Checklist Sign up with the CS107 organization on https://code.harvard.edu/CS107 and create your own private repository inside the organization . Information flow: Canvas → Grades and discussion forum https://code.harvard.edu/CS107 Assignment submissions inside your private repository (homework, pair-programming exercises) Group repositories for project work All course handouts are published in the https://code.harvard.edu/CS107/main repository Need help? → cs107-staff@g.harvard.edu OPTIONAL: Sign up on the class mailing list to receive push notifications when new content is available in the https://code.harvard.edu/CS107/main class repository. You can get an Ubuntu docker container with the necessary class tools by docker pull iacs/cs107_ubuntu . Note that no ssh keys are contained in that image for use with git . See also the docker resources page .","tags":"pages","url":"pages/systems-development-for-computational-science/"},{"title":"Final Deliverables","text":"Due: Saturday, December 10th 2021, 11:59 PM Submission Instructions Your project should be available in your private project repo in the CS107 organization . Your submission should be in the following format: teamXX/ ├── docs │ ├── documentation │ ├── milestone1 │ └── milestone2 ├── LICENSE ├── README.md ├── src │ └── ... └── ... Note that src is a generic name used for source code containers. You may choose a different name for the directory that holds your project source code. It should be a concise name. Software Requirements Here are the main requirements for the final project: Working forward mode implementation See the sections below for more specific details Test suite Updated / extended documentation Your updated documentation will be the final package documentation. Please name it documentation . Do not name it milestone3 . New features Working Forward Mode Implementation You must have a working forward mode implementation. Your library should be able to handle real functions of one or more variables. This includes the situation where a user might have multiple functions each of multiple variables. Your library should be able to handle vector functions with multiple real scalar or vector inputs. Minimum Package Requirements The software should be available in your project repository. The software should also be installable via PyPI . You should provide a pyproject.toml or requirements.txt (depending on your packaging choice) file with your software so other developers are able to install the necessary dependencies. After a user installs your package, they should be able to use it without difficulty. Minimum Implementation Requirements The following is a description of a typical use case. A user downloads your package through PyPI . (It is a good idea to use https://test.pypi.org/ instead of the production URL for this class project.) They install the dependencies. They run the tests if they're a fellow developer. They create a \"driver\" script in the top level. Note: How they interact with your package will depend on your implementation. The interface and other implementation details should be described in your documentation. The next few steps may sound somewhat abstract, but that is only because they hinge on your specific implementation. In the driver script, they import your package. They instantiate an automatic differentiation object to be used in the forward mode. They use the automatic differentiation objects in their own applications (root-finding, optimization, etc). What Kinds of Functions should be Implemented? All basic operations and elementary functions should be implemented. Basic Operations Addition (commutative) Subtraction Multiplication (commutative) Division Power Negation Comparison Operators It is up to you which comparison operators to implement. Here are some options: __lt__ (less than) __gt__ (greater than) __le__ (less than or equal to) __ge__ (greater than or equal to) __eq__ (equal to) __ne__ (not equal to) You may or may not need them. If you have nodes in a tree you may find it useful to have at least __eq__ and __ne__ for node comparison. Similarly, for dual numbers some of these operators may make sense. We will be interested to see what (if any) uses you find for these operators. Elementary Functions Trig functions (at the very least, you must have sine, cosine, tangent) Inverse trig functions (e.g. arcsine, arccosine, arctangent) Exponentials Should be able to handle any base You can treat the natural base (e) as a special case This is what numpy does. Hyperbolic functions (sinh, cosh, tanh) Note that these can be formed from the natural exponential (e) Logistic function Again, this can be formed from the natural exponential Logarithms Should be able to handle any base. Square root Note that the hyperbolic functions and the logistic function are arguably not elemental functions since they can be formed from the natural exponential. On the other hand, they do form a key ingredient in some algorithms (e.g. neural networks) and can therefore be considered as elemental functions. You should implement these functions. Test Suite You should have a test suite that preferably runs with pytest (or unittest ). Your tests should be designed to run unit tests and integration tests if there are dependencies among individual units. Unit tests include class methods (dunder methods are methods of a class) and possibly overloaded functions that are part of a module. The test suite should be designed to integrate with GitHub Action workflows such that you can exploit regression testing. You should have a test harness that allows you to run all your tests easily and compute coverage reports in the same flawless manner. This harness must run for a git repo that is cloned to a local directory as well as in a container if it was run via a third party CI provider. Your project README.md file should contain a badge showing the pass/fail status of your CI builds. The badge should show that your build is passing all tests. You should also have your code test coverage workflow setup. Your project repo should have a badge defined in the README.md file reporting a pass/fail of the test coverage based on the following criterion: Pass if the coverage satisfies 90% or more, fail otherwise. You can implement this behavior in your coverage workflow using shell commands or write a script that is being executed in the workflow. When the criterion is met the script should return a success exit code such that the workflow succeeds. Useful commands are sed and awk . You will need to obtain the total coverage percentage by parsing an output file generated with pytest and the pytest-cov package and then process this information further to decide on the pass or fail criterion. These steps must be done in your CI setup (already done in M2). In addition you are required to publish a GitHub page that hosts an HTML version of your code coverage results. The pytest-cov plugin can generate these sources using the --cov-report=html option. An easy way to achieve this is to use this action in your coverage workflow. If you use pytest-cov and this action you must make sure that you remove .gitignore file inside the HTML sources generated by pytest-cov before you use the publish action (otherwise the plugin will push nothing to the gh-pages branch). Your website will be accessible at https://code.harvard.edu/pages/CS107/teamXX/ , where teamXX is your team ID. Documentation Your documentation must be complete, easy to navigate, and clear. Remember to update the Background and How to Use sections of your documentation as you add more functionality to your package, so that the user has a good understanding of what they can do. Call the final form of your documentation \" documentation \". Please update and consolidate all relevant documentation from milestone1 and milestone2 and make any changes suggested by the teaching staff. Your documentation should be a mix of text and hands-on demos. As always, it is up to you and your group to determine the best way to accomplish this (e.g. Jupyter notebook, GitHub README, Sphinx/Read the Docs). You will receive full points as long as you have a docs/ directory and your documentation is complete. However, you may want to consider alternative ways of hosting your documentation. For example: Read the Docs or Sphinx . Documentation Sections The following sections should be present: Introduction Describe the problem the software solves and why it is important to solve that problem. This can be built off of the milestones, but you may need to update it depending on what new feature you proposed. Background The automatic differentiation background can probably stay the same as in the milestones, unless you were told to update it considerably. Be sure to include any necessary background for your new feature. How to use your package How to install? Include a basic demo for the user. This can be based off of the milestone, but it may change depending on what your new feature is. You may want to consider more than one basic demo: one demo just for automatic differentiation and one demo for your new feature. Note that this is very much dependent on your final deliverable! Keep the basic demos to a manageable number. Software organization High-level overview of how the software is organized. Directory structure Basic modules and what they do Where do the tests live? How are they run? How are they integrated? How can someone install your package? Should developers and consumers follow a different installation procedure? Implementation details Description of current implementation. This section goes deeper than the high-level software organization section. Try to think about the following: Core data structures Core classes Important attributes External dependencies Elementary functions Your extension To start, copy over the new/future feature section of the documentation from Milestone 2 and update it to reflect the M2 Feedback. Description of your extension (the feature(s) you implemented in addition to the minimum requirements.) Additional information or background needed to understand your extension This could include required mathematics or other concepts Broader Impact and Inclusivity Statement (See Below) Future What else do you want to add? What is missing? Don't just think about mathematical things here. Try to think about applications that you'd like to have use your code. Just about every area of science can use automatic differentiation (physics, biology, genetics, applied mathematics, optimization, statistics / machine learning, health science, etc.). Broader Impact and Inclusivity Statement Include a section in your documentation and in your GitHub README on Broader Impact and Inclusivity. This section should be around a 1/2 page in length and it can be the same between your documentation and your README. It should address two points: The potential broader impacts and implications of your software. How is your software inclusive to the broader community? Broader Impact Regarding the broader impact portion, try to think about the ways people will use or misuse your software. What are the consequences? How should people use it responsibly? Are there any ethical implications? The NeurIPS website has a number of references to get you started on thinking about this: Please read through the paper: It's Time to Do Something: Mitigating the Negative Impacts of Computing Through a Change to the Peer Review Process Suggestions for Writing NeurIPS 2020 Broader Impacts Statements A Guide to Writing the NeurIPS Impact Statement Example paper Software Inclusivity In principle, there should be no barrier whatsoever for other developers to contribute to your code base. In practice, these barriers do exist and can be rather subtle. For example, are there any subtle barriers to underrepresented groups? What about working parents? What about people from different countries or non-native English speakers? Do people from rural communities feel they have something to offer? Carefully think about the code contribution process for your software project. How are pull requests being reviewed and approved? Who is reviewing and approving these requests? Python has a Diversity Statement , but it is fairly generic and contains a lot of boiler plate. Different Python groups may have more concrete policies. Can you do better? How this will be graded Both topics in this section are subjective and there is no perfectly correct answer. I am looking for effort and honest attempts to address these issues. Code quality The quality of the written code will be assessed largely based on the material of lecture 14 , slides 32-42 in lecture 8 , object oriented design principles (interface design of your library in addition to dunder methods) as well as lectures 3 and 4 dedicated to version control (VCS). To give you an idea what this means, here are a few practical examples: Are tests separated from source code? Why: when you deploy your project, you do not want to ship development related code. Having tests separated simplifies the packaging process of your source code (lectures 8 and 14). How does your git commit history look like? Why: one of the first things a new developer that has been added to your project group does is browsing through the commit history of your source base. The information that can be extracted in that process depends on how descriptive your commit messages are. Is the commit subject a concise description or just a mere \"Fixed bug\"? Are the commits of reasonable size with logical contributions of code or do they mix many things together (could also be an indicator of last-minute commits)? How is the distribution of commits among group members? How well is your code covered? Why: an important quality metric of your code is coverage. While in the \"Test Suite\" section above we assess the correct usage of pytest and/or unittest as well as the structure of your tests (unit tests, integration and regression), the quality of these tests is assessed here. How high is the line coverage (at least 90%)? Do your tests take into account boundary/edge cases? Can we make your code produce side effects? If that is the case your tests are not robust, even if you achieved 90% minimum coverage. A high coverage percentage does not necessarily imply good tests. Do you use docstrings for your modules, functions, classes, methods? Why: this is the main tool to document your code . A high quality code necessitates consistent docstrings through out your code base. The \"Documentation\" section above requires you to document your project and usage of your library with demos and examples. The documentation of your code is assessed here. A nice way to combine the docstrings with your documentation is the sphinx package (this is optional for your project). Docstrings are further required for a proper working of the pydoc tool that will be utilized by arbitrary users of your library. Ideally, well written docstrings are decorated with short doctests to illustrate example usage of the code (see lecture 14). How are comments utilized in the code? Why: these are important for your fellow colleagues who collaborate or continue development. It is very easy to 'forget' about them because to you (at the time you write the code) the intention is clear. This will not be the case for the other person or you in two weeks from writing the code (especially for complex sections in the code). It is important to keep that actively in mind when writing code and it is a major contributor to high quality code. How accessible is your library (through the interface)? Example: in many cases we are interested in gradients (e.g. Newton's method, learning or optimization). Does your library provide such an interface that returns the gradient of a function f(x) or is the user required to 'build' the gradient by multiple library calls if x is higher dimensional? What are the returned types? Example: following the gradient example above, is the returned type a numpy array or a naive python list? Since gradients are often used in numerical computation, which type would provide more value for your end user? Is your code formatted consistently? Why: a consistent formatting is very important for readability of your code. Do you use consistent indentation in your python code? Are comments formatted consistently throughout? Docstrings should also have a consistent formatting. It is the same with typography in printed media. The font family, font size or font color does not arbitrarily change because it would be very hard on the eye to read such a document. The same applies to code. See slides 6-8 in lecture 6. Final Deliverables The deliverable for the project is a video that describes all of the work done on your project throughout the semester. You are free to decide upon the format of the video yourself (with some restrictions - see below). Some ideas include: an actual video of your group presenting in front of a screen a narrated video of presentation slides with diagrams and illustrations a narrated live code demo a mix of all of the above You may want to consider tools such as asciinema . We won't be judging you on the quality of your camera or your video editing ability so don't worry about that. We will judge you only on the content you present. However, if the quality of the video is so poor as to prevent us from judging the quality of the work, then we will unfortunately need to deduct points. Video Requirements The video should be narrated by all members of your group. Every group member should speak an equal amount in the video. For a group of n people, you should change speakers exactly n-1 times. (i.e. you should all speak exactly once during the video - you shouldn't continuously change who is speaking). The Introduction / Background and Implementation details/Software organization/How to use sections should contain information related to the minimum project requirements only. If you want to provide additional Introduction / Background or Implementation details/software organization/how to use for your extension to the project, you should do this in the Your additional feature(s) and extension section of the video. Section Minimum Length Introduction/background 2 minutes Implementation details/Software organization/How to use 4 minutes Your additional feature(s) and extension 5 minutes Future work/possible extensions 2 minutes Video Submission instructions Your video will be submitted to a group assignment on Canvas. Your video should be at maximum 15 minutes. WARNING: DO NOT exceed the time! We will not grade portions of your project that exceed the time. Make sure the title of your video includes your teamXX ID. Things to keep in mind Remember, the teaching staff already has full access to your code, so there is no need to focus on small implementation details. DO NOT include snippets of your actual library code in your presentation! Pseudo-code and flowcharts can be very useful to give the big idea of how your package works. Library demos can be very useful, but be careful. If they don't work well then you'll waste all your video time. You should provide sufficient background for the project. Don't overdo the mathematical details for automatic differentation. We are already familiar with them. Instead, provide the big ideas behind automatic differentation and the motivation for using it. Spend a fair bit of time on your new feature. You may need to present some mathematical background to get your audience oriented, but this will depend on your extension. Be sure to conclude with future work and possible additional extensions. Grading Breakdown Points Task 20 Complete forward mode functionality (see above) 15 Documentation, Test Suite & Coverage with GitHub page (see above) 4 Broader Impact Statement 20 Video Presentation 30 New Feature (see above) 15 Code quality (see above) 104 Total","tags":"Project","url":"project/FD/"},{"title":"Milestone 1A","text":"Due: Thursday, September 22nd, 11:59 PM You will now begin your final project to develop a Python package for automatic differentiation. Please get together with your project group and complete the tasks below for Milestone 1A. Steps to complete Find team members you would like to work with and establish a way to communicate. Once your team is complete, request a team ID from cs107-staff@g.harvard.edu . Your team ID will be team01 if you are team 1 or team10 if you are team 10 and so on. The project code will be hosted in private repositories in the CS107 organization at code.harvard.edu . Once you have received your team ID from item 2, a member of your team creates a private repository in the CS107 organization at https://code.harvard.edu/CS107 named after your team ID (e.g. team01 if you are team 1) and adds its team members to the repository (you do not need to add the teaching staff, we will have access already). Final Deliverables Form a project team and request team ID from teaching staff. Create a private team repository in CS107 organization. Grading breakdown Points Task 1 Team formation 1 Creation of team repository 2 Total","tags":"Project","url":"project/M1A/"},{"title":"Milestone 1B","text":"Due: Tuesday, October 4th, 11:59 PM You will now further configure your group repository Git Conventions We expect all work from this point onward do be done on feature branches and merged into master or main via Pull Requests. Try to work with different branches and \"approve\" each others pull requests by reviewing their code and then merge into your default project branch. You must work with your project Git repository. The teaching staff will frequently check the history of your project. Steps to complete Within your project repo, you must set up two workflows with GitHub Actions . One workflow will be used for tests and the other for code coverage . You will need two .yml files in the .github/workflows directory in your project repository. The .yml do not need to have meaningful declarations at this point but you should have at least the name: option and the on: option defined. See this link for more details. Make sure the README.md file at the root of your repo includes badges indicating whether your CI workflows are passing or failing. Your workflows are expected to be failing at this point. You should end up with a rendered README.md file that looks like this (workflows may fail or have no status ): In the root of your project repo, you should create a directory called docs . You can use this directory to organize documentation and tutorials for your final package. You will begin creating this documentation as part of the next milestone 1. Grading breakdown Points Task 1 Configuring test action 1 Configuring coverage action 1 Creating project structure 3 Total","tags":"Project","url":"project/M1B/"},{"title":"Milestone 2A","text":"Due: Tuesday, November 1st, 11:59 PM Submission Instructions Please update your milestone1 submission to reflect the feedback given by the teaching staff. Add a Feedback section to your milestone1 document which outlines the feedback given and the updates that you made to address the feedback. One convenient way to organize this would be to create a section titled Feedback and a subsection titled Milestone 1 . Then, under this subsection, you can create a list of the individual comments that you received along with your response on how you addressed each comment. If you receive feedback on the README.md file, please update that document as well. Your submission should be in the same format as milestone 1: teamXX/ ├── docs │ └── milestone1 ├── LICENSE ├── README.md └── ... Grading Breakdown Points Task 1 Adding Milestone1 Feedback to milestone1 doc 1 Updating milestone1 submission to reflect feedback 2 Total","tags":"Project","url":"project/M2A/"},{"title":"Milestone 2B","text":"Due: Thursday, November 10th, 11:59 PM Submission Instructions Milestone 2B is a progress report for Milestone 2 . Specifically, for each member of the group, please answer the following questions. What tasks has each group member been assigned to for Milestone 2. What has each group member done since the submission of Milestone 1. This progress report should be no longer than 1/2 page (maximum 1 page). Its main purpose is to help your group structure project work and delegate tasks. We will be specifically checking to make sure there is equitable distribution of tasks. All members of the group should contribute code to the core library and all members should contribute to the documentation. Your submission should be in the following format: teamXX/ ├── docs │ ├── milestone1 │ └── milestone2_progress ├── LICENSE ├── README.md └── ... Grading Breakdown Points Task 2 Progress report completed 2 Total","tags":"Project","url":"project/M2B/"}]} \ No newline at end of file