diff --git a/docs/project/M6/index.html b/docs/project/M6/index.html index ccf2d47..a2e1bcd 100644 --- a/docs/project/M6/index.html +++ b/docs/project/M6/index.html @@ -158,7 +158,7 @@

Final Deliverables

For this milestone, all final deliverables should be uploaded exclusively to the Github page. There's no need to send emails.

  1. Publish your library on Test PyPI. Include the link in the main branch's README file.
  2. -
  3. A Jupyter notebook that showcases how your library works.
  4. +
  5. A Jupyter notebook that showcases how your library works. Place it in a new tutorial/ directory on the dev branch.
  6. Record and upload a video presentation demonstrating the library's installation and functionality.
  7. Each team member's self-evaluation in the dev branch README.
diff --git a/docs/tipuesearch_content.js b/docs/tipuesearch_content.js index e6510ee..cb63ed9 100644 --- a/docs/tipuesearch_content.js +++ b/docs/tipuesearch_content.js @@ -1 +1 @@ -var tipuesearch = {"pages":[{"title":"CS107/AC207 Project","text":"Project Overview Goal You will develop a software library for a client, the teaching staff. The development of this library will leverage modern software development practices covered in the course. By the end of the semester, the client should be able to easily install and run your package. Topic The project topic is spectral analysis , which consists of the analysis of data obtained from publicly available sources currently used by professional astronomers to perform state-of-the-art research. Moreover, spectral data appears in many fields of science and engineering, and you are likely to encounter it in your professional careers. Your final project is to write a Python library. Your library is not required to have every module implemented; that would simply be too much for a single semester. However, your library should meet the basic project expectations outlined in the Software Requirements Specification (SRS). Project Milestones The following weight table is used for individual milestones of the project. The individual milestones make up the final project grade listed under the Grading section in the syllabus. Additional milestones will be included in the near future. The due date for the final milestone is December 14th 2023, 09:59 PM. The due date for the final milestone is December 17th 2023, 09:59 PM. Milestone Due Total Points Milestone 1 Thu, November 2nd, 09:59 PM 1 Milestone 2 Thu, November 9th, 09:59 PM 1 Milestone 3 Tue, November 14th, 09:59 PM 21 Milestone 4 Tue, November 28th, 09:59 PM 23 Milestone 5 Mon, December 11th, 09:59 PM 55 Milestone 6 (Final) Sun, December 17th, 09:59 PM 225 + 15X Total 326 + 15X Groups You will work in groups of 4-5 students. You are free to choose your project partners but groups sizes must consist of the number of students mentioned before. Some members of the group will be stronger than others. It is expected that you work together and help each other as needed. This is an opportunity for less experienced coders to improve their skills by working with more experienced coders. Every person must contribute. Expectations This project encompasses several mandatory requirements, detailed under basic expectations and within Annex A of the Contract. Furthermore, the project includes supplementary elements, specified under additional expectations and delineated in Annex B of the Contract. Basic Expectations Python library that can be used for astronomical spectral analysis. The library must comply with the API described the Contract. The client should be able to easily install the library, run the tests, access the documentation, and use the library for their application. Documentation for every subsystem in the project must be provided. Link to the docs from the README.md in each folder. The top level README.md should contain an overview, links to other docs, and an installation guide which will help us install and test your system. The project must utilize a proper packaging system for distribution and installation of the library. The project must ship with a test suite. Documentation on how to run the tests is mandatory. Additional Expectations In addition to the basic requirements of the library, you must also extend your package with at least two additional modules. Cross-Matching Machine Learning Interactive Visualization Spectral Feature Extraction You are more than welcome to pitch your own idea, which must be approved by the Teaching Staff.","tags":"pages","url":"pages/project.html"},{"title":"Resources","text":"Books No book is required. But we highly recommend two books for this course. Fluent Python: Clear, Concise, and Effective Programming, by Luciano Ramalho. Publisher: O'Reilly Media. 2015. Designing Data Intensive Applications, by http://dataintensive.net/ , The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann. Publisher: O'Reilly Media 2014 Other useful books The Practice of Programming by Brian W. Kernighan and Rob Pike, Addison-Wesley, 1999. Skiena: The Algorithm Design Manual Abelson, Sussmann and Sussmann: SICP and python based online version based on it: http://composingprograms.com/ High Performance Python: By Micha Gorelick, Ian Ozsvald. Oreilly Media 2014. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation by Andreas Griewank Papers and other readings Python pep8 An opinionated guide to python style Git Recommended: Git from the bottom up Recommended: Git Book GitHub Videos and Training GitHub Interactive Tutorial Git - the simple guide Git Reference Git Cheat Sheet Git Immersion Tutorial Git Atlassian Tutorial Python python Rich overview of Python 3 language features (recommended to work through) Scientific visualization with Python and Matplotlib C/C++ Fall 2021 C/C++ primer material C Tutorial C++ Tutorial C++ Cheat Sheet C++ Reference Vim Spend 30 minutes to complete the vimtutor . After you have installed vim , execute the following command in your command line: vimtutor Vim Cheat Sheet Vimcasts Recommended book Bash Command Line Reference Cheat Sheet Bash scripting Cheat Sheet Unix-Related Basic Computing Tools Windows Users Using Linux Subsystem on Windows 10 PuTTY SSH client for Windows Ubuntu Docker Image You can get an Ubuntu based Docker container with docker pull iacs/cs107_ubuntu The container is hosted here . The Dockerfile and run_cs107_docker.sh launch script can be found in the class repository .","tags":"Resources","url":"pages/resources.html"},{"title":"Schedule","text":"","tags":"pages","url":"pages/schedule.html"},{"title":"Schedule","text":"All due events with a given date are due on 09:59pm that day . Wk Tuesday Thursday Labs Events 1(35) Lecture 1: 2023-09-05 Class introduction/organization History of Bell Labs, Unix and Linux Command line introduction Lecture 2: 2023-09-07 More command line Pipes Regular expressions File attributes 2(36) Lecture 3: 2023-09-12 Command line customization I/O redirection Environment variables Shell scripting Process management Lecture 4: 2023-09-14 Version control systems (VCS) Centralized and distributed models Intro to Git PP01: (2023-09-12) Setup private class repository, tmate . 3(37) Lecture 5: 2023-09-19 Version control systems (VCS) Managing repositories Remote repositories Branching Lecture 6: 2023-09-21 Python basics Objects and Functions Environments Closures PP02: (2023-09-18) Bash scripting, Git workflow. Note: PP01 deadline (2023-09-22) 4(38) Lecture 7: 2023-09-26 OOP in Python Classes Inheritance Polymorphism Lecture 8: 2023-09-28 Python data model Dunder methods Software licenses PP03: (2023-09-25) Git local branches, merge conflics and merge tool. Note: HW1 deadline (2023-09-27) PP02 deadline (2023-09-29) 5(39) Lecture 9: 2023-10-03 Classes and methods Modules and packages Python Package Index Lecture 10: 2023-10-05 Databases SQL SQLite PP04: (2023-10-02) Python closure, fully connected neural networks. PP03 deadline (2023-10-06) 6(40) Lecture 11: 2023-10-10 Databases: OLAP & OLTP SQL: Joins Lecture 12: 2023-10-12 SQL Joins Pipelines Case Study PP05: (2023-10-10) SQL and SQLite in Python. HW2 deadline (2023-10-13) PP04 deadline (2023-10-13) 7(40) Lecture 13: 2023-10-17 Pipelines Software systems Documentation Lecture 14: 2023-10-19 Testing PP06: (2023-10-16) SQL and pipelines. PP05 deadline (2023-10-20) 8(40) Lecture 15: 2023-10-24 Testing revisited Exeptions Test coverage Lecture 16: 2023-10-26 Continuous integration PP07: (2023-10-23) Documentation and testing Quiz #2 deadline (2023-10-25) HW3 deadline (2023-10-27) PP06 deadline (2023-10-27) 9(40) Lecture 17: 2023-10-31 Containers Virtual environments Docker Lecture 18: 2023-11-02 Data structures Linked lists Iterators PP08: (2023-10-30) Package deployment PP07 deadline (2023-11-03) 10(40) Lecture 19: 2023-11-07 Binary search trees Tree traversal Priority queues Lecture 20: 2023-11-09 Heaps PP09: (2023-11-06) BST, Docker images HW4 deadline (2023-11-10) PP08 deadline (2023-11-10) 11(40) Lecture 21: 2023-11-14 Generators Coroutines Lecture 22: 2023-11-16 Python internals Memory PP10: (2023-11-13) TBD PP09 deadline (2023-11-17) 12(40) Lecture 23: 2023-11-21 CATCH UP lecture Thanksgiving break: 2023-11-23 No PP11 PP10 deadline (2023-11-17) 11(40) Lecture 24: 2023-11-28 Performance Lecture 25: 2023-11-30 Project work PP12: (2023-11-27) TBD Quiz #3 deadline (2023-11-30) 11(40) No lecture: 2023-12-05 Work on the project Work on other projects Rest and relax Reading period: 2023-12-07 PP12 deadline (2023-12-01) 11(40) Final exam period: 2023-12-12 Final exam period: 2023-12-14","tags":"pages","url":"pages/schedule_static.html"},{"title":"Syllabus","text":"Course Objective The primary goal of this course is to teach you how to develop effective software for scientific applications. In order to achieve this goal, there are several non-negotiable topics that must be included in the course. We will be concerned with two primary thrusts: System and Software Engineering and Language . Moreover, we aim to provide you with a suite of modern software development techniques and workflows. Learning Objective After successful completion of this course, you will be able to: Use Python, including its advanced features to write scientific programs. Have a basic idea how the Python interpreter works. Understand what features of Python make up its language execution model and how these features impact the code you write: e.g. how modularity, abstraction, and encapsulation can be used to solve problems. Write programs with good software engineering practices. These practices include: working on remote machines, version control, continuous integration, documentation and testing. Utilize data management techniques to store data, starting from a good understanding of data structures to databases. Combine these techniques together to write large pieces of software working in a team. Develop pipelines to integrate data aquisition and processing. Evaluate and test software as part of the development process. Be able to contribute on both the science and software engineering sides of things. Prerequisites You should have some basic familiarity with programming (functions, variables, constants, differences between integer and floating point, etc.) at the level of CS50. Some comfort with a tool to edit text files is beneficial. Any text editor or IDE will suit this purpose. The student should have passed a basic calculus class. The lectures will review the necessary fundamentals required to succeed with the class project. Besides this, you should have interest or investment in scientific computing. You can download Homework 0 for self-assessment here (not graded). You do not need to be able to solve all problems in order to take this class. Jupyter Notebooks Jupyter notebooks are great for code prototyping and learning how to use new features and APIs. However, they are not suitable for large software development projects! One reason for this is because code development in Jupyter notebooks is a nonlinear development process and there is presently no good solution for version control of Jupyter notebooks. A second reason is the question of efficient source editing. A helpful tool to convert (back and forth) Jupyter notebooks to pure python code is Jupytext . Homework assignments and lecture exercises turned in as Jupyter notebooks will not be graded. Textbooks There is no required course textbook. However, the course content will draw from various sources. We will cite the source when appropriate. Please consult the resources page for recommended textbooks and additional helpful material. Course Format The delivery of course content will occur via two weekly lectures as well as weekly pair-programming sections. Attending these sessions is mandatory . Lectures will consist of considerable interaction and discussion and will be greatly enhanced by student participation. The course contains the following main components: Lectures: Deliver the main content of the class. Attendance is mandatory. Quizzes: Graded in-class quizzes intended to assess the learning progress. Pair-programming: Pair-programming (PP) sections offer practice on topics addressed in class and help assess the skills to program in a collaborative environment. Attendance is mandatory. Homeworks: Homework assignments deepen the lecture material and include coding exercises. Exercises may be of theoretical or practical nature. Projects: The class is accompanied by a project (teams of 4-5 students) to practice the methods learned in class on a real Python application. The project topic is given by the teaching staff. The main programming language taught throughout the course is Python. Grading The following weight table is used for individual components of the class. The class does not have standard midterm or final exams. Total Weight Homework (5 Homeworks) 35% Project 35% Quizzes (3 Quizzes) 15% Pair-programming (11 sections) 15% Homework There are 5 homework assignments where each contributes equally to the final grade. The homework is focused on the topics discussed in class and involves programming and theoretical work. The teaching staff is determined to return solutions and graded assignments with feedback after the due date. It is your responsibility to check the consistency between your graded work and the assignment solution. You have the option to address possible inconsistencies in office hours or request a regrading for the assignment (see the homework grading inconsistencies section below). Homework will be released on the CS107/AC207 class repository . Push notifications for that repository will be distributed through the class mailing list . Homework will be graded on a 100 point scale: 100 = Solid / no mistakes (or really minor ones) 80 = Good / some mistakes 60 = Fair / some major conceptual errors 40 = Poor / did not finish 20 = Very Poor / little to no attempt. 0 = Did not participate / did not hand in Homework Submission Homework must be submitted via commits in your private git repository hosted in the CS107 organization at https://code.harvard.edu/CS107 . Grading and feedback for homework is done through the Gradescope platform which is connected to the class' Canvas site . Your homework solutions must therefore be zipped and uploaded in the Gradescope section of the class canvas. See the homework workflow tutorial for more details. The homework due date is indicated on the problem sheet and displayed in the schedule as well as shown on Canvas and Gradescope. Homework submissions will be graded on: Correctness: your code must run and must produce the correct result. We are not debugging issues when grading submissions. Presentation: presentation means structure and readability. We expect you to write high-quality, readable and tested code. A quality code is well commented in places where it is not straight forward to deduce the logic from code itself (from the reviewers perspective). We expect you to think about aspects such as modularity, reusability, code duplication and error handling when you design and write code. Presentation of results also means that unnecessary or superfluous files like editor backup files or other unrelated data should not be included in the submission commits (use .gitignore for this purpose). See the following tutorials to help you get started with homework submissions: How to setup your private class repository (onetime setup) Homework workflow Homework Late Days Homework submissions are accepted before the deadline of the assignment is due. You have three late days at your disposal that can be consumed for late submissions and two consecutive late days can be used at most for any of the homework assignments. Please note that any commits on your homework branch pushed after the deadline has passed are not considered for grading by default. If you wish that we consider a late commit for grading, please contact the teaching staff at cs107-staff@g.harvard.edu with appropriate explanation. This will count towards your late day budget. It is your responsibility to plan your work ahead and commit on time. If you have consumed all your late days and you have another late submission, it is in your benefit to still commit the work. We assume the Harvard Honor Code for all late submissions in case solutions are already posted. If you have a verifiable medical condition or other special circumstances that interfere with your coursework please let us know via cs107-staff@g.harvard.edu as soon as possible. Homework Grading Errors If you believe there is an error in your assignment grading, you can submit a regrade request through the Gradescope platform . Note: The entire assignment will be regraded. This may cause your total grade go up or down . An assignment can only be regraded once . Regrade requests are due within 2 days after the release of the grades . Project Please see the project section for more details. Quizzes There are 4 quizzes out of class which are graded and intended to assess the learning progress. Each quiz addresses topics from the lecture material . Quizzes are open book/ www and include multiple choice questions with at most back of the envelope calculations. Quizzes contain around 15-20 questions and take 30 minutes. They are accessible on canvas within a 24 hour time window from 8pm. Note: if a quiz takes 30 minutes and you start the quiz on 8:50pm, you will have only 10 minutes to work on the quiz. Please see the class schedule as well. Pair-Programming Sections Pair-Programming will form an essential part of the course. Pair-programming will take place in mandatory pair-programming sections led by members of the teaching staff. You are required to sign-up for your preferred pair-programming section at the beginning of the semester. You are expected to attend your chosen section during the semester. Should you not be able to attend one of your sections, please coordinate with your section TF to attend another section this week in order to obtain the attendance credit. In CS107/AC207 we are focusing on command line tools for the development of software projects in computational science. It is important that you get familiar with a small selection of such tools and integrate them in your development process. The pair-programming sections aim at combining some of these tools together to provide you with hands-on experience while developing software. The key is the \"pair\" in pair-programming. The exchange of knowledge between team mates in these pair-programming sections is essential for learning said tools or learning something new from your peers. Pair-programming Submissions Pair-programming exercise solutions must be submitted in your private git repository hosted in the CS107 organization at https://code.harvard.edu/CS107 . Only commits made on or before the due date will be considered for grading . The deadline for submission is usually one week after the last section for the exercise. Given this extra time for completion, late days do not apply to pair-programming exercises . The submission due date is indicated on the problem sheet and displayed in the schedule . As you are working in groups of 3-4 students for the lab exercises, the solution files you come up with in the group are submitted by each group member individually in her/his own private Git repository. Pair-programming submissions will be graded based on the following criteria: Attendance: your attendance will be recorded by the TF who leads the section. Joining the section at the beginning and then leaving 10-15 minutes later will not reward attendance credit. If you need to leave because of another appointment then it is expected that you communicate beforehand and coordinate with your TF. Please see the attendance policy section below as well. Your pair-programming session is determined at the beginning of the class by choosing lab sections in my.harvard . You can select your preferences depending on your schedule. Once determined, you can lookup your session details in the https://code.harvard.edu/CS107/main/blob/master/lab_groups.xls sheet. Completion: pair-programming submissions should reveal effort that the student attempted to solve the tasks. If you experience difficulties in a particular problem and you are not able to complete the task, please indicate the issues you had in your code using comments; the teaching staff will take that reasoning into account. Handing in an empty skeleton (same as hand-out) does not meet the expected standard and will not award credit for the submission. See the following tutorial to help you get started with pair-programming submissions: Pair-programming workflow Office Hours The teaching staff holds weekly office hours. Office hour times and locations are listed on the class main page. Office hours offer an opportunity to review course materials and receive additional guidance on your homework. Please see the following file in the class git repository for the details: https://code.harvard.edu/CS107/main/blob/master/office_hours.xls Attendance Policy Attendance at lectures and pair-programming sections is mandatory as they are core parts of the class. Pair-programming sections (labs) will be held on weekdays that we determine at the beginning of the class according to a best fit of the students' individual schedules for the term. You are required to attend the labs on the assigned day. Rescheduling of a lab to a different day due to an unforeseeable event must be coordinated with the responsible TF by sending an email to cs107-staff@g.harvard.edu . To be excused from a lecture or a lab, we ask you to follow the Harvard Honor Code and send an email to cs107-staff@g.harvard.edu at least one day before the lecture or lab. Lecture recordings are available only when students are excused for a lecture. Collaboration Policy You are welcome to discuss the course material and homework with others in order to better understand it, but the work you turn in must be your own (with exception of the project where collaborative work is permitted). Any work submitted as your own without properly citing the original author(s), is considered plagiarism. Failure to follow the academic integrity and dishonesty guidelines outlined in the Harvard Student Handbook will have an adverse effect on your final grade. This includes the removal of copyright notices in code. You may not submit the same or similar work to this course that you have submitted or will submit to another without permission. The teaching staff may use tools to compute correlations between submitted work. Use of AI Models Purpose of Policy: This policy outlines the acceptable use of AI models, including but not limited to ChatGPT, in completing assignments for this course. Policy Guidelines: Original Work: Students are expected to complete assignments using their original thoughts and interpretations. AI models can be used to help understand concepts, generate ideas, or learn about different perspectives, but they should not write or complete assignments for students. Collaboration with AI: Students may use AI models for brainstorming or generating preliminary ideas, but the final work submitted must be substantially their own. Students should be able to explain their reasoning, logic, and conclusions without relying on the model's output. Restrictions for Specific Assignments: There may be specific assignments (e.g. quiz part of the midterms) or parts of the course where the use of AI models is entirely prohibited. These restrictions will be clearly stated in the assignment guidelines. Ethical Considerations: Students are encouraged to approach the use of AI with ethical considerations in mind, including issues related to privacy, bias, and authenticity. Consequences for Non-Compliance: Failure to adhere to this policy may result in academic penalties as outlined in the course's academic integrity policy. Questions and Clarifications: If students have questions about the appropriate use of AI models in an assignment, they should consult the course instructor or teaching assistants before proceeding. Please refer to the University's policy for further information. Accessibility If you have a documented disability (physical or cognitive) that may impair your ability to complete assignments or otherwise participate in the course and satisfy course criteria, please contact the teaching staff or directly the Accessible Education Office to receive an AEO letter that will authorize us to help you with corresponding accommodations. Diversity Statement All participants in this class are expected to foster empathy and respect towards each other. This includes instructors, teaching staff or students. The motivation to take this course shall be to experience the joy of learning in an environment that allows for a diversity of thoughts, perspectives and experiences and honors your identity including race, gender, class, sexuality, religion, ability, etc. Any constructive feedback for improving the class environment is welcome and I encourage you to reach out to the instructor or teaching staff with any concerns you may have. If you prefer to speak with someone outside of the course, you may find helpful resources at the Harvard Office of Diversity and Inclusion .","tags":"pages","url":"pages/syllabus.html"},{"title":"Tutorials","text":"How to Setup your Private Class Repository Steps to Setup Your Private Class Repository Add an SSH Key to Your Account Homework Workflow Example Homework Workflow Step 1: Branch Off Step 2: Solving the Homework Step 3: Create a Pull Request Creating a Web Pull Request Step 4: Submit on Gradescope Pair-programming Workflow Protocol How to launch tmate Recommended Workflow How to Setup your Private Class Repository All of your work in CS107/AC207 will be committed in your private class git repository hosted in the CS107 organization at https://code.harvard.edu/CS107 . (The class project will be hosted in another repository in the same organization, see Milestone 1 for this separate task.) This tutorial walks you through the steps to create your private class repository. If you have already created git repositories on GitHub, then there is nothing new to learn in this tutorial and you should be familiar with the process already. A note on https://code.harvard.edu/ : this is an instance of a GitHub Enterprise edition hosted by Harvard University. The user interface is identical to the public GitHub site. The main difference is that https://code.harvard.edu/ is owned by Harvard University , whereas GitHub belongs to Microsoft which gives rise to security concerns regarding data belonging to classes held at Harvard University. Steps to Setup Your Private Class Repository Obtain your Harvard NetID Send an email to cs107-staff@g.harvard.edu (using your .harvard.edu email) to request access to the CS107 organization . Include your NetID from step 1 in the body of the email and choose an appropriate subject line. Once added to the organization, navigate to https://code.harvard.edu/CS107 (login if necessary) and click the green \"New\" button to add a new repository. Your repository must be named after your NetID . You can add an optional description if you like. Make sure the private radio button is checked and click \"Create repository\". You do not need to check any other options. This is all you have to do for now. In the first homework we will focus on how to setup your new repository such that you can work with it from your laptop (you can skip the landing page after you have created the repository). When you navigate back to https://code.harvard.edu/CS107 you should see something similar to this: The blurred repository is your private class repository that was the focus of this tutorial. The main repository is the main CS107/AC207 class repository which is used to distribute all of the class material during the semester. Any updates to this repository will be broadcast via email message such that you will not miss out on new material. In the first homework we will set this repository as an upstream such that you can conveniently unpack class material into your private repository. Note: private repositories are only visible to you within the organization. Please do not create other repositories in the https://code.harvard.edu/CS107 organization. You have your own user account on https://code.harvard.edu/ just like you have on GitHub or other providers. Your user account requires your Harvard login credentials and is a good alternative to hosts like GitHub. Feel free to create as many repositories in your user account as you like. Add an SSH Key to Your Account In order to access content on https://code.harvard.edu using Git you need to setup an SSH key. Check if you already have the file ~/.ssh/id_rsa.pub (assuming RSA). If you do not have such a file you can create one with ssh-keygen -t rsa -b 4096 Choose the default location by just hitting enter. You may enter a password for the key or just hit enter to go without password. If go with password you will have to enter it every time you use the key. To upload the public key to your Harvard GitHub account , click on your icon in the top right corner on your https://code.harvard.edu page, then click on \"Settings\" and then \"SSH and GPG keys\" in the left panel. Alternatively use this link https://code.harvard.edu/settings/keys . Click on the green \"New SSH key\" button in the top right corner and give your new key a title (e.g. the name of your laptop). In the key field paste the contents of your public key found in ~/.ssh/id_rsa.pub . Use for example cat ~/.ssh/id_rsa.pub and copy paste the output into the \"Key\" field on your GitHub page. You are now able to access any repositories on https://code.harvard.edu with corresponding permissions. Never share your private key ~/.ssh/id_rsa with anybody. Note: do not create a key in the class Docker container since the key will be lost when you exit the container. For security reasons, sensitive keys like this should not be put in containers. Homework Workflow The following are the basic rules we apply for homework submissions: Naming convention for homework directories: your private repository should contain one homework directory on the repository root with hwX sub-directories for each homework assignment. The X in hwX is to be replaced with the assignment number. For example hw1 , hw2 and so on. Which files will be considered for grading: within the sub-directory hwX , place the assignment files that you want us to grade in a directory called submission . We will only grade data in these directories . Pull request (PR): your homework assignments must be completed on git branches called hwX , where X is again to be substituted with the assignment number. Your homework X submission requires an open pull request to merge the hwX branch into your main (or deprecated master ) branch for full points (both branches are inside your private class repository in the CS107 organization ). Some implications of this: Solving homework on the main or master branch is always wrong. For each homework submission you need to issue one open PR. Merging an open PR before the teaching staff has reviewed and graded your work will make the PR disappear . Only files inside submission in PR X will contribute to your hwX grade (see next item). Gradescope: your homework will be graded on the Gradescope platform that has been setup and linked to the class canvas page. The platform does currently not support submission directly via your Git repository. You therefore have to create a zip archive of your submission directory created in step 2 above and upload the archive on Gradescope . It is important that you zip-up the directory and not individual files inside. You can use the command zip -r submission.zip submission/ , where the -r option means add files recursively , submission.zip is the name of zip archive and submission is your homework submission directory from step 2 containing your solution. This assumes you are in homework/hwX inside your Git repository. Points will be lost if any of these requirements are violated . The teaching staff will review the open PR for each homework and grade your work accordingly. Grades will be released on canvas and feedback is provided through the Gradescope platform. Once you have received the grade and feedback, your open PR for homework X can be merged into your main or master branch if there are no more pending issues. After the PR is closed, you may delete the hwX branch in your repository. This concludes a homework submission. Example Homework Workflow This example is intended to help you internalize the three basic rules described above. Each homework awards 10 points by performing these steps correctly. Note: Specific instructions provided in each homework assignment may override the following basic approach. Suppose we want to work on homework assignment 3, which consists of 4 problems. Step 1: Branch Off The ease of branching is the main strength of git . Branches allow you to be destructive without affecting production code or data. The reason we solve homeworks on individual branches is to help you develop a feel for this protection and to materialize the required steps to create branches. Branches will provide you true comfort when working on real projects outside of this class. Make sure your master or main branch is in the state you want your new branch to be based on. If you need to synchronize with your default remote branch you can type git pull The next step consists of creating and switching to a new branch that is based off the current branch. For this you can use git checkout -b hw3 which is how you did it before git 2.23.0 . Since the checkout command is ambiguous , the preferred way for more recent versions of git is git switch -c hw3 You are now on a new branch called hw3 as required. You will need to issue a pull request into main or master from this branch such that your homework will be graded. You can create the PR now (see below) or once you are done with solving hw3 , it does not matter to git . (Pull requests are not something designed by git itself, but rather by platforms like GitHub or GitLab.) Note: you will lose 5 points if you are not solving your hwX (in this example it is hw3 ) on a branch named hwX . You are of course free to create additional branches besides hwX if suitable. Step 2: Solving the Homework The files the teaching staff will consider for grading have to be located in the directory homework/hw3/submission . You are free to put other files below homework/hw3 that might be useful when you revisit your work sometime later. The problem sheet might be one of those files. Class handouts are distributed in the main class repository . You can manually create these directories and copy the files you want into your hw3 directory using, for example: mkdir -p homework/hw3/submission cp /homework/hw3/hw3.pdf homework/hw3 Alternatively you can use git by configuring the main class repository as another remote in your local git repository (see homework 1). In this case you can checkout all the distributed homework files at once with git checkout class/master -- homework/hw3 assuming that the remote points to https://code.harvard.edu/CS107/main and is locally named \" class \". You may need to update your refs with git fetch --all before you invoke the checkout command above. The homework sheet will state what files have to be submitted. For this hw3 we assume they are P1.py , P2.py , P3.py and P4.py , one for each of the four problems. These files should run and return the required output. They have to be submitted inside the homework/hw3/submission directory. You should commit your work often in logical chunks. Your commits are to be done on the hw3 branch, of course. The following are a few commands that might be helpful: Use git status often to check your local state. Use the git add command to stage files you have changed for a commit. Use git commit -m to create a commit with an appropriate commit message. Use git stash to temporarily stash modified files (similar to a commit but it is not written to the history). Later on use git stash list to list all your stashed changes if you have used git stash multiple times. You can check what will change when you apply the stash with git stash show -p and apply the stashed changes with git stash apply (or git stash pop which also removes the stash from the list). Note that these commands work on the first stash object in the list stash@{0} if you do not explicitly specify the stash object you want to apply. Use git push to push local branch/commits to your remote repository. Use git restore to undo changes to a single file. Use git revert to undo the changes in a specific commit. Make sure you have committed your solution you want to submit inside the homework/hw3/submission directory with the required file names. Step 3: Create a Pull Request If you have local commits not pushed to the remote issue the git push command. You are now ready to issue a pull request (you could also have done this step at the very beginning of solving this homework, this is up to you). The goal is to merge the hw3 branch into your main or master branch eventually. The teaching staff must review and grade your work first, however. There are two ways to accomplish a PR on GitHub: Through the web browser at https://code.harvard.edu/CS107/ . Through the GitHub command line client . This method is helpful if you get distracted from the context switch that is associated with the first method. Note: you will lose 2 points if you do not create a PR. Disclaimer: you would not typically issue a PR for projects you are the sole contributor. Pull requests are typical for large projects at a company in which someone else will review your code before you can merge your code to the production branch. We want you to become accustomed to this type of workflow. It is a good idea to always use separate development branches. You should never commit straight to your main or master branches until the changes have thoroughly been tested. Creating a Web Pull Request Navigate to your https://code.harvard.edu/CS107/ private class repository and click on the \"Pull Requests\" tab in the top left part of the window. Click on the \"New pull request\" button Choose your main or master branch as the base (the one you want to merge into) and your hw3 branch as the one you want to compare to. This should automatically reload the page and show the changes that will be applied. Click on the \"Create pull request\" button. You can optionally add comments to this pull request if you desire. Click on the \"Create pull request\" button once more to create and open the pull request. The pull request is now open. You can even push more commits to the hw3 branch if you need to correct something (before the deadline has passed of course). Therefore, you could also create the PR at the beginning of the homework. Note: DO NOT click on the button that says \"Merge pull request\" until you have received your grade and feedback for that homework. You will lose 3 points if you prematurely merge your PR. Step 4: Submit on Gradescope Your submission is now ready to be submitted for grading on Gradescope . Simply create a zip archive of your submission directory you have created in your Git repository, e.g. submission.zip , and upload it to Gradescope by following the link above. You can use the command zip -r submission.zip submission/ , where the -r option means add files recursively , submission.zip is the name of zip archive and submission is your homework submission directory. Since you track the change history of your work in Git, you should not add *.zip files to your Git history. You can simply ignore such archives by adding the line *.zip to your .gitignore file in your repository root. Pair-programming Workflow Exercises performed during pair-programming sections should be put under version control similar to homework assignments (see the Homework Workflow section above). You must not branch off and create a pull-request for pair-programming exercises . Just add and commit your work on the main or master branch and push them to your repository ( make sure you are on the correct branch before you commit! ). The following are the basic rules we apply for pair-programming submissions: Your private repository should contain a directory named lab with sub-directories for each session. The sub-directories should be named ppX where X is the session number. Within the sub-directory ppX , place the exercise files that you completed during the pair-programming sections. The exercises must have the name exercise_Y.ext where Y corresponds to the exercise number and ext is the proper extension ( .py , .sh , .c , .cpp ) depending on the exercise. Here is an example how it may look like: The pair-programming exercises will be graded for completeness and help us ensure you are on the right track. You may lose points for the completeness part if you do not follow these two basic rules. Protocol In class we are focusing on command line tools for the development of software projects in computational science. It is important that you get familiar with a small selection of such tools and integrate them in your development process. The pair-programming sections aim at combining some of these tools together to provide you with hands-on experience while developing software. The key is the \"pair\" in pair-programming. The exchange of knowledge between team mates in these pair-programming sections is essential for learning said tools or learning something new you did not know before. The pair-programming works by using a tool called tmate which is based on the tmux terminal multiplexer . It allows for easy sharing of a command line session or a specific instance of a program in read/write and read-only modes via ssh or web browsers. Check out this blog post for more. Text file editing will be performed in any text editor that supports a text-based user interface (TUI). Recommended choices are vim or emacs . If you are mainly working on a Windows operating system, you should install the Windows Subsystem for Linux . A small guide for doing so can be found here . See the How to Launch tmate section below for the steps to launch tmate on your laptop. Note: tmate is perfect for any coding related communication. For example, debugging, work on the project and of course pair-programming. There is no audio channel integrated in tmate . If students are remote, a zoom session or similar must be established for oral communication. The exercises in the pair-programming sections are necessarily collaborative. Each member of the group will turn in the same script. Adhere to the following workflow when solving pair-programming exercises: For each exercise (or sub-exercise for big problems), there will be one sharer , one coder , and one listener . This assumes a group size of 3 . If the group only has two people, then either one of you can take the sharer's role. The sharer will start each coding session and document interactions including points of contention and challenges. The coder will be in charge of writing the code. The listener will make suggestions and may offer tweaks from time to time. The sharer starts a new tmate session and invites the other team mates to join the session. Ideally you want to start the session inside the directory of the current pair-programming exercise in your git repository. You may share a read/write link either through ssh or a web browser. Note: the sharer allows the others access to her/his computer. Any abusive behavior that may cause harm on the sharer's system will not be tolerated and are forwarded to the dean's office. After the team mates have accepted the invite, they will be able to share the terminal instance and can create new files or execute Python together. The team should discuss a strategy on how to approach the exercise. The coder should start writing some code with input from the other two team members. Before each section that you work on, place a comment indicating which team member worked on that section. For example, a bash script could look like this: #!/usr/bin/env bash # File : exercise_1.sh # Created : Sat Aug 07 2021 04:58:49 PM (-0400) # Coder : Alice # Listener : Bob # Sharer : Alice echo 'Hello World' ### Main point of contention: whether to capitalize \"W\" in \"world\" For small exercises, each team member can play a single role once. For large exercises, the team members may rotate roles. The exercise will make it clear when you should rotate. At the end of the exercise, the developed code is inside the sharer's git repository that can readily be committed. Links to download these files can be shared with the other team mates such that they can update their repositories as well. Note that the exercise code will contain comments pertaining to who worked on which section. How to launch tmate Disclaimer: tmate is a tool to share a terminal session and interact with other people. If you host a session, it means the instance runs on your local computer and you are in control of how much permissions you want assign to your mates. There are 2 ways to share a session: read-only: mates that connect to your session can only read data (this is safe provided the data you expose is safe). read-write: mates that connect to your session can read and write data. This is unsafe if you share a session with an mistrusted person. Recommended Workflow Launch the CS107/AC207 docker container with the working directory mounted (see the provided run_cs107_docker.sh launch script ) Start tmate inside the docker container Wrapping tmate in a docker container provides another layer of security. You can also install tmate using your distribution package manager (on Linux or homebrew on MacOSX) and skip step 1 if you wish not to use a container. Steps: Assume Docker is installed and we have pulled the CS107/AC207 docker image . You can install the run_cs107_docker.sh launch script in your PATH for convenience (e.g. ~/bin/run_cs107_docker.sh and add this directory to your PATH environment variable). Be sure that the run_cs107_docker.sh script is executable . See the chmod command to change the permissions of the script. Assume you want to work on the PP1 exercise and you are in the lab directory of your private Git repo and pp1 exists. Launch the docker container and mount the pp1 directory in your repository: $ run_cs107_docker.sh pp1/ root@0a076feb425f:~# You are now inside a running docker container. Note that the hostname 0a076feb425f is arbitrary and yours will differ. Launch tmate (it is already installed in the container): root@0a076feb425f:~# tmate Tip: if you wish to use tmate only for remote access, run: tmate -F To see the following messages again, run in a tmate session: tmate show-messages Press or to continue --------------------------------------------------------------------- Connecting to ssh.tmate.io... Note: clear your terminal before sharing readonly access web session read only: https://tmate.io/t/ro-qNRV5QRVWkW3qr55sfATkBegr ssh session read only: ssh ro-qNRV5QRVWkW3qr55sfATkBegr@nyc1.tmate.io web session: https://tmate.io/t/nMWurZc7Q6Zbv8EnX2wdhf6GB ssh session: ssh nMWurZc7Q6Zbv8EnX2wdhf6GB@nyc1.tmate.io The tmate instance is now running and you can choose between 4 possible links to share with your mates: 2 that can be run in your web browser and another 2 to be used with ssh in your terminal (either read-only and read/write). Choose the appropriate link you want to share with your pair-programming mates. If you press q or ctrl-c you are dropped back to the shell. The server will tell you whenever mates join. You can print the links again with tmate show-messages (be careful when you are sharing screens on zoom for example). Note that pressing ctrl-d or typing exit in the shell will close the active terminal and if only one is left, also the active tmate session. This will close the connections to all connected clients. You can now work together on the exercise. For example: root@0a076feb425f:~# vim exercise_1.py The image below shows a terminal session (left) and two mates connected in a web browser window (right): Note: We run tmate with root in the docker container. Do not run tmate as root in any other situation (even here we could create a regular user) and be careful with password-less sudo (avoid password-less sudo in the first place). In order to use ssh you need to setup an ssh key if you have not done so already. If you do not have such a key, you may create one by running ssh-keygen -t rsa -b 4096 If you are not dropped into a shell after you execute tmate it may be because you are using a shell different than zsh . Install zsh on your system using your package manager and run tmate like this SHELL = /bin/zsh tmate","tags":"pages","url":"pages/tutorials.html"},{"title":"Milestone 6 (Final)","text":"Sunday, December 17th, 09:59 PM In this final milestone , you're tasked with delivering the library as outlined in the Contract. Library Your goal is to publish the first version of your library. Once the documentation is complete, it passes all tests, and every feature is implemented, you can merge it into the main branch. Use this branch to publish the library on Test PyPI. Each module will be graded and is worth 15 points: six for unit tests, six for implementation, and three for documentation. Each library implementation will have a different number of modules and they will be given points. In this milestone, we will grade the reamining X ungraded modules. All the integration tests must be implemented to check the API you defined (and refined) earlier. If the integration test suite is incomplete, your score will be reduced. Tutorials and video For the final presentation, you need to create a Jupyter notebook that explains your library's functionalities. This includes demonstrating how to execute required functions and handle exceptions. Upload a video to your repository to Youtube or or Google Drive in private or shared mode. You must show the real-time installation and execution of the library. Every team member must participate . Feel free to use any VM, virtual environment, or Docker container. The explanation in the video should not exceed 7 minutes. While there's no strict limit on installation time, please keep it within a reasonable duration. Write the link on at the end of your Jupyter notebook. It is your responsability to give the appropiate permisions. If the video is not present or not accesible by the teaching staff, you will not receive the points. Self-evaluation Each team member is required to upload an estimate of the hours they dedicated to the project in the dev branch of your repository. Along with this, include a brief summary of your main contributions. Submit these estimates through pull requests, which should not be accepted by the author . We'll review the commit history of the repository. Special considerations may be made in exceptional cases. Final Deliverables For this milestone, all final deliverables should be uploaded exclusively to the Github page. There's no need to send emails. Publish your library on Test PyPI. Include the link in the main branch's README file. A Jupyter notebook that showcases how your library works. Record and upload a video presentation demonstrating the library's installation and functionality. Each team member's self-evaluation in the dev branch README. Exceptions Submission of the Jupyter Notebook example is mandatory, while the video is optional. Both serve to demonstrate your work. If you choose not to submit the video: No points will be awarded for 'Library on Test PyPI' and 'Video Presentation' if the library installation fails. If the library installation succeeds but tests fail, you will receive points for 'Library on Test PyPI' but lose points for 'Video Presentation'. Points for both 'Library on Test PyPI' and 'Video Presentation' will be awarded if the installation and tests are successful. Grading breakdown Points Task 15 Library on Test PyPI 15X X dev of remaining modules 60 Integration tests 60 Jupyter notebook tutorial 45 Video presentation 45 Self-evaluation 225+15X Total","tags":"Project","url":"project/M6/"},{"title":"Milestone 5","text":"Monday, December 11th, 09:59 PM With the development in full swing, many modules should now be ready. This milestone is to ensure that some of these modules work together correctly. API tuning During project development, you'll gain insights about the structure and modules. It's possible that the initial API isn't ideal, so you might need to revise it. This involves updating the API documentation, diagrams, and, most importantly, the code to align with the new API. If your API draft still meets the contract requirements, you can choose not to modify it. In this case, add a small appendix explaining why it remains unchanged. In the remainder of this document, we'll refer to the latest version of the API as the modified version. Features and Integration Developing individual features is usually straightforward. The real challenge lies in integrating them smoothly. For this item, develop features from different modules and conduct integration tests using GitHub Actions. You can merge features into the dev branch once they pass your tests. However, keep the feature development branches until reviewed by the teaching staff. The teaching staff will grade the dev branch. Make sure to commit often in the local repository, as it is part of the evaluation . Tests must be commited and pushed before any code is written. Design and write your integration tests based on the modified API. Work on at least two consecutive modules. This includes writing unit tests, coding, and documentation (docstring). If the modules are already created: Merge the integration tests into the dev branch. If only one module is developed: Merge the integration tests into dev before starting the second module. If no modules are created: Merge the integration tests into dev first, before any module development. Successful integration of two modules is confirmed when all integration tests pass. Note : Regularly commit to your local repository and tidy up the history as needed. Push your changes only after passing the unit tests. SFS clarifications and modifications Your main focus should be to complete the pipeline. Use placeholder functions if necessary. Any modifications, as per the contract, should be straightforward. Clarifications of the Software Requirements Specifications For Annex A: 3.A: Each task listed in this item should be aplpied to one spectrum. 3.B: Include the class (STAR, GALAXY, or QSO) in the metadata. The spectrum is the data; metadata is everything else related to it. 3.C: Aligning in wavelength means sub-sampling the wavelengths. For a given list of target wavelengths, return a flux value for each. 4: The inferred continuum is a line derived from the data, excluding any emission or absorption lines. 5: Data augmentation module execution is optional. If used, the user inputs the degree of required derivatives. For Annex B: 2: The machine learning module should primarily use spectral data. It can include other metadata. The results should report a confussion matrix. 3: Total flux of spectral lines must use an inferred continuum. The method for calculating line area is defined by Developer, and should be well-documented and easily modifiable. Modifications of the Software Requirements Specification Annex A - 3.B: Replace chemical abundances with the value of Equivalent Width for each line detected by the SDSS pipeline. Annex B - New task: Report the chemical abundances of stars from the APOGEE survey . Steps to complete Re-evaluate the document written in the folder API_draft . Make the require modifications both in your diagram and the document. Based on the modified API, reorganize your library in the dev branch. Write the integration tests on the dev branch. Complete the implementation of at least two modules of your choosing and test their integration. Every change in the library should trigger integration tests via Github Actions. In milestone5 , describe the rationale behind any API changes. This should include: Why the API was modified, focusing on how the changes improve functionality, usability, or adaptability to project requirements. Discuss how these changes enhance the integration of different modules. List the specific module names that need to be evaluated by the teaching staff for their integration. Final Deliverables The final deliverables for this milestone should be uploaded only on the Github page. No emails are neccesary. Updated API document and its diagram. Place these in the draft_API folder. The docs/ directory should include a document called milestone5 . Integration tests for at least two modules. Note 1 : By now you should have implemented many modules, and most of the items requested in the milestone. Note 2 : The reading period ranges from December 6th to December 10th. You are highly encouraged to submit earlier (notifying your liaison by email) to receive feedback. Grading breakdown Points Task 15 API 20 Modules 20 Integration tests 55 Total","tags":"Project","url":"project/M5/"},{"title":"Milestone 4","text":"Due: Monday, November 27th, 09:59 PM Tuesday, November 28th, 09:59 PM You will now start the development of the library modules. As part of test-driven design, you should first write the tests of a functionality, and then write the code, based on the API you defined in Milestone 3. Software Organization Before any code is written, discuss how you plan on organizing your software package. With the idea of classes/modules in mind, organize your code according to the API identified in milestone 3. This is a more detailed organization and should reflect said API (If you already did this for milestone 3, you may use it for milestone 4 or expand it if you need it). What will the directory structure look like? Where will your test suite live? How will you distribute your package (e.g. PyPI with PEP517/518 or simply setuptools )? Other considerations? Describe your choices in the milesone4 document (even if you already wrote it for milestone 3). You have to follow the guidelines shown during lectures. Licensing Licensing is an essential consideration when you create new software. You should choose a suitable license for your project. A comprehensive list of licenses can be found here . The license you choose depends on factors such as what other software or libraries you use in your code ( copyleft , copyright). Will you have to deal with patents? How can others advertise software that makes use of your code (or parts thereof)? You may consult the following reading to aid you in choosing a license: Helper to choose a license Licenses License recommendations License compatibility Extensive list of open source licenses Briefly motivate your license choice in the milesone4 document and add a LICENSE file to the root of your project. Implementation New features should be developed on independent branches. For this, you will leave the main branch only for the code graded by the teaching staff. The branch dev will contain the code in development. You are free to create as many branches as you need and merge them into dev . Do not delete the branches used to develop new features until the teaching staff review them. Remember that you can also create as many workflows as you want. Select at least one module identified in Milestone 3 and implement it. What method and name attributes will your class have? What methods and attributes will you expose to the user? Do you want/need to depend on other libraries? (e.g. NumPy) Write a comprehensive test suite for this module(s), according to your API. This might change in the future as you learn more about your code. Write the code of the module along with its documentation. Note : You can commit multiple times to your local repository and clean the local history if needed. Push only when you pass the tests. Steps to complete In the main branch and within your docs sub-directory, create a file called milestone4 . The type of file is up to you and your group. Two acceptable choices are markdown ( milestone4.md ) or a Jupyter notebook ( milestone4.ipynb ). Your milestone4 document submission should be in the following format: teamXX/ ├── docs │ └── milestone4 ├── LICENSE ├── README.md └── ... Describe the software organization and licencing in milestone4 . Create branch dev . Create the branch featurename to implement your module. Replace \"featurename\" with the name of the module you want to implement. In that branch, write the tests for the module you want to implement. You should commit them before the writing/pushing any other code . Write the code for the module you wrote the tests for. Every test for the modules should pass. The code coverage must be at least 90%. Merge the branch into dev . (Optional) You can implement another feature, following steps 5-10. You are encouraged to develop the main modules as soon as possible, to focus on the integration later on. Final Deliverables The docs/ directory should include a document called milestone4 (the extension is up to you, but .md or .ipynb are recommended. Proper licensing of your project. Tests and implementation for your module(s). Grading breakdown Points Task 4 Software Organization 4 License 15 Implementation 15 (optional) Additional implementation 23(38) Total","tags":"Project","url":"project/M4/"},{"title":"Milestone 1","text":"Due: Thursday, November 2th, 09:59 PM You will now begin your final project to develop a Python package for astronomical research. Please get together with your project group and complete the tasks below for Milestone 1. Steps to complete Find team members you would like to work with and establish a way to communicate. Register your team on Canvas. Send an email to cs107-staff@g.harvard.edu with your team number and members. You should also create your team name, which will be used to represent your team. Your team ID will be team01 if you are Project - Group # 1 or team10 if you are Project - Group # 10 and so on. Final Deliverables Form a project team and communicate with the teaching staff. Grading breakdown Points Task 1 Team formation 1 Total","tags":"Project","url":"project/M1/"},{"title":"Milestone 2","text":"Due: Thursday, November 9th, 09:59 PM You are required to review the mockup contract and identify the critical elements that need to be developed. This exercise is designed to emulate a professional environment and is strictly for simulation purposes; it carries no legal obligations. Software Requirements Specification (SRS) We expect you and your team to read and understand the SRS. From the SRS, identify the API your library should present. We expect you and your team to thoroughly read and comprehend the SRS. Steps to complete Every team member must sign the contract and upload the signed document into the root folder of the team's repository. Grading breakdown Points Task 1 Uploading the signed contract 1 Total","tags":"Project","url":"project/M2/"},{"title":"Milestone 3","text":"Due: Tuesday, November 14th, 09:59 PM You will now further configure your group repository Software Requirements Specification (SRS) Based on the SRS, you are to identify the Application Programming Interface (API) that your library is required to provide. Git Conventions We expect all work from this point onward do be done on feature branches and merged into master or main via Pull Requests. Try to work with different branches and \"approve\" each others pull requests by reviewing their code and then merge into your default project branch. You must work with your project Git repository. The teaching staff will frequently check the history of your project. Steps to complete Create a private team repository in CS107 organization. The project code will be hosted in private repositories within the CS107 organization . A member of your team must create a private repository named after your team ID (e.g., team01_2023 for Team 1). After creating the repository, add all team members to it. The teaching staff will have automatic access and do not need to be added. Within your project repository, create a folder named API_draft . Inside this folder, provide a README file detailing the modules, classes, and functions planned for inclusion to meet the SRS requirements. Use this phase to outline your pipeline and begin task allocation among team members. Don't try to be overly specific on the details, as this is likely to change as your code evolves. Within the API_draft folder, also upload a schematic diagram that illustrates the modules and the API structure your library will present. Within your project repository, you must set up two workflows with GitHub Actions . One workflow will be used for tests and the other for code coverage . You will need two .yml files in the .github/workflows directory in your project repository. The .yml do not need to have meaningful declarations at this point but you should have at least the name: option and the on: option defined. See this link for more details. Make sure the README.md file at the root of your repo includes badges indicating whether your CI workflows are passing or failing. Your workflows are expected to be failing at this point. You should end up with a rendered README.md file that looks like this (workflows may fail or have no status ): In the root of your project repo, you should create a directory called docs . You can use this directory to organize documentation and tutorials for your final package. You will begin creating this documentation as part of the next milestone. Grading breakdown Points Task 1 Creation of team repository 4 Describing the API 2 Diagram 5 Configuring test action 5 Configuring coverage action 4 Creating project structure 21 Total","tags":"Project","url":"project/M3/"},{"title":"Systems Development for Computational Science","text":"Computation has emerged as the third pillar of science alongside the pillars of theory and experiment. Computational science is maturing rapidly and has found considerable and significant use in supporting scientists from various disciplines (including all engineering disciplines, mathematics, physics, chemistry, finance, biology, and data analysis to name a few). Many burgeoning scientists are still taught to write \"a code\" for some problem and to debug when things look wrong. Given the ever-increasing complexity of software solutions to scientific problems, this old paradigm is no longer tenable and at best inefficient. CS107/AC207 is an applications course highlighting the use of software engineering and computer science in solving scientific problems. You will learn the fundamentals of developing scientific software systems including abstract thinking, the handling of data, and assessment of computational approaches: all in the context of good software engineering practices. The class syllabus can be found by following this link. Teaching Staff The preferred way to reach the teaching staff is described in the Teaching Staff Mailing List section below. Instructor Ignacio Becker ( iebecker@g.harvard.edu ) Office: SEC, Office 1.312-05 Office Hours: Wed 5:00-6:00pm Teaching Fellows Fellow Email Office Hours Pair-Programming Sections Kimon Vogt kvogt@g.harvard.edu Sat 8:00-10:00am (Zoom) Mon 8:00-9:15am (Zoom) Fri 8:00-9:15am (Zoom) Yixian Gan ygan@g.harvard.edu Tue 5:00-6:00pm (SEC 6.301+6.302) Mon 6:00-7:15pm (SEC 6.301+6.302) Allison Karp akarp@mde.harvard.edu Thu 9:30-10:30am (SEC 6.301,6.302) Tue 9:30-10:45am (SEC 6.301+6.302) Gekai Liao gekailiao@g.harvard.edu Thu 4:00-5:00pm (MD PierceHall 100F) Tue 3:45-5:00pm (SEC 6.301+6.302) Victor Zhu dunminzhu@g.harvard.edu Mon 10:00-11:00am (Zoom) Thu 6:00-7:15pm (Zoom) Frank Cheng xcheng@g.harvard.edu Fri 4:00-5:00pm (SEC 2.122+2.123) Thu 1:00-2:15pm (SEC 2.122+2.123) Danni Lai danninglai@g.harvard.edu Wed 4:00-5:00pm (Zoom) Tue 7:00-8:15pm (Zoom) Isabella Bossa isabellabossa@g.harvard.edu Tue 10:45-11:45am (MD 223) Thu 8:00-9:15am (MD 123) Tanner Marsh tam997@g.harvard.edu Fri 10:00-11:00am (SEC 2.112) Thu 1:00-2:15pm (SEC 2.122+2.123) Boxiang Wang bwang@g.harvard.edu Mon 1:00-2:00pm (SEC 6.301+6.302) Thu 3:45-5:00pm (SEC 4.405) Shuheng Liu shuheng_liu@g.harvard.edu Mon 7:00-8:00pm (Zoom) Tue 7:00-8:15pm (Zoom) Cyrus Asgari cyrusasgari@college.harvard.edu Wed 3:00-4:00pm (MD 123) Fri 12:00-1:15pm (SEC 2.122+2.123) Haitian Liu hliu3@g.harvard.edu Fri 2:45-3:45pm (SEC 2.122+2.123) Fri 1:30-2:45pm (SEC 2.122+2.123) Legend: SEC : Science and Engineering Complex, Northwestern Av 150, Allston MD : Maxwell-Dworkin, Cambridge Please see Pages section in Canvas for a Google Calendar. Lecture Hours All lectures are of 75 minutes duration. Time is given in Eastern Standard Time (Boston). Lecture attendance is mandatory : Time Room Tuesday 2:15 - 3:30 PM SEC 1.321 Thursday 2:15 - 3:30 PM SEC 1.321 Important Information Canvas: Is used for posting grades and other sensitive content. The class can be found on Canvas at this link https://canvas.harvard.edu/courses/122565 Class git repository: All handouts in CS107/AC207 are provided through the main repository hosted in the CS107 organization at https://code.harvard.edu/CS107/main . You can set this repository as an upstream in your private class repository or clone it once you have joined the CS107 organization git clone git@code.harvard.edu:CS107/main.git Updates to the main repository are posted on the class mailing list. Your Harvard ID is required to login to https://code.harvard.edu . You can request membership in the CS107 organization (AC207 students join the CS107 organization as well) by sending an email to cs107-staff@g.harvard.edu (using your .harvard.edu email). You must include your NetID in the body of your email, which is also your https://code.harvard.edu username (something similar to abc123 ). Once you have been added to the CS107 organization, create your own private repository inside the organization. Your private repository must have the exact name as your NetID . This will be your private class repository where you submit your homework and pair-programming exercises. See the following tutorial to help you get started with your git repository: How to setup your private class repository Class Discussion Forum We will use the Ed Discussion forum on our Canvas page as our main communication platform. Questions regarding homework, labs or lecture material must be posted on this forum and you are encouraged to reply to questions if you know the answer or you can share a useful contribution. A fraction of your participation grade is computed by how often you visit and the frequency you post on the forum. Class Mailing List You can optionally sign up to our class mailing list if you would like to be notified whenever there is new class content available in the class git repository. Replies to posts in this list will be sent to all list members. To sign up, send an email to: cs107+subscribe@g.harvard.edu (subscribe by sending a blank email to this address; use the email address associated with your HarvardID ) You are required to confirm your subscription. Simply reply to the confirmation email with a blank message to complete the subscription. Teaching Staff Mailing List You can reach the teaching staff directly by sending your email to the following mailing list cs107-staff@g.harvard.edu (email sent to this list is only seen by the teaching staff; only email ending with .harvard.edu is accepted) You are not required to register for this mailing list but only email addresses ending with .harvard.edu are accepted (you will receive a rejection message otherwise). Getting Started Checklist Sign up with the CS107 organization on https://code.harvard.edu/CS107 and create your own private repository inside the organization . Information flow: Canvas → Grades and discussion forum https://code.harvard.edu/CS107 Assignment submissions inside your private repository (homework, pair-programming exercises) Group repositories for project work All course handouts are published in the https://code.harvard.edu/CS107/main repository Need help? → cs107-staff@g.harvard.edu OPTIONAL: Sign up on the class mailing list to receive push notifications when new content is available in the https://code.harvard.edu/CS107/main class repository. You can get an Ubuntu docker container with the necessary class tools by docker pull iacs/cs107_ubuntu . Note that no ssh keys are contained in that image for use with git . See also the docker resources page .","tags":"pages","url":"pages/systems-development-for-computational-science/"}]} \ No newline at end of file +var tipuesearch = {"pages":[{"title":"CS107/AC207 Project","text":"Project Overview Goal You will develop a software library for a client, the teaching staff. The development of this library will leverage modern software development practices covered in the course. By the end of the semester, the client should be able to easily install and run your package. Topic The project topic is spectral analysis , which consists of the analysis of data obtained from publicly available sources currently used by professional astronomers to perform state-of-the-art research. Moreover, spectral data appears in many fields of science and engineering, and you are likely to encounter it in your professional careers. Your final project is to write a Python library. Your library is not required to have every module implemented; that would simply be too much for a single semester. However, your library should meet the basic project expectations outlined in the Software Requirements Specification (SRS). Project Milestones The following weight table is used for individual milestones of the project. The individual milestones make up the final project grade listed under the Grading section in the syllabus. Additional milestones will be included in the near future. The due date for the final milestone is December 14th 2023, 09:59 PM. The due date for the final milestone is December 17th 2023, 09:59 PM. Milestone Due Total Points Milestone 1 Thu, November 2nd, 09:59 PM 1 Milestone 2 Thu, November 9th, 09:59 PM 1 Milestone 3 Tue, November 14th, 09:59 PM 21 Milestone 4 Tue, November 28th, 09:59 PM 23 Milestone 5 Mon, December 11th, 09:59 PM 55 Milestone 6 (Final) Sun, December 17th, 09:59 PM 225 + 15X Total 326 + 15X Groups You will work in groups of 4-5 students. You are free to choose your project partners but groups sizes must consist of the number of students mentioned before. Some members of the group will be stronger than others. It is expected that you work together and help each other as needed. This is an opportunity for less experienced coders to improve their skills by working with more experienced coders. Every person must contribute. Expectations This project encompasses several mandatory requirements, detailed under basic expectations and within Annex A of the Contract. Furthermore, the project includes supplementary elements, specified under additional expectations and delineated in Annex B of the Contract. Basic Expectations Python library that can be used for astronomical spectral analysis. The library must comply with the API described the Contract. The client should be able to easily install the library, run the tests, access the documentation, and use the library for their application. Documentation for every subsystem in the project must be provided. Link to the docs from the README.md in each folder. The top level README.md should contain an overview, links to other docs, and an installation guide which will help us install and test your system. The project must utilize a proper packaging system for distribution and installation of the library. The project must ship with a test suite. Documentation on how to run the tests is mandatory. Additional Expectations In addition to the basic requirements of the library, you must also extend your package with at least two additional modules. Cross-Matching Machine Learning Interactive Visualization Spectral Feature Extraction You are more than welcome to pitch your own idea, which must be approved by the Teaching Staff.","tags":"pages","url":"pages/project.html"},{"title":"Resources","text":"Books No book is required. But we highly recommend two books for this course. Fluent Python: Clear, Concise, and Effective Programming, by Luciano Ramalho. Publisher: O'Reilly Media. 2015. Designing Data Intensive Applications, by http://dataintensive.net/ , The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann. Publisher: O'Reilly Media 2014 Other useful books The Practice of Programming by Brian W. Kernighan and Rob Pike, Addison-Wesley, 1999. Skiena: The Algorithm Design Manual Abelson, Sussmann and Sussmann: SICP and python based online version based on it: http://composingprograms.com/ High Performance Python: By Micha Gorelick, Ian Ozsvald. Oreilly Media 2014. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation by Andreas Griewank Papers and other readings Python pep8 An opinionated guide to python style Git Recommended: Git from the bottom up Recommended: Git Book GitHub Videos and Training GitHub Interactive Tutorial Git - the simple guide Git Reference Git Cheat Sheet Git Immersion Tutorial Git Atlassian Tutorial Python python Rich overview of Python 3 language features (recommended to work through) Scientific visualization with Python and Matplotlib C/C++ Fall 2021 C/C++ primer material C Tutorial C++ Tutorial C++ Cheat Sheet C++ Reference Vim Spend 30 minutes to complete the vimtutor . After you have installed vim , execute the following command in your command line: vimtutor Vim Cheat Sheet Vimcasts Recommended book Bash Command Line Reference Cheat Sheet Bash scripting Cheat Sheet Unix-Related Basic Computing Tools Windows Users Using Linux Subsystem on Windows 10 PuTTY SSH client for Windows Ubuntu Docker Image You can get an Ubuntu based Docker container with docker pull iacs/cs107_ubuntu The container is hosted here . The Dockerfile and run_cs107_docker.sh launch script can be found in the class repository .","tags":"Resources","url":"pages/resources.html"},{"title":"Schedule","text":"","tags":"pages","url":"pages/schedule.html"},{"title":"Schedule","text":"All due events with a given date are due on 09:59pm that day . Wk Tuesday Thursday Labs Events 1(35) Lecture 1: 2023-09-05 Class introduction/organization History of Bell Labs, Unix and Linux Command line introduction Lecture 2: 2023-09-07 More command line Pipes Regular expressions File attributes 2(36) Lecture 3: 2023-09-12 Command line customization I/O redirection Environment variables Shell scripting Process management Lecture 4: 2023-09-14 Version control systems (VCS) Centralized and distributed models Intro to Git PP01: (2023-09-12) Setup private class repository, tmate . 3(37) Lecture 5: 2023-09-19 Version control systems (VCS) Managing repositories Remote repositories Branching Lecture 6: 2023-09-21 Python basics Objects and Functions Environments Closures PP02: (2023-09-18) Bash scripting, Git workflow. Note: PP01 deadline (2023-09-22) 4(38) Lecture 7: 2023-09-26 OOP in Python Classes Inheritance Polymorphism Lecture 8: 2023-09-28 Python data model Dunder methods Software licenses PP03: (2023-09-25) Git local branches, merge conflics and merge tool. Note: HW1 deadline (2023-09-27) PP02 deadline (2023-09-29) 5(39) Lecture 9: 2023-10-03 Classes and methods Modules and packages Python Package Index Lecture 10: 2023-10-05 Databases SQL SQLite PP04: (2023-10-02) Python closure, fully connected neural networks. PP03 deadline (2023-10-06) 6(40) Lecture 11: 2023-10-10 Databases: OLAP & OLTP SQL: Joins Lecture 12: 2023-10-12 SQL Joins Pipelines Case Study PP05: (2023-10-10) SQL and SQLite in Python. HW2 deadline (2023-10-13) PP04 deadline (2023-10-13) 7(40) Lecture 13: 2023-10-17 Pipelines Software systems Documentation Lecture 14: 2023-10-19 Testing PP06: (2023-10-16) SQL and pipelines. PP05 deadline (2023-10-20) 8(40) Lecture 15: 2023-10-24 Testing revisited Exeptions Test coverage Lecture 16: 2023-10-26 Continuous integration PP07: (2023-10-23) Documentation and testing Quiz #2 deadline (2023-10-25) HW3 deadline (2023-10-27) PP06 deadline (2023-10-27) 9(40) Lecture 17: 2023-10-31 Containers Virtual environments Docker Lecture 18: 2023-11-02 Data structures Linked lists Iterators PP08: (2023-10-30) Package deployment PP07 deadline (2023-11-03) 10(40) Lecture 19: 2023-11-07 Binary search trees Tree traversal Priority queues Lecture 20: 2023-11-09 Heaps PP09: (2023-11-06) BST, Docker images HW4 deadline (2023-11-10) PP08 deadline (2023-11-10) 11(40) Lecture 21: 2023-11-14 Generators Coroutines Lecture 22: 2023-11-16 Python internals Memory PP10: (2023-11-13) TBD PP09 deadline (2023-11-17) 12(40) Lecture 23: 2023-11-21 CATCH UP lecture Thanksgiving break: 2023-11-23 No PP11 PP10 deadline (2023-11-17) 11(40) Lecture 24: 2023-11-28 Performance Lecture 25: 2023-11-30 Project work PP12: (2023-11-27) TBD Quiz #3 deadline (2023-11-30) 11(40) No lecture: 2023-12-05 Work on the project Work on other projects Rest and relax Reading period: 2023-12-07 PP12 deadline (2023-12-01) 11(40) Final exam period: 2023-12-12 Final exam period: 2023-12-14","tags":"pages","url":"pages/schedule_static.html"},{"title":"Syllabus","text":"Course Objective The primary goal of this course is to teach you how to develop effective software for scientific applications. In order to achieve this goal, there are several non-negotiable topics that must be included in the course. We will be concerned with two primary thrusts: System and Software Engineering and Language . Moreover, we aim to provide you with a suite of modern software development techniques and workflows. Learning Objective After successful completion of this course, you will be able to: Use Python, including its advanced features to write scientific programs. Have a basic idea how the Python interpreter works. Understand what features of Python make up its language execution model and how these features impact the code you write: e.g. how modularity, abstraction, and encapsulation can be used to solve problems. Write programs with good software engineering practices. These practices include: working on remote machines, version control, continuous integration, documentation and testing. Utilize data management techniques to store data, starting from a good understanding of data structures to databases. Combine these techniques together to write large pieces of software working in a team. Develop pipelines to integrate data aquisition and processing. Evaluate and test software as part of the development process. Be able to contribute on both the science and software engineering sides of things. Prerequisites You should have some basic familiarity with programming (functions, variables, constants, differences between integer and floating point, etc.) at the level of CS50. Some comfort with a tool to edit text files is beneficial. Any text editor or IDE will suit this purpose. The student should have passed a basic calculus class. The lectures will review the necessary fundamentals required to succeed with the class project. Besides this, you should have interest or investment in scientific computing. You can download Homework 0 for self-assessment here (not graded). You do not need to be able to solve all problems in order to take this class. Jupyter Notebooks Jupyter notebooks are great for code prototyping and learning how to use new features and APIs. However, they are not suitable for large software development projects! One reason for this is because code development in Jupyter notebooks is a nonlinear development process and there is presently no good solution for version control of Jupyter notebooks. A second reason is the question of efficient source editing. A helpful tool to convert (back and forth) Jupyter notebooks to pure python code is Jupytext . Homework assignments and lecture exercises turned in as Jupyter notebooks will not be graded. Textbooks There is no required course textbook. However, the course content will draw from various sources. We will cite the source when appropriate. Please consult the resources page for recommended textbooks and additional helpful material. Course Format The delivery of course content will occur via two weekly lectures as well as weekly pair-programming sections. Attending these sessions is mandatory . Lectures will consist of considerable interaction and discussion and will be greatly enhanced by student participation. The course contains the following main components: Lectures: Deliver the main content of the class. Attendance is mandatory. Quizzes: Graded in-class quizzes intended to assess the learning progress. Pair-programming: Pair-programming (PP) sections offer practice on topics addressed in class and help assess the skills to program in a collaborative environment. Attendance is mandatory. Homeworks: Homework assignments deepen the lecture material and include coding exercises. Exercises may be of theoretical or practical nature. Projects: The class is accompanied by a project (teams of 4-5 students) to practice the methods learned in class on a real Python application. The project topic is given by the teaching staff. The main programming language taught throughout the course is Python. Grading The following weight table is used for individual components of the class. The class does not have standard midterm or final exams. Total Weight Homework (5 Homeworks) 35% Project 35% Quizzes (3 Quizzes) 15% Pair-programming (11 sections) 15% Homework There are 5 homework assignments where each contributes equally to the final grade. The homework is focused on the topics discussed in class and involves programming and theoretical work. The teaching staff is determined to return solutions and graded assignments with feedback after the due date. It is your responsibility to check the consistency between your graded work and the assignment solution. You have the option to address possible inconsistencies in office hours or request a regrading for the assignment (see the homework grading inconsistencies section below). Homework will be released on the CS107/AC207 class repository . Push notifications for that repository will be distributed through the class mailing list . Homework will be graded on a 100 point scale: 100 = Solid / no mistakes (or really minor ones) 80 = Good / some mistakes 60 = Fair / some major conceptual errors 40 = Poor / did not finish 20 = Very Poor / little to no attempt. 0 = Did not participate / did not hand in Homework Submission Homework must be submitted via commits in your private git repository hosted in the CS107 organization at https://code.harvard.edu/CS107 . Grading and feedback for homework is done through the Gradescope platform which is connected to the class' Canvas site . Your homework solutions must therefore be zipped and uploaded in the Gradescope section of the class canvas. See the homework workflow tutorial for more details. The homework due date is indicated on the problem sheet and displayed in the schedule as well as shown on Canvas and Gradescope. Homework submissions will be graded on: Correctness: your code must run and must produce the correct result. We are not debugging issues when grading submissions. Presentation: presentation means structure and readability. We expect you to write high-quality, readable and tested code. A quality code is well commented in places where it is not straight forward to deduce the logic from code itself (from the reviewers perspective). We expect you to think about aspects such as modularity, reusability, code duplication and error handling when you design and write code. Presentation of results also means that unnecessary or superfluous files like editor backup files or other unrelated data should not be included in the submission commits (use .gitignore for this purpose). See the following tutorials to help you get started with homework submissions: How to setup your private class repository (onetime setup) Homework workflow Homework Late Days Homework submissions are accepted before the deadline of the assignment is due. You have three late days at your disposal that can be consumed for late submissions and two consecutive late days can be used at most for any of the homework assignments. Please note that any commits on your homework branch pushed after the deadline has passed are not considered for grading by default. If you wish that we consider a late commit for grading, please contact the teaching staff at cs107-staff@g.harvard.edu with appropriate explanation. This will count towards your late day budget. It is your responsibility to plan your work ahead and commit on time. If you have consumed all your late days and you have another late submission, it is in your benefit to still commit the work. We assume the Harvard Honor Code for all late submissions in case solutions are already posted. If you have a verifiable medical condition or other special circumstances that interfere with your coursework please let us know via cs107-staff@g.harvard.edu as soon as possible. Homework Grading Errors If you believe there is an error in your assignment grading, you can submit a regrade request through the Gradescope platform . Note: The entire assignment will be regraded. This may cause your total grade go up or down . An assignment can only be regraded once . Regrade requests are due within 2 days after the release of the grades . Project Please see the project section for more details. Quizzes There are 4 quizzes out of class which are graded and intended to assess the learning progress. Each quiz addresses topics from the lecture material . Quizzes are open book/ www and include multiple choice questions with at most back of the envelope calculations. Quizzes contain around 15-20 questions and take 30 minutes. They are accessible on canvas within a 24 hour time window from 8pm. Note: if a quiz takes 30 minutes and you start the quiz on 8:50pm, you will have only 10 minutes to work on the quiz. Please see the class schedule as well. Pair-Programming Sections Pair-Programming will form an essential part of the course. Pair-programming will take place in mandatory pair-programming sections led by members of the teaching staff. You are required to sign-up for your preferred pair-programming section at the beginning of the semester. You are expected to attend your chosen section during the semester. Should you not be able to attend one of your sections, please coordinate with your section TF to attend another section this week in order to obtain the attendance credit. In CS107/AC207 we are focusing on command line tools for the development of software projects in computational science. It is important that you get familiar with a small selection of such tools and integrate them in your development process. The pair-programming sections aim at combining some of these tools together to provide you with hands-on experience while developing software. The key is the \"pair\" in pair-programming. The exchange of knowledge between team mates in these pair-programming sections is essential for learning said tools or learning something new from your peers. Pair-programming Submissions Pair-programming exercise solutions must be submitted in your private git repository hosted in the CS107 organization at https://code.harvard.edu/CS107 . Only commits made on or before the due date will be considered for grading . The deadline for submission is usually one week after the last section for the exercise. Given this extra time for completion, late days do not apply to pair-programming exercises . The submission due date is indicated on the problem sheet and displayed in the schedule . As you are working in groups of 3-4 students for the lab exercises, the solution files you come up with in the group are submitted by each group member individually in her/his own private Git repository. Pair-programming submissions will be graded based on the following criteria: Attendance: your attendance will be recorded by the TF who leads the section. Joining the section at the beginning and then leaving 10-15 minutes later will not reward attendance credit. If you need to leave because of another appointment then it is expected that you communicate beforehand and coordinate with your TF. Please see the attendance policy section below as well. Your pair-programming session is determined at the beginning of the class by choosing lab sections in my.harvard . You can select your preferences depending on your schedule. Once determined, you can lookup your session details in the https://code.harvard.edu/CS107/main/blob/master/lab_groups.xls sheet. Completion: pair-programming submissions should reveal effort that the student attempted to solve the tasks. If you experience difficulties in a particular problem and you are not able to complete the task, please indicate the issues you had in your code using comments; the teaching staff will take that reasoning into account. Handing in an empty skeleton (same as hand-out) does not meet the expected standard and will not award credit for the submission. See the following tutorial to help you get started with pair-programming submissions: Pair-programming workflow Office Hours The teaching staff holds weekly office hours. Office hour times and locations are listed on the class main page. Office hours offer an opportunity to review course materials and receive additional guidance on your homework. Please see the following file in the class git repository for the details: https://code.harvard.edu/CS107/main/blob/master/office_hours.xls Attendance Policy Attendance at lectures and pair-programming sections is mandatory as they are core parts of the class. Pair-programming sections (labs) will be held on weekdays that we determine at the beginning of the class according to a best fit of the students' individual schedules for the term. You are required to attend the labs on the assigned day. Rescheduling of a lab to a different day due to an unforeseeable event must be coordinated with the responsible TF by sending an email to cs107-staff@g.harvard.edu . To be excused from a lecture or a lab, we ask you to follow the Harvard Honor Code and send an email to cs107-staff@g.harvard.edu at least one day before the lecture or lab. Lecture recordings are available only when students are excused for a lecture. Collaboration Policy You are welcome to discuss the course material and homework with others in order to better understand it, but the work you turn in must be your own (with exception of the project where collaborative work is permitted). Any work submitted as your own without properly citing the original author(s), is considered plagiarism. Failure to follow the academic integrity and dishonesty guidelines outlined in the Harvard Student Handbook will have an adverse effect on your final grade. This includes the removal of copyright notices in code. You may not submit the same or similar work to this course that you have submitted or will submit to another without permission. The teaching staff may use tools to compute correlations between submitted work. Use of AI Models Purpose of Policy: This policy outlines the acceptable use of AI models, including but not limited to ChatGPT, in completing assignments for this course. Policy Guidelines: Original Work: Students are expected to complete assignments using their original thoughts and interpretations. AI models can be used to help understand concepts, generate ideas, or learn about different perspectives, but they should not write or complete assignments for students. Collaboration with AI: Students may use AI models for brainstorming or generating preliminary ideas, but the final work submitted must be substantially their own. Students should be able to explain their reasoning, logic, and conclusions without relying on the model's output. Restrictions for Specific Assignments: There may be specific assignments (e.g. quiz part of the midterms) or parts of the course where the use of AI models is entirely prohibited. These restrictions will be clearly stated in the assignment guidelines. Ethical Considerations: Students are encouraged to approach the use of AI with ethical considerations in mind, including issues related to privacy, bias, and authenticity. Consequences for Non-Compliance: Failure to adhere to this policy may result in academic penalties as outlined in the course's academic integrity policy. Questions and Clarifications: If students have questions about the appropriate use of AI models in an assignment, they should consult the course instructor or teaching assistants before proceeding. Please refer to the University's policy for further information. Accessibility If you have a documented disability (physical or cognitive) that may impair your ability to complete assignments or otherwise participate in the course and satisfy course criteria, please contact the teaching staff or directly the Accessible Education Office to receive an AEO letter that will authorize us to help you with corresponding accommodations. Diversity Statement All participants in this class are expected to foster empathy and respect towards each other. This includes instructors, teaching staff or students. The motivation to take this course shall be to experience the joy of learning in an environment that allows for a diversity of thoughts, perspectives and experiences and honors your identity including race, gender, class, sexuality, religion, ability, etc. Any constructive feedback for improving the class environment is welcome and I encourage you to reach out to the instructor or teaching staff with any concerns you may have. If you prefer to speak with someone outside of the course, you may find helpful resources at the Harvard Office of Diversity and Inclusion .","tags":"pages","url":"pages/syllabus.html"},{"title":"Tutorials","text":"How to Setup your Private Class Repository Steps to Setup Your Private Class Repository Add an SSH Key to Your Account Homework Workflow Example Homework Workflow Step 1: Branch Off Step 2: Solving the Homework Step 3: Create a Pull Request Creating a Web Pull Request Step 4: Submit on Gradescope Pair-programming Workflow Protocol How to launch tmate Recommended Workflow How to Setup your Private Class Repository All of your work in CS107/AC207 will be committed in your private class git repository hosted in the CS107 organization at https://code.harvard.edu/CS107 . (The class project will be hosted in another repository in the same organization, see Milestone 1 for this separate task.) This tutorial walks you through the steps to create your private class repository. If you have already created git repositories on GitHub, then there is nothing new to learn in this tutorial and you should be familiar with the process already. A note on https://code.harvard.edu/ : this is an instance of a GitHub Enterprise edition hosted by Harvard University. The user interface is identical to the public GitHub site. The main difference is that https://code.harvard.edu/ is owned by Harvard University , whereas GitHub belongs to Microsoft which gives rise to security concerns regarding data belonging to classes held at Harvard University. Steps to Setup Your Private Class Repository Obtain your Harvard NetID Send an email to cs107-staff@g.harvard.edu (using your .harvard.edu email) to request access to the CS107 organization . Include your NetID from step 1 in the body of the email and choose an appropriate subject line. Once added to the organization, navigate to https://code.harvard.edu/CS107 (login if necessary) and click the green \"New\" button to add a new repository. Your repository must be named after your NetID . You can add an optional description if you like. Make sure the private radio button is checked and click \"Create repository\". You do not need to check any other options. This is all you have to do for now. In the first homework we will focus on how to setup your new repository such that you can work with it from your laptop (you can skip the landing page after you have created the repository). When you navigate back to https://code.harvard.edu/CS107 you should see something similar to this: The blurred repository is your private class repository that was the focus of this tutorial. The main repository is the main CS107/AC207 class repository which is used to distribute all of the class material during the semester. Any updates to this repository will be broadcast via email message such that you will not miss out on new material. In the first homework we will set this repository as an upstream such that you can conveniently unpack class material into your private repository. Note: private repositories are only visible to you within the organization. Please do not create other repositories in the https://code.harvard.edu/CS107 organization. You have your own user account on https://code.harvard.edu/ just like you have on GitHub or other providers. Your user account requires your Harvard login credentials and is a good alternative to hosts like GitHub. Feel free to create as many repositories in your user account as you like. Add an SSH Key to Your Account In order to access content on https://code.harvard.edu using Git you need to setup an SSH key. Check if you already have the file ~/.ssh/id_rsa.pub (assuming RSA). If you do not have such a file you can create one with ssh-keygen -t rsa -b 4096 Choose the default location by just hitting enter. You may enter a password for the key or just hit enter to go without password. If go with password you will have to enter it every time you use the key. To upload the public key to your Harvard GitHub account , click on your icon in the top right corner on your https://code.harvard.edu page, then click on \"Settings\" and then \"SSH and GPG keys\" in the left panel. Alternatively use this link https://code.harvard.edu/settings/keys . Click on the green \"New SSH key\" button in the top right corner and give your new key a title (e.g. the name of your laptop). In the key field paste the contents of your public key found in ~/.ssh/id_rsa.pub . Use for example cat ~/.ssh/id_rsa.pub and copy paste the output into the \"Key\" field on your GitHub page. You are now able to access any repositories on https://code.harvard.edu with corresponding permissions. Never share your private key ~/.ssh/id_rsa with anybody. Note: do not create a key in the class Docker container since the key will be lost when you exit the container. For security reasons, sensitive keys like this should not be put in containers. Homework Workflow The following are the basic rules we apply for homework submissions: Naming convention for homework directories: your private repository should contain one homework directory on the repository root with hwX sub-directories for each homework assignment. The X in hwX is to be replaced with the assignment number. For example hw1 , hw2 and so on. Which files will be considered for grading: within the sub-directory hwX , place the assignment files that you want us to grade in a directory called submission . We will only grade data in these directories . Pull request (PR): your homework assignments must be completed on git branches called hwX , where X is again to be substituted with the assignment number. Your homework X submission requires an open pull request to merge the hwX branch into your main (or deprecated master ) branch for full points (both branches are inside your private class repository in the CS107 organization ). Some implications of this: Solving homework on the main or master branch is always wrong. For each homework submission you need to issue one open PR. Merging an open PR before the teaching staff has reviewed and graded your work will make the PR disappear . Only files inside submission in PR X will contribute to your hwX grade (see next item). Gradescope: your homework will be graded on the Gradescope platform that has been setup and linked to the class canvas page. The platform does currently not support submission directly via your Git repository. You therefore have to create a zip archive of your submission directory created in step 2 above and upload the archive on Gradescope . It is important that you zip-up the directory and not individual files inside. You can use the command zip -r submission.zip submission/ , where the -r option means add files recursively , submission.zip is the name of zip archive and submission is your homework submission directory from step 2 containing your solution. This assumes you are in homework/hwX inside your Git repository. Points will be lost if any of these requirements are violated . The teaching staff will review the open PR for each homework and grade your work accordingly. Grades will be released on canvas and feedback is provided through the Gradescope platform. Once you have received the grade and feedback, your open PR for homework X can be merged into your main or master branch if there are no more pending issues. After the PR is closed, you may delete the hwX branch in your repository. This concludes a homework submission. Example Homework Workflow This example is intended to help you internalize the three basic rules described above. Each homework awards 10 points by performing these steps correctly. Note: Specific instructions provided in each homework assignment may override the following basic approach. Suppose we want to work on homework assignment 3, which consists of 4 problems. Step 1: Branch Off The ease of branching is the main strength of git . Branches allow you to be destructive without affecting production code or data. The reason we solve homeworks on individual branches is to help you develop a feel for this protection and to materialize the required steps to create branches. Branches will provide you true comfort when working on real projects outside of this class. Make sure your master or main branch is in the state you want your new branch to be based on. If you need to synchronize with your default remote branch you can type git pull The next step consists of creating and switching to a new branch that is based off the current branch. For this you can use git checkout -b hw3 which is how you did it before git 2.23.0 . Since the checkout command is ambiguous , the preferred way for more recent versions of git is git switch -c hw3 You are now on a new branch called hw3 as required. You will need to issue a pull request into main or master from this branch such that your homework will be graded. You can create the PR now (see below) or once you are done with solving hw3 , it does not matter to git . (Pull requests are not something designed by git itself, but rather by platforms like GitHub or GitLab.) Note: you will lose 5 points if you are not solving your hwX (in this example it is hw3 ) on a branch named hwX . You are of course free to create additional branches besides hwX if suitable. Step 2: Solving the Homework The files the teaching staff will consider for grading have to be located in the directory homework/hw3/submission . You are free to put other files below homework/hw3 that might be useful when you revisit your work sometime later. The problem sheet might be one of those files. Class handouts are distributed in the main class repository . You can manually create these directories and copy the files you want into your hw3 directory using, for example: mkdir -p homework/hw3/submission cp /homework/hw3/hw3.pdf homework/hw3 Alternatively you can use git by configuring the main class repository as another remote in your local git repository (see homework 1). In this case you can checkout all the distributed homework files at once with git checkout class/master -- homework/hw3 assuming that the remote points to https://code.harvard.edu/CS107/main and is locally named \" class \". You may need to update your refs with git fetch --all before you invoke the checkout command above. The homework sheet will state what files have to be submitted. For this hw3 we assume they are P1.py , P2.py , P3.py and P4.py , one for each of the four problems. These files should run and return the required output. They have to be submitted inside the homework/hw3/submission directory. You should commit your work often in logical chunks. Your commits are to be done on the hw3 branch, of course. The following are a few commands that might be helpful: Use git status often to check your local state. Use the git add command to stage files you have changed for a commit. Use git commit -m to create a commit with an appropriate commit message. Use git stash to temporarily stash modified files (similar to a commit but it is not written to the history). Later on use git stash list to list all your stashed changes if you have used git stash multiple times. You can check what will change when you apply the stash with git stash show -p and apply the stashed changes with git stash apply (or git stash pop which also removes the stash from the list). Note that these commands work on the first stash object in the list stash@{0} if you do not explicitly specify the stash object you want to apply. Use git push to push local branch/commits to your remote repository. Use git restore to undo changes to a single file. Use git revert to undo the changes in a specific commit. Make sure you have committed your solution you want to submit inside the homework/hw3/submission directory with the required file names. Step 3: Create a Pull Request If you have local commits not pushed to the remote issue the git push command. You are now ready to issue a pull request (you could also have done this step at the very beginning of solving this homework, this is up to you). The goal is to merge the hw3 branch into your main or master branch eventually. The teaching staff must review and grade your work first, however. There are two ways to accomplish a PR on GitHub: Through the web browser at https://code.harvard.edu/CS107/ . Through the GitHub command line client . This method is helpful if you get distracted from the context switch that is associated with the first method. Note: you will lose 2 points if you do not create a PR. Disclaimer: you would not typically issue a PR for projects you are the sole contributor. Pull requests are typical for large projects at a company in which someone else will review your code before you can merge your code to the production branch. We want you to become accustomed to this type of workflow. It is a good idea to always use separate development branches. You should never commit straight to your main or master branches until the changes have thoroughly been tested. Creating a Web Pull Request Navigate to your https://code.harvard.edu/CS107/ private class repository and click on the \"Pull Requests\" tab in the top left part of the window. Click on the \"New pull request\" button Choose your main or master branch as the base (the one you want to merge into) and your hw3 branch as the one you want to compare to. This should automatically reload the page and show the changes that will be applied. Click on the \"Create pull request\" button. You can optionally add comments to this pull request if you desire. Click on the \"Create pull request\" button once more to create and open the pull request. The pull request is now open. You can even push more commits to the hw3 branch if you need to correct something (before the deadline has passed of course). Therefore, you could also create the PR at the beginning of the homework. Note: DO NOT click on the button that says \"Merge pull request\" until you have received your grade and feedback for that homework. You will lose 3 points if you prematurely merge your PR. Step 4: Submit on Gradescope Your submission is now ready to be submitted for grading on Gradescope . Simply create a zip archive of your submission directory you have created in your Git repository, e.g. submission.zip , and upload it to Gradescope by following the link above. You can use the command zip -r submission.zip submission/ , where the -r option means add files recursively , submission.zip is the name of zip archive and submission is your homework submission directory. Since you track the change history of your work in Git, you should not add *.zip files to your Git history. You can simply ignore such archives by adding the line *.zip to your .gitignore file in your repository root. Pair-programming Workflow Exercises performed during pair-programming sections should be put under version control similar to homework assignments (see the Homework Workflow section above). You must not branch off and create a pull-request for pair-programming exercises . Just add and commit your work on the main or master branch and push them to your repository ( make sure you are on the correct branch before you commit! ). The following are the basic rules we apply for pair-programming submissions: Your private repository should contain a directory named lab with sub-directories for each session. The sub-directories should be named ppX where X is the session number. Within the sub-directory ppX , place the exercise files that you completed during the pair-programming sections. The exercises must have the name exercise_Y.ext where Y corresponds to the exercise number and ext is the proper extension ( .py , .sh , .c , .cpp ) depending on the exercise. Here is an example how it may look like: The pair-programming exercises will be graded for completeness and help us ensure you are on the right track. You may lose points for the completeness part if you do not follow these two basic rules. Protocol In class we are focusing on command line tools for the development of software projects in computational science. It is important that you get familiar with a small selection of such tools and integrate them in your development process. The pair-programming sections aim at combining some of these tools together to provide you with hands-on experience while developing software. The key is the \"pair\" in pair-programming. The exchange of knowledge between team mates in these pair-programming sections is essential for learning said tools or learning something new you did not know before. The pair-programming works by using a tool called tmate which is based on the tmux terminal multiplexer . It allows for easy sharing of a command line session or a specific instance of a program in read/write and read-only modes via ssh or web browsers. Check out this blog post for more. Text file editing will be performed in any text editor that supports a text-based user interface (TUI). Recommended choices are vim or emacs . If you are mainly working on a Windows operating system, you should install the Windows Subsystem for Linux . A small guide for doing so can be found here . See the How to Launch tmate section below for the steps to launch tmate on your laptop. Note: tmate is perfect for any coding related communication. For example, debugging, work on the project and of course pair-programming. There is no audio channel integrated in tmate . If students are remote, a zoom session or similar must be established for oral communication. The exercises in the pair-programming sections are necessarily collaborative. Each member of the group will turn in the same script. Adhere to the following workflow when solving pair-programming exercises: For each exercise (or sub-exercise for big problems), there will be one sharer , one coder , and one listener . This assumes a group size of 3 . If the group only has two people, then either one of you can take the sharer's role. The sharer will start each coding session and document interactions including points of contention and challenges. The coder will be in charge of writing the code. The listener will make suggestions and may offer tweaks from time to time. The sharer starts a new tmate session and invites the other team mates to join the session. Ideally you want to start the session inside the directory of the current pair-programming exercise in your git repository. You may share a read/write link either through ssh or a web browser. Note: the sharer allows the others access to her/his computer. Any abusive behavior that may cause harm on the sharer's system will not be tolerated and are forwarded to the dean's office. After the team mates have accepted the invite, they will be able to share the terminal instance and can create new files or execute Python together. The team should discuss a strategy on how to approach the exercise. The coder should start writing some code with input from the other two team members. Before each section that you work on, place a comment indicating which team member worked on that section. For example, a bash script could look like this: #!/usr/bin/env bash # File : exercise_1.sh # Created : Sat Aug 07 2021 04:58:49 PM (-0400) # Coder : Alice # Listener : Bob # Sharer : Alice echo 'Hello World' ### Main point of contention: whether to capitalize \"W\" in \"world\" For small exercises, each team member can play a single role once. For large exercises, the team members may rotate roles. The exercise will make it clear when you should rotate. At the end of the exercise, the developed code is inside the sharer's git repository that can readily be committed. Links to download these files can be shared with the other team mates such that they can update their repositories as well. Note that the exercise code will contain comments pertaining to who worked on which section. How to launch tmate Disclaimer: tmate is a tool to share a terminal session and interact with other people. If you host a session, it means the instance runs on your local computer and you are in control of how much permissions you want assign to your mates. There are 2 ways to share a session: read-only: mates that connect to your session can only read data (this is safe provided the data you expose is safe). read-write: mates that connect to your session can read and write data. This is unsafe if you share a session with an mistrusted person. Recommended Workflow Launch the CS107/AC207 docker container with the working directory mounted (see the provided run_cs107_docker.sh launch script ) Start tmate inside the docker container Wrapping tmate in a docker container provides another layer of security. You can also install tmate using your distribution package manager (on Linux or homebrew on MacOSX) and skip step 1 if you wish not to use a container. Steps: Assume Docker is installed and we have pulled the CS107/AC207 docker image . You can install the run_cs107_docker.sh launch script in your PATH for convenience (e.g. ~/bin/run_cs107_docker.sh and add this directory to your PATH environment variable). Be sure that the run_cs107_docker.sh script is executable . See the chmod command to change the permissions of the script. Assume you want to work on the PP1 exercise and you are in the lab directory of your private Git repo and pp1 exists. Launch the docker container and mount the pp1 directory in your repository: $ run_cs107_docker.sh pp1/ root@0a076feb425f:~# You are now inside a running docker container. Note that the hostname 0a076feb425f is arbitrary and yours will differ. Launch tmate (it is already installed in the container): root@0a076feb425f:~# tmate Tip: if you wish to use tmate only for remote access, run: tmate -F To see the following messages again, run in a tmate session: tmate show-messages Press or to continue --------------------------------------------------------------------- Connecting to ssh.tmate.io... Note: clear your terminal before sharing readonly access web session read only: https://tmate.io/t/ro-qNRV5QRVWkW3qr55sfATkBegr ssh session read only: ssh ro-qNRV5QRVWkW3qr55sfATkBegr@nyc1.tmate.io web session: https://tmate.io/t/nMWurZc7Q6Zbv8EnX2wdhf6GB ssh session: ssh nMWurZc7Q6Zbv8EnX2wdhf6GB@nyc1.tmate.io The tmate instance is now running and you can choose between 4 possible links to share with your mates: 2 that can be run in your web browser and another 2 to be used with ssh in your terminal (either read-only and read/write). Choose the appropriate link you want to share with your pair-programming mates. If you press q or ctrl-c you are dropped back to the shell. The server will tell you whenever mates join. You can print the links again with tmate show-messages (be careful when you are sharing screens on zoom for example). Note that pressing ctrl-d or typing exit in the shell will close the active terminal and if only one is left, also the active tmate session. This will close the connections to all connected clients. You can now work together on the exercise. For example: root@0a076feb425f:~# vim exercise_1.py The image below shows a terminal session (left) and two mates connected in a web browser window (right): Note: We run tmate with root in the docker container. Do not run tmate as root in any other situation (even here we could create a regular user) and be careful with password-less sudo (avoid password-less sudo in the first place). In order to use ssh you need to setup an ssh key if you have not done so already. If you do not have such a key, you may create one by running ssh-keygen -t rsa -b 4096 If you are not dropped into a shell after you execute tmate it may be because you are using a shell different than zsh . Install zsh on your system using your package manager and run tmate like this SHELL = /bin/zsh tmate","tags":"pages","url":"pages/tutorials.html"},{"title":"Milestone 6 (Final)","text":"Sunday, December 17th, 09:59 PM In this final milestone , you're tasked with delivering the library as outlined in the Contract. Library Your goal is to publish the first version of your library. Once the documentation is complete, it passes all tests, and every feature is implemented, you can merge it into the main branch. Use this branch to publish the library on Test PyPI. Each module will be graded and is worth 15 points: six for unit tests, six for implementation, and three for documentation. Each library implementation will have a different number of modules and they will be given points. In this milestone, we will grade the reamining X ungraded modules. All the integration tests must be implemented to check the API you defined (and refined) earlier. If the integration test suite is incomplete, your score will be reduced. Tutorials and video For the final presentation, you need to create a Jupyter notebook that explains your library's functionalities. This includes demonstrating how to execute required functions and handle exceptions. Upload a video to your repository to Youtube or or Google Drive in private or shared mode. You must show the real-time installation and execution of the library. Every team member must participate . Feel free to use any VM, virtual environment, or Docker container. The explanation in the video should not exceed 7 minutes. While there's no strict limit on installation time, please keep it within a reasonable duration. Write the link on at the end of your Jupyter notebook. It is your responsability to give the appropiate permisions. If the video is not present or not accesible by the teaching staff, you will not receive the points. Self-evaluation Each team member is required to upload an estimate of the hours they dedicated to the project in the dev branch of your repository. Along with this, include a brief summary of your main contributions. Submit these estimates through pull requests, which should not be accepted by the author . We'll review the commit history of the repository. Special considerations may be made in exceptional cases. Final Deliverables For this milestone, all final deliverables should be uploaded exclusively to the Github page. There's no need to send emails. Publish your library on Test PyPI. Include the link in the main branch's README file. A Jupyter notebook that showcases how your library works. Place it in a new tutorial/ directory on the dev branch. Record and upload a video presentation demonstrating the library's installation and functionality. Each team member's self-evaluation in the dev branch README. Exceptions Submission of the Jupyter Notebook example is mandatory, while the video is optional. Both serve to demonstrate your work. If you choose not to submit the video: No points will be awarded for 'Library on Test PyPI' and 'Video Presentation' if the library installation fails. If the library installation succeeds but tests fail, you will receive points for 'Library on Test PyPI' but lose points for 'Video Presentation'. Points for both 'Library on Test PyPI' and 'Video Presentation' will be awarded if the installation and tests are successful. Grading breakdown Points Task 15 Library on Test PyPI 15X X dev of remaining modules 60 Integration tests 60 Jupyter notebook tutorial 45 Video presentation 45 Self-evaluation 225+15X Total","tags":"Project","url":"project/M6/"},{"title":"Milestone 5","text":"Monday, December 11th, 09:59 PM With the development in full swing, many modules should now be ready. This milestone is to ensure that some of these modules work together correctly. API tuning During project development, you'll gain insights about the structure and modules. It's possible that the initial API isn't ideal, so you might need to revise it. This involves updating the API documentation, diagrams, and, most importantly, the code to align with the new API. If your API draft still meets the contract requirements, you can choose not to modify it. In this case, add a small appendix explaining why it remains unchanged. In the remainder of this document, we'll refer to the latest version of the API as the modified version. Features and Integration Developing individual features is usually straightforward. The real challenge lies in integrating them smoothly. For this item, develop features from different modules and conduct integration tests using GitHub Actions. You can merge features into the dev branch once they pass your tests. However, keep the feature development branches until reviewed by the teaching staff. The teaching staff will grade the dev branch. Make sure to commit often in the local repository, as it is part of the evaluation . Tests must be commited and pushed before any code is written. Design and write your integration tests based on the modified API. Work on at least two consecutive modules. This includes writing unit tests, coding, and documentation (docstring). If the modules are already created: Merge the integration tests into the dev branch. If only one module is developed: Merge the integration tests into dev before starting the second module. If no modules are created: Merge the integration tests into dev first, before any module development. Successful integration of two modules is confirmed when all integration tests pass. Note : Regularly commit to your local repository and tidy up the history as needed. Push your changes only after passing the unit tests. SFS clarifications and modifications Your main focus should be to complete the pipeline. Use placeholder functions if necessary. Any modifications, as per the contract, should be straightforward. Clarifications of the Software Requirements Specifications For Annex A: 3.A: Each task listed in this item should be aplpied to one spectrum. 3.B: Include the class (STAR, GALAXY, or QSO) in the metadata. The spectrum is the data; metadata is everything else related to it. 3.C: Aligning in wavelength means sub-sampling the wavelengths. For a given list of target wavelengths, return a flux value for each. 4: The inferred continuum is a line derived from the data, excluding any emission or absorption lines. 5: Data augmentation module execution is optional. If used, the user inputs the degree of required derivatives. For Annex B: 2: The machine learning module should primarily use spectral data. It can include other metadata. The results should report a confussion matrix. 3: Total flux of spectral lines must use an inferred continuum. The method for calculating line area is defined by Developer, and should be well-documented and easily modifiable. Modifications of the Software Requirements Specification Annex A - 3.B: Replace chemical abundances with the value of Equivalent Width for each line detected by the SDSS pipeline. Annex B - New task: Report the chemical abundances of stars from the APOGEE survey . Steps to complete Re-evaluate the document written in the folder API_draft . Make the require modifications both in your diagram and the document. Based on the modified API, reorganize your library in the dev branch. Write the integration tests on the dev branch. Complete the implementation of at least two modules of your choosing and test their integration. Every change in the library should trigger integration tests via Github Actions. In milestone5 , describe the rationale behind any API changes. This should include: Why the API was modified, focusing on how the changes improve functionality, usability, or adaptability to project requirements. Discuss how these changes enhance the integration of different modules. List the specific module names that need to be evaluated by the teaching staff for their integration. Final Deliverables The final deliverables for this milestone should be uploaded only on the Github page. No emails are neccesary. Updated API document and its diagram. Place these in the draft_API folder. The docs/ directory should include a document called milestone5 . Integration tests for at least two modules. Note 1 : By now you should have implemented many modules, and most of the items requested in the milestone. Note 2 : The reading period ranges from December 6th to December 10th. You are highly encouraged to submit earlier (notifying your liaison by email) to receive feedback. Grading breakdown Points Task 15 API 20 Modules 20 Integration tests 55 Total","tags":"Project","url":"project/M5/"},{"title":"Milestone 4","text":"Due: Monday, November 27th, 09:59 PM Tuesday, November 28th, 09:59 PM You will now start the development of the library modules. As part of test-driven design, you should first write the tests of a functionality, and then write the code, based on the API you defined in Milestone 3. Software Organization Before any code is written, discuss how you plan on organizing your software package. With the idea of classes/modules in mind, organize your code according to the API identified in milestone 3. This is a more detailed organization and should reflect said API (If you already did this for milestone 3, you may use it for milestone 4 or expand it if you need it). What will the directory structure look like? Where will your test suite live? How will you distribute your package (e.g. PyPI with PEP517/518 or simply setuptools )? Other considerations? Describe your choices in the milesone4 document (even if you already wrote it for milestone 3). You have to follow the guidelines shown during lectures. Licensing Licensing is an essential consideration when you create new software. You should choose a suitable license for your project. A comprehensive list of licenses can be found here . The license you choose depends on factors such as what other software or libraries you use in your code ( copyleft , copyright). Will you have to deal with patents? How can others advertise software that makes use of your code (or parts thereof)? You may consult the following reading to aid you in choosing a license: Helper to choose a license Licenses License recommendations License compatibility Extensive list of open source licenses Briefly motivate your license choice in the milesone4 document and add a LICENSE file to the root of your project. Implementation New features should be developed on independent branches. For this, you will leave the main branch only for the code graded by the teaching staff. The branch dev will contain the code in development. You are free to create as many branches as you need and merge them into dev . Do not delete the branches used to develop new features until the teaching staff review them. Remember that you can also create as many workflows as you want. Select at least one module identified in Milestone 3 and implement it. What method and name attributes will your class have? What methods and attributes will you expose to the user? Do you want/need to depend on other libraries? (e.g. NumPy) Write a comprehensive test suite for this module(s), according to your API. This might change in the future as you learn more about your code. Write the code of the module along with its documentation. Note : You can commit multiple times to your local repository and clean the local history if needed. Push only when you pass the tests. Steps to complete In the main branch and within your docs sub-directory, create a file called milestone4 . The type of file is up to you and your group. Two acceptable choices are markdown ( milestone4.md ) or a Jupyter notebook ( milestone4.ipynb ). Your milestone4 document submission should be in the following format: teamXX/ ├── docs │ └── milestone4 ├── LICENSE ├── README.md └── ... Describe the software organization and licencing in milestone4 . Create branch dev . Create the branch featurename to implement your module. Replace \"featurename\" with the name of the module you want to implement. In that branch, write the tests for the module you want to implement. You should commit them before the writing/pushing any other code . Write the code for the module you wrote the tests for. Every test for the modules should pass. The code coverage must be at least 90%. Merge the branch into dev . (Optional) You can implement another feature, following steps 5-10. You are encouraged to develop the main modules as soon as possible, to focus on the integration later on. Final Deliverables The docs/ directory should include a document called milestone4 (the extension is up to you, but .md or .ipynb are recommended. Proper licensing of your project. Tests and implementation for your module(s). Grading breakdown Points Task 4 Software Organization 4 License 15 Implementation 15 (optional) Additional implementation 23(38) Total","tags":"Project","url":"project/M4/"},{"title":"Milestone 1","text":"Due: Thursday, November 2th, 09:59 PM You will now begin your final project to develop a Python package for astronomical research. Please get together with your project group and complete the tasks below for Milestone 1. Steps to complete Find team members you would like to work with and establish a way to communicate. Register your team on Canvas. Send an email to cs107-staff@g.harvard.edu with your team number and members. You should also create your team name, which will be used to represent your team. Your team ID will be team01 if you are Project - Group # 1 or team10 if you are Project - Group # 10 and so on. Final Deliverables Form a project team and communicate with the teaching staff. Grading breakdown Points Task 1 Team formation 1 Total","tags":"Project","url":"project/M1/"},{"title":"Milestone 2","text":"Due: Thursday, November 9th, 09:59 PM You are required to review the mockup contract and identify the critical elements that need to be developed. This exercise is designed to emulate a professional environment and is strictly for simulation purposes; it carries no legal obligations. Software Requirements Specification (SRS) We expect you and your team to read and understand the SRS. From the SRS, identify the API your library should present. We expect you and your team to thoroughly read and comprehend the SRS. Steps to complete Every team member must sign the contract and upload the signed document into the root folder of the team's repository. Grading breakdown Points Task 1 Uploading the signed contract 1 Total","tags":"Project","url":"project/M2/"},{"title":"Milestone 3","text":"Due: Tuesday, November 14th, 09:59 PM You will now further configure your group repository Software Requirements Specification (SRS) Based on the SRS, you are to identify the Application Programming Interface (API) that your library is required to provide. Git Conventions We expect all work from this point onward do be done on feature branches and merged into master or main via Pull Requests. Try to work with different branches and \"approve\" each others pull requests by reviewing their code and then merge into your default project branch. You must work with your project Git repository. The teaching staff will frequently check the history of your project. Steps to complete Create a private team repository in CS107 organization. The project code will be hosted in private repositories within the CS107 organization . A member of your team must create a private repository named after your team ID (e.g., team01_2023 for Team 1). After creating the repository, add all team members to it. The teaching staff will have automatic access and do not need to be added. Within your project repository, create a folder named API_draft . Inside this folder, provide a README file detailing the modules, classes, and functions planned for inclusion to meet the SRS requirements. Use this phase to outline your pipeline and begin task allocation among team members. Don't try to be overly specific on the details, as this is likely to change as your code evolves. Within the API_draft folder, also upload a schematic diagram that illustrates the modules and the API structure your library will present. Within your project repository, you must set up two workflows with GitHub Actions . One workflow will be used for tests and the other for code coverage . You will need two .yml files in the .github/workflows directory in your project repository. The .yml do not need to have meaningful declarations at this point but you should have at least the name: option and the on: option defined. See this link for more details. Make sure the README.md file at the root of your repo includes badges indicating whether your CI workflows are passing or failing. Your workflows are expected to be failing at this point. You should end up with a rendered README.md file that looks like this (workflows may fail or have no status ): In the root of your project repo, you should create a directory called docs . You can use this directory to organize documentation and tutorials for your final package. You will begin creating this documentation as part of the next milestone. Grading breakdown Points Task 1 Creation of team repository 4 Describing the API 2 Diagram 5 Configuring test action 5 Configuring coverage action 4 Creating project structure 21 Total","tags":"Project","url":"project/M3/"},{"title":"Systems Development for Computational Science","text":"Computation has emerged as the third pillar of science alongside the pillars of theory and experiment. Computational science is maturing rapidly and has found considerable and significant use in supporting scientists from various disciplines (including all engineering disciplines, mathematics, physics, chemistry, finance, biology, and data analysis to name a few). Many burgeoning scientists are still taught to write \"a code\" for some problem and to debug when things look wrong. Given the ever-increasing complexity of software solutions to scientific problems, this old paradigm is no longer tenable and at best inefficient. CS107/AC207 is an applications course highlighting the use of software engineering and computer science in solving scientific problems. You will learn the fundamentals of developing scientific software systems including abstract thinking, the handling of data, and assessment of computational approaches: all in the context of good software engineering practices. The class syllabus can be found by following this link. Teaching Staff The preferred way to reach the teaching staff is described in the Teaching Staff Mailing List section below. Instructor Ignacio Becker ( iebecker@g.harvard.edu ) Office: SEC, Office 1.312-05 Office Hours: Wed 5:00-6:00pm Teaching Fellows Fellow Email Office Hours Pair-Programming Sections Kimon Vogt kvogt@g.harvard.edu Sat 8:00-10:00am (Zoom) Mon 8:00-9:15am (Zoom) Fri 8:00-9:15am (Zoom) Yixian Gan ygan@g.harvard.edu Tue 5:00-6:00pm (SEC 6.301+6.302) Mon 6:00-7:15pm (SEC 6.301+6.302) Allison Karp akarp@mde.harvard.edu Thu 9:30-10:30am (SEC 6.301,6.302) Tue 9:30-10:45am (SEC 6.301+6.302) Gekai Liao gekailiao@g.harvard.edu Thu 4:00-5:00pm (MD PierceHall 100F) Tue 3:45-5:00pm (SEC 6.301+6.302) Victor Zhu dunminzhu@g.harvard.edu Mon 10:00-11:00am (Zoom) Thu 6:00-7:15pm (Zoom) Frank Cheng xcheng@g.harvard.edu Fri 4:00-5:00pm (SEC 2.122+2.123) Thu 1:00-2:15pm (SEC 2.122+2.123) Danni Lai danninglai@g.harvard.edu Wed 4:00-5:00pm (Zoom) Tue 7:00-8:15pm (Zoom) Isabella Bossa isabellabossa@g.harvard.edu Tue 10:45-11:45am (MD 223) Thu 8:00-9:15am (MD 123) Tanner Marsh tam997@g.harvard.edu Fri 10:00-11:00am (SEC 2.112) Thu 1:00-2:15pm (SEC 2.122+2.123) Boxiang Wang bwang@g.harvard.edu Mon 1:00-2:00pm (SEC 6.301+6.302) Thu 3:45-5:00pm (SEC 4.405) Shuheng Liu shuheng_liu@g.harvard.edu Mon 7:00-8:00pm (Zoom) Tue 7:00-8:15pm (Zoom) Cyrus Asgari cyrusasgari@college.harvard.edu Wed 3:00-4:00pm (MD 123) Fri 12:00-1:15pm (SEC 2.122+2.123) Haitian Liu hliu3@g.harvard.edu Fri 2:45-3:45pm (SEC 2.122+2.123) Fri 1:30-2:45pm (SEC 2.122+2.123) Legend: SEC : Science and Engineering Complex, Northwestern Av 150, Allston MD : Maxwell-Dworkin, Cambridge Please see Pages section in Canvas for a Google Calendar. Lecture Hours All lectures are of 75 minutes duration. Time is given in Eastern Standard Time (Boston). Lecture attendance is mandatory : Time Room Tuesday 2:15 - 3:30 PM SEC 1.321 Thursday 2:15 - 3:30 PM SEC 1.321 Important Information Canvas: Is used for posting grades and other sensitive content. The class can be found on Canvas at this link https://canvas.harvard.edu/courses/122565 Class git repository: All handouts in CS107/AC207 are provided through the main repository hosted in the CS107 organization at https://code.harvard.edu/CS107/main . You can set this repository as an upstream in your private class repository or clone it once you have joined the CS107 organization git clone git@code.harvard.edu:CS107/main.git Updates to the main repository are posted on the class mailing list. Your Harvard ID is required to login to https://code.harvard.edu . You can request membership in the CS107 organization (AC207 students join the CS107 organization as well) by sending an email to cs107-staff@g.harvard.edu (using your .harvard.edu email). You must include your NetID in the body of your email, which is also your https://code.harvard.edu username (something similar to abc123 ). Once you have been added to the CS107 organization, create your own private repository inside the organization. Your private repository must have the exact name as your NetID . This will be your private class repository where you submit your homework and pair-programming exercises. See the following tutorial to help you get started with your git repository: How to setup your private class repository Class Discussion Forum We will use the Ed Discussion forum on our Canvas page as our main communication platform. Questions regarding homework, labs or lecture material must be posted on this forum and you are encouraged to reply to questions if you know the answer or you can share a useful contribution. A fraction of your participation grade is computed by how often you visit and the frequency you post on the forum. Class Mailing List You can optionally sign up to our class mailing list if you would like to be notified whenever there is new class content available in the class git repository. Replies to posts in this list will be sent to all list members. To sign up, send an email to: cs107+subscribe@g.harvard.edu (subscribe by sending a blank email to this address; use the email address associated with your HarvardID ) You are required to confirm your subscription. Simply reply to the confirmation email with a blank message to complete the subscription. Teaching Staff Mailing List You can reach the teaching staff directly by sending your email to the following mailing list cs107-staff@g.harvard.edu (email sent to this list is only seen by the teaching staff; only email ending with .harvard.edu is accepted) You are not required to register for this mailing list but only email addresses ending with .harvard.edu are accepted (you will receive a rejection message otherwise). Getting Started Checklist Sign up with the CS107 organization on https://code.harvard.edu/CS107 and create your own private repository inside the organization . Information flow: Canvas → Grades and discussion forum https://code.harvard.edu/CS107 Assignment submissions inside your private repository (homework, pair-programming exercises) Group repositories for project work All course handouts are published in the https://code.harvard.edu/CS107/main repository Need help? → cs107-staff@g.harvard.edu OPTIONAL: Sign up on the class mailing list to receive push notifications when new content is available in the https://code.harvard.edu/CS107/main class repository. You can get an Ubuntu docker container with the necessary class tools by docker pull iacs/cs107_ubuntu . Note that no ssh keys are contained in that image for use with git . See also the docker resources page .","tags":"pages","url":"pages/systems-development-for-computational-science/"}]} \ No newline at end of file