Skip to content

Commit

Permalink
revise for lesson ease
Browse files Browse the repository at this point in the history
  • Loading branch information
jm1021 committed Mar 6, 2024
1 parent eae929a commit cf7b593
Showing 1 changed file with 9 additions and 11 deletions.
20 changes: 9 additions & 11 deletions _notebooks/2024-03-05-DS-python-pandas-df_titanic.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -239,20 +239,18 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Machine Learning <a href=\"https://www.tutorialspoint.com/scikit_learn/scikit_learn_introduction.htm#:~:text=Scikit%2Dlearn%20(Sklearn)%20is,a%20consistence%20interface%20in%20Python\">Visit Tutorials Point</a>\n",
"> Scikit-learn (SciPy Toolkit) is the most useful and robust library for machine learning in Python. It provides a selection of efficient tools for machine learning and statistical modeling including classification, regression, clustering and dimensionality reduction via a consistence interface in Python.\n",
"## Machine Learning \n",
"<a href=\"https://www.tutorialspoint.com/scikit_learn/scikit_learn_introduction.htm#:~:text=Scikit%2Dlearn%20(Sklearn)%20is,a%20consistence%20interface%20in%20Python\">Visit Tutorials Point</a>\n",
"\n",
"- Description from ChatGPT... The Titanic dataset is a popular dataset for data analysis and machine learning. \n",
"> Scikit-learn is a powerful Python library for machine learning, offering tools for classification, regression, clustering, and dimensionality reduction.\n",
"\n",
"- In the context of machine learning, accuracy refers to the percentage of correctly classified instances in a set of predictions. In this case, the testing data is a subset of the original Titanic dataset that the decision tree model has not seen during training.\n",
" - After training the decision tree model on the training data, we can evaluate its performance on the testing data by making predictions on the testing data and comparing them to the actual outcomes. The accuracy of the decision tree classifier on the testing data tells us how well the model generalizes to new data that it hasn't seen before.\n",
" - For example, if the accuracy of the decision tree classifier on the testing data is 0.8 (or 80%), this means that 80% of the predictions made by the model on the testing data were correct.\n",
" - Chance of survival could be done using various machine learning techniques, including decision trees, logistic regression, or support vector machines, among others.\n",
"- The Titanic dataset is a classic for data analysis and machine learning. We'll use machine learning techniques like Decision Trees and Logistic Regression to predict passenger survival. \n",
"\n",
"- Code Below prepares data for further analysis and provides an Accuracy. \n",
" - [Decision Trees](https://scikit-learn.org/stable/modules/tree.html#tree), prediction by a piecewise constant approximation.\n",
" \n",
" - [Logistic Regression](https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression), the probabilities describing the possible outcomes."
"- [Decision Trees](https://scikit-learn.org/stable/modules/tree.html#tree) are a type of model used for both classification and regression. They work by creating a tree-like model of decisions based on the features. For example, in the context of the Titanic dataset, a Decision Tree might make decisions based on features like 'age', 'sex', and 'fare' to predict whether a passenger survived. The tree might first split by 'sex', then for each sex, split by 'age', and so on, creating a tree of decisions.\n",
"\n",
"- [Logistic Regression](https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression) is a statistical model used in machine learning for binary classification problems. It models the probabilities of the default class (e.g., the probability of a passenger surviving, in the context of the Titanic dataset). \n",
"\n",
"- After training our models, we'll evaluate their performance using accuracy, the percentage of correct predictions on unseen data."
]
},
{
Expand Down

0 comments on commit cf7b593

Please sign in to comment.