The developed AI software focuses on enhancing the diagnostic processes in healthcare. The AI is trained on a comprehensive dataset comprising of various symptoms associated with common diseases. The AI predicts which disease the person is most likely suffering from based on their presented symptoms.
The main focus of the project is to improve efficiency in healthcare, saving both time and efforts of healthcare professionals and empowering patients to participate actively in their healthcare journey. It is important to emphasize that the software is not intended to replace healthcare professionals, but rather to complement them and serve as a valuable tool to enhance diagnostic accuracy.
On the development of this project, I have designed, tested, and compared six different AI models. This rigorous evaluation aimed to elucidate the strengths and weaknesses of each model so as to choose an AI model that would work best for the given problem and dataset. By harnessing the potential of AI, ultimately, this software represents a promising leap forward in healthcare to revolutionise medical decision-making, reduce errors and optimize healthcare resources.
NumPy:
• Fundamental library for scientific computing • Provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these array • Used for data manipulation, mathematical operations, linear algebra, and statistical analysis
SciPy:
• Builds on NumPy and offers additional functionality for scientific and technical computing • Includes modules for optimization, integration, interpolation, signal and image processing, linear algebra, etc. • Used for solving complex mathematical problems and conducting scientific experiments
Keras:
• Used for developing and experimenting with deep neural networks
Pandas:
• Provides data structures and functions for working with structured data • Suitable for data analysis and cleaning • Used for data preprocessing, cleaning, exploration, and transformation
Seaborn:
• Used for data visualization (high-level library)
Tabulate:
• Used for formatting and displaying data in tabular form
Matplotlib:
• Used for data visualization (low-level library)
Scikit-learn:
• Used for building, training, and evaluating machine learning models for various tasks in data science and machine learning
Gradio:
• User-friendly library for creating and sharing machine learning models via web interface
While the project represents a promising leap forward, several limitations and areas for future improvement should be acknowledged.
Future Work:
• Introduction of user-friendly GUI to predict disease probabilities from user-inputted values would greatly enhance accessibility • Including data on a broader spectrum of diseases • Using more extensive and diverse datasets to train the AI model
Limitations:
• Time required for data acquisition and model training • Number of people involved in the project, which could potentially benefit from collaboration of a larger, more diverse team • Limitations in programming interface
- https://numpy.org/
- https://scipy.org/
- https://keras.io/
- https://pandas.pydata.org/
- https://seaborn.pydata.org/
- https://pypi.org/project/tabulate/
- https://matplotlib.org/
- https://scikit-learn.org/stable/
- https://www.gradio.app/
- https://huggingface.co/
- https://www.kaggle.com/
- What is Linear Regression? - Linear Regression Explained - AWS, https://aws.amazon.com/what-is/linear-regression/
- What is Logistic regression? | IBM, https://www.ibm.com/topics/logistic-regression