-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* new contents added * Thesis document added * dependencies updated * enchanced:img added && pagefind disabled temporarily * pagefind added again
- Loading branch information
1 parent
11562cb
commit 63648a0
Showing
18 changed files
with
225 additions
and
98 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
--- | ||
title: My Thesis | ||
description: My Master Thesis Work | ||
date: '2024-7-16' | ||
categories: | ||
- programming | ||
- thesis | ||
published: true | ||
--- | ||
|
||
# ON THE USE OF LARGE LANGUAGE MODEL FOR VIRTUAL SCREENING | ||
|
||
## Introduction | ||
|
||
- Due to the abundance of drug candidates, conducting in-lab experiments to find an effective compound for a given target is a costly and time-consuming task in drug discovery. | ||
- This thesis aims to reduce the number of drug candidates during early drug discovery by clustering the compounds. | ||
- ChemBERTa, a Bidirectional Encoder Representation from Transformers (BERT) model, is employed to extract the descriptors for a compound. | ||
- The compounds are clustered with respect to the learned features, and several clustering algorithms, including the k-means clustering algorithm and the Butina algorithm, are used. | ||
- Finally, obtained clusters are evaluated by measures such as the Silhouette Score and Homogeneity Score. | ||
- Our empirical findings show that using learned descriptors of ChemBERTa produces results that are comparable with traditional and graph-based models, as shown by metrics of cluster accuracy and computing runtime. | ||
- Keywords: drug-target interaction, compound descriptors, representation learning, | ||
natural language processing, clustering | ||
- [Thesis Pdf](../pdfs/Thesis.pdf) | ||
- [Github Repository](https://github.com/ilkersigirci/thesis-work) | ||
|
||
## Method | ||
|
||
- Our method consists of 5 main stages. | ||
- We use 3 main compound SMILES datasets, with 3 different descriptors. | ||
- We also use dimensionality reduction techniques before clustering. | ||
- We use 4 main clustering algorithm and evaluate their performance with 3 different metrics. | ||
- data:image/s3,"s3://crabby-images/3230d/3230d37ce7879e2ee7460672b5682d55f83462ad" alt="Thesis.Method" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,117 @@ | ||
--- | ||
title: Python Template | ||
description: My Custom Python Project Template | ||
date: '2024-7-16' | ||
categories: | ||
- programming | ||
- python | ||
published: true | ||
--- | ||
|
||
## Project Structure | ||
|
||
- [Github Repository](https://github.com/ilkersigirci/python-template) | ||
- It uses `project.toml` instead of `setup.py` and `setup.cfg`. The reasoning is following: | ||
- As [official setuptools guide](https://github.com/pypa/setuptools/blob/main/docs/userguide/quickstart.rst) says, " configuring new projects via setup.py is discouraged" | ||
- One of the biggest problems with setuptools is that the use of an executable file (i.e. the setup.py) cannot be executed without knowing its dependencies. And there is really no way to know what these dependencies are unless you actually execute the file that contains the information related to package dependencies. | ||
- The pyproject.toml file is supposed to solve the build-tool dependency chicken and egg problem since pip itself can read pyproject.yoml along with the version of setuptools or wheel the project requires. | ||
- The pyproject.toml file was introduced in PEP-518 (2016) as a way of separating configuration of the build system from a specific, optional library (setuptools) and also enabling setuptools to install itself without already being installed. Subsequently PEP-621 (2020) introduces the idea that the pyproject.toml file be used for wider project configuration and PEP-660 (2021) proposes finally doing away with the need for setup.py for editable installation using pip. | ||
- It uses [rye](https://github.com/astral-sh/rye) for python dependency operations and virtual environment management. | ||
- It uses `src` layout, which is the recommended layout for python projects to avoid common [pitfalls](https://blog.ionelmc.ro/2014/05/25/python-packaging/#the-structure). | ||
|
||
## Install | ||
|
||
### Default installation | ||
|
||
- Install rye - System wide | ||
|
||
```bash | ||
make -s install-rye | ||
``` | ||
|
||
- Install the project dependencies | ||
|
||
```bash | ||
make -s install | ||
``` | ||
|
||
- After running above command, the project installed in editable mode with all development and test dependencies installed. | ||
- Moreover, a dummy `entry point` called `placeholder` will be available as a cli command. | ||
|
||
### Docker | ||
|
||
```bash | ||
# Development build (800 MB) | ||
docker build --tag python-template --file docker/Dockerfile --target development . | ||
|
||
# Production build (145 MB) | ||
docker build --tag python-template --file docker/Dockerfile --target production . | ||
``` | ||
|
||
- To run command inside the container: | ||
|
||
```bash | ||
docker run -it python-template:latest bash | ||
|
||
# Temporary container | ||
docker run --rm -it python-template:latest bash | ||
``` | ||
|
||
## IDE Setings | ||
|
||
### Pycharm | ||
|
||
- Line-length: `Editor -> Code Style -> Hard wrap at 88` | ||
|
||
#### Inspections | ||
|
||
Settings -> Editor -> Inspections -> Python | ||
|
||
Enable all except: | ||
|
||
- Accessing a protected member of a class or a module | ||
- Assignment can be replaced with augmented assignments | ||
- Classic style class usage | ||
- Incorrect BDD Behave-specific definitions | ||
- No encoding specified for file | ||
- The function argument is equal to the default parameter | ||
- Type checker compatible with Pydantic | ||
- For "PEP 8 coding style violation": | ||
Ignore = E266, E501 | ||
- For "PEP 8 naming convetion violation": | ||
Ignore = N803 | ||
|
||
#### Plugins | ||
|
||
- Ruff | ||
- Pydantic | ||
|
||
### Vscode | ||
|
||
- All recommended settings and extensions can be found in `.vscode` directory. | ||
|
||
## Useful Makefile commands | ||
|
||
```bash | ||
# All available commands | ||
makefile | ||
makefile help | ||
|
||
# Run all tests | ||
make -s test | ||
|
||
# Run specific tests | ||
make -s test-one TEST_MARKER=<TEST_MARKER> | ||
|
||
# Remove unnecessary files such as build,test, cache | ||
make -s clean | ||
|
||
# Run all pre-commit hooks | ||
make -s pre-commit | ||
|
||
# Lint the project | ||
make -s lint | ||
|
||
# Profile a file | ||
make -s profile PROFILE_FILE_PATH=<PATH_TO_FILE> | ||
``` |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,30 @@ | ||
<script context="module"> | ||
const images_base_path = '/src/assets/images' | ||
// From: https://github.com/sveltejs/kit/issues/11535#issuecomment-2207645048 | ||
const images = import.meta.glob(['/src/assets/images/*.{avif,gif,heif,jpeg,jpg,png,tiff,webp}'], { | ||
eager: true, | ||
query: { enhanced: true } | ||
}) | ||
const get_full = (desired_image) => { | ||
desired_image = `${images_base_path}/${desired_image}` | ||
return images[desired_image].default | ||
} | ||
</script> | ||
|
||
<script lang="ts"> | ||
export let src: string | ||
export let alt: string | ||
const image = get_full(src) | ||
</script> | ||
|
||
<!-- NOTE: Enables lazy loading images in mdsvex rendered markdown --> | ||
<img {src} {alt} loading="lazy" /> | ||
<!-- <img {src} {alt} loading="lazy" /> --> | ||
|
||
<!-- NOTE: /src/.. like paths only works with enhanced:img --> | ||
<!-- <img src={`${images_base_path}/${src}`} {alt} /> --> | ||
|
||
<figure class="image"> | ||
<enhanced:img src={image} {alt} /> | ||
</figure> |
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Oops, something went wrong.