Skip to content

Latest commit

 

History

History
37 lines (18 loc) · 2.96 KB

README.md

File metadata and controls

37 lines (18 loc) · 2.96 KB

How did I obtain the paratope information?

I will share how I obtain the paratope information stored in hla2paratopeTable_aligned.txt file.

Step1: Web-scraping the IMGT 3D database

IMGT 3D database contains abundant information on the experimentally solved crystal structure of pMHC1 complex, which can serve as the information source to extract the paratopes informatio (HLA residues that likely to interact with the neoepitope).

In the above link, we selected species as human and the type as pMH1, click submit. You will be directed to a page where all the solved complex structures were listed. When I conduct the project, there were 550 entries in the returned page, now the number has increased.

entry

Let's click any one of the structrure hyperlink:

complex

Then click the "paratope and epitope", where the contacting residues will be shown on the page:

paratope

For each HLA, we want to collect all the available contact residues information for each of the experimentally validated paired epitopes.

Now we definitely want to automates this retrieval process, so we use web scraping tool scrapy and selenium.

The codes and instruction for running the scraper can be found here in this folder. More information I'd like to refer you to the scrapy and selenium official website. You can also use any other alternative ways to retrieve the paratope information.

Step2: draw consesus paratopes information

We perform a two round clustal-omega runs, the concrete examples are shown in our Supplemental Figure 1.

And the corresponding text descrption can be found in our Supplemental method from line 36-40.

They should be clear enough, but feel free to contact me if you have any confusions or want to know more about the process.