Loading…

Identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and PSSM profiles

•A deep learning technique for identifying molecular functions of clathrin with high performance.•The proposed idea is to transform the position-specific scoring matrices to 2D images and feed into 2D convolutional neural networks.•Compared with the other state-of-the-art techniques, our method had...

Full description

Saved in:
Bibliographic Details
Published in:Computer methods and programs in biomedicine 2019-08, Vol.177, p.81-88
Main Authors: Le, Nguyen Quoc Khanh, Huynh, Tuan-Tu, Yapp, Edward Kien Yee, Yeh, Hui-Yuan
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c466t-7d8a286aded32eb541dd0c37ba29dcc0fc37cc855d4be137928ea2f2d5ead3693
cites cdi_FETCH-LOGICAL-c466t-7d8a286aded32eb541dd0c37ba29dcc0fc37cc855d4be137928ea2f2d5ead3693
container_end_page 88
container_issue
container_start_page 81
container_title Computer methods and programs in biomedicine
container_volume 177
creator Le, Nguyen Quoc Khanh
Huynh, Tuan-Tu
Yapp, Edward Kien Yee
Yeh, Hui-Yuan
description •A deep learning technique for identifying molecular functions of clathrin with high performance.•The proposed idea is to transform the position-specific scoring matrices to 2D images and feed into 2D convolutional neural networks.•Compared with the other state-of-the-art techniques, our method had a significant improvement in all of the measurement metrics.•A powerful model to help biologists discover the new sequences that belong to clathrin.•A basis for further research that can improve the performance of protein function prediction using deep neural networks. Clathrin is an adaptor protein that serves as the principal element of the vesicle-coating complex and is important for the membrane cleavage to dispense the invaginated vesicle from the plasma membrane. The functional loss of clathrins has been tied to a lot of human diseases, i.e., neurodegenerative disorders, cancer, Alzheimer's diseases, and so on. Therefore, creating a precise model to identify its functions is a crucial step towards understanding human diseases and designing drug targets. We present a deep learning model using a two-dimensional convolutional neural network (CNN) and position-specific scoring matrix (PSSM) profiles to identify clathrin proteins from high throughput sequences. Traditionally, the 2D CNNs take images as an input so we treated the PSSM profile with a 20 × 20 matrix as an image of 20 × 20 pixels. The input PSSM profile was then connected to our 2D CNN in which we set a variety of parameters to improve the performance of the model. Based on the 10-fold cross-validation results, hyper-parameter optimization process was employed to find the best model for our dataset. Finally, an independent dataset was used to assess the predictive ability of the current model. Our model could identify clathrin proteins with sensitivity of 92.2%, specificity of 91.2%, accuracy of 91.8%, and MCC of 0.83 in the independent dataset. Compared to state-of-the-art traditional neural networks, our method achieved a significant improvement in all typical measurement metrics. Throughout the proposed study, we provide an effective tool for investigating clathrin proteins and our achievement could promote the use of deep learning in biomedical research. We also provide source codes and dataset freely at https://www.github.com/khanhlee/deep-clathrin/.
doi_str_mv 10.1016/j.cmpb.2019.05.016
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2261241023</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S016926071930080X</els_id><sourcerecordid>2261241023</sourcerecordid><originalsourceid>FETCH-LOGICAL-c466t-7d8a286aded32eb541dd0c37ba29dcc0fc37cc855d4be137928ea2f2d5ead3693</originalsourceid><addsrcrecordid>eNp9kM1q3DAYRUVJaKZpX6CLomU2dvUzkm3oJoQmHUhIIOlayNLnjgZbciVNYfr0kZkky6wkLuceoYvQV0pqSqj8vqvNNPc1I7SriahL9AGtaNuwqhFSnKBVSbqKSdKcoU8p7QghTAj5EZ1xymnXSb5CaWPBZzc4o7MLHocBm1HnbXQezzFkcD7h_oCdNyHOIRbK_8Hbwwxx1lFPkCHiMGc3uf9HQylagBmPoKNfYO0tfnh8vFt8gxshfUangx4TfHk5z9Hv659PV7-q2_ubzdXlbWXWUuaqsa1mrdQWLGfQizW1lhje9Jp11hgylLsxrRB23QPlTcda0GxgVoC2XHb8HF0cveXhv3tIWU0uGRhH7SHsk2JMUramhPGCsiNqYkgpwqDm6CYdD4oStYytdmoZWy1jKyJUiUrp24t_309g3yqv6xbgxxGA8st_DqJKxoE3YF0Ek5UN7j3_Mz7Ck9U</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2261241023</pqid></control><display><type>article</type><title>Identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and PSSM profiles</title><source>ScienceDirect Freedom Collection</source><creator>Le, Nguyen Quoc Khanh ; Huynh, Tuan-Tu ; Yapp, Edward Kien Yee ; Yeh, Hui-Yuan</creator><creatorcontrib>Le, Nguyen Quoc Khanh ; Huynh, Tuan-Tu ; Yapp, Edward Kien Yee ; Yeh, Hui-Yuan</creatorcontrib><description>•A deep learning technique for identifying molecular functions of clathrin with high performance.•The proposed idea is to transform the position-specific scoring matrices to 2D images and feed into 2D convolutional neural networks.•Compared with the other state-of-the-art techniques, our method had a significant improvement in all of the measurement metrics.•A powerful model to help biologists discover the new sequences that belong to clathrin.•A basis for further research that can improve the performance of protein function prediction using deep neural networks. Clathrin is an adaptor protein that serves as the principal element of the vesicle-coating complex and is important for the membrane cleavage to dispense the invaginated vesicle from the plasma membrane. The functional loss of clathrins has been tied to a lot of human diseases, i.e., neurodegenerative disorders, cancer, Alzheimer's diseases, and so on. Therefore, creating a precise model to identify its functions is a crucial step towards understanding human diseases and designing drug targets. We present a deep learning model using a two-dimensional convolutional neural network (CNN) and position-specific scoring matrix (PSSM) profiles to identify clathrin proteins from high throughput sequences. Traditionally, the 2D CNNs take images as an input so we treated the PSSM profile with a 20 × 20 matrix as an image of 20 × 20 pixels. The input PSSM profile was then connected to our 2D CNN in which we set a variety of parameters to improve the performance of the model. Based on the 10-fold cross-validation results, hyper-parameter optimization process was employed to find the best model for our dataset. Finally, an independent dataset was used to assess the predictive ability of the current model. Our model could identify clathrin proteins with sensitivity of 92.2%, specificity of 91.2%, accuracy of 91.8%, and MCC of 0.83 in the independent dataset. Compared to state-of-the-art traditional neural networks, our method achieved a significant improvement in all typical measurement metrics. Throughout the proposed study, we provide an effective tool for investigating clathrin proteins and our achievement could promote the use of deep learning in biomedical research. We also provide source codes and dataset freely at https://www.github.com/khanhlee/deep-clathrin/.</description><identifier>ISSN: 0169-2607</identifier><identifier>EISSN: 1872-7565</identifier><identifier>DOI: 10.1016/j.cmpb.2019.05.016</identifier><identifier>PMID: 31319963</identifier><language>eng</language><publisher>Ireland: Elsevier B.V</publisher><subject>Adaptor protein complex ; Algorithms ; Cell Membrane - chemistry ; Clathrin - chemistry ; Clathrin coated pits ; Convolutional neural network ; Deep Learning ; Humans ; Molecular function ; Neural Networks, Computer ; Position specific scoring matrix ; Position-Specific Scoring Matrices ; Reproducibility of Results ; Sensitivity and Specificity ; Software ; Vesicular transport</subject><ispartof>Computer methods and programs in biomedicine, 2019-08, Vol.177, p.81-88</ispartof><rights>2019 Elsevier B.V.</rights><rights>Copyright © 2019 Elsevier B.V. All rights reserved.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c466t-7d8a286aded32eb541dd0c37ba29dcc0fc37cc855d4be137928ea2f2d5ead3693</citedby><cites>FETCH-LOGICAL-c466t-7d8a286aded32eb541dd0c37ba29dcc0fc37cc855d4be137928ea2f2d5ead3693</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>315,786,790,27957,27958</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31319963$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Le, Nguyen Quoc Khanh</creatorcontrib><creatorcontrib>Huynh, Tuan-Tu</creatorcontrib><creatorcontrib>Yapp, Edward Kien Yee</creatorcontrib><creatorcontrib>Yeh, Hui-Yuan</creatorcontrib><title>Identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and PSSM profiles</title><title>Computer methods and programs in biomedicine</title><addtitle>Comput Methods Programs Biomed</addtitle><description>•A deep learning technique for identifying molecular functions of clathrin with high performance.•The proposed idea is to transform the position-specific scoring matrices to 2D images and feed into 2D convolutional neural networks.•Compared with the other state-of-the-art techniques, our method had a significant improvement in all of the measurement metrics.•A powerful model to help biologists discover the new sequences that belong to clathrin.•A basis for further research that can improve the performance of protein function prediction using deep neural networks. Clathrin is an adaptor protein that serves as the principal element of the vesicle-coating complex and is important for the membrane cleavage to dispense the invaginated vesicle from the plasma membrane. The functional loss of clathrins has been tied to a lot of human diseases, i.e., neurodegenerative disorders, cancer, Alzheimer's diseases, and so on. Therefore, creating a precise model to identify its functions is a crucial step towards understanding human diseases and designing drug targets. We present a deep learning model using a two-dimensional convolutional neural network (CNN) and position-specific scoring matrix (PSSM) profiles to identify clathrin proteins from high throughput sequences. Traditionally, the 2D CNNs take images as an input so we treated the PSSM profile with a 20 × 20 matrix as an image of 20 × 20 pixels. The input PSSM profile was then connected to our 2D CNN in which we set a variety of parameters to improve the performance of the model. Based on the 10-fold cross-validation results, hyper-parameter optimization process was employed to find the best model for our dataset. Finally, an independent dataset was used to assess the predictive ability of the current model. Our model could identify clathrin proteins with sensitivity of 92.2%, specificity of 91.2%, accuracy of 91.8%, and MCC of 0.83 in the independent dataset. Compared to state-of-the-art traditional neural networks, our method achieved a significant improvement in all typical measurement metrics. Throughout the proposed study, we provide an effective tool for investigating clathrin proteins and our achievement could promote the use of deep learning in biomedical research. We also provide source codes and dataset freely at https://www.github.com/khanhlee/deep-clathrin/.</description><subject>Adaptor protein complex</subject><subject>Algorithms</subject><subject>Cell Membrane - chemistry</subject><subject>Clathrin - chemistry</subject><subject>Clathrin coated pits</subject><subject>Convolutional neural network</subject><subject>Deep Learning</subject><subject>Humans</subject><subject>Molecular function</subject><subject>Neural Networks, Computer</subject><subject>Position specific scoring matrix</subject><subject>Position-Specific Scoring Matrices</subject><subject>Reproducibility of Results</subject><subject>Sensitivity and Specificity</subject><subject>Software</subject><subject>Vesicular transport</subject><issn>0169-2607</issn><issn>1872-7565</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNp9kM1q3DAYRUVJaKZpX6CLomU2dvUzkm3oJoQmHUhIIOlayNLnjgZbciVNYfr0kZkky6wkLuceoYvQV0pqSqj8vqvNNPc1I7SriahL9AGtaNuwqhFSnKBVSbqKSdKcoU8p7QghTAj5EZ1xymnXSb5CaWPBZzc4o7MLHocBm1HnbXQezzFkcD7h_oCdNyHOIRbK_8Hbwwxx1lFPkCHiMGc3uf9HQylagBmPoKNfYO0tfnh8vFt8gxshfUangx4TfHk5z9Hv659PV7-q2_ubzdXlbWXWUuaqsa1mrdQWLGfQizW1lhje9Jp11hgylLsxrRB23QPlTcda0GxgVoC2XHb8HF0cveXhv3tIWU0uGRhH7SHsk2JMUramhPGCsiNqYkgpwqDm6CYdD4oStYytdmoZWy1jKyJUiUrp24t_309g3yqv6xbgxxGA8st_DqJKxoE3YF0Ek5UN7j3_Mz7Ck9U</recordid><startdate>201908</startdate><enddate>201908</enddate><creator>Le, Nguyen Quoc Khanh</creator><creator>Huynh, Tuan-Tu</creator><creator>Yapp, Edward Kien Yee</creator><creator>Yeh, Hui-Yuan</creator><general>Elsevier B.V</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>201908</creationdate><title>Identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and PSSM profiles</title><author>Le, Nguyen Quoc Khanh ; Huynh, Tuan-Tu ; Yapp, Edward Kien Yee ; Yeh, Hui-Yuan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c466t-7d8a286aded32eb541dd0c37ba29dcc0fc37cc855d4be137928ea2f2d5ead3693</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Adaptor protein complex</topic><topic>Algorithms</topic><topic>Cell Membrane - chemistry</topic><topic>Clathrin - chemistry</topic><topic>Clathrin coated pits</topic><topic>Convolutional neural network</topic><topic>Deep Learning</topic><topic>Humans</topic><topic>Molecular function</topic><topic>Neural Networks, Computer</topic><topic>Position specific scoring matrix</topic><topic>Position-Specific Scoring Matrices</topic><topic>Reproducibility of Results</topic><topic>Sensitivity and Specificity</topic><topic>Software</topic><topic>Vesicular transport</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Le, Nguyen Quoc Khanh</creatorcontrib><creatorcontrib>Huynh, Tuan-Tu</creatorcontrib><creatorcontrib>Yapp, Edward Kien Yee</creatorcontrib><creatorcontrib>Yeh, Hui-Yuan</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Computer methods and programs in biomedicine</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Le, Nguyen Quoc Khanh</au><au>Huynh, Tuan-Tu</au><au>Yapp, Edward Kien Yee</au><au>Yeh, Hui-Yuan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and PSSM profiles</atitle><jtitle>Computer methods and programs in biomedicine</jtitle><addtitle>Comput Methods Programs Biomed</addtitle><date>2019-08</date><risdate>2019</risdate><volume>177</volume><spage>81</spage><epage>88</epage><pages>81-88</pages><issn>0169-2607</issn><eissn>1872-7565</eissn><notes>ObjectType-Article-1</notes><notes>SourceType-Scholarly Journals-1</notes><notes>ObjectType-Feature-2</notes><notes>content type line 23</notes><abstract>•A deep learning technique for identifying molecular functions of clathrin with high performance.•The proposed idea is to transform the position-specific scoring matrices to 2D images and feed into 2D convolutional neural networks.•Compared with the other state-of-the-art techniques, our method had a significant improvement in all of the measurement metrics.•A powerful model to help biologists discover the new sequences that belong to clathrin.•A basis for further research that can improve the performance of protein function prediction using deep neural networks. Clathrin is an adaptor protein that serves as the principal element of the vesicle-coating complex and is important for the membrane cleavage to dispense the invaginated vesicle from the plasma membrane. The functional loss of clathrins has been tied to a lot of human diseases, i.e., neurodegenerative disorders, cancer, Alzheimer's diseases, and so on. Therefore, creating a precise model to identify its functions is a crucial step towards understanding human diseases and designing drug targets. We present a deep learning model using a two-dimensional convolutional neural network (CNN) and position-specific scoring matrix (PSSM) profiles to identify clathrin proteins from high throughput sequences. Traditionally, the 2D CNNs take images as an input so we treated the PSSM profile with a 20 × 20 matrix as an image of 20 × 20 pixels. The input PSSM profile was then connected to our 2D CNN in which we set a variety of parameters to improve the performance of the model. Based on the 10-fold cross-validation results, hyper-parameter optimization process was employed to find the best model for our dataset. Finally, an independent dataset was used to assess the predictive ability of the current model. Our model could identify clathrin proteins with sensitivity of 92.2%, specificity of 91.2%, accuracy of 91.8%, and MCC of 0.83 in the independent dataset. Compared to state-of-the-art traditional neural networks, our method achieved a significant improvement in all typical measurement metrics. Throughout the proposed study, we provide an effective tool for investigating clathrin proteins and our achievement could promote the use of deep learning in biomedical research. We also provide source codes and dataset freely at https://www.github.com/khanhlee/deep-clathrin/.</abstract><cop>Ireland</cop><pub>Elsevier B.V</pub><pmid>31319963</pmid><doi>10.1016/j.cmpb.2019.05.016</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0169-2607
ispartof Computer methods and programs in biomedicine, 2019-08, Vol.177, p.81-88
issn 0169-2607
1872-7565
language eng
recordid cdi_proquest_miscellaneous_2261241023
source ScienceDirect Freedom Collection
subjects Adaptor protein complex
Algorithms
Cell Membrane - chemistry
Clathrin - chemistry
Clathrin coated pits
Convolutional neural network
Deep Learning
Humans
Molecular function
Neural Networks, Computer
Position specific scoring matrix
Position-Specific Scoring Matrices
Reproducibility of Results
Sensitivity and Specificity
Software
Vesicular transport
title Identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and PSSM profiles
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-09-21T15%3A28%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Identification%20of%20clathrin%20proteins%20by%20incorporating%20hyperparameter%20optimization%20in%20deep%20learning%20and%20PSSM%20profiles&rft.jtitle=Computer%20methods%20and%20programs%20in%20biomedicine&rft.au=Le,%20Nguyen%20Quoc%20Khanh&rft.date=2019-08&rft.volume=177&rft.spage=81&rft.epage=88&rft.pages=81-88&rft.issn=0169-2607&rft.eissn=1872-7565&rft_id=info:doi/10.1016/j.cmpb.2019.05.016&rft_dat=%3Cproquest_cross%3E2261241023%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c466t-7d8a286aded32eb541dd0c37ba29dcc0fc37cc855d4be137928ea2f2d5ead3693%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2261241023&rft_id=info:pmid/31319963&rfr_iscdi=true