[Update] added papers about transformer-based music generation and RL…

…-based methods
ybayle · May 25, 2019 · a1e1ee4 · a1e1ee4
1 parent 4dd6bf5
commit a1e1ee4
Show file tree

Hide file tree

Showing 14 changed files with 102 additions and 7 deletions.
diff --git a/README.md b/README.md
@@ -172,6 +172,7 @@ However, these surveys do not cover music information retrieval tasks that are i
 | 2017 | [Designing efficient architectures for modeling temporal features with convolutional neural networks](http://ieeexplore.ieee.org/document/7952601/) | [GitHub](https://github.com/jordipons/ICASSP2017) |
 | 2017 | [Timbre analysis of music audio signals with convolutional neural networks](https://github.com/ronggong/EUSIPCO2017) | [GitHub](https://github.com/jordipons/EUSIPCO2017) |
 | 2017 | [Deep learning and intelligent audio mixing](http://www.semanticaudio.co.uk/wp-content/uploads/2017/09/WIMP2017_Martinez-RamirezReiss.pdf) | No |
+| 2017 | [A SeqGAN for Polyphonic Music Generation](https://arxiv.org/pdf/1710.11418v2.pdf) | [GitHub](https://github.com/L0SG/seqgan-music) |
 | 2017 | [Deep learning for event detection, sequence labelling and similarity estimation in music signals](http://ofai.at/~jan.schlueter/pubs/phd/phd.pdf) | No |
 | 2017 | [Music feature maps with convolutional neural networks for music genre classification](https://www.researchgate.net/profile/Thomas_Pellegrini/publication/319326354_Music_Feature_Maps_with_Convolutional_Neural_Networks_for_Music_Genre_Classification/links/59ba5ae3458515bb9c4c6724/Music-Feature-Maps-with-Convolutional-Neural-Networks-for-Music-Genre-Classification.pdf?origin=publication_detail&ev=pub_int_prw_xdl&msrp=wzXuHZAa5zAnqEmErYyZwIRr2H0q01LnNEd4Wd7A15CQfdVLwdy98pmE-AdnrDvoc3-bVENSFrHt0yhaOiE2mQrYllVS9CJZOk-c9R0j_R1rbgcZugS6RtQ_.AUjPuJSF5P_DMngf-woH7W-7jdnQlbNQziR4_h6NnCHfR_zGcEa8vOyyOz5gx5nc4azqKTPQ5ZgGGLUxkLj1qCQLEQ5ThkhGlWHLyA.s6MBZE20-EO_RjRGCOCV4wk0WSFdN56Aloiraxz9hKCbJwRM2Et27RHVUA8jj9H8qvXIB6f7zSIrQgjXGrL2yCpyQlLffuf57rzSwg.KMMXbZrHsihV8DJM53xkHAWf3VebCJESi4KU4btNv9nQsyK2KnkhSQaTILKv0DSZY3c70a61LzywCBuoHtIhVOFhW5hVZN2n5O9uKQ) | No |
 | 2017 | [Automatic drum transcription for polyphonic recordings using soft attention mechanisms and convolutional neural networks](https://carlsouthall.files.wordpress.com/2017/12/ismir2017adt.pdf) | [GitHub](https://github.com/CarlSouthall/ADTLib) |
@@ -191,7 +192,10 @@ However, these surveys do not cover music information retrieval tasks that are i
 | 2017 | [Attention and localization based on a deep convolutional recurrent model for weakly supervised audio tagging](https://arxiv.org/pdf/1703.06052.pdf) | [GitHub](https://github.com/yongxuUSTC/att_loc_cgrnn) |
 | 2017 | [Surrey-CVSSP system for DCASE2017 challenge task4](https://www.cs.tut.fi/sgn/arg/dcase2017/documents/challenge_technical_reports/DCASE2017_Xu_146.pdf) | [GitHub](https://github.com/yongxuUSTC/dcase2017_task4_cvssp) |
 | 2017 | [A study on LSTM networks for polyphonic music sequence modelling](https://qmro.qmul.ac.uk/xmlui/handle/123456789/24946) | [Website](http://www.eecs.qmul.ac.uk/~ay304/code/ismir17) |
+| 2018 | [MUSIC TRANSFORMER:GENERATING MUSIC WITH LONG-TERM STRUCTURE](https://arxiv.org/pdf/1809.04281.pdf) | No |
 | 2018 | [MuseGAN: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment](https://arxiv.org/pdf/1709.06298.pdf) | [GitHub](https://github.com/salu133445/musegan) |
+| 2018 | [Music Theory Inspired Policy Gradient Method for Piano Music Transcription](https://nips2018creativity.github.io/doc/music_theory_inspired_policy_gradient.pdf) | No |
+| 2019 | [Generating Long Sequences with Sparse Transformers](https://arxiv.org/pdf/1904.10509.pdf) | [GitHub](https://github.com/openai/sparse_attention) |
 
 [Go back to top](https://github.com/ybayle/awesome-deep-learning-music#deep-learning-for-music-dl4m-)
 
@@ -238,24 +242,24 @@ Each entry in [dl4m.bib](dl4m.bib) also displays additional information:
 
 ## Statistics and visualisations
 
-- 160 papers referenced. See the details in [dl4m.bib](dl4m.bib).
+- 164 papers referenced. See the details in [dl4m.bib](dl4m.bib).
 There are more papers from 2017 than any other years combined.
 Number of articles per year:
 ![Number of articles per year](fig/articles_per_year.png)
-- If you are applying DL to music, there are [329 other researchers](authors.md) in your field.
-- 33 tasks investigated. See the list of [tasks](tasks.md).
+- If you are applying DL to music, there are [348 other researchers](authors.md) in your field.
+- 35 tasks investigated. See the list of [tasks](tasks.md).
 Tasks pie chart:
 ![Tasks pie chart](fig/pie_chart_task.png)
-- 48 datasets used. See the list of [datasets](datasets.md).
+- 51 datasets used. See the list of [datasets](datasets.md).
 Datasets pie chart:
 ![Datasets pie chart](fig/pie_chart_dataset.png)
-- 27 architectures used. See the list of [architectures](architectures.md).
+- 29 architectures used. See the list of [architectures](architectures.md).
 Architectures pie chart:
 ![Architectures pie chart](fig/pie_chart_architecture.png)
-- 9 frameworks used. See the list of [frameworks](frameworks.md).
+- 10 frameworks used. See the list of [frameworks](frameworks.md).
 Frameworks pie chart:
 ![Frameworks pie chart](fig/pie_chart_framework.png)
-- Only 42 articles (26%) provide their source code.
+- Only 44 articles (26%) provide their source code.
 Repeatability is the key to good science, so check out the [list of useful resources on reproducibility for MIR and ML](reproducibility.md).
 
 [Go back to top](https://github.com/ybayle/awesome-deep-learning-music#deep-learning-for-music-dl4m-)

diff --git a/architectures.md b/architectures.md
@@ -27,5 +27,7 @@ Please refer to the list of useful acronyms used in deep learning and music: [ac
 - RNN
 - RNN-LSTM
 - ResNet
+- SeqGAN
+- Transformer
 - U-Net
 - VPNN
diff --git a/authors.md b/authors.md
@@ -1,8 +1,11 @@
 # List of authors
 
 - Adavanne, Sharath
+- Alec Radford
+- Andrew M. Dai
 - Arumugam, Muthumari
 - Arzt, Andreas
+- Ashish Vaswani
 - Badeau, Roland
 - Bammer, Roswitha
 - Barbieri, Francesco
@@ -33,6 +36,7 @@
 - Chen, Tanfang
 - Chen, Wenxiao
 - Cheng, Wen-Huang
+- Cheng{-}Zhi Anna Huang
 - Chesmore, David
 - Chiang, Chin-Chin
 - Cho, Kyunghyun
@@ -41,6 +45,7 @@
 - Costa, Yandre MG
 - Courville, Aaron
 - Coutinho, Eduardo
+- Curtis Hawthorne
 - Dannenberg, Roger B
 - David, Bertrand
 - De Haas, W Bas
@@ -52,6 +57,7 @@
 - Doerfler, Monika
 - Dong, Hao-Wen
 - Dorfer, Matthias
+- Douglas Eck
 - Drossos, Konstantinos
 - Duppada, Venkatesh
 - Durand, Simon
@@ -111,9 +117,11 @@
 - Hutchings, P.
 - Huttunen, Heikki
 - Ide, Ichiro
+- Ilya Sutskever
 - Imenina, Alina
 - Jackson, Philip J. B.
 - Jain, Shubham
+- Jakob Uszkoreit
 - Janer Mestres, Jordi
 - Janer, Jordi
 - Jang, Jyh-Shing R
@@ -158,6 +166,7 @@
 - Lee, Tan
 - Leglaive, Simon
 - Lewis, J. P.
+- Li, Juncheng
 - Li, Lihua
 - Li, Peter
 - Li, Siyan
@@ -178,11 +187,13 @@
 - Materka, Andrzej
 - Mathulaprangsan, Seksan
 - Matityaho, Benyamin
+- Matthew D. Hoffman
 - McFee, Brian
 - Medhat, Fady
 - Mehri, Soroush
 - Meng, Fanhang
 - Mertins, Alfred
+- Metze, Florian
 - Mimilakis, Stylianos Ioannis
 - Miron, Marius
 - Mitsufuji, Yuki
@@ -198,6 +209,7 @@
 - Nielsen, Frank
 - Nieto, Oriol
 - Niewiadomski, Adam
+- Noam Shazeer
 - Ogihara, Mitsunori
 - Oliveira, Luiz S
 - Oramas, Sergio
@@ -223,17 +235,20 @@
 - Prockup, Matthew
 - Qian, Jiyuan
 - Qian, Sheng
+- Qu, Shuhui
 - Radenen, Mathieu
 - Ramírez, Marco A. Martínez
 - Reiss, Joshua D.
 - Ren, Gang
+- Rewon Child
 - Richard, Gaël
 - Riedmiller, Martin
 - Rigaud, François
 - Robinson, John
 - Roma, Gerard
 - Rosasco, Lorenzo
 - Sandler, Mark Brian
+- Sang{-}gil Lee
 - Santos, João Felipe
 - Santoso, Andri
 - Saurous, Rif A.
@@ -246,7 +261,9 @@
 - Schuller, Björn W
 - Schuller, Gerald
 - Schultz, Tanja
+- Scott Gray
 - Senac, Christine
+- Seonwoo Min
 - Serra, Xavier
 - Seybold, Bryan
 - Shi, Zhengshan
@@ -265,6 +282,7 @@
 - Stoller, Daniel
 - Sturm, Bob L.
 - Su, Hong
+- Sungroh Yoon
 - Takahashi, Naoya
 - Takiguchi, Tetsuya
 - Tanaka, Hidehiko
@@ -277,6 +295,7 @@
 - Tsaptsinos, Alexandros
 - Tsipas, Nikolaos
 - Uhlich, Stefan
+- Uiwon Hwang
 - Ullrich, Karen
 - Valin, Jean-Marc
 - Van Gemert, JC

diff --git a/datasets.md b/datasets.md
@@ -23,6 +23,7 @@ Please refer to the list of useful acronyms used in deep learning and music: [ac
 - [Homburg](http://www-ai.cs.uni-dortmund.de/audio.html)
 - [IDMT-SMT-Drums](https://www.idmt.fraunhofer.de/en/business_units/m2d/smt/drums.html)
 - [IRMAS](https://www.upf.edu/web/mtg/irmas)
+- [J.S. Bach chorales dataset](https://github.com/czhuang/JSB-Chorales-dataset)
 - [JSB Chorales](ftp://i11ftp.ira.uka.de/pub/neuro/dominik/midifiles/bach.zip)
 - [Jamendo](http://www.mathieuramona.com/wp/data/jamendo/)
 - [LMD](https://sites.google.com/site/carlossillajr/resources/the-latin-music-database-lmd)
@@ -39,7 +40,9 @@ Please refer to the list of useful acronyms used in deep learning and music: [ac
 - [MedleyDB](http://medleydb.weebly.com/)
 - [MusicNet](https://homes.cs.washington.edu/~thickstn/musicnet.html)
 - [NTT MLS](http://www.ntt-at.com/product/speech/)
+- [Nottingham dataset](http://abc.sourceforge.net/NMD/)
 - [Open Multitrack Testbed](http://www.semanticaudio.co.uk/projects/omtb/)
+- [Piano-e-Competition dataset (competition history)](http://www.piano-e-competition.com/)
 - [Piano-midi.de](Piano-midi.de)
 - [RWC](https://staff.aist.go.jp/m.goto/RWC-MDB/)
 - [SALAMI](http://ddmal.music.mcgill.ca/research/salami/annotations)

diff --git a/dl4m.bib b/dl4m.bib
@@ -1,3 +1,5 @@
+@comment{}}
+
 @inproceedings{Bharucha1988,
   author = {Bharucha, J.},
   booktitle = {Proceedings of the First Workshop on Artificial Intelligence and Music},
@@ -1782,6 +1784,23 @@ @inproceedings{Ramirez2017
   year = {2017}
 }
 
+@inproceedings{Lee2017,
+  architecture = {SeqGAN},
+  author = {Sang{-}gil Lee and Uiwon Hwang and Seonwoo Min and Sungroh Yoon},
+  batch = {No},
+  booktitle = {CoRR},
+  code = {https://github.com/L0SG/seqgan-music},
+  dataaugmentation = {No},
+  dataset = {[Nottingham dataset](http://abc.sourceforge.net/NMD/)},
+  framework = {Tensorflow},
+  input = {MIDI},
+  link = {https://arxiv.org/pdf/1710.11418v2.pdf},
+  loss = {No},
+  task = {Polyphonic music sequence modelling},
+  title = {A SeqGAN for Polyphonic Music Generation},
+  year = {2017}
+}
+
 @phdthesis{Schlueter2017,
   author = {Schlüter, Jan},
   link = {http://ofai.at/~jan.schlueter/pubs/phd/phd.pdf},
@@ -2034,6 +2053,22 @@ @inproceedings{Ycart2017
   year = {2017}
 }
 
+@inproceedings{Huang2018,
+  architecture = {Transformer & RNN},
+  author = {Cheng{-}Zhi Anna Huang and Ashish Vaswani and Jakob Uszkoreit and Noam Shazeer and Curtis Hawthorne and Andrew M. Dai and Matthew D. Hoffman and Douglas Eck},
+  batch = {No},
+  booktitle = {CoRR},
+  dataaugmentation = {Time Stretches & pitch transcription},
+  dataset = {[J.S. Bach chorales dataset](https://github.com/czhuang/JSB-Chorales-dataset) & [Piano-e-Competition dataset (competition history)](http://www.piano-e-competition.com/)},
+  framework = {tensor2tensor},
+  input = {MIDI},
+  link = {https://arxiv.org/pdf/1809.04281.pdf},
+  loss = {No},
+  task = {Polyphonic music sequence modelling},
+  title = {MUSIC TRANSFORMER:GENERATING MUSIC WITH LONG-TERM STRUCTURE},
+  year = {2018}
+}
+
 @inproceedings{Dong2018,
   activation = {ReLU & Leaky ReLU},
   architecture = {GAN & CNN},
@@ -2065,3 +2100,27 @@ @inproceedings{Dong2018
   year = {2018}
 }
 
+@article{Li2018,
+  architecture = {CNN & RNN},
+  author = {Li, Juncheng and Qu, Shuhui and Metze, Florian},
+  link = {https://nips2018creativity.github.io/doc/music_theory_inspired_policy_gradient.pdf},
+  task = {Music Transcription},
+  title = {Music Theory Inspired Policy Gradient Method for Piano Music Transcription},
+  year = {2018}
+}
+
+@unpublished{Child2019,
+  architecture = {Transformer},
+  author = {Rewon Child and Scott Gray and Alec Radford and Ilya Sutskever},
+  batch = {No},
+  code = {https://github.com/openai/sparse_attention},
+  input = {Raw Audio},
+  link = {https://arxiv.org/pdf/1904.10509.pdf},
+  loss = {No},
+  note = {this paper is mainly about how sparse transformer are implemented},
+  pages = {8--9},
+  task = {audio generation},
+  title = {Generating Long Sequences with Sparse Transformers},
+  year = {2019}
+}
+
diff --git a/dl4m.tsv b/dl4m.tsv
@@ -139,6 +139,7 @@ Year	Entrytype	Title	Author	Link	Code	Task	Reproducible	Dataset	Framework	Archit
 2017	inproceedings	Designing efficient architectures for modeling temporal features with convolutional neural networks	Pons, Jordi and Serra, Xavier	http://ieeexplore.ieee.org/document/7952601/	https://github.com/jordipons/ICASSP2017	MGR		[Ballroom](http://mtg.upf.edu/ismir2004/contest/tempoContest/node5.html)		CNN												
 2017	inproceedings	Timbre analysis of music audio signals with convolutional neural networks	Pons, Jordi and Slizovskaia, Olga and Gong, Rong and Gómez, Emilia and Serra, Xavier	https://github.com/ronggong/EUSIPCO2017	https://github.com/jordipons/EUSIPCO2017					CNN												
 2017	inproceedings	Deep learning and intelligent audio mixing	Ramírez, Marco A. Martínez and Reiss, Joshua D.	http://www.semanticaudio.co.uk/wp-content/uploads/2017/09/WIMP2017_Martinez-RamirezReiss.pdf	No	Mixing		[Open Multitrack Testbed](http://www.semanticaudio.co.uk/projects/omtb/)		DAE				No						Adam		
+2017	inproceedings	A SeqGAN for Polyphonic Music Generation	Sang{-}gil Lee and Uiwon Hwang and Seonwoo Min and Sungroh Yoon	https://arxiv.org/pdf/1710.11418v2.pdf	https://github.com/L0SG/seqgan-music	Polyphonic music sequence modelling		[Nottingham dataset](http://abc.sourceforge.net/NMD/)	Tensorflow	SeqGAN		No		No	MIDI			No				
 2017	phdthesis	Deep learning for event detection, sequence labelling and similarity estimation in music signals	Schlüter, Jan	http://ofai.at/~jan.schlueter/pubs/phd/phd.pdf																		
 2017	inproceedings	Music feature maps with convolutional neural networks for music genre classification	Senac, Christine and Pellegrini, Thomas and Mouret, Florian and Pinquier, Julien	https://www.researchgate.net/profile/Thomas_Pellegrini/publication/319326354_Music_Feature_Maps_with_Convolutional_Neural_Networks_for_Music_Genre_Classification/links/59ba5ae3458515bb9c4c6724/Music-Feature-Maps-with-Convolutional-Neural-Networks-for-Music-Genre-Classification.pdf?origin=publication_detail&ev=pub_int_prw_xdl&msrp=wzXuHZAa5zAnqEmErYyZwIRr2H0q01LnNEd4Wd7A15CQfdVLwdy98pmE-AdnrDvoc3-bVENSFrHt0yhaOiE2mQrYllVS9CJZOk-c9R0j_R1rbgcZugS6RtQ_.AUjPuJSF5P_DMngf-woH7W-7jdnQlbNQziR4_h6NnCHfR_zGcEa8vOyyOz5gx5nc4azqKTPQ5ZgGGLUxkLj1qCQLEQ5ThkhGlWHLyA.s6MBZE20-EO_RjRGCOCV4wk0WSFdN56Aloiraxz9hKCbJwRM2Et27RHVUA8jj9H8qvXIB6f7zSIrQgjXGrL2yCpyQlLffuf57rzSwg.KMMXbZrHsihV8DJM53xkHAWf3VebCJESi4KU4btNv9nQsyK2KnkhSQaTILKv0DSZY3c70a61LzywCBuoHtIhVOFhW5hVZN2n5O9uKQ		MGR		[GTzan](http://marsyas.info/downloads/datasets.html)		CNN					Spectrograms & common audio features							
 2017	inproceedings	Automatic drum transcription for polyphonic recordings using soft attention mechanisms and convolutional neural networks	Southall, Carl and Stables, Ryan and Hockman, Jason	https://carlsouthall.files.wordpress.com/2017/12/ismir2017adt.pdf	https://github.com/CarlSouthall/ADTLib	Transcription		[IDMT-SMT-Drums](https://www.idmt.fraunhofer.de/en/business_units/m2d/smt/drums.html)		CNN & BRNN												
@@ -158,4 +159,7 @@ Year	Entrytype	Title	Author	Link	Code	Task	Reproducible	Dataset	Framework	Archit
 2017	inproceedings	Attention and localization based on a deep convolutional recurrent model for weakly supervised audio tagging	Xu, Yong and Kong, Qiuqiang and Huang, Qiang and Wang, Wenwu and Plumbley, Mark D.	https://arxiv.org/pdf/1703.06052.pdf	https://github.com/yongxuUSTC/att_loc_cgrnn	DCASE 2016 Task 4 Domestic audio tagging				CRNN												
 2017	techreport	Surrey-CVSSP system for DCASE2017 challenge task4	Xu, Yong and Kong, Qiuqiang and Wang, Wenwu and Plumbley, Mark D.	https://www.cs.tut.fi/sgn/arg/dcase2017/documents/challenge_technical_reports/DCASE2017_Xu_146.pdf	https://github.com/yongxuUSTC/dcase2017_task4_cvssp	Event recognition																
 2017	inproceedings	A study on LSTM networks for polyphonic music sequence modelling	Ycart, Adrien and Benetos, Emmanouil	https://qmro.qmul.ac.uk/xmlui/handle/123456789/24946	http://www.eecs.qmul.ac.uk/~ay304/code/ismir17	Polyphonic music sequence modelling		Inhouse & [Piano-midi.de](Piano-midi.de)		RNN-LSTM				Pitch shift								
+2018	inproceedings	MUSIC TRANSFORMER:GENERATING MUSIC WITH LONG-TERM STRUCTURE	Cheng{-}Zhi Anna Huang and Ashish Vaswani and Jakob Uszkoreit and Noam Shazeer and Curtis Hawthorne and Andrew M. Dai and Matthew D. Hoffman and Douglas Eck	https://arxiv.org/pdf/1809.04281.pdf		Polyphonic music sequence modelling		[J.S. Bach chorales dataset](https://github.com/czhuang/JSB-Chorales-dataset) & [Piano-e-Competition dataset (competition history)](http://www.piano-e-competition.com/)	tensor2tensor	Transformer & RNN		No		Time Stretches & pitch transcription	MIDI			No				
 2018	inproceedings	MuseGAN: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment	Dong, Hao-Wen and Hsiao, Wen-Yi and Yang, Li-Chia and Yang, Yi-Hsuan	https://arxiv.org/pdf/1709.06298.pdf	https://github.com/salu133445/musegan	Composition	No	[Lakh Pianoroll Datase](https://github.com/salu133445/musegan/blob/master/docs/dataset.md)	No	GAN & CNN	No	No	No	No	Piano-roll	1D	ReLU & Leaky ReLU	No	No	Adam	1 Tesla K40m	
+2018	article	Music Theory Inspired Policy Gradient Method for Piano Music Transcription	Li, Juncheng and Qu, Shuhui and Metze, Florian	https://nips2018creativity.github.io/doc/music_theory_inspired_policy_gradient.pdf		Music Transcription				CNN & RNN												
+2019	unpublished	Generating Long Sequences with Sparse Transformers	Rewon Child and Scott Gray and Alec Radford and Ilya Sutskever	https://arxiv.org/pdf/1904.10509.pdf	https://github.com/openai/sparse_attention	audio generation				Transformer		No			Raw Audio			No				
diff --git a/fig/articles_per_year.png b/fig/articles_per_year.png
diff --git a/fig/pie_chart_architecture.png b/fig/pie_chart_architecture.png
diff --git a/fig/pie_chart_dataset.png b/fig/pie_chart_dataset.png
diff --git a/fig/pie_chart_framework.png b/fig/pie_chart_framework.png
diff --git a/fig/pie_chart_task.png b/fig/pie_chart_task.png
diff --git a/frameworks.md b/frameworks.md
@@ -11,3 +11,4 @@ Please refer to the list of useful acronyms used in deep learning and music: [ac
 - PyTorch
 - Tensorflow
 - Theano
+- tensor2tensor
diff --git a/publication_type.md b/publication_type.md
@@ -22,6 +22,7 @@
 - Biennial Symposium for Arts and Technology
 - CBMI
 - CSMC
+- CoRR
 - Connectionist Models Summer School
 - Convention of Electrical and Electronics Engineers
 - DLRS

diff --git a/tasks.md b/tasks.md
@@ -18,6 +18,7 @@ Please refer to the list of useful acronyms used in deep learning and music: [ac
 - MSR
 - Manifesto
 - Mixing
+- Music Transcription
 - Music/Noise segmentation
 - Noise suppression
 - Onset detection
@@ -35,3 +36,4 @@ Please refer to the list of useful acronyms used in deep learning and music: [ac
 - Syllable segmentation
 - Transcription
 - VAD
+- audio generation
-Original file line number
+Diff line change
@@ Expand Up @@
     - RNN
     - RNN-LSTM
     - ResNet
+    - SeqGAN
+    - Transformer
     - U-Net
     - VPNN