diff --git a/consistency_tta.html b/consistency_tta.html new file mode 100644 index 0000000..b7c3e91 --- /dev/null +++ b/consistency_tta.html @@ -0,0 +1,55 @@ + + + + + + Consistency TTA Generation + + + + +
+

Accelerating Diffusion-Based Text-to-Audio
Generation with Consistency Distillation

+

Yatong Bai, Trung Dang, Dung Tran, Kazuhito Koishida, Somayeh Sojoudi

+ +
+ + + + + + + + + + + +
+
+ +
+

Description

+

+ Diffusion models power a vast majority of the text-to-audio generation methods. + Unfortunately, diffusion models suffer from a slow inference speed due to iteratively querying the + underlying denoising network, thus unsuitable for applications with time or computational constraints. + This work modifies the recently proposed "consistency distillation" framework to train text-to-audio models + that only require a single neural network query, accelerating the generation hundreds of times. +

+

+ By incorporating classifier-free guidance into the distillation framework, our models retain diffusion models' impressive generation quality and diversity. + Furthermore, the non-recurrent differentiable structure resulting from the distillation allows fine-tuning with novel loss functions. + We use the CLAP loss as an example, confirming that end-to-end fine-tuning further boosts the generation quality. +

+
+ +
+

Contact

+

For any questions regarding our work, please email yatong_bai@berkeley.edu.

+
+ + + + diff --git a/consistency_tta/.gitignore b/consistency_tta/.gitignore new file mode 100644 index 0000000..5509140 --- /dev/null +++ b/consistency_tta/.gitignore @@ -0,0 +1 @@ +*.DS_Store diff --git a/consistency_tta/LICENSE.md b/consistency_tta/LICENSE.md new file mode 100644 index 0000000..47cd78a --- /dev/null +++ b/consistency_tta/LICENSE.md @@ -0,0 +1,141 @@ +## Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License + +By exercising the Licensed Rights (defined below), You accept and agree to be bound by the terms and conditions of this Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License ("Public License"). To the extent this Public License may be interpreted as a contract, You are granted the Licensed Rights in consideration of Your acceptance of these terms and conditions, and the Licensor grants You such rights in consideration of benefits the Licensor receives from making the Licensed Material available under these terms and conditions. + +### Section 1 – Definitions. + +a. __Adapted Material__ means material subject to Copyright and Similar Rights that is derived from or based upon the Licensed Material and in which the Licensed Material is translated, altered, arranged, transformed, or otherwise modified in a manner requiring permission under the Copyright and Similar Rights held by the Licensor. For purposes of this Public License, where the Licensed Material is a musical work, performance, or sound recording, Adapted Material is always produced where the Licensed Material is synched in timed relation with a moving image. + +b. __Copyright and Similar Rights__ means copyright and/or similar rights closely related to copyright including, without limitation, performance, broadcast, sound recording, and Sui Generis Database Rights, without regard to how the rights are labeled or categorized. For purposes of this Public License, the rights specified in Section 2(b)(1)-(2) are not Copyright and Similar Rights. + +e. __Effective Technological Measures__ means those measures that, in the absence of proper authority, may not be circumvented under laws fulfilling obligations under Article 11 of the WIPO Copyright Treaty adopted on December 20, 1996, and/or similar international agreements. + +f. __Exceptions and Limitations__ means fair use, fair dealing, and/or any other exception or limitation to Copyright and Similar Rights that applies to Your use of the Licensed Material. + +h. __Licensed Material__ means the artistic or literary work, database, or other material to which the Licensor applied this Public License. + +i. __Licensed Rights__ means the rights granted to You subject to the terms and conditions of this Public License, which are limited to all Copyright and Similar Rights that apply to Your use of the Licensed Material and that the Licensor has authority to license. + +h. __Licensor__ means the individual(s) or entity(ies) granting rights under this Public License. + +i. __NonCommercial__ means not primarily intended for or directed towards commercial advantage or monetary compensation. For purposes of this Public License, the exchange of the Licensed Material for other material subject to Copyright and Similar Rights by digital file-sharing or similar means is NonCommercial provided there is no payment of monetary compensation in connection with the exchange. + +j. __Share__ means to provide material to the public by any means or process that requires permission under the Licensed Rights, such as reproduction, public display, public performance, distribution, dissemination, communication, or importation, and to make material available to the public including in ways that members of the public may access the material from a place and at a time individually chosen by them. + +k. __Sui Generis Database Rights__ means rights other than copyright resulting from Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases, as amended and/or succeeded, as well as other essentially equivalent rights anywhere in the world. + +l. __You__ means the individual or entity exercising the Licensed Rights under this Public License. Your has a corresponding meaning. + +### Section 2 – Scope. + +a. ___License grant.___ + + 1. Subject to the terms and conditions of this Public License, the Licensor hereby grants You a worldwide, royalty-free, non-sublicensable, non-exclusive, irrevocable license to exercise the Licensed Rights in the Licensed Material to: + + A. reproduce and Share the Licensed Material, in whole or in part, for NonCommercial purposes only; and + + B. produce and reproduce, but not Share, Adapted Material for NonCommercial purposes only. + + 2. __Exceptions and Limitations.__ For the avoidance of doubt, where Exceptions and Limitations apply to Your use, this Public License does not apply, and You do not need to comply with its terms and conditions. + + 3. __Term.__ The term of this Public License is specified in Section 6(a). + + 4. __Media and formats; technical modifications allowed.__ The Licensor authorizes You to exercise the Licensed Rights in all media and formats whether now known or hereafter created, and to make technical modifications necessary to do so. The Licensor waives and/or agrees not to assert any right or authority to forbid You from making technical modifications necessary to exercise the Licensed Rights, including technical modifications necessary to circumvent Effective Technological Measures. For purposes of this Public License, simply making modifications authorized by this Section 2(a)(4) never produces Adapted Material. + + 5. __Downstream recipients.__ + + A. __Offer from the Licensor – Licensed Material.__ Every recipient of the Licensed Material automatically receives an offer from the Licensor to exercise the Licensed Rights under the terms and conditions of this Public License. + + B. __No downstream restrictions.__ You may not offer or impose any additional or different terms or conditions on, or apply any Effective Technological Measures to, the Licensed Material if doing so restricts exercise of the Licensed Rights by any recipient of the Licensed Material. + + 6. __No endorsement.__ Nothing in this Public License constitutes or may be construed as permission to assert or imply that You are, or that Your use of the Licensed Material is, connected with, or sponsored, endorsed, or granted official status by, the Licensor or others designated to receive attribution as provided in Section 3(a)(1)(A)(i). + +b. ___Other rights.___ + + 1. Moral rights, such as the right of integrity, are not licensed under this Public License, nor are publicity, privacy, and/or other similar personality rights; however, to the extent possible, the Licensor waives and/or agrees not to assert any such rights held by the Licensor to the limited extent necessary to allow You to exercise the Licensed Rights, but not otherwise. + + 2. Patent and trademark rights are not licensed under this Public License. + + 3. To the extent possible, the Licensor waives any right to collect royalties from You for the exercise of the Licensed Rights, whether directly or through a collecting society under any voluntary or waivable statutory or compulsory licensing scheme. In all other cases the Licensor expressly reserves any right to collect such royalties, including when the Licensed Material is used other than for NonCommercial purposes. + +### Section 3 – License Conditions. + +Your exercise of the Licensed Rights is expressly made subject to the following conditions. + +a. ___Attribution.___ + + 1. If You Share the Licensed Material, You must: + + A. retain the following if it is supplied by the Licensor with the Licensed Material: + + i. identification of the creator(s) of the Licensed Material and any others designated to receive attribution, in any reasonable manner requested by the Licensor (including by pseudonym if designated); + + ii. a copyright notice; + + iii. a notice that refers to this Public License; + + iv. a notice that refers to the disclaimer of warranties; + + v. a URI or hyperlink to the Licensed Material to the extent reasonably practicable; + + B. indicate if You modified the Licensed Material and retain an indication of any previous modifications; and + + C. indicate the Licensed Material is licensed under this Public License, and include the text of, or the URI or hyperlink to, this Public License. + + For the avoidance of doubt, You do not have permission under this Public License to Share Adapted Material. + + 2. You may satisfy the conditions in Section 3(a)(1) in any reasonable manner based on the medium, means, and context in which You Share the Licensed Material. For example, it may be reasonable to satisfy the conditions by providing a URI or hyperlink to a resource that includes the required information. + + 3. If requested by the Licensor, You must remove any of the information required by Section 3(a)(1)(A) to the extent reasonably practicable. + +### Section 4 – Sui Generis Database Rights. + +Where the Licensed Rights include Sui Generis Database Rights that apply to Your use of the Licensed Material: + +a. for the avoidance of doubt, Section 2(a)(1) grants You the right to extract, reuse, reproduce, and Share all or a substantial portion of the contents of the database for NonCommercial purposes only and provided You do not Share Adapted Material; + +b. if You include all or a substantial portion of the database contents in a database in which You have Sui Generis Database Rights, then the database in which You have Sui Generis Database Rights (but not its individual contents) is Adapted Material; and + +c. You must comply with the conditions in Section 3(a) if You Share all or a substantial portion of the contents of the database. + +For the avoidance of doubt, this Section 4 supplements and does not replace Your obligations under this Public License where the Licensed Rights include other Copyright and Similar Rights. + +### Section 5 – Disclaimer of Warranties and Limitation of Liability. + +a. __Unless otherwise separately undertaken by the Licensor, to the extent possible, the Licensor offers the Licensed Material as-is and as-available, and makes no representations or warranties of any kind concerning the Licensed Material, whether express, implied, statutory, or other. This includes, without limitation, warranties of title, merchantability, fitness for a particular purpose, non-infringement, absence of latent or other defects, accuracy, or the presence or absence of errors, whether or not known or discoverable. Where disclaimers of warranties are not allowed in full or in part, this disclaimer may not apply to You.__ + +b. __To the extent possible, in no event will the Licensor be liable to You on any legal theory (including, without limitation, negligence) or otherwise for any direct, special, indirect, incidental, consequential, punitive, exemplary, or other losses, costs, expenses, or damages arising out of this Public License or use of the Licensed Material, even if the Licensor has been advised of the possibility of such losses, costs, expenses, or damages. Where a limitation of liability is not allowed in full or in part, this limitation may not apply to You.__ + +c. The disclaimer of warranties and limitation of liability provided above shall be interpreted in a manner that, to the extent possible, most closely approximates an absolute disclaimer and waiver of all liability. + +### Section 6 – Term and Termination. + +a. This Public License applies for the term of the Copyright and Similar Rights licensed here. However, if You fail to comply with this Public License, then Your rights under this Public License terminate automatically. + +b. Where Your right to use the Licensed Material has terminated under Section 6(a), it reinstates: + + 1. automatically as of the date the violation is cured, provided it is cured within 30 days of Your discovery of the violation; or + + 2. upon express reinstatement by the Licensor. + + For the avoidance of doubt, this Section 6(b) does not affect any right the Licensor may have to seek remedies for Your violations of this Public License. + +c. For the avoidance of doubt, the Licensor may also offer the Licensed Material under separate terms or conditions or stop distributing the Licensed Material at any time; however, doing so will not terminate this Public License. + +d. Sections 1, 5, 6, 7, and 8 survive termination of this Public License. + +### Section 7 – Other Terms and Conditions. + +a. The Licensor shall not be bound by any additional or different terms or conditions communicated by You unless expressly agreed. + +b. Any arrangements, understandings, or agreements regarding the Licensed Material not stated herein are separate from and independent of the terms and conditions of this Public License. + +### Section 8 – Interpretation. + +a. For the avoidance of doubt, this Public License does not, and shall not be interpreted to, reduce, limit, restrict, or impose conditions on any use of the Licensed Material that could lawfully be made without permission under this Public License. + +b. To the extent possible, if any provision of this Public License is deemed unenforceable, it shall be automatically reformed to the minimum extent necessary to make it enforceable. If the provision cannot be reformed, it shall be severed from this Public License without affecting the enforceability of the remaining terms and conditions. + +c. No term or condition of this Public License will be waived and no failure to comply consented to unless expressly agreed to by the Licensor. + +d. Nothing in this Public License constitutes or may be interpreted as a limitation upon, or waiver of, any privileges and immunities that apply to the Licensor or You, including from the legal processes of any jurisdiction or authority. diff --git a/consistency_tta/README.md b/consistency_tta/README.md new file mode 100644 index 0000000..5863402 --- /dev/null +++ b/consistency_tta/README.md @@ -0,0 +1,5 @@ +## Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation + +This website is the official site for the paper Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation by Yatong Bai, Trung Dang, Dung Tran, Kazuhito Koishida, and Somayeh Sojoudi. + +The webpage has two portions: a demo page and an example of the human evaluation form. diff --git a/consistency_tta/audio/0.wav b/consistency_tta/audio/0.wav new file mode 100644 index 0000000..0d6c130 Binary files /dev/null and b/consistency_tta/audio/0.wav differ diff --git a/consistency_tta/audio/1.wav b/consistency_tta/audio/1.wav new file mode 100644 index 0000000..12d0b29 Binary files /dev/null and b/consistency_tta/audio/1.wav differ diff --git a/consistency_tta/audio/10.wav b/consistency_tta/audio/10.wav new file mode 100644 index 0000000..6d412ec Binary files /dev/null and b/consistency_tta/audio/10.wav differ diff --git a/consistency_tta/audio/11.wav b/consistency_tta/audio/11.wav new file mode 100644 index 0000000..a5b0d7d Binary files /dev/null and b/consistency_tta/audio/11.wav differ diff --git a/consistency_tta/audio/12.wav b/consistency_tta/audio/12.wav new file mode 100644 index 0000000..bd5171e Binary files /dev/null and b/consistency_tta/audio/12.wav differ diff --git a/consistency_tta/audio/13.wav b/consistency_tta/audio/13.wav new file mode 100644 index 0000000..4fd174e Binary files /dev/null and b/consistency_tta/audio/13.wav differ diff --git a/consistency_tta/audio/14.wav b/consistency_tta/audio/14.wav new file mode 100644 index 0000000..cf70eab Binary files /dev/null and b/consistency_tta/audio/14.wav differ diff --git a/consistency_tta/audio/15.wav b/consistency_tta/audio/15.wav new file mode 100644 index 0000000..dfc513a Binary files /dev/null and b/consistency_tta/audio/15.wav differ diff --git a/consistency_tta/audio/16.wav b/consistency_tta/audio/16.wav new file mode 100644 index 0000000..2bb59f5 Binary files /dev/null and b/consistency_tta/audio/16.wav differ diff --git a/consistency_tta/audio/17.wav b/consistency_tta/audio/17.wav new file mode 100644 index 0000000..0af1c72 Binary files /dev/null and b/consistency_tta/audio/17.wav differ diff --git a/consistency_tta/audio/18.wav b/consistency_tta/audio/18.wav new file mode 100644 index 0000000..1b8d59c Binary files /dev/null and b/consistency_tta/audio/18.wav differ diff --git a/consistency_tta/audio/19.wav b/consistency_tta/audio/19.wav new file mode 100644 index 0000000..cebfab3 Binary files /dev/null and b/consistency_tta/audio/19.wav differ diff --git a/consistency_tta/audio/2.wav b/consistency_tta/audio/2.wav new file mode 100644 index 0000000..ea1a198 Binary files /dev/null and b/consistency_tta/audio/2.wav differ diff --git a/consistency_tta/audio/20.wav b/consistency_tta/audio/20.wav new file mode 100644 index 0000000..17745fe Binary files /dev/null and b/consistency_tta/audio/20.wav differ diff --git a/consistency_tta/audio/21.wav b/consistency_tta/audio/21.wav new file mode 100644 index 0000000..1bea324 Binary files /dev/null and b/consistency_tta/audio/21.wav differ diff --git a/consistency_tta/audio/22.wav b/consistency_tta/audio/22.wav new file mode 100644 index 0000000..856a55b Binary files /dev/null and b/consistency_tta/audio/22.wav differ diff --git a/consistency_tta/audio/23.wav b/consistency_tta/audio/23.wav new file mode 100644 index 0000000..2541001 Binary files /dev/null and b/consistency_tta/audio/23.wav differ diff --git a/consistency_tta/audio/24.wav b/consistency_tta/audio/24.wav new file mode 100644 index 0000000..284a070 Binary files /dev/null and b/consistency_tta/audio/24.wav differ diff --git a/consistency_tta/audio/25.wav b/consistency_tta/audio/25.wav new file mode 100644 index 0000000..fab0093 Binary files /dev/null and b/consistency_tta/audio/25.wav differ diff --git a/consistency_tta/audio/26.wav b/consistency_tta/audio/26.wav new file mode 100644 index 0000000..d54c469 Binary files /dev/null and b/consistency_tta/audio/26.wav differ diff --git a/consistency_tta/audio/27.wav b/consistency_tta/audio/27.wav new file mode 100644 index 0000000..55bd7e1 Binary files /dev/null and b/consistency_tta/audio/27.wav differ diff --git a/consistency_tta/audio/28.wav b/consistency_tta/audio/28.wav new file mode 100644 index 0000000..3d1b962 Binary files /dev/null and b/consistency_tta/audio/28.wav differ diff --git a/consistency_tta/audio/29.wav b/consistency_tta/audio/29.wav new file mode 100644 index 0000000..6dec530 Binary files /dev/null and b/consistency_tta/audio/29.wav differ diff --git a/consistency_tta/audio/3.wav b/consistency_tta/audio/3.wav new file mode 100644 index 0000000..86af916 Binary files /dev/null and b/consistency_tta/audio/3.wav differ diff --git a/consistency_tta/audio/30.wav b/consistency_tta/audio/30.wav new file mode 100644 index 0000000..48194c8 Binary files /dev/null and b/consistency_tta/audio/30.wav differ diff --git a/consistency_tta/audio/31.wav b/consistency_tta/audio/31.wav new file mode 100644 index 0000000..f574990 Binary files /dev/null and b/consistency_tta/audio/31.wav differ diff --git a/consistency_tta/audio/32.wav b/consistency_tta/audio/32.wav new file mode 100644 index 0000000..0fb5b86 Binary files /dev/null and b/consistency_tta/audio/32.wav differ diff --git a/consistency_tta/audio/33.wav b/consistency_tta/audio/33.wav new file mode 100644 index 0000000..77099b5 Binary files /dev/null and b/consistency_tta/audio/33.wav differ diff --git a/consistency_tta/audio/34.wav b/consistency_tta/audio/34.wav new file mode 100644 index 0000000..dc60c64 Binary files /dev/null and b/consistency_tta/audio/34.wav differ diff --git a/consistency_tta/audio/35.wav b/consistency_tta/audio/35.wav new file mode 100644 index 0000000..78fa3b7 Binary files /dev/null and b/consistency_tta/audio/35.wav differ diff --git a/consistency_tta/audio/36.wav b/consistency_tta/audio/36.wav new file mode 100644 index 0000000..354b122 Binary files /dev/null and b/consistency_tta/audio/36.wav differ diff --git a/consistency_tta/audio/37.wav b/consistency_tta/audio/37.wav new file mode 100644 index 0000000..efb083c Binary files /dev/null and b/consistency_tta/audio/37.wav differ diff --git a/consistency_tta/audio/38.wav b/consistency_tta/audio/38.wav new file mode 100644 index 0000000..f97b640 Binary files /dev/null and b/consistency_tta/audio/38.wav differ diff --git a/consistency_tta/audio/39.wav b/consistency_tta/audio/39.wav new file mode 100644 index 0000000..7308e0c Binary files /dev/null and b/consistency_tta/audio/39.wav differ diff --git a/consistency_tta/audio/4.wav b/consistency_tta/audio/4.wav new file mode 100644 index 0000000..e1701a8 Binary files /dev/null and b/consistency_tta/audio/4.wav differ diff --git a/consistency_tta/audio/40.wav b/consistency_tta/audio/40.wav new file mode 100644 index 0000000..dc5004a Binary files /dev/null and b/consistency_tta/audio/40.wav differ diff --git a/consistency_tta/audio/41.wav b/consistency_tta/audio/41.wav new file mode 100644 index 0000000..8aae52a Binary files /dev/null and b/consistency_tta/audio/41.wav differ diff --git a/consistency_tta/audio/42.wav b/consistency_tta/audio/42.wav new file mode 100644 index 0000000..c2ca4b1 Binary files /dev/null and b/consistency_tta/audio/42.wav differ diff --git a/consistency_tta/audio/43.wav b/consistency_tta/audio/43.wav new file mode 100644 index 0000000..c15b588 Binary files /dev/null and b/consistency_tta/audio/43.wav differ diff --git a/consistency_tta/audio/44.wav b/consistency_tta/audio/44.wav new file mode 100644 index 0000000..00e8466 Binary files /dev/null and b/consistency_tta/audio/44.wav differ diff --git a/consistency_tta/audio/45.wav b/consistency_tta/audio/45.wav new file mode 100644 index 0000000..ba2403c Binary files /dev/null and b/consistency_tta/audio/45.wav differ diff --git a/consistency_tta/audio/46.wav b/consistency_tta/audio/46.wav new file mode 100644 index 0000000..0b7110d Binary files /dev/null and b/consistency_tta/audio/46.wav differ diff --git a/consistency_tta/audio/47.wav b/consistency_tta/audio/47.wav new file mode 100644 index 0000000..7033370 Binary files /dev/null and b/consistency_tta/audio/47.wav differ diff --git a/consistency_tta/audio/48.wav b/consistency_tta/audio/48.wav new file mode 100644 index 0000000..7bd52a9 Binary files /dev/null and b/consistency_tta/audio/48.wav differ diff --git a/consistency_tta/audio/49.wav b/consistency_tta/audio/49.wav new file mode 100644 index 0000000..f5d4983 Binary files /dev/null and b/consistency_tta/audio/49.wav differ diff --git a/consistency_tta/audio/5.wav b/consistency_tta/audio/5.wav new file mode 100644 index 0000000..0f50ad3 Binary files /dev/null and b/consistency_tta/audio/5.wav differ diff --git a/consistency_tta/audio/50.wav b/consistency_tta/audio/50.wav new file mode 100644 index 0000000..04e0912 Binary files /dev/null and b/consistency_tta/audio/50.wav differ diff --git a/consistency_tta/audio/51.wav b/consistency_tta/audio/51.wav new file mode 100644 index 0000000..9472ac4 Binary files /dev/null and b/consistency_tta/audio/51.wav differ diff --git a/consistency_tta/audio/52.wav b/consistency_tta/audio/52.wav new file mode 100644 index 0000000..1d26b76 Binary files /dev/null and b/consistency_tta/audio/52.wav differ diff --git a/consistency_tta/audio/53.wav b/consistency_tta/audio/53.wav new file mode 100644 index 0000000..1d9e5bf Binary files /dev/null and b/consistency_tta/audio/53.wav differ diff --git a/consistency_tta/audio/54.wav b/consistency_tta/audio/54.wav new file mode 100644 index 0000000..3e4d43b Binary files /dev/null and b/consistency_tta/audio/54.wav differ diff --git a/consistency_tta/audio/55.wav b/consistency_tta/audio/55.wav new file mode 100644 index 0000000..cf3fb50 Binary files /dev/null and b/consistency_tta/audio/55.wav differ diff --git a/consistency_tta/audio/56.wav b/consistency_tta/audio/56.wav new file mode 100644 index 0000000..0378cff Binary files /dev/null and b/consistency_tta/audio/56.wav differ diff --git a/consistency_tta/audio/57.wav b/consistency_tta/audio/57.wav new file mode 100644 index 0000000..7419ee0 Binary files /dev/null and b/consistency_tta/audio/57.wav differ diff --git a/consistency_tta/audio/58.wav b/consistency_tta/audio/58.wav new file mode 100644 index 0000000..e394a9d Binary files /dev/null and b/consistency_tta/audio/58.wav differ diff --git a/consistency_tta/audio/59.wav b/consistency_tta/audio/59.wav new file mode 100644 index 0000000..81aeb3f Binary files /dev/null and b/consistency_tta/audio/59.wav differ diff --git a/consistency_tta/audio/6.wav b/consistency_tta/audio/6.wav new file mode 100644 index 0000000..40f2b63 Binary files /dev/null and b/consistency_tta/audio/6.wav differ diff --git a/consistency_tta/audio/60.wav b/consistency_tta/audio/60.wav new file mode 100644 index 0000000..4266334 Binary files /dev/null and b/consistency_tta/audio/60.wav differ diff --git a/consistency_tta/audio/61.wav b/consistency_tta/audio/61.wav new file mode 100644 index 0000000..cdd56a0 Binary files /dev/null and b/consistency_tta/audio/61.wav differ diff --git a/consistency_tta/audio/62.wav b/consistency_tta/audio/62.wav new file mode 100644 index 0000000..df8a1cf Binary files /dev/null and b/consistency_tta/audio/62.wav differ diff --git a/consistency_tta/audio/63.wav b/consistency_tta/audio/63.wav new file mode 100644 index 0000000..71d6db6 Binary files /dev/null and b/consistency_tta/audio/63.wav differ diff --git a/consistency_tta/audio/64.wav b/consistency_tta/audio/64.wav new file mode 100644 index 0000000..c9f869f Binary files /dev/null and b/consistency_tta/audio/64.wav differ diff --git a/consistency_tta/audio/65.wav b/consistency_tta/audio/65.wav new file mode 100644 index 0000000..eb32a69 Binary files /dev/null and b/consistency_tta/audio/65.wav differ diff --git a/consistency_tta/audio/66.wav b/consistency_tta/audio/66.wav new file mode 100644 index 0000000..b6e18fa Binary files /dev/null and b/consistency_tta/audio/66.wav differ diff --git a/consistency_tta/audio/67.wav b/consistency_tta/audio/67.wav new file mode 100644 index 0000000..ff64fd4 Binary files /dev/null and b/consistency_tta/audio/67.wav differ diff --git a/consistency_tta/audio/68.wav b/consistency_tta/audio/68.wav new file mode 100644 index 0000000..c916f47 Binary files /dev/null and b/consistency_tta/audio/68.wav differ diff --git a/consistency_tta/audio/69.wav b/consistency_tta/audio/69.wav new file mode 100644 index 0000000..6362235 Binary files /dev/null and b/consistency_tta/audio/69.wav differ diff --git a/consistency_tta/audio/7.wav b/consistency_tta/audio/7.wav new file mode 100644 index 0000000..084519f Binary files /dev/null and b/consistency_tta/audio/7.wav differ diff --git a/consistency_tta/audio/70.wav b/consistency_tta/audio/70.wav new file mode 100644 index 0000000..c463fdc Binary files /dev/null and b/consistency_tta/audio/70.wav differ diff --git a/consistency_tta/audio/71.wav b/consistency_tta/audio/71.wav new file mode 100644 index 0000000..28f6d91 Binary files /dev/null and b/consistency_tta/audio/71.wav differ diff --git a/consistency_tta/audio/72.wav b/consistency_tta/audio/72.wav new file mode 100644 index 0000000..936ec88 Binary files /dev/null and b/consistency_tta/audio/72.wav differ diff --git a/consistency_tta/audio/73.wav b/consistency_tta/audio/73.wav new file mode 100644 index 0000000..62feac4 Binary files /dev/null and b/consistency_tta/audio/73.wav differ diff --git a/consistency_tta/audio/74.wav b/consistency_tta/audio/74.wav new file mode 100644 index 0000000..4d9a1d0 Binary files /dev/null and b/consistency_tta/audio/74.wav differ diff --git a/consistency_tta/audio/75.wav b/consistency_tta/audio/75.wav new file mode 100644 index 0000000..faaceb9 Binary files /dev/null and b/consistency_tta/audio/75.wav differ diff --git a/consistency_tta/audio/76.wav b/consistency_tta/audio/76.wav new file mode 100644 index 0000000..dee4cef Binary files /dev/null and b/consistency_tta/audio/76.wav differ diff --git a/consistency_tta/audio/77.wav b/consistency_tta/audio/77.wav new file mode 100644 index 0000000..bd0a865 Binary files /dev/null and b/consistency_tta/audio/77.wav differ diff --git a/consistency_tta/audio/78.wav b/consistency_tta/audio/78.wav new file mode 100644 index 0000000..7d07790 Binary files /dev/null and b/consistency_tta/audio/78.wav differ diff --git a/consistency_tta/audio/79.wav b/consistency_tta/audio/79.wav new file mode 100644 index 0000000..9996c30 Binary files /dev/null and b/consistency_tta/audio/79.wav differ diff --git a/consistency_tta/audio/8.wav b/consistency_tta/audio/8.wav new file mode 100644 index 0000000..64c14a9 Binary files /dev/null and b/consistency_tta/audio/8.wav differ diff --git a/consistency_tta/audio/80.wav b/consistency_tta/audio/80.wav new file mode 100644 index 0000000..d91b2ab Binary files /dev/null and b/consistency_tta/audio/80.wav differ diff --git a/consistency_tta/audio/81.wav b/consistency_tta/audio/81.wav new file mode 100644 index 0000000..daf6145 Binary files /dev/null and b/consistency_tta/audio/81.wav differ diff --git a/consistency_tta/audio/82.wav b/consistency_tta/audio/82.wav new file mode 100644 index 0000000..1d53329 Binary files /dev/null and b/consistency_tta/audio/82.wav differ diff --git a/consistency_tta/audio/83.wav b/consistency_tta/audio/83.wav new file mode 100644 index 0000000..c63f226 Binary files /dev/null and b/consistency_tta/audio/83.wav differ diff --git a/consistency_tta/audio/84.wav b/consistency_tta/audio/84.wav new file mode 100644 index 0000000..834096f Binary files /dev/null and b/consistency_tta/audio/84.wav differ diff --git a/consistency_tta/audio/85.wav b/consistency_tta/audio/85.wav new file mode 100644 index 0000000..7eb52e3 Binary files /dev/null and b/consistency_tta/audio/85.wav differ diff --git a/consistency_tta/audio/86.wav b/consistency_tta/audio/86.wav new file mode 100644 index 0000000..df52f40 Binary files /dev/null and b/consistency_tta/audio/86.wav differ diff --git a/consistency_tta/audio/87.wav b/consistency_tta/audio/87.wav new file mode 100644 index 0000000..2f8b942 Binary files /dev/null and b/consistency_tta/audio/87.wav differ diff --git a/consistency_tta/audio/88.wav b/consistency_tta/audio/88.wav new file mode 100644 index 0000000..2535fa9 Binary files /dev/null and b/consistency_tta/audio/88.wav differ diff --git a/consistency_tta/audio/89.wav b/consistency_tta/audio/89.wav new file mode 100644 index 0000000..cf0e494 Binary files /dev/null and b/consistency_tta/audio/89.wav differ diff --git a/consistency_tta/audio/9.wav b/consistency_tta/audio/9.wav new file mode 100644 index 0000000..67c7271 Binary files /dev/null and b/consistency_tta/audio/9.wav differ diff --git a/consistency_tta/audio/90.wav b/consistency_tta/audio/90.wav new file mode 100644 index 0000000..fca8715 Binary files /dev/null and b/consistency_tta/audio/90.wav differ diff --git a/consistency_tta/audio/91.wav b/consistency_tta/audio/91.wav new file mode 100644 index 0000000..cbea0a7 Binary files /dev/null and b/consistency_tta/audio/91.wav differ diff --git a/consistency_tta/audio/92.wav b/consistency_tta/audio/92.wav new file mode 100644 index 0000000..5cff587 Binary files /dev/null and b/consistency_tta/audio/92.wav differ diff --git a/consistency_tta/audio/93.wav b/consistency_tta/audio/93.wav new file mode 100644 index 0000000..9fa57de Binary files /dev/null and b/consistency_tta/audio/93.wav differ diff --git a/consistency_tta/audio/94.wav b/consistency_tta/audio/94.wav new file mode 100644 index 0000000..f9b85e2 Binary files /dev/null and b/consistency_tta/audio/94.wav differ diff --git a/consistency_tta/audio/95.wav b/consistency_tta/audio/95.wav new file mode 100644 index 0000000..1f07694 Binary files /dev/null and b/consistency_tta/audio/95.wav differ diff --git a/consistency_tta/audio/96.wav b/consistency_tta/audio/96.wav new file mode 100644 index 0000000..f68a303 Binary files /dev/null and b/consistency_tta/audio/96.wav differ diff --git a/consistency_tta/audio/97.wav b/consistency_tta/audio/97.wav new file mode 100644 index 0000000..01fad7f Binary files /dev/null and b/consistency_tta/audio/97.wav differ diff --git a/consistency_tta/audio/98.wav b/consistency_tta/audio/98.wav new file mode 100644 index 0000000..3f04280 Binary files /dev/null and b/consistency_tta/audio/98.wav differ diff --git a/consistency_tta/audio/99.wav b/consistency_tta/audio/99.wav new file mode 100644 index 0000000..d6e08f5 Binary files /dev/null and b/consistency_tta/audio/99.wav differ diff --git a/consistency_tta/demo.html b/consistency_tta/demo.html new file mode 100644 index 0000000..a4e5b4e --- /dev/null +++ b/consistency_tta/demo.html @@ -0,0 +1,19 @@ + + + + + + Efficient Audio Generation + + + + +
+

TBD

+
+ + + + diff --git a/consistency_tta/evaluation.html b/consistency_tta/evaluation.html new file mode 100644 index 0000000..d2d4396 --- /dev/null +++ b/consistency_tta/evaluation.html @@ -0,0 +1,2769 @@ + + + + Consistency Model Human Eval + + + + +

Criteria for overall audio quality

+ + The quality of each rating is:

+ + 5 - Excellent.
+ 4 - Overall slightly synthetic.
+ 3 - Clearly synthetic but recognizable.
+ 2 - Unclear/unidentifiable sound.
+ 1 - Completely unrecognizable.

+ + Since the generative models were not trained on speech data, + they are expected to generate unintelligible speech. + Therefore, please DO NOT consider the intelligibility of speech as a part of the criteria + (the voice quality can be taken into consideration).
+ +

Criteria for audio-text correspondence

+ + The quality of each rating is:

+ + 5 - Excellent
+ 4 - Temporal mismatch or other slight mismatches. + E.g., the prompt says one sound after another, but the audio has them simultaneously.
+ 3 - One of the sound components missing/redundant/incorrect. + E.g. the prompt requests four sound components, but the audio only has three or vice versa; + the prompt asks for one persor speaking but there are two people in the audio.
+ 2 - Missing/redundant/incorrect more than one components.
+ 1 - Totally incorrect.
+ +

Before starting the rating, clear the browser local storage using the following button.

+ + +

After completing the ratings, click the following button to download the data into a CSV.

+ + There is also a copy of this button at the bottom of the page. + +
+

Prompt 0

+

Rain and thunder

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 1

+

A loud bang followed by an engine idling loudly

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 2

+

A man speaking while water runs in the background

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 3

+

An electric motor runs then a person speaks

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 4

+

A helicopter engine operating while wind blows heavily into a microphone

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 5

+

A sewing machine sews followed by a man talking

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 6

+

A woman talks briefly as several goats bleat including one that has high pitched bleats. A crunch is followed by a man speaking

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 7

+

High pressure liquid spraying as a radio plays in the background

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 8

+

Male speech and then scraping

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 9

+

Mechanical rotation and then a loud click occurs

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 10

+

A loud bang followed by an engine idling loudly

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 11

+

Humming from a large engine

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 12

+

A motor vehicle engine is revving

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 13

+

A bus engine driving in the distance then nearby followed by compressed air releasing while a woman and a child talk in the distance

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 14

+

A woman speaks, and a motor vehicle revs its engine

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 15

+

A vehicle accelerating then driving by as gusts of wind blow and leaves rustle in the distance

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 16

+

A car engine idling then starts to rev shortly after

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 17

+

Rain and thunder

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 18

+

A man talking followed by a camera muffling and footsteps shuffling then wood lightly clanking

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 19

+

An electric motor runs then a person speaks

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 20

+

A helicopter engine operating while wind blows heavily into a microphone

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 21

+

Mechanical rotation and then a loud click occurs

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 22

+

A machine motor running as a man is speaking followed by rapid buzzing

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 23

+

A vehicle accelerating then driving by as gusts of wind blow and leaves rustle in the distance

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 24

+

Train passing followed by short honk

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 25

+

A woman speaks, and a motor vehicle revs its engine

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 26

+

Several puppies yapping

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 27

+

A person gulping followed by glass tapping then liquid shaking in a container proceeded by liquid pouring before plastic thumps on paper

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 28

+

A nearby insect buzzes with nearby vibrations

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 29

+

A bus engine driving in the distance then nearby followed by compressed air releasing while a woman and a child talk in the distance

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 30

+

A bus engine driving in the distance then nearby followed by compressed air releasing while a woman and a child talk in the distance

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 31

+

High pressure liquid spraying as a radio plays in the background

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 32

+

A loud bang followed by an engine idling loudly

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 33

+

Mechanical rotation and then a loud click occurs

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 34

+

A motor vehicle engine is revving

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 35

+

A woman speaks, and a motor vehicle revs its engine

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 36

+

An electric motor runs then a person speaks

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 37

+

A man speaking while water runs in the background

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 38

+

Man talking in the wind and someone yells in the background while an engine makes squealing and air puffing sounds

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 39

+

A person gulping followed by glass tapping then liquid shaking in a container proceeded by liquid pouring before plastic thumps on paper

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 40

+

Male speech and then scraping

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 41

+

Mechanical rotation and then a loud click occurs

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 42

+

Several puppies yapping

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 43

+

Train passing followed by short honk

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 44

+

An baby laughing

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 45

+

Humming from a large engine

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 46

+

An baby laughing

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 47

+

A man speaking while water runs in the background

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 48

+

A man talking followed by a camera muffling and footsteps shuffling then wood lightly clanking

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 49

+

A horse gallops then trot on grass as gusts of wind blow and thunderclaps in the distance

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 50

+

A sewing machine sews followed by a man talking

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 51

+

An baby laughing

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 52

+

A horse gallops then trot on grass as gusts of wind blow and thunderclaps in the distance

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 53

+

Train passing followed by short honk

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 54

+

A man speaking while water runs in the background

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 55

+

Several puppies yapping

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 56

+

Several puppies yapping

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 57

+

A person gulping followed by glass tapping then liquid shaking in a container proceeded by liquid pouring before plastic thumps on paper

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 58

+

A woman talks briefly as several goats bleat including one that has high pitched bleats. A crunch is followed by a man speaking

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 59

+

Rain and thunder

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 60

+

Humming from a large engine

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 61

+

A car engine idling then starts to rev shortly after

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 62

+

High pressure liquid spraying as a radio plays in the background

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 63

+

A woman speaks, and a motor vehicle revs its engine

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 64

+

A nearby insect buzzes with nearby vibrations

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 65

+

Train passing followed by short honk

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 66

+

Rain and thunder

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 67

+

A bus engine driving in the distance then nearby followed by compressed air releasing while a woman and a child talk in the distance

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 68

+

Male speech and then scraping

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 69

+

An electric motor runs then a person speaks

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 70

+

A machine motor running as a man is speaking followed by rapid buzzing

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 71

+

A vehicle accelerating then driving by as gusts of wind blow and leaves rustle in the distance

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 72

+

A machine motor running as a man is speaking followed by rapid buzzing

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 73

+

A car engine idling then starts to rev shortly after

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 74

+

A helicopter engine operating while wind blows heavily into a microphone

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 75

+

A man talking followed by a camera muffling and footsteps shuffling then wood lightly clanking

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 76

+

A vehicle accelerating then driving by as gusts of wind blow and leaves rustle in the distance

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 77

+

A motor vehicle engine is revving

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 78

+

High pressure liquid spraying as a radio plays in the background

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 79

+

Man talking in the wind and someone yells in the background while an engine makes squealing and air puffing sounds

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 80

+

A woman talks briefly as several goats bleat including one that has high pitched bleats. A crunch is followed by a man speaking

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 81

+

A sewing machine sews followed by a man talking

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 82

+

A machine motor running as a man is speaking followed by rapid buzzing

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 83

+

A loud bang followed by an engine idling loudly

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 84

+

Man talking in the wind and someone yells in the background while an engine makes squealing and air puffing sounds

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 85

+

Male speech and then scraping

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 86

+

An baby laughing

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 87

+

A nearby insect buzzes with nearby vibrations

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 88

+

A horse gallops then trot on grass as gusts of wind blow and thunderclaps in the distance

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 89

+

Humming from a large engine

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 90

+

A nearby insect buzzes with nearby vibrations

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 91

+

A motor vehicle engine is revving

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 92

+

A car engine idling then starts to rev shortly after

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 93

+

A helicopter engine operating while wind blows heavily into a microphone

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 94

+

A horse gallops then trot on grass as gusts of wind blow and thunderclaps in the distance

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 95

+

A man talking followed by a camera muffling and footsteps shuffling then wood lightly clanking

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 96

+

A sewing machine sews followed by a man talking

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 97

+

A person gulping followed by glass tapping then liquid shaking in a container proceeded by liquid pouring before plastic thumps on paper

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 98

+

Man talking in the wind and someone yells in the background while an engine makes squealing and air puffing sounds

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Prompt 99

+

A woman talks briefly as several goats bleat including one that has high pitched bleats. A crunch is followed by a man speaking

+ + +

Rate on overall audio quality.

+
+ + + + + +
+

+ +

Rate on audio-text correspondence.

+
+ + + + + +
+

+ +
+

Please remember to download the ratings into a CSV file.

+ + Please email the CSV file to t-yatongbai@microsoft.com or send it to Yatong Bai via Teams.

+ Thank you for filling in the evaluation. + + + + diff --git a/consistency_tta/js_script.js b/consistency_tta/js_script.js new file mode 100644 index 0000000..7e39913 --- /dev/null +++ b/consistency_tta/js_script.js @@ -0,0 +1,97 @@ +function saveRating(rating, thing, aspect) { + // Construct the key for localStorage + let key = thing + '-' + aspect; + + // Save the rating to localStorage + localStorage.setItem(key, rating); + + // Reset color of all buttons for the current thing and aspect + let buttons = document.querySelectorAll('#' + key + ' .button'); + buttons.forEach(button => button.classList.remove("clicked")); + + // Change color of clicked button + let button = buttons[rating - 1]; + button.classList.add("clicked"); + + // Display message + let messageElement = document.getElementById("message-" + key); + messageElement.textContent = "The rating of " + rating + " has been received."; +} + + +function clearRatings() { + // Confirm with the user before clearing ratings + if (confirm("Are you sure you want to clear all ratings?")) { + // Clear all data from localStorage + localStorage.clear(); + + // Reset all button colors + let buttons = document.querySelectorAll('.button'); + buttons.forEach(button => button.classList.remove("clicked")); + + // Clear all messages + let messages = document.querySelectorAll('[id^="message-"]'); + messages.forEach(messageElement => messageElement.textContent = ""); + + alert("All ratings have been cleared."); + } +} + + +// When the page loads, retrieve ratings from localStorage (if any) and update button colors +document.addEventListener('DOMContentLoaded', function() { + let things = ['thing1']; // Add other things to this array + let aspects = ['aspect1']; // Add other aspects to this array + + things.forEach(thing => { + aspects.forEach(aspect => { + let rating = localStorage.getItem(thing + '-' + aspect); + if (rating) { + let buttons = document.querySelectorAll('#' + thing + '-' + aspect + ' .button'); + buttons[rating - 1].classList.add("clicked"); + } + }); + }); +}); + + +function downloadLocalStorageData(name) { + // Create a CSV string + let csvContent = "Index,Aspect,Rating\n"; + + for (let i = 0; i < localStorage.length; i++) { + let key = localStorage.key(i); + let value = localStorage.getItem(key); + + let [thing, aspect] = key.split('-'); + csvContent += thing + "," + aspect + "," + value + "\n"; + } + + // Create a blob from the CSV string + let blob = new Blob([csvContent], { type: "text/csv;charset=utf-8" }); + + // Create a download link and trigger it + let link = document.createElement("a"); + let url = URL.createObjectURL(blob); + link.setAttribute("href", url); + link.setAttribute("download", name + ".csv"); + document.body.appendChild(link); + link.click(); + document.body.removeChild(link); +} + + +// Function to set button colors based on localStorage values +function setButtonColorsFromLocalStorage() { + for (let i = 0; i < localStorage.length; i++) { + let key = localStorage.key(i); + let rating = localStorage.getItem(key); + + let buttons = document.querySelectorAll('#' + key + ' .button'); + buttons.forEach(button => button.classList.remove("clicked")); // Reset all button colors + buttons[rating - 1].classList.add("clicked"); // Set the color of the rated button + } +} + +// Event listener to execute the function when the content is loaded +document.addEventListener('DOMContentLoaded', setButtonColorsFromLocalStorage); diff --git a/consistency_tta/report.pdf b/consistency_tta/report.pdf new file mode 100644 index 0000000..5dd1dc4 Binary files /dev/null and b/consistency_tta/report.pdf differ diff --git a/consistency_tta/styles.css b/consistency_tta/styles.css new file mode 100644 index 0000000..1f47364 --- /dev/null +++ b/consistency_tta/styles.css @@ -0,0 +1,82 @@ +body { + font-family: 'Lato', sans-serif; + margin: 0; + padding: 0; + background-color: #f9f9f9; + color: #333; +} + +header { + background-color: #264653; /* Deep Blue */ + color: #fff; + text-align: center; + padding: 2.4rem 0; +} + +header h3 { + font-weight: 300; + font-size: 1.2em; + margin-top: 0.8em; +} + +.description { + font-weight: 400; + max-width: 800px; + margin: 20px auto; + padding: 20px; + box-shadow: 0 0 15px rgba(0, 0, 0, 0.1); + background-color: #fff; + border-radius: 8px; /* Rounded corners */ +} + +a { + text-decoration: none; + margin-right: 30px; /* Add spacing between the two buttons */ +} + +button { + font-family: 'Lato', sans-serif; + color: #fff; + border: none; + padding: 12px 5px; + cursor: pointer; + transition: background-color 0.3s; + font-size: 1.2em; + width: 220px; + border-radius: 4px; /* Rounded button edges */ +} + +.demo-button { + background-color: #7a5947; +} + +.paper-button { + background-color: #158590; +} + +.eval-button { + background-color: #5d5d5d; +} + +button:hover { + opacity: 0.75 /* A generic hover effect for all buttons */ +} + +a:last-child { + margin-right: 0; +} + +a[href^="mailto:"] { + /* styles for email links */ + color: #264653; + text-decoration: underline; +} + + +footer { + background-color: #264653; /* Dark Desaturated Blue */ + color: #e9e9e9; + text-align: center; + padding: 1rem 0; + margin-top: 40px; +}