-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Auto. Make Doomgrad HF Review on 30 January
- Loading branch information
1 parent
8c398dc
commit d914ca4
Showing
8 changed files
with
2,135 additions
and
82 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,180 @@ | ||
|
||
<!DOCTYPE html> | ||
<html lang="en"> | ||
<head> | ||
<script async src="https://www.googletagmanager.com/gtag/js?id=G-C1CRWDNJ1J"></script> | ||
<script> | ||
window.dataLayer = window.dataLayer || []; | ||
function gtag(){dataLayer.push(arguments);} | ||
gtag('js', new Date()); | ||
gtag('config', 'G-C1CRWDNJ1J'); | ||
</script> | ||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin> | ||
<link href="https://fonts.googleapis.com/css2?family=Noto+Sans+SC:[email protected]&display=swap" rel="stylesheet"> | ||
<meta charset="UTF-8"> | ||
<meta name="viewport" content="width=device-width, initial-scale=1.0"> | ||
<title>Chinese reading task about ML</title> | ||
<style> | ||
body { | ||
font-family: Arial, sans-serif; | ||
background-color: #f4f4f9; | ||
color: #333; | ||
margin: 0; | ||
padding: 20px; | ||
} | ||
.container { | ||
max-width: 800px; | ||
margin: 0 auto; | ||
background-color: #fff; | ||
padding: 20px; | ||
border-radius: 8px; | ||
box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1); | ||
} | ||
h1 { | ||
color: #0056b3; | ||
text-align: center; | ||
} | ||
p { | ||
line-height: 1.6; | ||
} | ||
.zh-text { | ||
font-size: 1.3em; | ||
font-family: 'Noto Sans SC'; | ||
font-weight: 300; | ||
margin: 0 0 5px 0; | ||
} | ||
.pinyin { | ||
padding-top: 5px; | ||
padding-bottom: 5px; | ||
font-style: italic; | ||
color: #888; | ||
} | ||
table { | ||
width: 100%; | ||
border-collapse: collapse; | ||
margin-top: 20px; | ||
} | ||
th, td { | ||
padding: 12px; | ||
border: 1px solid #ddd; | ||
text-align: left; | ||
} | ||
th { | ||
background-color: #0056b3; | ||
color: #fff; | ||
} | ||
td { | ||
background-color: #f9f9f9; | ||
} | ||
td.zh { | ||
font-family: 'Noto Sans SC'; | ||
font-size: 1.2em; | ||
font-weight: 400; | ||
} | ||
</style> | ||
</head> | ||
<body> | ||
<div class="container"> | ||
<h1>SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training</h1> | ||
<div><p class='zh-text'>1. 这篇文章比较了监督微调(SFT)和强化学习(RL)在基础模型上的作用。</p> | ||
<p class='zh-text'>2. 研究发现,RL在文本和视觉任务上都表现出更好的泛化能力。</p> | ||
<p class='zh-text'>3. SFT倾向于记住训练数据,而RL能够处理未见过的变体。</p> | ||
<p class='zh-text'>4. RL还提高了模型的视觉识别能力。</p> | ||
<p class='zh-text'>5. 然而,SFT对于RL的有效训练仍然不可或缺。</p></div> | ||
<div class="pinyin"> | ||
<p>1. 这篇文章比较了监督微调(SFT)和强化学习(RL)在基础模型上的作用。研究发现,RL在文本和视觉任务上都表现出更好的泛化能力。SFT倾向于记住训练数据,而RL能够处理未见过的变体。RL还提高了模型的视觉识别能力。然而,SFT对于RL的有效训练仍然不可或缺。 | ||
|
||
Zhè piān wénzhāng bǐjiào le jiàndū wēitiáo (SFT) hé qiáng huà xuéxí (RL) zài jīchǔ móxíng shàng de zuòyòng</p> | ||
<p>2. Yánjiū fāxiàn, RL zài wénběn hé shìjué rènwù shàng dōu biǎoxiàn chū gèng hǎo de fànhuà nénglì</p> | ||
<p>3. SFT qīngxiàng yú jìzhù xùnliàn shùjù, ér RL nénggòu chǔlǐ wèi jiànguò de biàntǐ</p> | ||
<p>4. RL hái tígāo le móxíng de shìjué shíbié nénglì</p> | ||
<p>5. Rán'ér, SFT duìyú RL de yǒuxiào xùnliàn réngrán bùkě huòquē</p> | ||
</div> | ||
<div><p>1. This article compares the roles of Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) on base models.</p> | ||
<p>2. The study found that RL demonstrates better generalization capabilities in both textual and visual tasks.</p> | ||
<p>3. SFT tends to memorize training data, while RL can handle unseen variants.</p> | ||
<p>4. RL also enhances the model's visual recognition capabilities.</p> | ||
<p>5. However, SFT remains indispensable for effective RL training.</p></div> | ||
<h2>Vocabulary</h2> | ||
<table> | ||
<thead> | ||
<tr> | ||
<th>Word</th> | ||
<th>Pinyin</th> | ||
<th>Translation</th> | ||
</tr> | ||
</thead> | ||
<tbody> | ||
|
||
<tr> | ||
<td class="zh">监督</td> | ||
<td>jiàn dū</td> | ||
<td>supervised</td> | ||
</tr> | ||
|
||
<tr> | ||
<td class="zh">微调</td> | ||
<td>wēi tiáo</td> | ||
<td>fine-tuning</td> | ||
</tr> | ||
|
||
<tr> | ||
<td class="zh">强化学习</td> | ||
<td>qiáng huà xué xí</td> | ||
<td>reinforcement learning</td> | ||
</tr> | ||
|
||
<tr> | ||
<td class="zh">基础模型</td> | ||
<td>jī chǔ mó xíng</td> | ||
<td>foundational model</td> | ||
</tr> | ||
|
||
<tr> | ||
<td class="zh">作用</td> | ||
<td>zuò yòng</td> | ||
<td>effect</td> | ||
</tr> | ||
|
||
<tr> | ||
<td class="zh">泛化</td> | ||
<td>fàn huà</td> | ||
<td>generalization</td> | ||
</tr> | ||
|
||
<tr> | ||
<td class="zh">倾向于</td> | ||
<td>qīng xiàng yú</td> | ||
<td>tend to</td> | ||
</tr> | ||
|
||
<tr> | ||
<td class="zh">未见过</td> | ||
<td>wèi jiàn guò</td> | ||
<td>unseen</td> | ||
</tr> | ||
|
||
<tr> | ||
<td class="zh">变体</td> | ||
<td>biàn tǐ</td> | ||
<td>variant</td> | ||
</tr> | ||
|
||
<tr> | ||
<td class="zh">视觉识别</td> | ||
<td>shì jué shí bié</td> | ||
<td>visual recognition</td> | ||
</tr> | ||
|
||
<tr> | ||
<td class="zh">不可或缺</td> | ||
<td>bù kě huò quē</td> | ||
<td>indispensable</td> | ||
</tr> | ||
|
||
</tbody> | ||
</table> | ||
</div> | ||
</body> | ||
</html> | ||
|
Oops, something went wrong.