-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathatom.xml
44 lines (24 loc) · 4.09 KB
/
atom.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>sangplus的博客</title>
<link href="http://sangplus.com.cn/atom.xml" rel="self"/>
<link href="http://sangplus.com.cn/"/>
<updated>2023-03-16T03:19:17.589Z</updated>
<id>http://sangplus.com.cn/</id>
<author>
<name>SangPlus</name>
</author>
<generator uri="https://hexo.io/">Hexo</generator>
<entry>
<title>sentence-transformers详解</title>
<link href="http://sangplus.com.cn/2023/03/15/sentence-transformers%E8%AF%A6%E8%A7%A3/"/>
<id>http://sangplus.com.cn/2023/03/15/sentence-transformers%E8%AF%A6%E8%A7%A3/</id>
<published>2023-03-15T06:07:55.000Z</published>
<updated>2023-03-16T03:19:17.589Z</updated>
<content type="html"><![CDATA[<p>从基础的bert构建一个sentence-transformer模型</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> sentence_transformers <span class="keyword">import</span> SentenceTransformer, models</span><br><span class="line"></span><br><span class="line">model_path = <span class="string">"/Users/sang/workhome/pretrained_models/pytorch/pytorch_bert"</span></span><br><span class="line"></span><br><span class="line">word_embedding_model = models.Transformer(model_path, max_seq_length=<span class="number">256</span>)</span><br><span class="line">pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension())</span><br><span class="line"></span><br><span class="line">model = SentenceTransformer(modules=[word_embedding_model, pooling_model])</span><br><span class="line"></span><br><span class="line">sentences = [<span class="string">'This framework generates embeddings for each input sentence'</span>,</span><br><span class="line"> <span class="string">'Sentences are passed as a list of string.'</span>,</span><br><span class="line"> <span class="string">'The quick brown fox jumps over the lazy dog.'</span>]</span><br><span class="line"></span><br><span class="line"><span class="comment">#Sentences are encoded by calling model.encode()</span></span><br><span class="line">embeddings = model.encode(sentences)</span><br><span class="line"></span><br><span class="line"><span class="built_in">print</span>(embeddings.shape)</span><br><span class="line"><span class="built_in">print</span>(embeddings.dtype)</span><br></pre></td></tr></table></figure><p>准备训练数据,用到InputExample模块</p><script type="text/javascript" src="https://unpkg.com/[email protected]/dist/kity.min.js"></script><script type="text/javascript" src="https://unpkg.com/[email protected]/dist/kityminder.core.min.js"></script><script defer="true" type="text/javascript" src="https://unpkg.com/[email protected]/dist/mindmap.min.js"></script><link rel="stylesheet" type="text/css" href="https://unpkg.com/[email protected]/dist/mindmap.min.css">]]></content>
<summary type="html"><p>从基础的bert构建一个sentence-transformer模型</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br</summary>
<category term="预训练模型" scheme="http://sangplus.com.cn/tags/%E9%A2%84%E8%AE%AD%E7%BB%83%E6%A8%A1%E5%9E%8B/"/>
<category term="NLP" scheme="http://sangplus.com.cn/tags/NLP/"/>
</entry>
</feed>