Skip to content

Commit

Permalink
Update ReadMe.md
Browse files Browse the repository at this point in the history
  • Loading branch information
stefanache authored Feb 8, 2025
1 parent 10263f8 commit e1fadae
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion python/DeepSeek/ReadMe.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Pentru inceput as vrea sa va reamintesc cate ceva despre [termenul](https://www.

[**Inferenta**](https://ro.wiktionary.org/wiki/inferen%C8%9B%C4%83) necesită un număr **semnificativ**/**important**/**mare** de **GPU**-uri **NVIDIA**(utilizate in/pentru calcul-inferential, pt. construire/antrenare/invatare/training) și de **rețele** de **înaltă performanță**([retele neuronale](https://www.aut.upt.ro/~andreea.robu/Lab1Retele.pdf) pt [antrenament/invatare](https://staff.fmi.uvt.ro/~daniela.zaharie/am2016/curs/curs12/am2016_slides12_RN.pdf))

[**DeepSeek**](https://huggingface.co/deepseek-ai/DeepSeek-V2) isi are radacina in performanta arhitectura actuala de baza, [**transformers**](https://www.unite.ai/ro/deepseek-v3-cum-o-pornire-chinezeasc%C4%83-de-IA-%C3%AEi-dep%C4%83%C8%99e%C8%99te-pe-gigan%C8%9Bii-tehnologiei-%C3%AEn-ceea-ce-prive%C8%99te-costul-%C8%99i-performan%C8%9Ba/), dar... pe care o [imbunatateste](https://epoch.ai/gradient-updates/how-has-deepseek-improved-the-transformer-architecture) considerabil(prin inovatiile aduse proiectului initial/aflat in circulatie, urcand stacheta pe scara evolutiva!), atingand nivelele [ridicate](https://infobrand.ro/deepseek/) de performanta(etalate de modelele actuale de top), cu un consum [redus](https://codingmall.com/knowledge-base/25-global/248227-care-sunt-avantajele-utilizrii-modelelor-distilate-precum-deepseek-r1-distill-qwen-7b)(in special de timp-GPU/energie) de resurse de calcul(deci intr-un mod mult mai eficient!), fara un sacrificiu la nivelul / de acuratete/precizie(**nota:** desigur, puteti sa vedeti [***distilarea***](https://dexonline.ro/definitie/distilare), ca un sinonim pt un proces de [***separare***](https://www.researchgate.net/figure/Architectures-of-the-Linformer-layer-and-its-components-Left-to-right-scaled_fig2_352209326), in vederea unei [***filtrari***](https://dexonline.ro/definitie/filtrare/definitii) ulterioare....dar aici poate ca ar trebui, sa vedem acest termen, si ca pe ... un proces de *diluare*/pierdere a concentratiei/preciziei in favoarea <ins>reducerii</ins> marimii si implicit a latentei/intarzierii/duratei de procesare sale).
[**DeepSeek**](https://huggingface.co/deepseek-ai/DeepSeek-V2) isi are radacina in performanta arhitectura actuala de baza, [**transformers**](https://www.unite.ai/ro/deepseek-v3-cum-o-pornire-chinezeasc%C4%83-de-IA-%C3%AEi-dep%C4%83%C8%99e%C8%99te-pe-gigan%C8%9Bii-tehnologiei-%C3%AEn-ceea-ce-prive%C8%99te-costul-%C8%99i-performan%C8%9Ba/), dar... pe care o [imbunatateste](https://epoch.ai/gradient-updates/how-has-deepseek-improved-the-transformer-architecture) considerabil(prin inovatiile aduse proiectului initial/aflat in circulatie, urcand stacheta pe scara evolutiva!), atingand nivelele [ridicate](https://infobrand.ro/deepseek/) de performanta(etalate de modelele actuale de top), cu un consum [redus](https://codingmall.com/knowledge-base/25-global/248227-care-sunt-avantajele-utilizrii-modelelor-distilate-precum-deepseek-r1-distill-qwen-7b)(in special de timp-GPU/energie) de resurse de calcul(deci intr-un mod mult mai eficient!), fara un sacrificiu la nivelul / de acuratete/precizie(**nota:** desigur, puteti sa vedeti [***distilarea***](https://dexonline.ro/definitie/distilare), ca un sinonim pt un proces de [***separare***](https://www.researchgate.net/figure/Architectures-of-the-Linformer-layer-and-its-components-Left-to-right-scaled_fig2_352209326), in vederea unei [***filtrari***](https://dexonline.ro/definitie/filtrare/definitii) ulterioare....dar aici poate ca ar trebui, sa vedem acest termen, si ca pe ... un proces de [*diluare*](https://hotnews.ro/cum-a-fost-distilat-deepseek-modelul-care-a-suprins-lumea-ai-tarul-lui-trump-din-domeniu-veti-auzi-multe-despre-aceasta-tehnica-nu-cred-ca-openai-este-prea-fer-1889941)/pierdere a concentratiei/preciziei in favoarea <ins>reducerii</ins> marimii si implicit a latentei/intarzierii/duratei de procesare sale).

Pana la aparitia [**DeepSeek**](https://en.wikipedia.org/wiki/DeepSeek), existau **2** legi(sau mai bine spus 2 momente) de scalare(ajustare/modificare a lungimii/modelului): **pre**-***training*** și **post**-***training***.

Expand Down

0 comments on commit e1fadae

Please sign in to comment.