Yandex just open sourced a 100B - GPT Like Model

Sartre

Jr. VIP
Jr. VIP
Joined
Apr 1, 2010
Messages
530
Reaction score
560
Website
NoSandbox.com
YaLM 100B is a GPT-like neural network for generating and processing text. It can be used freely by developers and researchers from all over the world.

The model leverages 100 billion parameters. It took 65 days to train the model on a cluster of 800 A100 graphics cards and 1.7 TB of online texts, books, and countless other sources in both English and Russian.


Calling all my AI / ML boys to the yard.

@Sartre @Cognitive @spectrejoe @Alma @mdevo @NulledCode @ghengis_khan

That GPU req tho!
love it thanks so much for sharing :)
 

NulledCode

Jr. VIP
Jr. VIP
Joined
Jun 10, 2010
Messages
1,574
Reaction score
1,252
Website
www.gplcellar.com
content writers close to being unemployed

AI has a long way to go before content writers are no longer needed. AI by today's standard is an attempt to predict "X" based on prior "Y" by finding patterns in "Y" to conclude "X". All of the training data "Y" has come from content writers and these AI tools only know what they've been trained and fed. People have the ability to create something out of nothing, and AI has a long way to go to get there.
 

reaaski

Senior Member
Joined
Mar 11, 2013
Messages
875
Reaction score
1,347
AI has a long way to go before content writers are no longer needed. AI by today's standard is an attempt to predict "X" based on prior "Y" by finding patterns in "Y" to conclude "X". All of the training data "Y" has come from content writers and these AI tools only know what they've been trained and fed. People have the ability to create something out of nothing, and AI has a long way to go to get there.
Case studies, proposals, business brochures, conversion optimized content, technical content.

I'd be more than happy to see AI doing that hard work for me.
 

eugenbg

Newbie
Joined
Jun 27, 2021
Messages
47
Reaction score
17
this is ridiculously expensive to run, i wonder how many prompt requests I can make with this setup
 

Ashk881

Regular Member
Joined
Aug 24, 2021
Messages
462
Reaction score
493
this is ridiculously expensive to run, i wonder how many prompt requests I can make with this setup
All large language models are ridiculously expensive to run. Not just this. Infact, this particular model was designed a a bit smaller (uses fp16 data types instead of fp32 for weights, which is a a smaller and faster)

They trained this on 800 A100s for 65 days. That's about $1-8 million in training alone.
Compared that, spending on a $30/hr for inference doesn't look too bad :)
 
Last edited:

PinguSpy

Jr. VIP
Jr. VIP
Joined
Dec 7, 2007
Messages
2,661
Reaction score
2,725
Yandex FTW!

Even their images search is more intelligent and advance than Google.
Google always gave me zero result.
 

Xyox

Junior Member
Joined
Oct 19, 2017
Messages
153
Reaction score
94
I predict an increase in demand for the services that sell you AWS accounts with preloaded credit for cheap. If not now, when the 176B model will be released.
 
Top