Machine learning - how to start it and not waste months of time?

crnack

Jr Vip
Jr. VIP
Joined
Oct 10, 2017
Messages
1,352
Reaction score
1,473
How do I start with machine learning?
My goal is to make software that will be able to rewrite whatever I give to it.
My set of data would be small like maybe 50k characters.
Anyways I would like it to produce only a small piece of text.
So I think it could be precise with this amount of sample data.

With some models available on the internet, I can get precisely summarized or paraphrased text.
However the sample I can use is really small.
Or those models "guess information" instead of using my samples.

They understand language and they are definitely capable of using samples.
They just don't allow massive input for some reason.

"Microsoft announced on September 22, 2020 that it had licensed "exclusive" use of GPT-3; others can still use the public API to receive output, but only Microsoft has access to GPT-3’s underlying code"
 

crnack

Jr Vip
Jr. VIP
Joined
Oct 10, 2017
Messages
1,352
Reaction score
1,473
Spent half the day reading about this and yes, I know how to do it!
I will be on the news! Hahaha. Everything is possible!

I will use 10000000 computers for training the best of the best models.
I understand now my set of data must be 50 terabytes.
 

TheOverlord

Jr. VIP
Jr. VIP
Joined
Sep 1, 2014
Messages
121
Reaction score
98
Spent half the day reading about this and yes, I know how to do it!
I will be on the news! Hahaha. Everything is possible!

I will use 10000000 computers for training the best of the best models.
I understand now my set of data must be 50 terabytes.
Sounds about right, but there are other ways to summarize text and paraphrase. Summarization is probably easiest as it doesn't need any machine learning - see wikipedia articles for automatic summarization. Extractive summarization is easiest. For paraphrasing, it is possible to use translation software tricks/hacks to get a shitty paraphrase
 

crnack

Jr Vip
Jr. VIP
Joined
Oct 10, 2017
Messages
1,352
Reaction score
1,473
"Computer + telephone = internet "
This output made my day!
 

AccsHub

Jr Vip
Jr. VIP
Joined
Oct 1, 2018
Messages
373
Reaction score
83
Would that be possible to learn something without having to waste time?
As you become more experienced, you will be able to recognize those paths that waste your time from the beginning.
 

crnack

Jr Vip
Jr. VIP
Joined
Oct 10, 2017
Messages
1,352
Reaction score
1,473
Would that be possible to learn something without having to waste time?
As you become more experienced, you will be able to recognize those paths that waste your time from the beginning.
I found the truth in 6 hours, so not so bad.
One time I "wasted" couple months on learning JavaScript. :D
 
Joined
Mar 23, 2018
Messages
20
Reaction score
3
Sounds about right, but there are other ways to summarize text and paraphrase. Summarization is probably easiest as it doesn't need any machine learning - see wikipedia articles for automatic summarization. Extractive summarization is easiest. For paraphrasing, it is possible to use translation software tricks/hacks to get a shitty paraphrase
hey, i'm looking for a good text Summarization script, can you recommend any? my coding skills aren't advanced i don't think i can build one by myself
 

Machairodont

Registered Member
Joined
Oct 9, 2021
Messages
60
Reaction score
37
https://huggingface.co/models?pipeline_tag=text2text-generation&sort=downloads
https://huggingface.co/models?pipeline_tag=summarization&sort=downloads
Enjoy
 

Alexion

Junior Member
Joined
Sep 9, 2021
Messages
121
Reaction score
133
Website
moderntechtropolis.wordpress.com
Why do you even bother with AI / ML for content if it's going to be trash anyway?
I mean, I normally land on such a page from Google, read it for 3 seconds, notice it's AI and leave. There's no value. You can't monetize 3 second visits.
 

Machairodont

Registered Member
Joined
Oct 9, 2021
Messages
60
Reaction score
37
Why do you even bother with AI / ML for content if it's going to be trash anyway?
I mean, I normally land on such a page from Google, read it for 3 seconds, notice it's AI and leave. There's no value. You can't monetize 3 second visits.
It's good enough for 2/3-tiers, nobody wants to monetize them
 
Top