Hello,
More of my philosophy about transformers limitation and Natural Language Processing (NLP) in artificial intelligence..
I am a white arab from Morocco, and i think i am smart since i have also invented many scalable algorithms and algorithms..
I invite you to read the following about Microsoft Megatron-Turing Natural Language Generation (MT-NLP) from NVIDIA:
https://developer.nvidia.com/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/
I think i am quickly understanding the defects of Megatron-Turing Natural Language Generation (MT-NLP) that is better than GPT-3, and it is that "self-attention" of the transformers in NLP, even if they scale to very long sequences, they have a limited
expressiveness, as they cannot process input sequentially they can not model hierarchical structures and recursion, and hierarchical structure is widely thought to be essential to modeling natural language, in particular its syntax, so i think that
Microsoft a Megatron-Turing Natural Language Generation (MT-NLP) and GPT-3 too will be practically applied to limited areas, but they can not make emerge common sense reasoning or the like that are necessary for general artificial intelligence.
Read the following paper so that to understand the mathematical proof of it:
https://aclanthology.org/2020.tacl-1.11.pdf
Read my previous thoughts:
More of my philosophy about Natural Language Processing (NLP) in artificial intelligence and more..
I think that the transformers in Natural Language Processing (NLP) use a kind of Deep learning, and Natural Language Processing (NLP)
is a branch of Artificial Intelligence (AI) that enables machines to understand the human language, so i think that the transformers in Natural Language Processing (NLP) are using Pruning + quantization that makes the model much faster and much smaller
so that to scale much better, so i think it is the basic ideas of Microsoft Megatron-Turing Natural Language Generation (MT-NLP) below, so i think that it is the way that can make "emerge" in NLP the common sense reasoning and also reading comprehension
and natural language inferences by this way of ‘brute-force’ when the model attains 1 trillion or more parameters. So read my below thoughts about artificial intelligence so that to understand more, and you can understand more about Pruning +
quantization by looking at the following video of a jewish PhD researcher called Nir Shavit that has invented a software called neural magic that does the Pruning + quantization efficiently:
The Software GPU: Making Inference Scale in the Real World by Nir Shavit, PhD
https://www.youtube.com/watch?v=mGj2CJHXXKQ
More of my philosophy about the benefits of Exascale supercomputers and more..
As you have just noticed i have just posted about the following:
Intel's Aurora Supercomputer Now Expected to Exceed 2 ExaFLOPS Performance
Read more here:
https://www.anandtech.com/show/17037/aurora-supercomputer-now-expected-to-exceed-2-exaflops-performance
But Exascale supercomputers will also allow to construct an accurate map of the brain that allows to "reverse" engineer or understand the brain, read the following so that to notice it:
“If we don’t improve today’s technology, the compute time for a whole mouse brain would be something like 1,000,000 days of work on current supercomputers. Using all of Aurora, if everything worked beautifully, it could still take 1,000 days.”
Nicola Ferrier, Argonne senior computer scientist
Read more here so that to understand:
https://www.anl.gov/article/preparing-for-exascale-argonnes-aurora-supercomputer-to-drive-brain-map-construction
Also Exascale supercomputers will allow researchers to tackle problems which were impossible to simulate using the previous generation of machines, due to the massive amounts of data and calculations involved.
Small modular nuclear reactor (SMR) design, wind farm optimization and cancer drug discovery are just a few of the applications that are priorities of the U.S. Department of Energy (DOE) Exascale Computing Project. The outcomes of this project will have
a broad impact and promise to fundamentally change society, both in the U.S. and abroad.
Read more here:
https://www.cbc.ca/news/opinion/opinion-exascale-computing-1.5382505
Also the goal of delivering safe, abundant, cheap energy from fusion is just one of many challenges in which exascale computing’s power may prove decisive. That’s the hope and expectation. Also to know more about the other benefits of using Exascale
computing power, read more here:
https://www.hpcwire.com/2019/05/07/ten-great-reasons-among-many-more-to-build-the-1-5-exaflops-frontier/
And more of my philosophy about the future of humanity:
Read more here:
https://groups.google.com/g/alt.culture.morocco/c/0X024jfzNvM
More of my philosophy about artificial intelligence..
'
AI Generates Hypotheses Human Scientists Have Not Thought Of
Read more here:
https://www.scientificamerican.com/article/ai-generates-hypotheses-human-scientists-have-not-thought-of/
More of my philosophy about artificial intelligence and common sense reasoning..
"Microsoft and Nvidia today announced that they trained what they claim is the largest and most capable AI-powered language model to date: Megatron-Turing Natural Language Generation (MT-NLP). The successor to the companies’ Turing NLG 17B and Megatron-
LM models, MT-NLP contains 530 billion parameters and achieves “unmatched” accuracy in a broad set of natural language tasks, Microsoft and Nvidia say — including reading comprehension, commonsense reasoning, and natural language inferences."
Read more here:
https://venturebeat.com/2021/10/11/microsoft-and-nvidia-team-up-to-train-one-of-the-worlds-largest-language-models/
So I think that one hypothesis is that we should be able to build even bigger models, with trillions of parameters or more, and artificial common sense will eventually emerge. Let’s call this the ‘brute-force’ hypothesis.
Read more here so that to notice:
https://towardsdatascience.com/the-quest-for-artificial-common-sense-766af7fce292
Also I invite you to look carefully at the following video of a jewish AI(artificial intelligence) scientist about artificial intelligence(And read about him here:
https://rogantribe.com/who-is-lex-fridman/):
Exponential Progress of AI: Moore's Law, Bitter Lesson, and the Future of Computation
https://www.youtube.com/watch?v=Me96OWd44q0
I think that the jewish AI(artificial intelligence) scientist that is speaking on the video above and that is called Lex Fridman is making a
big mistake, since he focuses too much on improving Deep Learning in artificial intelligence using exponential improvement of computation of CPU hardware, but i think that it is a "big" mistake and you can easily notice it by reading carefully my
following thoughts and writing:
More of my philosophy about artificial intelligence and specialized hardwares and more..
I think that specialized hardwares for deep learning in artificial intelligence like GPUs and quantum computers are no more needed, since you can use only a much less powerful CPU with more memory and do it efficiently, since a PhD researcher called Nir
Shavit that is a jewish from Israel has just invented a very interesting software called neural magic that does it efficiently, and i invite you to look at the following very interesting video of Nir Shavit to know more about it:
The Software GPU: Making Inference Scale in the Real World by Nir Shavit, PhD
https://www.youtube.com/watch?v=mGj2CJHXXKQ
And there is not only the jewish above called Nir Shavit that has invented a very interesting thing, but there is also the following muslim Iranian and Postdoctoral Associate that has also invented a very interesting thing too for artificial intelligence,
and here it is:
Why is MIT's new "liquid" AI a breakthrough innovation?
Read more here:
https://translate.google.com/translate?hl=en&sl=auto&tl=en&u=https%3A%2F%2Fintelligence-artificielle.developpez.com%2Factu%2F312174%2FPourquoi-la-nouvelle-IA-liquide-de-MIT-est-elle-une-innovation-revolutionnaire-Elle-apprend-continuellement-de-son-
experience-du-monde%2F
And here is Ramin Hasani, Postdoctoral Associate (he is an Iranian):
https://www.csail.mit.edu/person/ramin-hasani
And here he is:
http://www.raminhasani.com/
He is the study’s lead author of the following new study:
New ‘Liquid’ AI Learns Continuously From Its Experience of the World
Read more here:
https://singularityhub.com/2021/01/31/new-liquid-ai-learns-as-it-experiences-the-world-in-real-time/
And here is my thoughts about artificial intelligence and evolutionary algorithms in artificial intelligence:
https://groups.google.com/g/alt.culture.morocco/c/P9OTDTiCZ44
Thank you,
Amine Moulay Ramdane.
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)