S Manish

I am bored

So what am I doing here? Well I just wanted to write my first blog ever since I created an account. But turns out I have nothing to blog about. So I decided I'll just talk about the large language models that I am learning about right now.

Did y'all know that large language model is an extremely huge probability distribution of all the possible sentences that could ever be made using all the possible words ( vocabulary ) that we have right now? Yes I was shocked the first time I learnt that too. So now that I have told you that I would obviously have to tell you the sauce.

Since training a ML model using all the labelled data is extremely laborious and costly, some people ( don't ask me who, i forgot ) came up with the idea of large language models which would essentially use non labelled data like from wikipedia pages or any text you can think of, reddit, facebook, instagram, your chats, stackoverflow, so and so forth will be given to the model as input to train on. Now how does it train on that? Well that is interestingly cool. I am not gonna go too deep into explaining you the underlying mechanism. But essentially what it does is, lets take this blog for example. It starts with the following text

So what am I doing here? Well I just wanted to write my first blog

so the model will be given this same sentence in the following ways

So what

So what am

So what am I

So what am I doing

so on and so forth essentially making it predict what will be the next word given the preceding words ( Causal Language Modeling ). So it essentially predicts, given the word So what, what will be the next word after that. So we'll essentially train the model to know that and yea just do it for a billions and billions of text available on the internet and there you go with your own GPT or Deepseek

Well for my first blog this was fun. See y'all another time👋👋

#llm #technical