Červenec 2023

linear regression = finding the right weights to balance a scale

26 července, 20235 srpna, 2023 adminLeave a comment

Visualize the data points as weights on one side of a balancing scale. Your task is to find the right combination of weights (coefficients) for the objects on the other side to balance the scale perfectly. Linear regression involves adjusting the weights until the scale is in equilibrium, which means the difference between the two sides is minimized. The weights‘ values and positions correspond to the coefficients of the linear equation, and once the scale is balanced, you can use this equation to make predictions for new weights.

Unravel, don’t tangle.

bytecode = the MEAT between human and machine code

25 července, 20235 srpna, 2023 adminLeave a comment

Tento obrázek nemá vyplněný atribut alt; název souboru je SCR-20230805-kcgz.png.

Unravel, don’t tangle.

#Python

Seek To Understand > Seek To Argue

25 července, 202325 července, 2023 adminLeave a comment

Explore all opinions with a child like curiosity. Especially opinions that heavily contradict yours. Seek to understand, not to argue.

Unravel, don’t tangle.

vector embeddings = your moody playlist

24 července, 202325 července, 2023 adminLeave a comment

Imagine you have playlists for different moods – happy, sad, energetic, etc. The songs in each playlist are different, but they all share a common theme or mood.

Similarly, words that share similar context or meaning will be close together in the embedding space, just like songs sharing the same mood are grouped into the same playlist.

Unravel, don’t tangle.

perfect flavor = gradient descent

23 července, 202323 července, 2023 adminLeave a comment

When cooking a new dish, you add a pinch of seasoning, taste it, and adjust accordingly. If it’s too bland, you add more; if it’s too salty, you reduce the seasoning. This process continues iteratively until you achieve the perfect flavor.

Unravel, don’t tangle.

context length !=29

22 července, 202323 července, 2023 adminLeave a comment

The context window (context length) of a LLM is the length of the longest sequence of tokens that a LLM can use to generate a token. If a LLM is to generate a token over a sequence longer than the context window, it would have to either truncate the sequence down to the context window, or use certain algorithmic modifications.

Ok, so the fancy term „context length“ is nothing else than a token count. And token count is nothing else than words count. Not quite. This is what the internet has to say about that:

A general rule of thumb is that one token is roughly equivalent to 4 characters of text in common English text. This means that 100 tokens are approximately equal to 75 words.

So if we take our two sentences and throw them into the OpenAI Tokenizer, this is what we get:

So is the internet correct about 4 characters approximation? Let’s ask chatGPT4:

Not quite. The approximation can actually confuse rather than clarify. So rather than using the approximation of 4, just remember that short words, spaces, and punctuations often end up being represented by single tokens.

Unravel, don’t tangle