Chaitanya Sanivada's Avatar

Chaitanya Sanivada

@sanivada

here for data, viz, and cats

9
Followers
33
Following
10
Posts
27.10.2024
Joined
Posts Following

Latest posts by Chaitanya Sanivada @sanivada

"PhD-level experts in your back pocket" is a completely nonsensical description of AI but a pretty good description of social media if you follow the right people

09.08.2025 23:07 πŸ‘ 10431 πŸ” 2012 πŸ’¬ 136 πŸ“Œ 170

There was a wonderful talk at PyData Amsterdam last year on using machine learning to study and preserve artworks at the Rijksmuseum, strong recommend.

youtu.be/kMfl5SzfkVc?...

15.12.2024 14:04 πŸ‘ 45 πŸ” 8 πŸ’¬ 4 πŸ“Œ 2
Preview
Journaling 101 - Daystar Eld I often get asked what the most things valuable things people can do to improve their mental health are, and while it’s really hard to give a general answer to that sort of thing, what immediately alw...

"Journaling is almost the physical exercise of the mental health world; .....The reason it’s not is that physical exercise is also the physical exercise of the mental health world."
daystareld.com/journaling-1...

09.12.2024 16:28 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

Guard's patrolling path for part 1 visualized using @matplotlib.bsky.social
adventofcode.com/2024/day/6
#AdventOfCode #day6

06.12.2024 11:07 πŸ‘ 4 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Day 4 - Advent of Code 2024

Spent almost 30 mins debugging Part 2 only to find I used the wrong variable.
I just completed "Ceres Search" - Day 4 - Advent of Code 2024 #AdventOfCode adventofcode.com/2024/day/4

04.12.2024 12:03 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
re β€” Regular expression operations Source code: Lib/re/ This module provides regular expression matching operations similar to those found in Perl. Both patterns and strings to be searched can be Unicode strings ( str) as well as 8-...

re.finditer(pattern, string) returns an iterator yielding match objects. Match object m has m.start() and m.end() which return index of the first character and index of last character + 1 of the matched string.

docs.python.org/3/library/re...

03.12.2024 10:33 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

TIL how to get index of matched strings when using re package in python standard library.

03.12.2024 10:33 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Day 3 - Advent of Code 2024

Part 2 was fun. Using Regex with capture groups makes Part 1 straightforward.
I just completed "Mull It Over" - Day 3 - Advent of Code 2024 #AdventOfCode adventofcode.com/2024/day/3

03.12.2024 10:19 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
RegexOne - Learn Regular Expressions - Lesson 1: An Introduction, and the ABCs RegexOne provides a set of interactive lessons and exercises to help you learn regular expressions

regexone.com is a great resource to start learning regex as well.

03.12.2024 08:39 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Day 2 - Advent of Code 2024

Struggled a little with Part 2, but got it done with what I feel an inefficient algorithm.
#AdventOfCode adventofcode.com/2024/day/2

02.12.2024 11:00 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

I just completed "Historian Hysteria" - Day 1 - Advent of Code 2024 #AdventOfCode adventofcode.com/2024/day/1

01.12.2024 10:02 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

I didn't know data visualization tournaments are a thing Until now. I'd love to participate.

29.11.2024 05:16 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Book outline

Book outline

Over the past decade, embeddings β€” numerical representations of
machine learning features used as input to deep learning models β€” have
become a foundational data structure in industrial machine learning
systems. TF-IDF, PCA, and one-hot encoding have always been key tools
in machine learning systems as ways to compress and make sense of
large amounts of textual data. However, traditional approaches were
limited in the amount of context they could reason about with increasing
amounts of data. As the volume, velocity, and variety of data captured
by modern applications has exploded, creating approaches specifically
tailored to scale has become increasingly important.
Google’s Word2Vec paper made an important step in moving from
simple statistical representations to semantic meaning of words. The
subsequent rise of the Transformer architecture and transfer learning, as
well as the latest surge in generative methods has enabled the growth
of embeddings as a foundational machine learning data structure. This
survey paper aims to provide a deep dive into what embeddings are,
their history, and usage patterns in industry.

Over the past decade, embeddings β€” numerical representations of machine learning features used as input to deep learning models β€” have become a foundational data structure in industrial machine learning systems. TF-IDF, PCA, and one-hot encoding have always been key tools in machine learning systems as ways to compress and make sense of large amounts of textual data. However, traditional approaches were limited in the amount of context they could reason about with increasing amounts of data. As the volume, velocity, and variety of data captured by modern applications has exploded, creating approaches specifically tailored to scale has become increasingly important. Google’s Word2Vec paper made an important step in moving from simple statistical representations to semantic meaning of words. The subsequent rise of the Transformer architecture and transfer learning, as well as the latest surge in generative methods has enabled the growth of embeddings as a foundational machine learning data structure. This survey paper aims to provide a deep dive into what embeddings are, their history, and usage patterns in industry.

Cover image

Cover image

Just realized BlueSky allows sharing valuable stuff cause it doesn't punish links. 🀩

Let's start with "What are embeddings" by @vickiboykis.com

The book is a great summary of embeddings, from history to modern approaches.

The best part: it's free.

Link: vickiboykis.com/what_are_emb...

22.11.2024 11:13 πŸ‘ 652 πŸ” 101 πŸ’¬ 22 πŸ“Œ 6