๐ถ that's me on the corner
That's me on the spot ๐ถ
@fingels
Ph.D. in Mathematics Currently postdoc at @bonsaiseqbioinfo.bsky.social, in Lille. Investigating patterns (substructures) in structured data (sequences, trees, graphs) of predominantly biological origin. More at https://fingels.github.io/
๐ถ that's me on the corner
That's me on the spot ๐ถ
PREPRINT ALERT
I heard you craving for more combinatorics, here are some more for y'all !
Dr Kareem Carr man: i wish to publish @kareem_carr Jan 21 reviewer 2: your paper is no good man: i'll do anything to improve reviewer 2: it's simple. you must read the work of the great scientist Pagliarini man: *bursts into tears* but i am Pagliarini Andre Pagliarini @apagliar Jan 21 a first: in rejecting an article I submitted to a journal, reviewer 2 noted I failed to engage the work of one Andre Pagliarini Jan 21, 2026 โข 3:47 PM UTC
I just thought everyone should see this
DSB Program is out !
Seems incredible (as ever)
dsb-meeting.github.io/DSB2026/
Literally a publication for eight-year olds 40 years ago
On ne manque pas de problรจmes! C'est รงa qui est chouette en interdisciplinaire. Je privilรฉgie ceux qui me plaisent, bien sรปr, mais c'est un mรฉlange de plaisir / abordabilitรฉ / "rentabilitรฉ" (รฉtant en postdoc je n'ai pas le luxe de traiter des sujets trop obscurs peu valorisables dans un dossier)
Il se trouve que je fais des maths appliquรฉes (ร la bioinformatique, en l'occurrence). Je cherche la petite bรชte, le petit coin oรน il faudrait un peu de maths pour aider les collรจgues, et aprรจs je me lance. De fait, parfois ce que je trouve aide les collรจgues, mais parfois non.
Peu m'importe, au final, si รงa atteint, ou non, des gens, tant que je n'ai pas ร rougir de ce que je propose !
Mon idรฉal ร moi, c'est de proposer un problรจme (ร ma portรฉe), de l'attaquer honnรชtement et sincรจrement, de proposer une solution aussi complรจte que possible, qui fait le tour du sujet (sans saucissonnage) et, lorsque je n'ai pas ร rougir de ce que j'ai ร disposition, alors je l'envoie.
Nicolas Sarkozy reรงoit le โprix de littรฉratureโ de la FIFA https://www.legorafi.fr/2025/12/12/nicolas-sarkozy-recoit-le-prix-de-litterature-de-la-fifa/
Dans l'article, une application c'est la numรฉrotation des permutations ร n รฉlรฉments : tu peux trouver le rang d'une permutation en calculant sa valeur en num. factorielle, et vice-versa. Pour les gens (comme moi) qui aiment l'รฉnumรฉration c'est super cool, en vrai.
Preprint Alert!
We present new strategies to accelerate large-scale document comparison using MinHash-like sketches.
A thread:
And now, for the 25th post, i.e. CHRISTMAS MORNING, the promised thread by Antoine :
bsky.app/profile/npma...
Preprint alert!
We introduce new ideas to revisit the notion of sampling with window guarantees, also known as minimizers.
A thread:
I found a flowchart which helps you navigate the IT landscape
His book "Mathematica" is a real eye-opener. David Bessis describes accurately and vividly the way we perceive and manipulate mathematical objects. Colleagues who read it also felt he put into words what they had not been able to formulate about the mathematical process. Highly recommended.
This kind of situation appear elsewhere in the simulations, but it happens randomly after Z2, so it is averaged over all 10^6 simulations and the effect is smoothed. That's why I was speaking of a border effect, as Y1/Z1 is very special by being the only window without prior dependencies
Yeah so basically Y1 is always a rescan, and Y2* (the second selected position) always follows a rescan, so Z2 = Y2* - Y1 is always the gap between a rescan and its successor.
There is some bias as the next minimizer after a rescan is more likely to be far as it follows the min over a whole window
And yes, the result is true when taking the limit to infinity. In the proof of Theorem 1, we establish the following
--- where E[tau_M] / (M-k+1) is the expected specific density of a random sequence of length M (and then M->infty)
You can derive actually an interval for a finite sequence from this
You are right in that eps_i is not *technically* a random variable. And yes, we want the average eps_i to be 0.
We chose to consider the eps_i as some realizations of an underlying random variable eps of mean 0, but it is only for the proof and not a big deal actually.
But as you can see, numerically with Monte Carlo, we obtain that E[Z_i] are somehow around (w+1)/2, so all good.
(and the values E[Z_i] - (w+1)/2 are, equivalently, somewhat around 0)
They're not, actually ! For instance, this is what we (formally) obtain for Z1 and Z2, with random minimizers. For w=10, E[Z1] = 5.5 whereas E[Z2] = 5.87 (there is a good reason for this, I can explain more if you're curious, but basically it is a border effect).
Where the Z_i's i.i.d then it would simply be E[Z1] but we didnt want to assume independence nor identical distribution, as to be as general as possible (also they are not iid in real life)
There are several way to define it actually, but you can think of it as E[E[Z_i]] where Z_i is the i-th gap. So, the average of the average gap.
I would also argue that n / # samples โ avg gap.
Take a sequence with n=6, and sample positions 1 and 3.
You get 3 โ ((1-0) + (3-1))/2 = 1.5
You need to account for the remaining bits after the last sample. And this implies going to infinity!
The only proper formal definition of density I found (the one I just gave) was in the GreedyMini paper. Other references, as far as I know, defines it informally, which is usually enough, but not when you want to claim a mathematical truth !
Well, usually when things seems trivial, one must be cautious. Since the density is defined as the limit of expected specific density when S tends to infinity, maybe there could have been a trick with the limit. Better safe than sorry, I would say