#TIL pymc.Metropolis always assume symmetric proposal distribution (default to normal), therefore does not implement Hastings correction. For a custom asymmetric proposal distribution, PyMC's out-of-the-box Metropolis algorithm will give incorrect sampling. We need to write a custom step_method.
16.12.2024 01:03
๐ 0
๐ 0
๐ฌ 0
๐ 0
Mathematically deriving Metropolis-Hastings algorithm from Markov-Chain's detailed balance equation makes me feel like going back to school. It's thermodynamics.
07.12.2024 14:05
๐ 0
๐ 0
๐ฌ 0
๐ 0
GitHub - ridhoma/monte-cafe: My learning journal studying Markov-Chain Monte Carlo methods
My learning journal studying Markov-Chain Monte Carlo methods - ridhoma/monte-cafe
Making my repository, which is my journal in studying MCMC public. Theory, equations, and interpretations as well as code and animations/visualization can be found there.
github.com/ridhoma/mont...
30.11.2024 23:59
๐ 1
๐ 0
๐ฌ 0
๐ 0
GitHub - ridhoma/monte-cafe: My learning journal studying Markov-Chain Monte Carlo methods
My learning journal studying Markov-Chain Monte Carlo methods - ridhoma/monte-cafe
There is another stronger condition that is the fundamental of Metropolis-Hastings algorithm called Detailed Balance, which will be my next topic.
All codes are available in
github.com/ridhoma/mont...
30.11.2024 23:00
๐ 0
๐ 0
๐ฌ 0
๐ 0
This is a very fundamental concept for developing MCMC sampling algorithm. Because we want the sampling process to converge to our target distribution, we must make sure the sampling process;
1. Satisfy Markov property / be a Markov-Chain
2. Irreducible
3. Aperiodic
30.11.2024 23:00
๐ 0
๐ 0
๐ฌ 1
๐ 0
Moreover, Markov-Chain must be irreducible and aperiodic to guarantee reaching a Stationary Distribution. Irreducible: every state can be reached from any other state. Aperiodic: the chain does not cycle through states in a fixed period.
30.11.2024 23:00
๐ 0
๐ 0
๐ฌ 1
๐ 0
This interesting properties only applies if the random walk follows a Markov process, that is the probability of transition to the next node/state depends only to the current state. Such graph we may call a Markov-Chain
30.11.2024 23:00
๐ 0
๐ 0
๐ฌ 1
๐ 0
No matter which node we started at, with long enough simulation time, the distribution of visit frequency will converge to the same values called Stationary Distribution. This also applies with larger graphs.
30.11.2024 23:00
๐ 0
๐ 0
๐ฌ 1
๐ 0
The simplest illustration of Markov Chain is a random walk on a graph. Given edge Qij as probability of moving from node-i to node-j, where Qii represents probability of staying at the same node-i. Simulate the random walk, and at each time step, count how many times each node has been visited
30.11.2024 23:00
๐ 0
๐ 0
๐ฌ 1
๐ 0
Moreover, it took only 3 minutes to run this. Soo.. i'm satisfied.
In the scheme of learning MCMC, this exercise is a good familiarization to the concept of rejection sampling, a simple example for Markov-Chain based state proposal, and good introduction to Metropolis-Hasting algorithm
30.11.2024 18:42
๐ 0
๐ 0
๐ฌ 0
๐ 0
Finally, compare the final solution with greedy algorithm and the result is quite good. It gives 12% shorter distance than greedy. According to ChatGPT, greedy usually gives sub-optimal solution that is 10%-20% global optimum. So our solution might be close ๐
30.11.2024 18:42
๐ 0
๐ 0
๐ฌ 1
๐ 0
During the first 1000 iterations, Temperature is kept high to maximize the search space. Then it's gradually lowered to start the optimization. Finally temperature is kept low to fine tune the final solution
30.11.2024 18:42
๐ 0
๐ 0
๐ฌ 1
๐ 0
It took me quite a number of trials to fine-tune the To and the annealing function until I'm satisfied with the outcome. I ended up with this annealing schedule.
30.11.2024 18:42
๐ 0
๐ 0
๐ฌ 1
๐ 0
This cooling down process is called annealing in material science, hence the algorithm is called Simulated Annealing. The success of finding the optimal solution depend on the initial temperature (To) and the annealing schedulefunction, T(t) = f(To, t)
30.11.2024 18:42
๐ 0
๐ 0
๐ฌ 1
๐ 0
We repeat the process and after each iteration we will slowly reduce the temperature (slowly cool down the system). Until the temperature reach ~0 we stop the process and whatever the solution we have is our final solution.
30.11.2024 18:42
๐ 0
๐ 0
๐ฌ 1
๐ 0
We can control how likely we accept the worse solution using temperature parameter T. The higher the temperature, the more likely we accept route even though it is worse than the current route. This encourage exploration of the search space and help escaping local minima
30.11.2024 18:42
๐ 0
๐ 0
๐ฌ 1
๐ 0
Here "energy" is just a fancy term for "total travel distance" of the TSP route. We will accept the proposal if it gives lower energy than the current solution (distance improves). Otherwise, we may still accept it but with probability exponentially drop as the energy difference increase.
30.11.2024 18:42
๐ 0
๐ 0
๐ฌ 1
๐ 0
The algorithm basically starts with a random initial state/solution. Then we change the solution with some perturbation function, we call the the proposed next solution. We either accept/reject the proposal depending the energy of the proposed state compared to current state.
30.11.2024 18:42
๐ 0
๐ 0
๐ฌ 1
๐ 0
Long time no update.. Quite busy lately in the work. But here it is. Applying Markov-Chain Monte Carlo to estimate large Traveling Salesman Problem (N=100).
It's quite fascinating that a very simple algorithm relying on random search can be used to approximate an NP-Hard optimization problem.
30.11.2024 18:42
๐ 1
๐ 0
๐ฌ 1
๐ 0
Studying MCMC from scratch, so far everything is basically physics ???
23.11.2024 19:44
๐ 0
๐ 0
๐ฌ 0
๐ 0
Wkwk siap suhu
20.11.2024 19:33
๐ 0
๐ 0
๐ฌ 0
๐ 0
Monte Carlo Methods and Applications
Bayesian Analysis generally consists of 2 essentials:
1. Assumption/priors building
2. Numerical method to get the posterior
I find this course from CMU a good starting point to learn #2 fundamental levels. I will use this as my guide
gi1242.codeberg.page/cmu-math-cs-...
19.11.2024 16:05
๐ 1
๐ 0
๐ฌ 0
๐ 0
I'd also like to make my github repo public (with all the codes, visualization, equations, and comments), but probbaly later. Dirapihin dulu, malu kalau berantakan
19.11.2024 14:41
๐ 0
๐ 0
๐ฌ 1
๐ 0
However, this randomness behavior is not bad at all. Using a technique called Simmulated Annealing, we can take advantage of this behavior and use Monte-Carlo to efficiently solve optimization problem, e.g. Traveling Salesman Problem. Something I will post next time
19.11.2024 14:41
๐ 0
๐ 0
๐ฌ 1
๐ 0
However, Monte-Carlo is not the best method to estimate pi because it converges very slowly, with rate of 1/โn. Not only that, as more random points are added we might randomly ruin a previously better estimation and end up getting a worse estimation.
19.11.2024 14:41
๐ 0
๐ 0
๐ฌ 1
๐ 0
In the circle-in-a-square model, the reality to reconstruct is a "circle in a square". But with only 1000 points, it barely looks like a circle. With 10,000 now can see some circle. Finally with 100,000 the circle looks clear. Notice as the circle becomes clearer, the more accurate is our pi
19.11.2024 14:41
๐ 0
๐ 0
๐ฌ 1
๐ 0
The Law of Large Number is the theoretical foundation of Monte Carlo method. This law states that as the number of random samples increases, the average of the samples will be closer to reality (math term: true value / expected value). The closer to the reality the better our estimation accuracy
19.11.2024 14:41
๐ 0
๐ 0
๐ฌ 1
๐ 0
In the case of estimating pi above, we throw bunch of points randomly into a 1x1 square., accept (blue) them if they fall inside the quadrant (quarter-circle), otherwise reject them (gray). The number of accepted points is proportionate to the area of the circle, allowing us to estimate pi.
19.11.2024 14:41
๐ 0
๐ 0
๐ฌ 1
๐ 0
Monte Carlo is essentially trying to estimate certain variables/parameters by generating a large number of random samples and accepting/rejecting each based on some criteria. The proportion of accepted samples will reveal the information needed to estimate the parameters.
19.11.2024 14:41
๐ 0
๐ 0
๐ฌ 1
๐ 0
First post in BlueSky. This place seems like perfect to share my learning process of MCMC methods for Bayesian Analysis.
I'll start with the very basics, the first chapter of any books: Stochastically estimating pi using the circle-in-a-square model.
19.11.2024 14:41
๐ 3
๐ 0
๐ฌ 1
๐ 0