This is what I think of the AI bubble:
A few years back BORUTA was all over the web and data science competition forums.
Since then... silence... is it really dead?
I did some research, and this is what I found out:
www.blog.trainindata.com/is-boruta-de...
New payment method rolled out for all our courses!
You can now pay in your own currency* and avoid hidden bank or country specific fees.
We look forward to seeing you on our courses.
*Atm only 20 currencies are supported.
champ.ly/6WkK6AA3
Should you use imbalanced-learn in 2025?
SMOTE, oversampling and undersampling have been proposed as the power horses to tackle imbalanced data.
But do they really work?
We talk about that in this article.
www.blog.trainindata.com/should-you-u...
Moving averages has been long used as a forecasting benchmark model.
Did you know that you can also use moving averages as input features?
If not, check out this blog to find out more, together with Python implementations:
www.blog.trainindata.com/master-movin...
Discover the latest thoughts on working with imbalanced data with our free booklet.
We discuss 3 recent articles that have changed the conversation on resampling and SMOTEπ
www.trainindata.com/p/7-takes-on...
All our courses come with a 30-Day money back guarantee...
If you are unhappy for whatever reason, we give you the money back.
That confident we are that you'll β€οΈ our courses.
#trainindata
Next Monday on Data Bites : Six Cloud Platforms to Run Jupyter Notebooks for Free π
Want to know more?
Click the link below to subscribe and stay tuned!π
https://f.mtr.cool/bltkmoeitj
#machinelearning #datascience #jupyter #mlmodels #ML #mltools #notebooks #cloudplatforms
Imbalanced datasets can mess with your ML models. π¬
ADASYN (Adaptive Synthetic Sampling) to the rescue! π
Learn how it works + when to use it in our latest blog π
https://f.mtr.cool/rqstrumpnx
#MachineLearning #DataScience #ImbalancedData #ADASYN
πMICE is a powerful method for datasets with missing data across multiple variables.Β
Let this slide guide you through how it works.Β
#machinelearning #MICE #mlmodels #datascience #dataengineering #imputation #featureengineering
How to construct ensembles from a thousand models?
In this article, Caruana, a prominent figure in machine learning and ensemble methods, tells us more about how they create ensembles from libraries of 1000s of machine learning models.Β
π https://f.mtr.cool/fpaqqnqxms
Clustering & Dimensionality Reduction: your toolkit for finding patterns, simplifying data, and solving real-world problems.
π Youβll:
β
Group data (K-means, DBSCAN & more)
β
Reduce complexity (PCA, UMAP)
β
Work on real cases like RNA profiling
π https://f.mtr.cool/hdjiwbbsbl
Next Monday on Data Bites : Working with imbalanced data? Follow these 3 steps.
Want to know more?
Click the link below to subscribe and stay tuned!π
https://f.mtr.cool/svpfklfpda
#machinelearning #datascience #CV #mlmodels #ML #MLCareer #MLresume
Model performance matters! π―Β
In this article, we break down essential evaluation metrics for classification models, starting with the Confusion Matrix. Perfect for anyone looking to build reliable #machinelearning systems!
Have a good readπ
ELI5 now supports scikit-learn 1.6.0! πIt wasnβt working with the latest version of scikit-learn, but thatβs a thing of the past.
As of now, ELI5 has released a new version with full support for scikit-learn >1.6.0 and Python >3.10.
Check it out π
Can we use statistical tests to select features? π€
Turns out, we can! π
In the slides below, weβll explore the most commonly used statistical tests for feature selection, along with their advantages and limitations. π
#machinelearning #datascience #featureselection
π¨ Itβs here! Our new course on Clustering & Dimensionality Reduction just dropped π
Learn how to group data (K-Means, DBSCAN, Louvain) + simplify it with PCA & UMAP, no prior experience needed!
Hands-on & practical π
πΒ https://f.mtr.cool/zshxexbrds
#MachineLearning #DataScience
Next Monday on Data Bites : How to Write a Winning Data Science CV
Want to know more?
Click the link below to subscribe and stay tuned!π
https://f.mtr.cool/nozrfuruar
#machinelearning #datascience #CV #mlmodels #ML #MLCareer #MLresume
Deep learning has transformed our daily lives, but designing neural networks remains a challenge.Β
Automated hyperparameter optimization (HPO) streamlines the process. This paper reviews key techniques & tools for improving model accuracy & efficiency.
πhttps://f.mtr.cool/wowjcrmwjg
In case you were wondering πΒ
#machinelearning #ai #datascience #dataengineering #mlmodels
π¨ SMOTE has long been hailed as the go-to solution for imbalanced datasets, but it only works in specific scenarios.Β
In this article, we explore when SMOTE is truly effective & why itβs remained popular.Β
Check it out!
https://f.mtr.cool/medbbpfril
π¨ Just launched: our new course on Clustering & Dimensionality Reduction is live at Train in Data!
Learn to group data, reduce complexity with PCA & UMAP, and tackle real-world projects (no experience needed!)
π Join us:Β https://f.mtr.cool/wlhxbboqkl
Next Monday on Data Bites : Everybody says βSMOTE does not workβ.
Want to know more?
Click the link below to subscribe and stay tuned!π
https://f.mtr.cool/pinchbaedf
#machinelearning #datascience #smote #mlmodels #ML
In this video, I review hyperparameter optimization techniques like Grid Search, Random Search, & Bayesian methods.
Learn their pros, cons, and best applications for both low and high-dimensional spaces!Β
What techniques do you use?Β
π½οΈ
πPython libraries that implement agnostic global explainability methods πΒ
#python #machinelearning #MLModel #datascience #dataengineering
Most commonly used encoding techniques β¬οΈ
1. OneHotEncoder
2. OrdinalEncoder
3. TargetEncoder
When one-hot encoding gets too complex and ordinal encoding leads to inaccuracies, TargetEncoding often becomes the best choice. Learn more at the link below.
#targetencoder #ML
π¨ New Course - Clustering & Dimensionality Reduction at Train in Data
Learn to apply unsupervised ML in practice π
β
K-Means, DBSCAN, HDBSCAN, Graph-based
β
PCA & UMAP
β
Real-world projects incl. RNA case study
Find out more : https://f.mtr.cool/cojxgkyhgq
Next Monday on Data Bites : Probe Feature Selection
Want to know more?
Click the link below to subscribe and stay tuned!π
https://f.mtr.cool/xefqrzzgeh
#machinelearning #datascience #imbalanceddata #undersampling #mlmodels #ML
The most crucial component of any machine learning project is data!
Β
Β βΆοΈ 90% of the time is spent on data preprocessingΒ
Β βΆοΈ 10% of the time is spent on model building, tuning and evaluation.
#machinelearning #ML #MLmodels #preprocessing #modelbuilding #datascience
Discover the truth behind SMOTE for imbalanced data and explore better alternatives.
Learn more about metrics, threshold optimization, and classifier calibration in this video.
If you find it useful, donβt hesitate to share with your peers! π
https://www.youtube.com/watch?v=blcOOheXNoQ
#ml