Scott Lowe's Avatar

Scott Lowe

@scottclowe

Machine learning researcher

14
Followers
9
Following
15
Posts
08.12.2024
Joined
Posts Following

Latest posts by Scott Lowe @scottclowe

Preview
UIP2P: Unsupervised Instruction-based Image Editing via Cycle Edit Consistency We propose an unsupervised model for instruction-based image editing that eliminates the need for ground-truth edited images during training. Existing supervised methods depend on datasets containing ...

There will also be continued interest in methods that allow more controllable manipulation of generated images (e.g. UIP2P arxiv.org/abs/2412.15216, Imagic arxiv.org/abs/2210.09276).

But maybe the pending copyright lawsuits will have major impacts on GenAI.

03.01.2025 17:40 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

For GenAI, improvements to generation quality are going to come from better data curation and value fns to drive the model toward high-quality outputs. Standard model training results in outputs representative of the training distribution, but users don't want averageβ€”we want the best quality.

03.01.2025 17:39 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Frontier Models are Capable of In-context Scheming Frontier models are increasingly trained and deployed as autonomous agent. One safety concern is that AI agents might covertly pursue misaligned goals, hiding their true capabilities and objectives - ...

This is crucial for AGI, and will pose serious safety concerns. Models better at thinking outside the box and coming up with creative solutions will have broader implications than the prompter anticipated arxiv.org/abs/2412.04984.

03.01.2025 17:38 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Humans solve novel problems on-the-fly using System 2 reasoning: AI needs this too. By learning reasoning steps at training time, at deployment the model can build new sequences of reasoning steps, enabling it to extrapolate.

03.01.2025 17:37 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Reasoning capabilities are essential to reach robust perf in key ML products e.g. full self-driving. The distribution of driving scenarios is long-tailed, so even a model that covers most situations well may be faced with a novel situation outside its training data, but needs to respond correctly.

03.01.2025 17:37 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Large Concept Models: Language Modeling in a Sentence Representation Space LLMs have revolutionized the field of artificial intelligence and have emerged as the de-facto tool for many tasks. The current established technology of LLMs is to process input and generate output a...

One way to do this is hierarchical LLMs like the Large Concept Model arxiv.org/abs/2412.08821, Byte Latent Transformer arxiv.org/abs/2412.09871, and Block Transformer arxiv.org/abs/2406.02657.

03.01.2025 17:35 πŸ‘ 1 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0

For agentic models, the focus is shifting to System 2-like reasoning. OpenAI’s o1/o3 models demonstrate reasoning step-by-step can improve output quality by leveraging test-time compute. But impressive results on ARC are expensive, hence there will be a focus on improving test-compute efficiency.

03.01.2025 17:34 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

But the LLM training corpus is now the majority of worthwhile text humanity has ever written, and can't be meaningfully scaled further. As Ilya Sutskever put it at NeurIPS, "big data is the fossil fuel of AI".

With this in mind, what will be the next stage of AI development?

03.01.2025 17:31 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

The turn of the year is a good time for reflection. Here's my thoughts on where ML is headed.

Advances have been driven by scalingβ€”bigger compute, bigger data: bigger models. Moreover, larger data also gives a solution to OOD generalizationβ€”just increase the train set until everything is in domain!

03.01.2025 17:31 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

This has some serious AI safety implications. Having an AI model able to classify what is in an image better than a human doesn't pose an existential threat. But when an AI model can perform long-term planning better than a human, "just unplug it" ceases to be a reliable solution

09.12.2024 17:16 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
System 2 Reasoning Capabilities Are Nigh In recent years, machine learning models have made strides towards human-like reasoning capabilities from several directions. In this work, we review the current state of the literature and describe t...

In "System 2 Reasoning Capabilities Are Nigh", I lay out comparisons between human reasoning and reasoning in AI models, and argue that all the components needed to create AI models that can perform human-like reasoning already exist.
arxiv.org/abs/2410.03662

09.12.2024 17:13 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

It's very easy to get started with using the dataset. The commands to download it and load it for PyTorch training fit in less than half a tweet:

!pip install bioscan-dataset
from bioscan_dataset import BIOSCAN5M
ds = BIOSCAN5M("~/Datasets/bioscan-5m", download=True)

09.12.2024 17:08 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

The dataset should be useful for a variety of research topics:
- multimodal learning
- fine-grained classification
- hierarchical labelling
- open-world classification/clustering
- semi- and self-supervised learning

09.12.2024 17:00 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
BIOSCAN-5M: A Multimodal Dataset for Insect Biodiversity As part of an ongoing worldwide effort to comprehend and monitor insect biodiversity, this paper presents the BIOSCAN-5M Insect dataset to the machine learning community and establish several benchmar...

BIOSCAN-5M is a multimodal dataset for insect biodiversity monitoring. It consists of 5 million insect specimens from around the world, with a high-res microscopy image, DNA barcode, taxonomic labels, size, and geolocation info for each sample.
arxiv.org/abs/2406.127...

09.12.2024 16:57 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

I'm looking forward to NeurIPS this week! I'll be presenting two papers there. In the main conference, our new dataset BIOSCAN-5M, and in the System 2 Reasoning At Scale workshop my position paper "System 2 Reasoning Capabilities Are Nigh".

09.12.2024 16:53 πŸ‘ 0 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0