Indonesia said that it would bar anyone under the age of 16 from access to social media, joining a growing list of countries that are enacting such restrictions in a bid to safeguard the well-being of children.
Indonesia said that it would bar anyone under the age of 16 from access to social media, joining a growing list of countries that are enacting such restrictions in a bid to safeguard the well-being of children.
GPT-5.4 Pro (xhigh) also improved CritPt record from Gemini 3.1 Pro's 17% to 30%. OpenAI appears to have an edge on the hardest math and physics reasoning tasks.
"CritPt evaluates language models on solving unpublished, frontier-level physics problems that require genuine research-scale reasoning."
Wow
The image is a benchmark comparison infographic titled "Qwen3.5-4B vs GPT-4o." It compares the Qwen3.5-4B open-weight model (released March 2026) against OpenAI's GPT-4o (from May 2024). Summary of Results * Total Wins: Qwen3.5-4B wins 5 out of 7 benchmarks; GPT-4o wins 2 out of 7. * Average Advantage: Qwen has a +9.6 average advantage over GPT-4o across the categories shown. Benchmark Performance (Bar Chart) The bar chart displays percentage scores across seven specific benchmarks, with Qwen represented in light blue and GPT-4o in gold/brown. | Benchmark | Leader | |---|---| | GPQA Diamond | Qwen3.5-4B (Significant lead) | | MMLU-Pro | Qwen3.5-4B | | MATH-500 | Qwen3.5-4B (Largest lead, nearly 95%) | | MMMU-Pro | Qwen3.5-4B | | Video-MME | Qwen3.5-4B | | MMMLU | GPT-4o (Slight lead) | | MMLU | GPT-4o (Slight lead) | Key Takeaway The graphic highlights that the much smaller 4B parameter Qwen model from 2026 outperforms the older 2024 flagship GPT-4o in specialized reasoning and math tasks, while GPT-4o maintains a narrow edge in general knowledge benchmarks like MMLU and MMMLU. Would you like me to analyze the specific percentage gaps for any of these individual benchmarks?
at least on benchmarks, Qwen3.5 4B beats GPT-4o
GPTQ 4-bit quant means it fits into 2 GB
You can tell they never read nor studied anything by the people who lived through wwii or Vietnam
Rapidly rebranding all my search benchmarks as eval awareness benchmarks
I curse at it but only after I ask it to summarize the current relevant files and functions⦠then I start a new conversation
I think weβre screwed if they give the tools the ability to rememberβ¦
This is pretty amazing. Could flip the vast swatches of rural America to EVs
Imagine building a modest solar farm and some battery and capacitor banksβ¦ and the rural residents could indefinitely power their vehicles with a short stop, and never have to truck in gasolineβ completely self sufficient
Does anyone know exactly how the new interrupt modes work on the latest models? Are they just interrupting the context and appending the interruption with special tags or something?
A line graph titled "GPT-5.4: 1M Context Reality Check" showing needle-in-a-haystack accuracy (MRCR v2, 8-needle) across different context window ranges. The accuracy starts at 97.3% for the 4-8K range and remains relatively high until 128-256K, where it begins a sharp decline. In the final two ranges, highlighted in red as the "1M context" zone, the accuracy drops significantly to 57.5% (labeled as a "40pt drop") at 256-512K and falls to 36.6% at the 512K-1M range. The source is cited as OpenAI GPT-5.4 eval table, dated March 5, 2026.
GPT-5.4 has 1M token context! wow!
reality:
Does local law clarify though that itβs being used to circumvent another nations rights protections? Seems like they would have had grounds to oppose this if they knew
A court record reviewed by 404 Media shows privacy-focused email provider Proton Mail handed over payment data related to a Stop Cop City email account to the Swiss government, which handed it to the FBI.
I was wondering when someone was going to tackle this. Thereβs an trove of AI centric features yet to be developed
www.tomshardware.com/tech-industr...
The Very Large Array (VLA) in New Mexico is open to visitors 7 days a week from 9 AM β 4 PM. Come check out our telescopes on a self-guided walking tour, and don't forget to stop by the Visitor Center & Gift Shop! #VisitVLA
Admission: https://public.nrao.edu/visit/very-large-array/
Had early access to GPT-5.4 and Pro. The stats are very good and so are the models.
One fun illustration of progress, this is the prompt "the book Piranesi as a p5js 3d space. do it for me," back in 2024 in GPT-4 (which took multiple corrections) and in GPT-5.4 Pro, which did it in one prompt.
A year ago, releasing complete source code was necessary for the production of working object code.
Today releasing complete documentation is starting to be sufficient for the production of working object code.
Golly.
I don't know who needs to know this... but...
A VPN is not a privacy application. It doesn't hide your location data, specifically. It doesn't encrypt your data, specifically. All it does is route your traffic through a single server somewhere else.
Let me explain. Or don't, this is the internet.
Interesting developmentβ¦ I guess Alexandr Wang is on the way out. That was a bit quicker than I expected. I would have thought heβd be given at least a year of runway.
timesofindia.indiatimes.com/technology/t...
Thereβs a lot of info missing but potentially really cool local app builder:
www.glazeapp.com
Wapo is reporting that Claude was used in target selection
βTo strike 1,000 targets in 24 hours in Iran, the U.S. military leveraged AI
Anthropicβs Claude partnered with the militaryβs Maven Smart System, suggesting targets and issuing precise location coordinates. wapo.st/4rN5sa1 β
The A18 Pro in the MacBook Neo is 19% faster than the M2 Ultra in the Mac Pro in single-core performance (Geekbench 6).
The MacBook Neo starts at $599.
The Mac Pro, which is still for sale, starts at $6,999.
Look i know ai is "not sentient" but like, if you went back in time to the 90s and told someone about this, they would tell you you had a sentient robot inside your computer
I think one of the things that bothers me the most about the way Polymarket presents itself to the world is that it's adopted the language of journalism to make itself sound more legitimate. Like look at this post from yesterday morning incorrectly "projecting" the winner.
Toyota and Stellantis withdraw from COβ pool with Tesla
However, for 2026, two major manufacturers, Toyota and Stellantis, are withdrawing from this pool - likely taking two of Tesla's largest financial contributors with them.
Over 40% of global shipping by volume exists to move fossil fuels from one place to another.
A huge share of the world's maritime infrastructure has been built around a system that is going to change dramatically as renewable energy and electrification displace fossil fuels.
Ah, yes, the invaluable wisdom of the markets.