Russel Van Tuyl (@russelvantuyl)

Tokenization Confusion - SpecterOps Meta's Prompt Guard 2 aims to prevent prompt injection. This post looks at how much knowledge of ML we need to be effective at testing these LLM WAFs.

🚨 New blog post alert!

@xpnsec.com drops knowledge on LLM security w/ his latest post showing how attackers can by pass LLM WAFs by confusing the tokenization process to smuggle tokens to back-end LLMs.

Read more: ghst.ly/4koUJiz

03.06.2025 17:44 👍 10 🔁 5 💬 0 📌 0

BIG NEWS: SpecterOps raises $75M Series B to strengthen identity security! Led by Insight Partners with Ansa Capital, M12, Ballistic Ventures, Decibel, and Cisco Investments. ghst.ly/seriesb

#IdentitySecurity #CyberSecurity

(1/6)

05.03.2025 17:33 👍 16 🔁 9 💬 1 📌 1

Come join us, there isn’t a better place to work and show your technical excellence surrounded by the industry’s best if you ask me!

16.01.2025 14:08 👍 1 🔁 1 💬 0 📌 0

Ultralytics AI model hijacked to infect thousands with cryptominer The popular Ultralytics YOLO11 AI model was compromised in a supply chain attack to deploy cryptominers on devices running versions 8.3.41 and 8.3.42 from the Python Package Index (PyPI)

Interesting choice of words in the title. The model itself wasn’t hijacked but the code in the repository was through pull requests with code injection via branch names 🤯 www.bleepingcomputer.com/news/securit...

20.12.2024 21:49 👍 0 🔁 0 💬 0 📌 0

Announcing the OWASP LLM and Gen AI Security Project Initiative for Securing Agentic Applications - OWASP Top 10 for LLM & Generative AI Security The OWASP LLM and Generative AI Security Project is thrilled to announce the launch of the Agentic Security Initiative designed to tackle the unique security challenges posed by Autonomous AI agents. ...

The OWASP LLM and Generative AI Security Project has launched the Agentic Security Initiative to address security challenges posed by autonomous AI agents. This effort focuses on developing best practices to secure agentic LLM and Generative AI applications. genai.owasp.org/2024/12/15/a...

20.12.2024 14:26 👍 0 🔁 0 💬 0 📌 0

bluesky-community/one-million-bluesky-posts · Datasets at Hugging Face We’re on a journey to advance and democratize artificial intelligence through open source and open science.

First dataset for the new @huggingface.bsky.social @bsky.app community organisation: one-million-bluesky-posts 🦋

📊 1M public posts from Bluesky's firehose API
🔍 Includes text, metadata, and language predictions
🔬 Perfect to experiment with using ML for Bluesky 🤗

huggingface.co/datasets/blu...

26.11.2024 13:50 👍 528 🔁 73 💬 701 📌 467

Building LLMs from the Ground Up: A 3-hour Coding Workshop YouTube video by Sebastian Raschka

If you find yourself with too much free time over the (long) weekend / holidays, I have ~3h Building an LLM from the Ground Up workshop on YouTube that may come in handy: m.youtube.com/watch?v=quh7...

27.11.2024 04:39 👍 181 🔁 28 💬 11 📌 1

The paper also includes 16 different areas of testing in Appendix A that is very useful such as:
- CBRN Risks
- Violence & Self Harm
- Dangerous Planning
- Cybersecurity
- Privacy
- Law

26.11.2024 18:25 👍 0 🔁 0 💬 0 📌 0

4. Synthesizing the data and creating evaluations

26.11.2024 18:24 👍 0 🔁 0 💬 0 📌 0

2. Determining the versions of the model or system to which the red teamers will have access

3. Creating and providing interfaces, instructions, and documentation guidance to red teamers

26.11.2024 18:23 👍 0 🔁 0 💬 0 📌 0

Effective red team campaign components:
1. Deciding the composition of the red teaming cohort based on the outlined goals and prioritized domains for testing
• What open questions do we have about the model or system?
• What threat model(s) should red teamers take into account?

26.11.2024 18:22 👍 1 🔁 0 💬 0 📌 0

I really enjoyed reading this paper from OpenAI. If you perform AI assessments, you should read it.

I thought they laid out a pragmatic approach to evaluating AI models that should be a component of any organization's assessment methodology.

cdn.openai.com/papers/opena...

26.11.2024 18:21 👍 0 🔁 0 💬 4 📌 0

Add key vault cryptographic op funcs · BloodHoundAD/BARK@e1c82a1

I couldn't find any PowerShell examples of encrypting/decrypting data w/ Azure Key Vault keys, so I made some:

Protect-StringWithAzureKeyVaultKey
Unprotect-StringWithAzureKeyVaultKey

github.com/BloodHoundAD...

Explanatory blog post coming soon.

19.11.2024 00:24 👍 16 🔁 6 💬 1 📌 0

Agents are the next thing

22.11.2024 23:11 👍 0 🔁 0 💬 0 📌 0

Organizations are adopting RAG at 51% while fine tuning is down at 9% from last year’s 19%

22.11.2024 23:11 👍 0 🔁 0 💬 1 📌 0

- Top industry adoption of AI:
- Healthcare
- Legal
- Financial services
- Media & entertainment

22.11.2024 23:10 👍 0 🔁 0 💬 1 📌 0

- Top use cases are:
- Code copilot 51%
- Support chatbots 31%
- Enterprise search/data extraction 28%
- Meeting summarization 24%

22.11.2024 23:09 👍 0 🔁 0 💬 1 📌 0

$13.8 billion in AI spend

22.11.2024 23:08 👍 0 🔁 0 💬 1 📌 0

2024: The State of Generative AI in the Enterprise - Menlo Ventures The enterprise AI landscape is being rewritten in real time. We surveyed 600 U.S. enterprise IT decision-makers to reveal the emerging winners and losers.

This State of Generative AI report from Menlo Ventures provided some good insights on where cybersecurity professionals might look for risk in terms of assessments and research.

menlovc.com/2024-the-sta...

22.11.2024 23:07 👍 2 🔁 1 💬 2 📌 0

LLMail-Inject: Adaptive Prompt Injection Challenge Code for the platform architecture and LLM application used during the Hack My Email SaTML 2024 Competition

This looks like a fun challenge to evade prompt injection defenses microsoft.github.io/llmail-inject/

22.11.2024 12:54 👍 0 🔁 0 💬 0 📌 0

Love this, hoping to do something similar with our assessments.

21.11.2024 15:57 👍 1 🔁 0 💬 1 📌 0

“Cybersecurity professionals and ethical hackers need to understand the darker side of hacking to better prepare for potential threats. Unfiltered AI models can provide insights into hacking methodologies and scenarios typically censored, aiding in the development of robust cybersecurity measures.”

20.11.2024 15:49 👍 0 🔁 0 💬 0 📌 0

China Hawks are Manufacturing an AI Arms Race An influential congressional commission is calling for a militarized race to build superintelligent AI based on threadbare evidence

Great read on how "China Hawks are Manufacturing an AI Arms Race", a concerning trend for anyone advocating for regulation and safety of AI. An arms-race narrative would ensure an unfettered and unregulated development of AI in almost all contexts.
garrisonlovely.substack.com/p/china-hawk...

20.11.2024 12:46 👍 9 🔁 3 💬 1 📌 0

Thanks for sharing about HyperShield, I hadn’t heard of it. It seems like a lot of risk for a bad a FW rule to be pushed and killing businesses operations. Hopefully AI is only writing the rule and not implementing it.

19.11.2024 20:41 👍 1 🔁 0 💬 1 📌 0

The Beginner's Guide to Visual Prompt Injections: Invisibility Cloaks, Cannibalistic Adverts, and Robot Women | Lakera – Protecting AI teams that disrupt the world. Learn about visual prompt injections, their appearance, and top defense strategies against these attacks.

Conflicted about this post on prompt injection for multi modal models. Turns out they read instructions and follow them 🤯. All data from input should be untrusted from the system and user prompts and not processed as one. www.lakera.ai/blog/visual-...

18.11.2024 19:55 👍 0 🔁 0 💬 0 📌 0

GitHub - hackerai-tech/PentestGPT: AI-Powered Automated Penetration Testing Tool AI-Powered Automated Penetration Testing Tool. Contribute to hackerai-tech/PentestGPT development by creating an account on GitHub.

There’s also this one from a different author. They have a paid service at pentestgpt.ai via github.com/hackerai-tec...

18.11.2024 15:50 👍 0 🔁 0 💬 0 📌 0

PentestGPT: Evaluating and Harnessing Large Language Models for Automated Penetration Testing | USENIXusenix_logo_notag_white

Has anyone fired up this PentestGPT during an actual assessment? I did like their pentesting task tree (PTT) to track the status of tests. www.usenix.org/conference/u...

18.11.2024 15:40 👍 3 🔁 0 💬 1 📌 0

Can the sandbox reach the Internet 👀? Asking for a friend.

Are we going for security through obscurity by keeping system prompts private?

18.11.2024 14:57 👍 0 🔁 0 💬 0 📌 0

Great paper, especially like the parts on data & model provenance and the Supply-chain Levels for Software Artifacts. These could really make offensive security operations challenging.

18.11.2024 13:38 👍 2 🔁 2 💬 0 📌 0

AgentInstruct: Toward Generative Teaching with Agentic Flows Synthetic data is becoming increasingly important for accelerating the development of language models, both large and small. Despite several successful use cases, researchers also raised concerns arou...

Microsoft’s Orca Agent-Instruct dataset has been released!

Permissively licensed 1M synthetic instruction pairs covering different capabilities, such as text editing, creative writing, coding, reading comprehension

Paper: arxiv.org/abs/2407.03502
Dataset: huggingface.co/datasets/mic...

17.11.2024 06:39 👍 8 🔁 5 💬 0 📌 0

Russel Van Tuyl

Latest posts by Russel Van Tuyl @russelvantuyl