1 code implementation • 19 Apr 2024 • Manish Bhatt, Sahana Chennabasappa, Yue Li, Cyrus Nikolaidis, Daniel Song, Shengye Wan, Faizan Ahmad, Cornelius Aschermann, Yaohui Chen, Dhaval Kapil, David Molnar, Spencer Whitman, Joshua Saxe
We present BenchmarkName, a novel benchmark to quantify LLM security risks and capabilities.
no code implementations • 7 Dec 2023 • Manish Bhatt, Sahana Chennabasappa, Cyrus Nikolaidis, Shengye Wan, Ivan Evtimov, Dominik Gabi, Daniel Song, Faizan Ahmad, Cornelius Aschermann, Lorenzo Fontana, Sasha Frolov, Ravi Prakash Giri, Dhaval Kapil, Yiannis Kozyrakis, David LeBlanc, James Milazzo, Aleksandar Straumann, Gabriel Synnaeve, Varun Vontimitta, Spencer Whitman, Joshua Saxe
This paper presents CyberSecEval, a comprehensive benchmark developed to help bolster the cybersecurity of Large Language Models (LLMs) employed as coding assistants.
no code implementations • 3 Jun 2019 • Ben Gelman, Bryan Hoyle, Jessica Moore, Joshua Saxe, David Slater
We use Stack Overflow code snippets and their tags to train a language-agnostic, deep convolutional neural network to automatically predict semantic labels for source code documents.
no code implementations • 13 Apr 2018 • Joshua Saxe, Richard Harang, Cody Wild, Hillary Sanders
Malicious web content is a serious problem on the Internet today.
2 code implementations • 27 Feb 2017 • Joshua Saxe, Konstantin Berlin
For years security machine learning research has promised to obviate the need for signature based detection by automatically learning to detect indicators of attack.
4 code implementations • 13 Aug 2015 • Joshua Saxe, Konstantin Berlin
Further, we confirm our false positive rates directly on a live stream of files coming in from Invincea's deployed endpoint solution, provide an estimate of how many new binary files we expected to see a day on an enterprise network, and describe how that relates to the false positive rate and translates into an intuitive threat score.
Cryptography and Security