Large Language Model
803 papers with code • 0 benchmarks • 2 datasets
Benchmarks
These leaderboards are used to track progress in Large Language Model
Libraries
Use these libraries to find Large Language Model models and implementationsMost implemented papers
Generative Agents: Interactive Simulacra of Human Behavior
Believable proxies of human behavior can empower interactive applications ranging from immersive environments to rehearsal spaces for interpersonal communication to prototyping tools.
CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis
To democratize this, we train and release a family of large language models up to 16. 1B parameters, called CODEGEN, on natural language and programming language data, and open source the training library JAXFORMER.
Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data
Furthermore, we propose a new technique called Self-Distill with Feedback, to further improve the performance of the Baize models with feedback from ChatGPT.
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
Our work, for the first time, uncovers that properly aligning the visual features with an advanced large language model can possess numerous advanced multi-modal abilities demonstrated by GPT-4, such as detailed image description generation and website creation from hand-drawn drafts.
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Evaluating large language model (LLM) based chat assistants is challenging due to their broad capabilities and the inadequacy of existing benchmarks in measuring human preferences.
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following
We introduce Point-Bind, a 3D multi-modality model aligning point clouds with 2D image, language, audio, and video.
Muse: Text-To-Image Generation via Masked Generative Transformers
Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding.
DeepTextMark: A Deep Learning-Driven Text Watermarking Approach for Identifying Large Language Model Generated Text
As elaborated within the paper, these attributes are crucial for universal text source detection, with a particular emphasis in this paper on text produced by LLMs.
How is ChatGPT's behavior changing over time?
We find that the performance and behavior of both GPT-3. 5 and GPT-4 can vary greatly over time.
Efficient Memory Management for Large Language Model Serving with PagedAttention
On top of it, we build vLLM, an LLM serving system that achieves (1) near-zero waste in KV cache memory and (2) flexible sharing of KV cache within and across requests to further reduce memory usage.