Christopher Ré

research

∙ 08/20/2023

LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models

The advent of large language models (LLMs) and their adoption by the leg...

0 Neel Guha, et al. ∙

research

∙ 07/26/2023

Skill-it! A Data-Driven Skills Framework for Understanding and Training Language Models

The quality of training data impacts the performance of pre-trained larg...

0 Mayee F. Chen, et al. ∙

research

∙ 07/20/2023

Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification

Recent work has shown that language models' (LMs) prompt-based learning ...

0 Neel Guha, et al. ∙

research

∙ 07/14/2023

Fast Algorithms for a New Relaxation of Optimal Transport

We introduce a new class of objectives for optimal transport computation...

0 Moses Charikar, et al. ∙

research

∙ 06/24/2023

H_2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

Large Language Models (LLMs), despite their recent impressive accomplish...

0 Zhenyu Zhang, et al. ∙

research

∙ 06/14/2023

Towards trustworthy seizure onset detection using workflow notes

A major barrier to deploying healthcare AI models is their trustworthine...

5 Khaled Saab, et al. ∙

research

∙ 06/13/2023

TART: A plug-and-play Transformer module for task-agnostic reasoning

Large language models (LLMs) exhibit in-context learning abilities which...

26 Kush Bhatia, et al. ∙

research

∙ 04/19/2023

Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes

A long standing goal of the data management community is to develop gene...

4 Simran Arora, et al. ∙

research

∙ 03/16/2023

Effectively Modeling Time Series with Simple Discrete State Spaces

Time series modeling is a well-established problem, which often requires...

0 Michael Zhang, et al. ∙

research

∙ 03/13/2023

FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU

The high computational and memory requirements of large language model (...

0 Ying Sheng, et al. ∙

research

∙ 03/01/2023

Collage Diffusion

Text-conditional diffusion models generate high-quality, diverse images....

19 Vishnu Sarukkai, et al. ∙

research

∙ 02/21/2023

Hyena Hierarchy: Towards Larger Convolutional Language Models

Recent advances in deep learning have relied heavily on the use of large...

7 Michael Poli, et al. ∙

research

∙ 02/13/2023

Simple Hardware-Efficient Long Convolutions for Sequence Modeling

State space models (SSMs) have high performance on long sequence modelin...

8 Daniel Y. Fu, et al. ∙

research

∙ 12/28/2022

Hungry Hungry Hippos: Towards Language Modeling with State Space Models

State space models (SSMs) have demonstrated state-of-the-art sequence mo...

1 Tri Dao, et al. ∙

research

∙ 11/26/2022

Transform Once: Efficient Operator Learning in Frequency Domain

Spectral analysis provides one of the most effective paradigms for infor...

9 Michael Poli, et al. ∙

research

∙ 11/16/2022

Holistic Evaluation of Language Models

Language models (LMs) are becoming the foundation for almost all major l...

21 Percy Liang, et al. ∙

research

∙ 10/12/2022

S4ND: Modeling Images and Videos as Multidimensional Signals Using State Spaces

Visual data such as images and videos are typically modeled as discretiz...

9 Eric Nguyen, et al. ∙

research

∙ 10/05/2022

Ask Me Anything: A simple strategy for prompting language models

Large language models (LLMs) transfer well to new tasks out-of-the-box s...

10 Simran Arora, et al. ∙

research

∙ 09/18/2022

HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions

Commercial ML APIs offered by providers such as Google, Amazon and Micro...

8 Lingjiao Chen, et al. ∙

research

∙ 09/13/2022

LegalBench: Prototyping a Collaborative Benchmark for Legal Reasoning

Can foundation models be guided to execute tasks involving legal reasoni...

7 Neel Guha, et al. ∙

research

∙ 07/14/2022

Contrastive Adapters for Foundation Model Group Robustness

While large pretrained foundation models (FMs) have shown remarkable zer...

8 Michael Zhang, et al. ∙

research

∙ 06/24/2022

How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections

Linear time-invariant state space models (SSM) are a classical model fro...

12 Albert Gu, et al. ∙

research

∙ 06/23/2022

On the Parameterization and Initialization of Diagonal State Space Models

State space models (SSM) have recently been shown to be very effective a...

4 Albert Gu, et al. ∙

research

∙ 06/17/2022

The Importance of Background Information for Out of Distribution Generalization

Domain generalization in medical image classification is an important pr...

0 Jupinder Parmar, et al. ∙

research

∙ 06/02/2022

Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees

Communication compression is a crucial technique for modern distributed ...

1 Jue Wang, et al. ∙

research

∙ 06/02/2022

Decentralized Training of Foundation Models in Heterogeneous Environments

Training foundation models, such as GPT-3 and PaLM, can be extremely exp...

8 Binhang Yuan, et al. ∙

research

∙ 05/31/2022

Comparing interpretation methods in mental state decoding analyses with deep learning models

Deep learning (DL) methods find increasing application in mental state d...

6 Armin W. Thomas, et al. ∙

research

∙ 05/27/2022

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

Transformers are slow and memory-hungry on long sequences, since the tim...

10 Tri Dao, et al. ∙

research

∙ 05/27/2022

Can Foundation Models Help Us Achieve Perfect Secrecy?

A key promise of machine learning is the ability to assist users with pe...

8 Simran Arora, et al. ∙

research

∙ 05/20/2022

Can Foundation Models Wrangle Your Data?

Foundation Models (FMs) are models trained on large corpora of data that...

44 Avanika Narayan, et al. ∙

research

∙ 04/18/2022

TABi: Type-Aware Bi-Encoders for Open-Domain Entity Retrieval

Entity retrieval–retrieving information about entity mentions in a query...

10 Megan Leszczynski, et al. ∙

research

∙ 04/15/2022

Perfectly Balanced: Improving Transfer and Robustness of Supervised Contrastive Learning

An ideal learned representation should display transferability and robus...

11 Mayee F. Chen, et al. ∙

research

∙ 03/24/2022

Domino: Discovering Systematic Errors with Cross-Modal Embeddings

Machine learning models that achieve high overall accuracy often make sy...

4 Sabri Eyuboglu, et al. ∙

research

∙ 03/24/2022

Shoring Up the Foundations: Fusing Model Embeddings and Weak Supervision

Foundation models offer an exciting new paradigm for constructing models...

14 Mayee F. Chen, et al. ∙

research

∙ 03/14/2022

Reasoning over Public and Private Data in Retrieval-Based Systems

Users and organizations are generating ever-increasing amounts of privat...

1 Simran Arora, et al. ∙

research

∙ 03/14/2022

SKM-TEA: A Dataset for Accelerated MRI Reconstruction with Dense Image Labels for Quantitative Clinical Evaluation

Magnetic resonance imaging (MRI) is a cornerstone of modern medical imag...

8 Arjun Desai, et al. ∙

research

∙ 03/03/2022

Correct-N-Contrast: A Contrastive Approach for Improving Robustness to Spurious Correlations

Spurious correlations pose a major challenge for robust machine learning...

3 Michael Zhang, et al. ∙

research

∙ 02/20/2022

It's Raw! Audio Generation with State-Space Models

Developing architectures suitable for modeling raw audio is a challengin...

6 Karan Goel, et al. ∙

research

∙ 12/31/2021

BARACK: Partially Supervised Group Robustness With Guarantees

While neural networks have shown remarkable success on classification ta...

4 Nimit Sohoni, et al. ∙

research

∙ 11/30/2021

Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models

Overparameterized neural networks generalize well but are expensive to t...

4 Beidi Chen, et al. ∙

research

∙ 10/31/2021

Efficiently Modeling Long Sequences with Structured State Spaces

A central goal of sequence modeling is designing a single principled mod...

1 Albert Gu, et al. ∙

research

∙ 10/28/2021

Scatterbrain: Unifying Sparse and Low-rank Attention Approximation

Recent advances in efficient Transformers have exploited either the spar...

5 Beidi Chen, et al. ∙

research

∙ 10/26/2021

Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers

Recurrent neural networks (RNNs), temporal convolutions, and neural diff...

25 Albert Gu, et al. ∙

research

∙ 10/16/2021

Metadata Shaping: Natural Language Annotations for the Tail

Language models (LMs) have made remarkable progress, but still struggle ...

6 Simran Arora, et al. ∙

research

∙ 10/15/2021

Cross-Domain Data Integration for Named Entity Disambiguation in Biomedical Text

Named entity disambiguation (NED), which involves mapping textual mentio...

9 Maya Varma, et al. ∙

research

∙ 08/16/2021

Challenges for cognitive decoding using deep learning methods

In cognitive decoding, researchers aim to characterize a brain region's ...

13 Armin W. Thomas, et al. ∙

research

∙ 07/16/2021

Declarative Machine Learning Systems

In the last years machine learning (ML) has moved from a academic endeav...

13 Piero Molino, et al. ∙

research

∙ 07/01/2021

Mandoline: Model Evaluation under Distribution Shift

Machine learning models are often deployed in different settings than th...

17 Mayee Chen, et al. ∙

research

∙ 06/07/2021

HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical Projections

This paper studies Principal Component Analysis (PCA) for data lying in ...

9 Ines Chami, et al. ∙

research

∙ 06/02/2021

Ember: No-Code Context Enrichment via Similarity-Based Keyless Joins

Structured data, or data that adheres to a pre-defined schema, can suffe...

13 Sahaana Suri, et al. ∙

Christopher Ré

Featured Co-authors

Sign in with Google

Consider DeepAI Pro