Harshay Shah

Papers

Parameters vs FLOPs: Scaling Laws for Optimal Sparsity of MoE Language Models

Samira Abnar*, Harshay Shah*, Dan Busbridge, Alaa El-Nouby, Josh Susskind, Vimal Thilak*
International Conference on Machine Learning (ICML), 2025
+ Workshop on Sparsity in LLMs (ICLR SLLM), 2025

S. Abnar*, H. Shah*, D. Busbridge, A. El-Nouby, J. Susskind, V. Thilak*
ICML 2025
+ ICLR SLLM, 2025

arxiv abstract

@misc{abnar2025parameters,
    title={Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models},
    author={Samira Abnar and Harshay Shah and Dan Busbridge and Alaaeldin Mohamed Elnouby Ali and Josh Susskind and Vimal Thilak},
    year={2025},
    eprint={2501.12370},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

ContextCite: Attributing Model Generation to Context

Benjamin Cohen-Wang*, Harshay Shah*, Kristian Georgiev*, Aleksander Mądry
Neural Information Processing Systems (NeurIPS), 2024

B. Cohen-Wang*, H. Shah*, K. Georgiev*, A. Mądry
NeurIPS 2024

arxiv abstract code poster article demo

@article{cohen2024contextcite,
    title={ContextCite: Attributing Model Generation to Context},
    author={Cohen-Wang, Benjamin and Shah, Harshay and Georgiev, Kristian and Madry, Aleksander},
    journal={arXiv preprint arXiv:2409.00729},
    year={2024}
    }

Decomposing and Editing Predictions by Modeling Model Computation

Harshay Shah, Andrew Ilyas, Aleksander Mądry
International Conference on Machine Learning (ICML), 2024
+ Workshop on Foundation Model Interventions (NeurIPS MINT), 2024

H. Shah, A. Ilyas, A. Mądry
ICML 2024
+ NeurIPS MINT, 2024

arxiv abstract code poster

@article{shah2024decomposing,
  title={Decomposing and Editing Predictions by Modeling Model Computation},
  author={Shah, Harshay and Ilyas, Andrew and Madry, Aleksander},
  journal={arXiv preprint arXiv:2404.11534},
  year={2024}
}

ModelDiff: A Framework for Comparing Learning Algorithms

Harshay Shah*, Sung Min Park*, Andrew Ilyas*, Aleksander Mądry
International Conference on Machine Learning (ICML), 2023
+ Workshop on Spurious Correlations, Invariance, and Stability (ICML SCIS), 2023

H. Shah*, S. M. Park*, A. Ilyas*, A. Mądry
ICML 2023
+ ICML SCIS, 2023

arxiv abstract code poster

@inproceedings{shah2023modeldiff,
    title={Modeldiff: A framework for comparing learning algorithms},
    author={Shah, Harshay and Park, Sung Min and Ilyas, Andrew and Madry, Aleksander},
    booktitle={International Conference on Machine Learning},
    pages={30646--30688},
    year={2023},
    organization={PMLR}
}

Do Input Gradients Highlight Discriminative Features?

Harshay Shah, Prateek Jain, Praneeth Netrapalli
Neural Information Processing Systems (NeurIPS), 2021
+ Workshop on Science and Engineering of Deep Learning (ICLR SEDL), 2021
+ Workshop on Responsible AI (ICLR RAI), 2021

H. Shah, P. Jain, P. Netrapalli
NeurIPS 2021
+ ICLR SEDL, 2021
+ ICLR RAI, 2021

arxiv abstract code poster

@article{shah2021input,
    title={Do Input Gradients Highlight Discriminative Features?},
    author={Shah, Harshay and Jain, Prateek and Netrapalli, Praneeth},
    journal={Advances in Neural Information Processing Systems},
    volume={34},
    year={2021}
}

The Pitfalls of Simplicity Bias in Neural Networks

Harshay Shah, Kaustav Tamuly, Aditi Raghunathan, Prateek Jain, Praneeth Netrapalli
Neural Information Processing Systems (NeurIPS), 2020

H. Shah, K. Tamuly, A. Raghunathan, P. Jain, P. Netrapalli
NeurIPS 2020

arxiv abstract code poster

Several works have proposed Simplicity Bias (SB)—the tendency of standard training procedures such as Stochastic Gradient Descent (SGD) to find simple models—to justify why neural networks generalize well [Arpit et al. 2017, Nakkiran et al. 2019, Soudry et al. 2018]. However, the precise notion of simplicity remains vague. Furthermore, previous settings that use SB to theoretically justify why neural networks generalize well do not simultaneously capture the non-robustness of neural networks—a widely observed phenomenon in practice [Goodfellow et al. 2014, Jo and Bengio 2017]. We attempt to reconcile SB and the superior standard generalization of neural networks with the non-robustness observed in practice by designing datasets that (a) incorporate a precise notion of simplicity, (b) comprise multiple predictive features with varying levels of simplicity, and (c) capture the non-robustness of neural networks trained on real data. Through theory and empirics on these datasets, we make four observations: (i) SB of SGD and variants can be extreme: neural networks can exclusively rely on the simplest feature and remain invariant to all predictive complex features. (ii) The extreme aspect of SB could explain why seemingly benign distribution shifts and small adversarial perturbations significantly degrade model performance. (iii) Contrary to conventional wisdom, SB can also hurt generalization on the same data distribution, as SB persists even when the simplest feature has less predictive power than the more complex features. (iv) Common approaches to improve generalization and robustness—ensembles and adversarial training—can fail in mitigating SB and its pitfalls. Given the role of SB in training neural networks, we hope that the proposed datasets and methods serve as an effective testbed to evaluate novel algorithmic approaches aimed at avoiding the pitfalls of SB; code and data available at github.com/harshays/simplicitybiaspitfalls.

@article{shah2020pitfalls,
    title={The Pitfalls of Simplicity Bias in Neural Networks},
    author={Shah, Harshay and Tamuly, Kaustav and Raghunathan, Aditi and Jain, Prateek and Netrapalli, Praneeth},
    journal={Advances in Neural Information Processing Systems},
    volume={33},
    year={2020}
}

Growing Attributed Networks through Local Processes

Harshay Shah, Suhansanu Kumar, Hari Sundaram
World Wide Web Conference (WWW), 2019

H. Shah, S. Kumar, H. Sundaram
WWW, 2019

arxiv abstract code poster

@inproceedings{shah2019growing,
    title={Growing Attributed Networks through Local Processes},
    author={Shah, Harshay and Kumar, Suhansanu and Sundaram, Hari},
    booktitle={The World Wide Web Conference},
    pages={3208--3214},
    year={2019},
    organization={ACM}
}

Blog posts

arrow_forward_ios How does in-context information shape language model generations? May 2024

arrow_forward_ios Component attribution enables gradient-free model editing Apr 2024

arrow_forward_ios Decomposing model computation with component attribution Apr 2024

arrow_forward_ios Understanding the effect of algorithmic design choices on model biases Nov 2022

arrow_forward_ios Modeling link formation in social and information networks Jun 2019