Classification with Mixture of Experts Models

Abstract

Mixture of experts (MoE) layers allow for an increase in model parameters without a corresponding increase in computational cost by utilizing sparse dynamic computation across “expert” modules during both inference and training. In this work we study whether these sparse activations of expert modules are semantically meaningful in classification tasks; in particular, we investigate whether experts develop specializations that reveal semantic relationships among classes. This work replaces the classification head of selected deep networks on classification tasks with an MoE layer. MoE layers allow for the experts to specialize in ways that are qualitatively intuitive, and quantitatively match structural descriptions of their relationships better than the classification heads in the original networks.

Details

Title

Classification with Mixture of Experts Models

Author

Mooney, James Thomas

Publication year

2022

Publisher

ProQuest Dissertations & Theses

ISBN

9798368480794

Source type

Dissertation or Thesis

Language of publication

English

ProQuest document ID

2774407226

Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.

Classification with Mixture of Experts Models

Content area

Abstract

Details

Suggested sources