Risk bounds for transferring representations with and without fine-tuning

Daniel McNamara*, Maria Fiorina Balean

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    7 Citations (Scopus)

    Abstract

    A popular machine learning strategy is the transfer of a representation (i.e. a feature extraction function) learned on a source task to a target task. Examples include the re-use of neural network weights or word embeddings. We develop sufficient conditions for the success of this approach. If the representation learned from the source task is fixed, we identify conditions on how the tasks relate to obtain an upper bound on target task risk via a VC dimension-based argument. We then consider using the representation from the source task to construct a prior, which is fine-tuned using target task data. We give a PAC-Bayes target task risk bound in this setting under suitable conditions. We show examples of our bounds using feedforward neural networks. Our results motivate a practical approach to weight transfer, which we validate with experiments.

    Original languageEnglish
    Title of host publication34th International Conference on Machine Learning, ICML 2017
    PublisherInternational Machine Learning Society (IMLS)
    Pages3676-3684
    Number of pages9
    ISBN (Electronic)9781510855144
    Publication statusPublished - 2017
    Event34th International Conference on Machine Learning, ICML 2017 - Sydney, Australia
    Duration: 6 Aug 201711 Aug 2017

    Publication series

    Name34th International Conference on Machine Learning, ICML 2017
    Volume5

    Conference

    Conference34th International Conference on Machine Learning, ICML 2017
    Country/TerritoryAustralia
    CitySydney
    Period6/08/1711/08/17

    Fingerprint

    Dive into the research topics of 'Risk bounds for transferring representations with and without fine-tuning'. Together they form a unique fingerprint.

    Cite this