<!DOCTYPE html>
<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=UTF-8">
  </head>
  <body>
    Speaker:  Özgür Martin - Mimar Sinan Güzel Sanatlar Üniversitesi<br>
    Date: 04.12.2024<br>
    Time: 15:00 - 16:00<br>
    Location: Galatasaray Üniversitesi, Ortaköy, Çırağan Cd. No:36,
    34349 Beşiktaş, H 306<br>
    <br>
    <br>
    <p>*Title*: How to train your large AI model at a lower cost? </p>
    <p>*Abstract*: Stochastic gradient descent (SGD) method and its
      variants constitute the core optimization algorithms that are used
      for training large-scale machine learning models. These algorithms
      achieve very good convergence rates, especially when they are
      fine-tuned for the application at hand. Unfortunately, this tuning
      process can require large computational costs. For example, GPT-4
      (the core machinery of ChatGPT), was trained using trillions of
      words of text and many thousands of powerful computer chips. The
      electric bill for the training was over $100 million [1]. </p>
    <p>Recent work has shown that these costs can be reduced by choosing
      the learning rate adaptively. We propose an alternative approach
      to this problem by using a new algorithm based on forward step
      model building [2] built upon SGD. </p>
    <p>[1]
<a class="moz-txt-link-freetext" href="https://www.wired.com/story/openai-ceo-sam-altman-the-age-of-giant-ai-models-is-already-over/">https://www.wired.com/story/openai-ceo-sam-altman-the-age-of-giant-ai-models-is-already-over/</a>
<a class="moz-txt-link-rfc2396E" href="https://www.google.com/url?q=https://www.wired.com/story/openai-ceo-sam-altman-the-age-of-giant-ai-models-is-already-over/&sa=D&source=calendar&ust=1733510532993228&usg=AOvVaw2WtDjtx0KyPQTkx9AwnHX9"><https://www.google.com/url?q=https://www.wired.com/story/openai-ceo-sam-altman-the-age-of-giant-ai-models-is-already-over/&sa=D&source=calendar&ust=1733510532993228&usg=AOvVaw2WtDjtx0KyPQTkx9AwnHX9></a> </p>
    <p>[2] Birbil, Ş.İ., Martin, Ö., Onay, G., Öztoprak, F., Bolstering
      stochastic gradient descent with model building. TOP 32, 517–536
      (2024). <a class="moz-txt-link-freetext" href="https://doi.org/10.1007/s11750-024-00673-z">https://doi.org/10.1007/s11750-024-00673-z</a>
<a class="moz-txt-link-rfc2396E" href="https://www.google.com/url?q=https://doi.org/10.1007/s11750-024-00673-z&sa=D&source=calendar&ust=1733510532993228&usg=AOvVaw0gYAxvawzcpt5CzeozXV3D"><https://www.google.com/url?q=https://doi.org/10.1007/s11750-024-00673-z&sa=D&source=calendar&ust=1733510532993228&usg=AOvVaw0gYAxvawzcpt5CzeozXV3D></a></p>
    <br>
    * To access to the complete seminar calendar please visit this link<br>
    <br>
    <a class="moz-txt-link-freetext" href="https://matematik.gsu.edu.tr/tr/arastirma/seminerler">https://matematik.gsu.edu.tr/tr/arastirma/seminerler</a><br>
    <br>
    * To add to your calendar use ical url.<br>
    <br>
<a class="moz-txt-link-freetext" href="https://calendar.google.com/calendar/ical/mathseminar%40galatasaray.education/public/basic.ics">https://calendar.google.com/calendar/ical/mathseminar%40galatasaray.education/public/basic.ics</a><br>
    <br>
    * Participants from outside Galatasaray Üniversitesi are kindly<br>
    requested to send an email to <a class="moz-txt-link-abbreviated" href="mailto:mathseminar@galatasaray.education">mathseminar@galatasaray.education</a><br>
    before 13:00 on the day of the seminar.<br>
    <br>
    ---<br>
    Galatasaray Üniversitesi Matematik Bölümü<br>
    <br>
    <br>
    <a class="moz-txt-link-freetext" href="https://matematik.gsu.edu.tr/">https://matematik.gsu.edu.tr/</a><br>
    <br>
    <br>
    <br>
    <br>
    <br>
    <br>
    <br>
    <br>
    <div id="grammalecte_menu_main_button_shadow_host"
      style="width: 0px; height: 0px;"></div>
  </body>
</html>