<!DOCTYPE html>

<html>

  <head>

    <meta http-equiv="content-type" content="text/html; charset=UTF-8">

  </head>

  <body>

    Speaker:  Özgür Martin - Mimar Sinan Güzel Sanatlar Üniversitesi<br>

    Date: 04.12.2024<br>

    Time: 15:00 - 16:00<br>

    Location: Galatasaray Üniversitesi, Ortaköy, Çırağan Cd. No:36,

    34349 Beşiktaş, H 306<br>

    <br>

    <br>

    <div class="nBzcnc OcVpRe"

style="width: 432.485px; position: relative; display: flex; min-height: 32px; padding-right: 16px; box-sizing: border-box; -webkit-box-align: center; align-items: center; outline: none;">

      <div class="toUqff vfzv" id="xDetDlgDesc"

style="color: rgb(60, 64, 67); font-size: 14px; font-weight: 400; line-height: 20px; padding-bottom: 6px; padding-top: 6px; -webkit-box-flex: 1; flex-grow: 1; display: block; -webkit-box-align: center; align-items: center; flex-wrap: wrap; overflow: hidden auto; white-space: pre-wrap; overflow-wrap: break-word; font-family: Roboto, Arial, sans-serif; letter-spacing: 0.2px;">Title: How to train your large AI model at a lower cost?

Abstract: Stochastic gradient descent (SGD) method and its variants constitute the core optimization algorithms that are used for training large-scale machine learning models. These algorithms achieve very good convergence rates, especially when they are fine-tuned for the application at hand. Unfortunately, this tuning process can require large computational costs. For example, GPT-4 (the core machinery of ChatGPT), was trained using trillions of words of text and many thousands of powerful computer chips. The electric bill for the training was over $100 million [1].

Recent work has shown that these costs can be reduced by choosing the learning rate adaptively. We propose an alternative approach to this problem by using a new algorithm based on forward step model building [2] built upon SGD.

[1]

<a

href="https://www.google.com/url?q=https://www.wired.com/story/openai-ceo-sam-altman-the-age-of-giant-ai-models-is-already-over/&sa=D&source=calendar&ust=1733510532993228&usg=AOvVaw2WtDjtx0KyPQTkx9AwnHX9"

      target="_blank"

      style="text-decoration: underline; color: rgb(26, 115, 232);">https://www.wired.com/story/openai-ceo-sam-altman-the-age-of-giant-ai-models-is-already-over/</a>

[2] Birbil, Ş.İ., Martin, Ö., Onay, G., Öztoprak, F., Bolstering stochastic gradient descent with model building. TOP 32, 517–536 (2024).

<a

href="https://www.google.com/url?q=https://doi.org/10.1007/s11750-024-00673-z&sa=D&source=calendar&ust=1733510532993228&usg=AOvVaw0gYAxvawzcpt5CzeozXV3D"

      target="_blank"

      style="text-decoration: underline; color: rgb(26, 115, 232);">https://doi.org/10.1007/s11750-024-00673-z</a></span></div>

    </div>

    <br>

    * To access to the complete seminar calendar please visit this link<br>

    <br>

    <a class="moz-txt-link-freetext" href="https://matematik.gsu.edu.tr/tr/arastirma/seminerler">https://matematik.gsu.edu.tr/tr/arastirma/seminerler</a><br>

    <br>

    * To add to your calendar use ical url.<br>

    <br>

<a class="moz-txt-link-freetext" href="https://calendar.google.com/calendar/ical/mathseminar%40galatasaray.education/public/basic.ics">https://calendar.google.com/calendar/ical/mathseminar%40galatasaray.education/public/basic.ics</a><br>

    <br>

    * Participants from outside Galatasaray Üniversitesi are kindly<br>

    requested to send an email to <a class="moz-txt-link-abbreviated" href="mailto:mathseminar@galatasaray.education">mathseminar@galatasaray.education</a><br>

    before 13:00 on the day of the seminar.<br>

    <br>

    ---<br>

    Galatasaray Üniversitesi Matematik Bölümü<br>

    <p><br>

    </p>

    <p><a class="moz-txt-link-freetext" href="https://matematik.gsu.edu.tr/">https://matematik.gsu.edu.tr/</a></p>

    <br>

    <br>

    <br>

    <br>

    <br>

    <br>

    <div id="grammalecte_menu_main_button_shadow_host"

      style="width: 0px; height: 0px;"></div>

  </body>

</html>