Director of the Institute of Systematic Problems

At Telemarketing Data Forum, professionals gather to share insights, verified contact lists, and proven strategies for successful outreach.
Post Reply
tanjimajuha20
Posts: 554
Joined: Thu Jan 02, 2025 1:17 pm

Director of the Institute of Systematic Problems

Post by tanjimajuha20 »

There are two main options that experts propose to solve this issue. The first is for private companies to gain access to government data in order to develop large language models. And the second is for private companies to share the data.

Leading researcher at the Ivannikov Institute of System Programming of the Russian Academy of Sciences (ISP RAS) Alexey Khoroshilov, when asked by a ComNews correspondent which approach he sees as more pragmatic, answered that one should not rely only on data, explaining that both of these paths will not work well for a number of reasons: "Therefore, the issue should be resolved not at the level of data, but at the level of changing approaches to them," he concluded.

of the Russian australia telegram Academy of Sciences Arutyun Avetisyan agreed that the problem of lack of data for training large language models does exist. However, according to him, there are several directions in which large language models are being developed: "One of which allows us to obtain high-quality adjusted data and, with their help, raises the quality of the models themselves. Secondly, we create small models - from 2 billion to 10 billion parameters - that work in the same way as large models, but more like specialized ones. Accordingly, a window of opportunity arises to train such models both from scratch and by distilling large models, that is, reducing their complexity and volume."

Sergey Verentsov, Technical Director of Eora Data Lab LLC, noted that the data has ceased to be accessible: "This is due to both government regulation and the fact that companies are not as open as before. It's not that there isn't enough data, but rather that access to it has become difficult. Companies understand that this is one of their assets. Developers of large and complex models are faced with the task of finding sources of data. I believe that the solution to this problem lies in the political and economic plane."
Post Reply