e.bike.free.fr

Le site communautaire où l'on discute des vélos à assistance électrique en copyleft, libre de tout bandeau publicitaire

Vous n'êtes pas identifié.

Annonce

Bienvenue sur e.bike.free.fr le forum communautaire dédié aux vélos à assistance électrique sans pollution publicitaire envahissante. N'hésitez pas à faire part de vos connaissances sur les différents modèles évoqués.

#1 14-02-2025 11:36:39

BrandiEcg5
Membre
Date d'inscription: 02-02-2025
Messages: 20
Site web

*

https://media.premiumtimesng.com/wp-content/files/2025/01/Deepseek-750x430-1.jpg
Machine-learning models can fail when they attempt to make forecasts for individuals who were underrepresented in the datasets they were trained on.
https://cdn-1.webcatalog.io/catalog/deepseek/deepseek-social-preview.png?v\u003d1735234232905

For circumstances, a design that predicts the very best treatment alternative for somebody with a persistent illness may be trained using a dataset that contains mainly male patients. That model might make incorrect forecasts for female clients when deployed in a hospital.
https://ebsedu.org/wp-content/uploads/2023/07/AI-Artificial-Intelligence-What-it-is-and-why-it-matters.jpg

To improve results, engineers can attempt stabilizing the training dataset by removing information points till all subgroups are represented similarly. While dataset balancing is promising, it typically needs eliminating large amount of data, injuring the model's total efficiency.


MIT researchers developed a new method that identifies and removes particular points in a training dataset that contribute most to a design's failures on minority subgroups. By getting rid of far less datapoints than other methods, this technique maintains the general accuracy of the model while enhancing its performance concerning underrepresented groups.


In addition, the method can identify hidden sources of predisposition in a training dataset that lacks labels. Unlabeled information are even more widespread than identified information for macphersonwiki.mywikis.wiki lots of applications.


This technique could likewise be integrated with other methods to enhance the fairness of machine-learning models released in high-stakes circumstances. For instance, it may one day assist make sure underrepresented patients aren't misdiagnosed due to a prejudiced AI model.


"Many other algorithms that attempt to address this concern assume each datapoint matters as much as every other datapoint. In this paper, we are revealing that presumption is not true. There are specific points in our dataset that are adding to this predisposition, and we can discover those information points, eliminate them, and improve efficiency," states Kimia Hamidieh, an electrical engineering and computer technology (EECS) graduate trainee at MIT and co-lead author of a paper on this strategy.


She composed the paper with co-lead authors Saachi Jain PhD '24 and fellow EECS graduate trainee Kristian Georgiev; Andrew Ilyas MEng '18, PhD '23, a Stein Fellow at Stanford University; and senior authors Marzyeh Ghassemi, links.gtanet.com.br an associate teacher in EECS and a member of the Institute of Medical Engineering Sciences and the Laboratory for Details and Decision Systems, and Aleksander Madry, the Cadence Design Systems Professor at MIT. The research study will be presented at the Conference on Neural Details Processing Systems.


Removing bad examples


Often, machine-learning models are trained utilizing huge datasets gathered from many sources throughout the web. These datasets are far too big to be carefully curated by hand, so they might contain bad examples that injure design performance.


Scientists also understand that some information points affect a design's performance on certain downstream tasks more than others.


The MIT researchers combined these 2 concepts into a technique that identifies and gets rid of these problematic datapoints. They look for to fix an issue understood as worst-group error, which occurs when a model underperforms on minority subgroups in a training dataset.


The researchers' new method is driven by previous operate in which they introduced a method, experienciacortazar.com.ar called TRAK, that recognizes the most essential training examples for a specific design output.


For this new method, they take inaccurate predictions the model made about minority subgroups and utilize TRAK to identify which training examples contributed the most to that incorrect forecast.


"By aggregating this details across bad test predictions in the proper way, we have the ability to find the particular parts of the training that are driving worst-group accuracy down in general," Ilyas explains.


Then they remove those specific samples and retrain the model on the remaining data.


Since having more data usually yields better total efficiency, eliminating simply the samples that drive worst-group failures maintains the design's overall precision while boosting its efficiency on minority subgroups.


A more available approach


Across three machine-learning datasets, their method surpassed multiple strategies. In one instance, it enhanced worst-group precision while eliminating about 20,000 less training samples than a standard data balancing method. Their method also attained greater precision than techniques that need making modifications to the inner functions of a model.


Because the MIT approach includes altering a dataset rather, it would be simpler for a professional to use and can be applied to lots of types of designs.


It can also be made use of when predisposition is unknown because subgroups in a training dataset are not identified. By identifying datapoints that contribute most to a function the model is finding out, they can comprehend the variables it is using to make a prediction.


"This is a tool anybody can use when they are training a machine-learning model. They can take a look at those datapoints and see whether they are lined up with the capability they are trying to teach the design," states Hamidieh.


Using the method to spot unknown subgroup predisposition would need intuition about which groups to look for, so the scientists intend to confirm it and explore it more totally through future human research studies.


They likewise want to improve the efficiency and dependability of their technique and ensure the technique is available and user friendly for practitioners who might one day deploy it in real-world environments.


"When you have tools that let you seriously look at the information and figure out which datapoints are going to cause bias or other unfavorable behavior, it offers you an initial step toward structure models that are going to be more fair and more trusted," Ilyas says.


This work is funded, in part, by the National Science Foundation and the U.S. Defense Advanced Research Projects Agency.


my homepage - ai

Hors ligne

 

Pied de page des forums

Propulsé par FluxBB
Traduction par FluxBB.fr