Sahara AI partners with Microsoft, delivering high-precision data labeling for 'MathVista'

Uk Jin

Summary

  • Sahara AI said it partnered with Microsoft to build the multimodal AI benchmark MathVista and conducted more than 6,000 high-precision data-labeling tasks.
  • MathVista is used to assess the mathematical reasoning of major generative AI models such as ChatGPT, Bard, Claude, and Gemini, and cumulative downloads have surpassed 270,000 worldwide.
  • Sahara AI said the partnership underscores that high-quality data is key to raising AI performance standards, and it plans to expand collaboration with global partners to continue building AI infrastructure.

Forecast Trend Report by Period

Loading IndicatorLoading Indicator
Photo=Sahara AI
Photo=Sahara AI

Blockchain-based artificial intelligence (AI) data platform Sahara AI (SAHARA) said on the 19th that it has partnered with Microsoft (MS) to build the multimodal AI benchmark 'MathVista (MATHVISTA)' that evaluates the inference capabilities of multimodal AI (AI that processes multiple types of information such as images and text together).

MathVista is an evaluation framework that measures the mathematical reasoning ability of AI models. It is used to validate the performance of major generative AI systems such as ChatGPT, Bard, Claude, and Gemini.

Sahara AI carried out more than 6,000 data-labeling tasks in building MathVista. Data labeling is the process of refining training data so that AI can understand and solve problems. Specifically, data labeling was performed across a range of areas including ▲arithmetic and algebra ▲geometry and statistics ▲advanced STEM logic ▲time-series numerical reasoning ▲numerical commonsense reasoning.

In particular, the project required high-precision data labeling rather than simple classification. That was the reason Microsoft selected Sahara AI.

Hao Cheng, principal researcher at Microsoft Research, said, "This project required understanding complex instructions, rigorous worker testing, and precise labeling based on logical reasoning," adding that "it was at a level that existing crowdsourcing platforms would find difficult to deliver."

Meanwhile, MathVista has been used by research institutions and companies worldwide since its release. Over the past month, it was downloaded more than 13,000 times, and cumulative downloads have surpassed 270,000.

Results from evaluating major AI models using MathVista showed that the top-performing model, GPT-4V, achieved an accuracy of just 49.9%—about 10% lower than humans. The company said the results indicate that multimodal AI inference performance still has room to improve.

Sean Ren, co-founder and CEO of Sahara AI, said, "Our collaboration with Microsoft shows that high-quality data plays a key role in raising the bar for AI performance," adding, "We plan to continue building AI infrastructure by expanding cooperation with global partners." The two companies also plan to pursue additional joint projects going forward.

Uk Jin

Uk Jin

wook9629@bloomingbit.ioH3LLO, World! I am Uk Jin.
hot_people_entry_banner in news detail bottom articleshot_people_entry_banner in news detail mobile bottom articles
What did you think of the article you just read?




PiCK News

Trending News