AI hosting platform surpasses 1 million models for the first time

On Thursday, AI hosting platform Hugging Face exceeded The listing of 1 million AI models for the first time marked a milestone in the rapidly expanding field of machine learning. An AI model is a computer program (usually using a neural network) that is trained on data to perform specific tasks or make predictions. The platform, which started as a chatbot application in 2016 and became an open-source hub for artificial intelligence models in 2020, now hosts a wide range of tools for developers and researchers.

The field of machine learning represents a much larger world than large language models (LLMs) of the kind that power ChatGPT. Hugging Face CEO Clément Delangue in a post about X wrote How the company is home to many high-profile AI models such as “Llama, Gemma, Phi, Flux, Mistral, Starcoder, Qwen, Stable diffusion, Grok, Whisper, Olmo, Command, Zephyr, OpenELM, Jamba, Yi” and also “999.984 another person.”

Delangue says this is due to personalization. “Contrary to the fallacy of ‘1 model to rule them all,'” he wrote, “smaller, customized, optimized models are better for your use case, space, language, hardware, and constraints in general. In fact,” Hugging Face has only one “What few people realize is that there are many models that are specific to an organization, for companies to build AI specifically for their use cases.”

enlarge / A chart provided by Hugging Face showing the number of AI models added to Hugging Face month by month over time.

Hugging Face’s evolution into a major AI platform follows the increasing pace of AI research and development in the tech industry. In just a few years the number of models hosted on the site has increased significantly along with interest in the field. Caleb Fahlgren, Hugging Face product engineer at X published a chart Number of models created on the platform each month (and connection to other maps), saying, ““The models are increasing exponentially every month and September isn’t even over yet.”

The power of fine tuning

As Delangue noted above, the multitude of models on the platform stems from the collaborative nature of the platform and its practice of fine-tuning existing models for specific tasks. Fine-tuning means taking an existing model, adding new concepts to the neural network, and giving it additional training to change the way it produces output. Developers and researchers from around the world contribute to their results, leading to a large ecosystem.

For example, the platform hosts many variations of Meta’s open weights llama models Representing different tweaked versions of the original base models, each optimized for specific applications.

Hugging Face’s repository includes models for a wide variety of tasks. browsing models page Under the “Multimodal” section, it shows categories such as image to text, visual question answering, and document question answering. The “Computer Vision” category includes subcategories such as depth estimation, object detection, and image rendering. Natural language processing tasks such as text classification and question answering are also represented along with voice, table, and reinforcement learning (RL) models.

enlarge / Screenshot of the Hugging Face models page taken on September 26, 2024.

Hugging Face

When sorted for “most downloads” The list of Hugging Face models reveals the artificial intelligence models that people find most useful. It is at the top by a large margin with 163 million downloads. Audio Spectrogram Transformer from MIT, which classifies audio content such as speech, music, and environmental sounds. It follows with 54.2 million downloads BERT An artificial intelligence language model from Google that learns to understand English by predicting masked words and sentence relationships, allowing it to assist with a variety of language tasks.

To complete the first five artificial intelligence models: all-MiniLM-L6-v2 (maps sentences and paragraphs into 384-dimensional dense vector representations, useful for semantic search), Vision Transformer (which processes images as patch arrays to perform image classification) and OpenAI CLIP (links images and text, allowing visual content to be classified or described using natural language).

No matter the model or mission, the platform continues to grow. “Today in HF, a new repository (model, dataset, or field) is created every 10 seconds,” Delangue wrote. “Eventually there will be as many models as there are code repositories, and that’s what we’ll be here for!”