1
Country: Australia
Appen delivers high-quality datasets that power world-leading AI models. Its Appen AI Data Platform (ADAP) combines automation and human oversight to deliver high-quality data for a wide range of AI modalities and use cases. It streamlines complex workflows, enabling rapid model iteration and the development of advanced AI systems that meet business needs. Appen's network of over a million AI trainers worldwide evaluates datasets for accuracy and bias, adding value through language proficiency, creativity, and adherence to brand guidelines. Appen also provides enterprises with software for collecting, processing, customizing data that automates tasks traditionally performed by humans.
Appen delivers high-quality datasets that power world-leading AI models. Its Appen AI Data Platform (ADAP) combines automation and human oversight to deliver high-quality data for a wide range of AI modalities and use cases. It streamlines complex workflows, enabling rapid model iteration and the development of advanced AI systems that meet business needs. Appen's network of over a million AI trainers worldwide evaluates datasets for accuracy and bias, adding value through language proficiency, creativity, and adherence to brand guidelines. Appen also provides enterprises with software for collecting, processing, customizing data that automates tasks traditionally performed by humans.
2
Country: USA | Funding: $1M
Enabled Intelligence specializes in AI-powered data labeling for classified systems. The company collaborates with the Department of Defense, National Geospatial-Intelligence Agency, CSHA, BAE Systems, Vantor and others. In particular, the company provides data labeling services, enabling AI and machine learning systems to separate objects in satellite imagery to identify targets of interest. Enabled Intelligence also develops own AI models for high-stakes applications, creates pre-labled libraries and datasets, provides LLM fine-tuning and testing services and annotates audio recordings in various languages.
Enabled Intelligence specializes in AI-powered data labeling for classified systems. The company collaborates with the Department of Defense, National Geospatial-Intelligence Agency, CSHA, BAE Systems, Vantor and others. In particular, the company provides data labeling services, enabling AI and machine learning systems to separate objects in satellite imagery to identify targets of interest. Enabled Intelligence also develops own AI models for high-stakes applications, creates pre-labled libraries and datasets, provides LLM fine-tuning and testing services and annotates audio recordings in various languages.
3
Country: USA
DataAnnotation Tech (a subsidiary of Surge AI) is a platform that specializes in recruiting more-or-less qualified experts to label data for training AI models remotely. Freelancers perform tasks such as image and video labeling, fact-checking and suggesting best responses for chatbots, audio transcription, writing text and code. Pay depends on the complexity of the task and starts at $20 per hour. The platform allows participants to choose projects and working hours. Before contractors can receive tasks, they must create account and complete screening tests. The platform is often associated with scam because freelancers sell their accounts or build a network of subcontractors to work through their own accounts.
DataAnnotation Tech (a subsidiary of Surge AI) is a platform that specializes in recruiting more-or-less qualified experts to label data for training AI models remotely. Freelancers perform tasks such as image and video labeling, fact-checking and suggesting best responses for chatbots, audio transcription, writing text and code. Pay depends on the complexity of the task and starts at $20 per hour. The platform allows participants to choose projects and working hours. Before contractors can receive tasks, they must create account and complete screening tests. The platform is often associated with scam because freelancers sell their accounts or build a network of subcontractors to work through their own accounts.
4
Country: USA | Funding: $15.9B
Scale AI is focused on data annotation for AI training and serves large tech companies like OpenAI, Google and Microsoft, which are developing large language models. It operates a distributed network of annotators and subcontractors via online platforms Remotasks and Outlier. The company also develops the "Scale Data Engine" platform for fine-tuning and reinforcement learning based on human feedback. Scale also has its own SEAL (Safety, Evaluations, Alignment Lab) research lab for evaluating and aligning AI models. At the exit it produces analytical reports on LLM performance, including quality metrics, safety and weaknesses. The company is 49% owned by Meta, that often raises concerns about independence, data leaks and client conflicts of interest.
Scale AI is focused on data annotation for AI training and serves large tech companies like OpenAI, Google and Microsoft, which are developing large language models. It operates a distributed network of annotators and subcontractors via online platforms Remotasks and Outlier. The company also develops the "Scale Data Engine" platform for fine-tuning and reinforcement learning based on human feedback. Scale also has its own SEAL (Safety, Evaluations, Alignment Lab) research lab for evaluating and aligning AI models. At the exit it produces analytical reports on LLM performance, including quality metrics, safety and weaknesses. The company is 49% owned by Meta, that often raises concerns about independence, data leaks and client conflicts of interest.
5
Country: USA | Funding: $235.3M
Snorkel is an AI data science lab that develops datasets, benchmarks and data evaluation methods to help AI learn, adapt and operate in enterprise systems. Snorkel Flow platform connects enterprise data streams with systems that run AI in production, enabling organizations to evaluate, develop and improve models and agents quickly and with high quality. Snorkel also provides data preparation and annotation services, AI model evaluation, agent/RAG diagnostics and dataset creation. The company leverages its network of experts to produce high-quality datasets for specialized, domain-specific tasks. However, because manual labeling and annotation of data is a slow, expensive process, Snorkel also utilizes software labeling technologies, enabling experts to encode domain knowledge into labeling features so that they can be applied to the entire dataset at once, rather than one data point at a time. Snorkel Flow then removes the noise and applies the most likely label(s) to each data point.
Snorkel is an AI data science lab that develops datasets, benchmarks and data evaluation methods to help AI learn, adapt and operate in enterprise systems. Snorkel Flow platform connects enterprise data streams with systems that run AI in production, enabling organizations to evaluate, develop and improve models and agents quickly and with high quality. Snorkel also provides data preparation and annotation services, AI model evaluation, agent/RAG diagnostics and dataset creation. The company leverages its network of experts to produce high-quality datasets for specialized, domain-specific tasks. However, because manual labeling and annotation of data is a slow, expensive process, Snorkel also utilizes software labeling technologies, enabling experts to encode domain knowledge into labeling features so that they can be applied to the entire dataset at once, rather than one data point at a time. Snorkel Flow then removes the noise and applies the most likely label(s) to each data point.
6
Country: Switzerland | Funding: $72M
Toloka is a provider of carefully curated data for developing AI agents and models. The company has a unique methodology for ensuring high data quality, optimally combining machine learning technologies and human expertise (the company has its own global network of experts). Toloka creates environments and training platforms for reinforcement learning, collecting trajectories and graded evaluation signals for training and evaluating AI agents. The company collaborates with each client to define reliable success criteria and then develops reproducible data and virtual environments that integrate with client's training and evaluation process. Toloka improves the following model parameters: agent skills, AI safety, coding skills, text generation and reasoning skills, image, video and audio generation.
Toloka is a provider of carefully curated data for developing AI agents and models. The company has a unique methodology for ensuring high data quality, optimally combining machine learning technologies and human expertise (the company has its own global network of experts). Toloka creates environments and training platforms for reinforcement learning, collecting trajectories and graded evaluation signals for training and evaluating AI agents. The company collaborates with each client to define reliable success criteria and then develops reproducible data and virtual environments that integrate with client's training and evaluation process. Toloka improves the following model parameters: agent skills, AI safety, coding skills, text generation and reasoning skills, image, video and audio generation.
7
Country: Israel | Funding: $49M
DataLoop develops data management and annotation platform that streamlines the process of creating quality machine learning and AI-ready datasets from unstructured data. Dataloop's dataset browser enables visualization of unstructured data at any scale, simplifying data exploration and improving decision making. It's data management capabilities support querying, versioning and curation of all types of unstructured data. The platform scales data to millions of individual elements of video, images, audio and other formats. Dataloop also simplifies the integration of human feedback into the AI development process. Platform use cases include active learning workflows, validating GenAI, building AI Agents, building RAG workflows, DataOps and LiDAR data annotation. Dataloop also provides a marketplace of pre-created AI workflows, allowing teams to jumpstart their development with hundreds of pre-built pipeline templates, great datasets and the latest models.
DataLoop develops data management and annotation platform that streamlines the process of creating quality machine learning and AI-ready datasets from unstructured data. Dataloop's dataset browser enables visualization of unstructured data at any scale, simplifying data exploration and improving decision making. It's data management capabilities support querying, versioning and curation of all types of unstructured data. The platform scales data to millions of individual elements of video, images, audio and other formats. Dataloop also simplifies the integration of human feedback into the AI development process. Platform use cases include active learning workflows, validating GenAI, building AI Agents, building RAG workflows, DataOps and LiDAR data annotation. Dataloop also provides a marketplace of pre-created AI workflows, allowing teams to jumpstart their development with hundreds of pre-built pipeline templates, great datasets and the latest models.
8
Country: USA | Funding: $41.6M
Micro1 helps AI development companies find and manage contractors for data labeling and training. Micro1 partners with leading AI labs, including Microsoft, who are seeking to improve large-scale language models using post-training and reinforcement learning. The company also helps evaluate enterprise AI agents for internal workflows, operations support, finance and industry-specific tasks - with the help of subject-matter experts. Micro1 also enables robotics pre-training, which requires high-quality, human-generated demonstrations of everyday physical tasks. It's building the world's largest robotics pre-training dataset by collecting demonstrations from hundreds of generalists who record interactions with objects in their homes.
Micro1 helps AI development companies find and manage contractors for data labeling and training. Micro1 partners with leading AI labs, including Microsoft, who are seeking to improve large-scale language models using post-training and reinforcement learning. The company also helps evaluate enterprise AI agents for internal workflows, operations support, finance and industry-specific tasks - with the help of subject-matter experts. Micro1 also enables robotics pre-training, which requires high-quality, human-generated demonstrations of everyday physical tasks. It's building the world's largest robotics pre-training dataset by collecting demonstrations from hundreds of generalists who record interactions with objects in their homes.
9
Country: UK | Funding: $33.7M
Prolific has built a network of 120,000 human participants to inform and stress test AI models.
Prolific has built a network of 120,000 human participants to inform and stress test AI models.
10
Country: USA | Funding: $30M
Heartex offers a data labeling and annotations tool for building accurate and smart AI products.
Heartex offers a data labeling and annotations tool for building accurate and smart AI products.
11
Country: USA | Funding: $25M
Surge AI is the world's most powerful data labeling platform for NLP.
Surge AI is the world's most powerful data labeling platform for NLP.
12
Country: USA | Funding: $5.1M
RedBrick AI is a SaaS platform for annotating medical data.
RedBrick AI is a SaaS platform for annotating medical data.
13
Country: Estonia | Funding: $2M
AI based, gamification supported, crowdsourced data annotation and enrichment platform
AI based, gamification supported, crowdsourced data annotation and enrichment platform
14
Country: Ukraine | Funding: $1.5M
Annotation Hub offers a curated platform for data annotation freelancers and agencies. Beyond job connections, we equip individuals with sought-after annotation skills, ensuring their evolution from entry-level roles to tech professionals.
Annotation Hub offers a curated platform for data annotation freelancers and agencies. Beyond job connections, we equip individuals with sought-after annotation skills, ensuring their evolution from entry-level roles to tech professionals.
15
Country: USA | Funding: $100K
Labellerr's data labeling engine uses automated annotation, advanced analytics and smart QA to process million images and thousands hrs of videos in just few weeks.
Labellerr's data labeling engine uses automated annotation, advanced analytics and smart QA to process million images and thousands hrs of videos in just few weeks.
16
Country: USA
Label Your Data offers secure, high-quality data annotation services for Computer Vision and NLP. Our expertise spans diverse industries (including military) and data types.
Label Your Data offers secure, high-quality data annotation services for Computer Vision and NLP. Our expertise spans diverse industries (including military) and data types.
17
Country: USA
Shaip is an end-to-end AI training data ecosystem that helps companies launch their most demanding AI initiatives.
Shaip is an end-to-end AI training data ecosystem that helps companies launch their most demanding AI initiatives.
18
Country: India
Wisepl is one of the leading companies in image annotation to annotate the data with an exceptional level of accuracy
Wisepl is one of the leading companies in image annotation to annotate the data with an exceptional level of accuracy
19
Country: India
TagX is an Industry-leading data annotation/labeling Company creating high-quality data assets for Artificial Intelligence leveraging AI and humans in the loop. By learning from the data we create AI solutions for industries to maximize profits and reduce downtimes.
TagX is an Industry-leading data annotation/labeling Company creating high-quality data assets for Artificial Intelligence leveraging AI and humans in the loop. By learning from the data we create AI solutions for industries to maximize profits and reduce downtimes.
20
Country: USA
Cogito is the industry leader in data labeling and annotation services to provide the training data sets for AI and machine learning model developments. All types of AI and ML services requires the training data for algorithms with next level of accuracy making AI possible into diverse fields like healthcare, gaming, agriculture, retail, automotive, robotics and security surveillance etc.
Cogito is the industry leader in data labeling and annotation services to provide the training data sets for AI and machine learning model developments. All types of AI and ML services requires the training data for algorithms with next level of accuracy making AI possible into diverse fields like healthcare, gaming, agriculture, retail, automotive, robotics and security surveillance etc.

























