Towards high-quality (maybe synthetic) datasets
Towards high-quality (maybe synthetic) datasets  
Podcast: Practical AI: Machine Learning, Data Science, LLM
Published On: Wed Oct 09 2024
Description: As Argilla puts it: "Data quality is what makes or breaks AI." However, what exactly does this mean and how can AI team probably collaborate with domain experts towards improved data quality? David Berenstein & Ben Burtenshaw, who are building Argilla & Distilabel at Hugging Face, join us to dig into these topics along with synthetic data generation & AI-generated labeling / feedback.