Data Analysis and Preview¶
Generally, the quality of raw data cannot meet training requirements, for example, invalid or duplicate data exists. To help you improve data quality, ModelArts provides the following capabilities:
Auto Grouping: pre-classifies data through clustering to allow you to label data based on clustering results, which ensures that different labels have the same or the almost same number of samples.
Data Filtering: enables you to filter data based on sample attributes and auto grouping results.
Data Feature Analysis: analyzes data features or labeling results, such as the brightness and bounding box distribution, helping you analyze data balance and improve the model training effect.