A groundbreaking algorithm exposes how much hidden sugar is lurking in your food—and shows which countries and products meet the mark for healthy carbs.
Study: Predicting carbohydrate quality in a global database of packaged foods. Image Credit: New Africa / Shutterstock
Carbohydrates contribute approximately 70% of daily energy intake in the average human diet worldwide; yet, the importance of carbohydrate quality is often overshadowed by its quantity. In a recent study published in the journal Frontiers in Nutrition, a European research team developed an algorithm to predict the free sugar content in packaged foods, providing insights into carbohydrate quality on a global scale.
Carbohydrates in the diet
Carbohydrates are a vital energy source and play a crucial role in global nutrition. While discussions on diet often focus on the quantity of carbohydrates, the quality of carbohydrates is equally essential for maintaining good health. Scientific evidence indicates that the quality of carbohydrates affects metabolic function and the risk of chronic diseases.
One tool used to assess carbohydrate quality is the Carbohydrate Quality Ratio (CQR), which evaluates the balance of total carbohydrates, dietary fiber, and free sugars in food products. This ratio specifies at least 1 gram of dietary fiber per 10 grams of total carbohydrates, and no more than 2 grams of free sugars per 1 gram of fiber. This ratio helps distinguish nutritionally beneficial foods from those that may contribute to poor health outcomes.
However, accurately determining free sugar content in packaged foods remains a challenge. Few countries require explicit labeling of added sugars, limiting transparency for consumers and researchers. Free sugars, as defined by the World Health Organization (WHO), include added sugars as well as naturally occurring sugars in honey, syrups, and fruit juices, whereas the FDA defines added sugars as only those introduced during processing. This lack of information hinders efforts to assess carbohydrate quality effectively, making it difficult to make informed dietary choices and study the impact of carbohydrate consumption on health.
About the Study
In the present study, the researchers developed an algorithm to predict free sugars in packaged foods worldwide, addressing a critical knowledge gap in carbohydrate quality. They used data from the Mintel Global New Products Database (GNPD), which contains extensive information on packaged foods from 86 countries, including nutrient composition and ingredient lists.
Prior to analysis, the team meticulously cleaned and standardized the data to ensure consistency. A crucial step involved manually curating and tagging ingredients using regular expressions to classify them as added or naturally occurring sugars—a distinction that was essential for accurately estimating free sugar content.
To build predictive models, the researchers employed machine learning techniques. They trained their models using data from the United States (U.S.), and formally tested their performance in 14 selected countries, while applying the models to products from 81 additional countries. The models analyzed product labels, considering the first six ingredients categorized as added sugars, fruits, or dairy, along with detailed nutritional information such as energy content, fats, carbohydrates, fiber, protein, sugars, and sodium.
The pipeline included three binary classifiers to detect presence of added sugars and stacked tree-based regression models to estimate their quantity. Additionally, predicted added sugar values were used as estimates of free sugar, except for specific food categories such as juice drinks and sugar confectionery, where total sugars were used directly due to their unique sugar profiles.
Finally, the models were applied to products without explicit added sugar declarations to predict the carbohydrate composition. Carbohydrate quality was assessed using a predefined 10:1 to 1:2 ratio of carbohydrates, fiber, and free sugars.
Key findings
The study found that the machine learning models demonstrated a high degree of accuracy in predicting free sugar content in packaged food products. The mean absolute error for the test set was calculated to be 0.96 g/100g, indicating a relatively small average difference between the predicted and declared values.
Furthermore, the model achieved a high R² of 0.98 between predicted and declared values and outperformed previous models such as k-nearest neighbors, which showed a much higher error rate, confirming the reliability of the predictions. Notably, the model's predictive capabilities were not limited to the U.S. The researchers found that the model performed accurately when formally tested in 14 countries and applied across an additional 81 countries, highlighting its global applicability.
The study also examined the proportion of food products that met the target carbohydrate quality ratio, revealing significant variations across both food categories and countries. In the U.S., the products meeting the carbohydrate quality ratio varied considerably, ranging from a relatively high 60% for hot cereals to a notably low 0% for flavored milk and malt beverages. This wide range highlighted the diversity in carbohydrate quality even within a single country.
When considering all food categories, the percentage of products meeting the target ratio ranged from 67% in the United Kingdom, representing relatively high adherence to the quality standard, to 9.8% in Malaysia, indicating a significantly lower proportion of products meeting the desired carbohydrate quality.
Notably, plant-based beverages—unlike most drink categories—showed relatively high adherence to the carbohydrate quality ratio across countries, due to higher fiber content and lower added sugar levels.
However, the researchers acknowledged that the accuracy of predictions for certain countries may be limited to some extent by small sample sizes, which could potentially affect the generalizability of the findings for those specific regions.
Additionally, the authors performed z-tests comparing predicted and declared free sugar values across 18 food categories in the U.S. and found no statistically significant differences, affirming the model’s robustness.
Conclusion
In summary, the study successfully developed and validated a machine-learning-based method for predicting free sugar content in packaged foods using a large-scale global database. This fully automated and scalable approach demonstrated strong accuracy across countries and food categories and may be extended to other databases and nutrient metrics requiring free sugar estimation.
The predicted free sugar values could also enhance nutrient profiling systems such as Nutri-Score, which currently rely on total sugars due to limited labeling requirements.
This innovative methodological approach provided a valuable and powerful tool for monitoring and assessing carbohydrate quality in the global food supply, offering crucial insights for public health initiatives and dietary guidance.
Journal reference:
- Scuccimarra, E. A., Arnaud, A., Tassy, M., Lê, K.-A., & Mainardi, F. (2025). Predicting carbohydrate quality in a global database of packaged foods. Frontiers in Nutrition, 12. DOI:10.3389/fnut.2025.1530846, https://www.frontiersin.org/journals/nutrition/articles/10.3389/fnut.2025.1530846/full