The food industry is facing major challenges in the global context of increased population, climate crisis, limited resources, water scarcity, etc. Exploitation of agroindustrial by-products for obtaining food ingredients and materials is a plausible option in line with circular economy policies. However, research in this area has been mainly done in a non-structured way, thus not being industrially scalable as extraction conditions need to be optimized depending on the compound to be extracted and the starting biomass source. Artificial intelligence tools offer a great opportunity in this area, as the huge amount of information available can be used to develop algorithms which can predict optimum conditions of extraction, thus boosting potential residual valorization. To the best of our knowledge there are no databases available providing structured and comprehensive information on the initial composition of different residual sources, extraction conditions, yields and properties of the extracted compounds, which are crucial for developing valorization tools of industrial interest. For these reasons, there is plenty of room for improvement in the current databases and bioinformatics protocols used for residual valorization.
The global objective of this project, carried out in collaboration with a mathematical group at the UPV (Prof. Calabuig) with extensive experience in mathematical modelling and Artificial Intelligence, will be to develop mathematical algorithms for valorization purposes. By using Machine Learning techniques (as, for instance, Neural Networks) we will be able to predict the extraction of two of the most abundant carbohydrates present in residues (cellulose and pectin), using the initial characterization information of different biomass sources.
The candidate will perform the bioinformatics work required to fulfil the objectives of the project. This mainly involves implementing the required methodologies and workflows using modern programming languages, and develop new algorithms which can be industrially useful and advance basic knowledge in the area of valorization.
Initially, the candidate will organize available data from the group, and will feed this data through an automatic literature mining technique available in the collaborating group from Prof. Calabuig from the UPV. This method will be used to search the whole set of PubMed articles for complementary manuscripts which will help in the generation of databases and subsequent algorithms. In order to do this Software R will be used: 1) to create the database (for instance, easyPubMed, could be used to retrive the information of PubMed); 2) to obtain the Neural Network architecture (using, for instance, kerastuneR) and to implement it (with tensorflow) and; 3) to publish the results in an interactive dashboard (using shiny). The candidate will be also formed in experimental activities, thus validating the algorithms developed for cellulose and pectin extraction from residues.