Early stage researchers in a pratical data science environment

As Chief Data Scientist of SDG Group I had the opportunity to supervise for a short time two ESRs from the AMVANewPhysics network, this speech is an account of the experience from my point of view and some indication about the methods, the methodologies and the techniques proposed and explored by the ESRs.

The first experience was a cursory exploration of an exciting, at least for me, subject: Topological Data Analysis which apply algebraic topological methods and local geometrical constructs to data. So, we meet persistent homology, multidimensional persistent homology, and other on one side and the mapper from the other side as the main topics of TDA. The ESR got to a fairly deep understanding of the basic concepts and was able to implement mapper in Python. As far as mapper is applied to data, statistics come into the game and the selection of the optimal mapper parameterization although passing from algebraic topology constructs such as the multiscale mapper and the multinerve, will pass through statistical methods, partly proposed, much to be devised and validated.

The second experience was focused on two topics, the first was to explore mapper as a tool for improving predictive modeling and, as a parallel task try to find out a heuristic for selecting one of the key parameters of mapper, namely the overlapping. The second stream was to try to implement a flexible calendar for supporting a forecasting model were calendar effects are essential.

Maurizio Sanarico