Knowledge representation and data integration
Scientists modeling complicated phenomena don’t use explicit (formally-specified) models. This is pragmatic, given current available open-source tools, but informal reasoning ulimately leads to serious challenges in communicating scientific results clearly and sharing data. A relational database backend is important for a scalable modeling tool, but a SQL-less interface is also crucial: the complexity of managing database implementation details quickly becomes unmanagable and unextensible as the model complexity increases. I develop a Python EDSL to help scientists generate relational databases from a natural declaration of scientific facts and to naturally query and publicly communicate their knowledge base.
Heterogenous data integration, using Category Theory There is little standardization in how data is to be represented and stored in many scientific fields. However, the varying schemas of different researchers contain significant overlap in information, and for data-driven fields it is especially beneficial to be able to freely switch from one frame of reference to another. Categorical Query Language, a tool developed by Conexus AI, allows for specification of data migration and integration in a declarative and clean process. I aim to show the utility of this approach in the field of computational chemistry, in collaboration with Ryan Wisnesky.
Machine learning applied to computational chemistry
A Chemical Analog to the Convolution Operation Along with Brian Rohr and Michael Statt, I develop models which quickly compute high-level properties of atomic systems from cheap descriptors by leveraging recent advances in ML (e.g. graph-convolutional and message-passing networks). The link goes to our poster which was awarded 1st place in Andrew Ng’s CS 230: Deep Learning course (Winter 2018).
Development of functionals for Density Functional Theory Simulation of chemical reactions using first-principles techniques requires a theoretical framework that is able to describe a wide range of electronic interactions. Under the direction of Johannes Voss, I develop new meta exchange-correlation functionals with a semi-empirical approach, fitting the functional form against higher level of theory and experimental benchmark data. By using Bayesian statistics, we enable uncertainty estimation of the computed reaction energies.