At the moment I am a postdoctoral researcher in the Steinbeck group at Friedrich-Schiller University in Jena, Germany. I worked on my PhD thesis under the supervision of Prof.Dr.Christoph Steinbeck and Prof.Dr.Achim Zielesny. Most of my research focuses on using deep learning methodologies for chemical data mining, computer vision and natural language processing (NLP) for understanding chemical structure representations. Most of my work is done with the Google Cloud platform and Tensor Processing Units (TPUs) from Google. Furthermore, I also serve as a teaching assistant for cheminformatics and Python courses.
Currently, I am a full-time researcher, a hobbyist photographer, and a foodie, with a great passion for cooking and baking.
Summarized few of my professional skills
In my lifetime, I have been fortunate enough to live, work and study in multiple countries. Each of them has shaped me into the person I am today.
July 2017 - May 2021
Deep leARning for chemicaL Information processinG
Vast quantities of scientific information are hidden in primary scientific publications and not available as curated data in scientific databases. Making such information publicly available to support open science and open innovation is a challenge that has to be solved. In this dissertation, state-of-the-art deep learning models for optical chemical structure recognition and chemical information processing have been implemented to rediscover this information and retrieve it automatically.
August 2020 - today
STOUT: SMILES TO IUPAC Translator is built using the same concept as a Neural Machine Translation(NMT). STOUT is initially trained on a subset downloaded from PubChem containing 30 Million SMILES and 60 Million SMILES. which got converted into SELFIES using the SELFIES package. The same set of SMILES also was converted into IUPAC names using ChemAxon "molconvert", a command-line program in Marvin Suite 20.15 from ChemAxon (https://www.chemaxon.com). Later the textual data was converted into TFRecords(Binary files) for training on Tensor Processing Units(TPUs).
December 2019 - June 2020
Molecule Set Comparator (MSC) is designed as an application that enables a user to do a versatile and fast comparison of large molecule sets with a unique inter-set, molecule-to-molecule comparison, for the original set and a predicted set of molecules obtained by machine learning approaches. The molecule-to-molecule comparison is based on chemical descriptors, which are included in the Chemistry Development Kit (CDK), such as Tanimoto similarities, atom/bond/ring counts, and physicochemical properties like logP. The results are presented graphically and summarized by interactive histograms that can be exported in publication quality
July 2020 - today
I was part of the data curation work and desinged the logo for the project
December 2016 - May 2017
A Java based to tool to explore and generate the conformers of a bioactive molecule systematically.
July 2017 - May 2021
In 2021 I completed and defended my Ph.D with a "Summa cum laude" grade.
July 2015 - June 2017
I completed my Master’s degree in Bioinformatics, in Pune, India. For my Masters’ thesis, I worked primarily in cheminformatics.
September 2011 - November 2014
I completed my Bachelor of Science in Biotechnology, in Bangalore, India.
Here are a few of the scientific papers I have published over the course of my scientific career. All of my papers are open-access because I exclusively work on open-source and open-data projects. My code for the projects is available on GitHub.
Rajan, K., Zielesny, A. & Steinbeck, C. DECIMER: towards deep learning for chemical image recognition. J Cheminform 12, 65 (2020). https://doi.org/10.1186/s13321-020-00469-w
Project on GitHub
Rajan, Kohulan; Zielesny, Achim; Steinbeck, Christoph (2021): DECIMER 1.0: Deep Learning for Chemical Image Recognition using Transformers. ChemRxiv. Preprint. https://doi.org/10.26434/chemrxiv.14479287.v1
Project on GitHub
Rajan, K., Brinkhaus, H.O., Sorokina, M. et al. DECIMER-Segmentation: Automated extraction of chemical structure depictions from scientific literature. J Cheminform 13, 20 (2021). https://doi.org/10.1186/s13321-021-00496-1
Project on GitHub
Project Website
Rajan, K., Hein, JM., Steinbeck, C. et al. Molecule Set Comparator (MSC): a CDK-based open rich‐client tool for molecule set similarity evaluations. J Cheminform 13, 5 (2021). https://doi.org/10.1186/s13321-021-00485-4
Project on GitHub
Rajan, K., Brinkhaus, H.O., Zielesny, A. et al. A review of optical chemical structure recognition tools. J Cheminform 12, 60 (2020). https://doi.org/10.1186/s13321-020-00465-0
Project on GitHub
Rajan, K., Zielesny, A. & Steinbeck, C. STOUT: SMILES to IUPAC names using neural machine translation. J Cheminform 13, 34 (2021). https://doi.org/10.1186/s13321-021-00512-4
Project on GitHub
The logos of the projects and the naturalproducts.net website were designed by me.
I do click plenty more, to check them all visit my Instagram and to download them as wallpapers check my artist page on walli.
I would love to hear from you, so please feel free to contact me directly or simply follow me on my GitHub and social media accounts.