Chemical Analytics Platform
Chemical Analytics Platform is a freely available Virtual Machine encompassing tools, databases, and KNIME workflows.
Managing and exploiting growing data resources in chemical design
Chemical design, like most scientific disciplines, is becoming increasingly data-intensive and dependent on our capacity to manage and exploit growing data resources. In particular, there is increasing need for drug-discovery organizations to enable decision making that is informed by the growth of their internally generated data and its integration with external data.
Data-driven chemistry (for drug-design materials science, catalysis, polymers) is dependent on researchers dealing with the growth in data and finding ways to convert these resources into better decisions. Increasing the capacity of chemists to undertake data-driven research has the potential to improve decision making in drug discovery and ensure the most benefit can be derived from the growth in data. The rapid increase in available data in the so-called Big Data era makes harnessing these resources and optimizing our research processes a prerequisite for future success.
At the core of chemical design is the “design, synthesis, testing and evaluation” cycle. Traditionally, all components of the cycle have been undertaken in the same laboratory under the control of a small team of synthetic chemists and a computational chemist as part of a multidisciplinary team. The most important task of the chemistry team is to evaluate new biological testing in the context of known chemistry rules, general and project specific models and any other available information such as protein structures. The key to successful design chemistry is the ability to balance an array of often conflicting properties as each round of design and synthesis improves the overall properties of the compound series (or at least facilitates future improvement). Design chemistry is therefore a data-driven task, with a requirement for immediate access to all available data if we want to ensure that the results of new testing truly influences the next rounds of synthesis.
Chemical data analysis workflow tools, such as KNIME, TAVERNA and PIPELINE PILOT have been implemented in most pharmaceutical companies, providing user-friendly workbenches for experts and non-experts to undertake complex data analysis tasks including machine learning, analytics and visualization. TAVERNA and KNIME are open source workflow tools, with large communities developing and sharing new functionality, providing dissemination of methods and rigorous community testing. It is now the case that even the largest commercial software providers, including Schrodinger, Tripos and CCG are providing tools (nodes and extensions) to the KNIME community.
Despite the user-friendly nature of these workflow tools, they are not trivial to manage, especially when seeking to connect with database tools or other extensions. For this reason the Dutch academic community benefits from the Netherlands eScience Center implementing an eScience platform around a workflow tool on their behalf.
This project delivers a local version of such an eScience for chemistry platform, supported by open source databases (MySQL and PostgreSQL), and connected to chemistry specific applications such as RDKit and CDK and the analytics and visualization capabilities of R based on previously described infrastructures.
Such an approach has the potential to support many aspects of data-driven chemistry, but also other disciplines as the central workflow tool KNIME (like TAVERNA and PIPELINE PILOT) is domain independent and could support projects in many other disciplines in the future.
Integrating data publishing principles in scientific workflows
Combining molecular simulation and eScience technologies
Sequence validation in the DNA barcoding project
Efficient exploitation of the massive amount of modern-day life science data
Bringing concepts from distributed computing and bioinformatics to the field of computational...
Open discovery and exchange for all
Capitalizing on the growth of scientific knowledge on food
The Virtual Laboratory for Plant Breeding
Chemical Analytics Platform is a freely available Virtual Machine encompassing tools, databases, and KNIME workflows.
If you are working in the KNIME worflow platform and need data about your favorite kinase receptor ligand interaction, then these nodes are for you.
A node for the KNIME workflow systems that allows you to retrieve data about your favorite G protein-coupled receptors from gpcrdb.org.
Want to write your own KNIME node Then use the KNIME node archetype to generate a node skeleton repository with sample code.
A node for the KNIME workflow systems that allows you to compare different binding sites in proteins with each other.
Want to write your own KNIME node wrapping a Python library. Then use the KNIME Python node archetype to generate a node skeleton repository with sample code
A node for the KNIME workflow systems that allows you to use the Silicos-it software to filter or align molecules.