Drug named entity recognition

doi:10.5281/zenodo.10970630

Cite this software

DOI:

10.5281/zenodo.10970630

Description

💊 Drug named entity recognition

Developed by Fast Data Science, https://fastdatascience.com

Source code at https://github.com/fastdatascience/drug_named_entity_recognition

Tutorial at https://fastdatascience.com/drug-named-entity-recognition-python-library/

This is a lightweight Python library for finding drug names in a string, otherwise known as named entity recognition (NER) and named entity linking.

Please note this library finds only high confidence drugs and doesn't support misspellings at present.

It also only finds the English names of these drugs. Names in other languages are not supported.

It also doesn't find short code names of drugs, such as abbreviations commonly used in medicine, such as "Ceph" for "Cephradin" - as these are highly ambiguous.

💻Installing drug named entity recognition Python package

You can install from PyPI.

pip install drug-named-entity-recognition

If you get an error installing, try making a new Python environment in Conda (conda create -n test-env; conda activate test-env) or Venv (python -m testenv; source testenv/bin/activate / testenv\Scripts\activate) and then installing the library.

The library already contains the drug names so if you don't need to update the dictionary, then you should not have to run any of the download scripts.

If you have problems installing, try our Google Colab walkthrough.

💡Usage examples

You must first tokenise your input text using a tokeniser of your choice (NLTK, spaCy, etc).

You pass a list of strings to the find_drugs function.

Example 1

from drug_named_entity_recognition import find_drugs

find_drugs("i bought some Prednisone".split(" "))

outputs a list of tuples.

[({'name': 'Prednisone', 'synonyms': {'Sone', 'Sterapred', 'Deltasone', 'Panafcort', 'Prednidib', 'Cortan', 'Rectodelt', 'Prednisone', 'Cutason', 'Meticorten', 'Panasol', 'Enkortolon', 'Ultracorten', 'Decortin', 'Orasone', 'Winpred', 'Dehydrocortisone', 'Dacortin', 'Cortancyl', 'Encorton', 'Encortone', 'Decortisyl', 'Kortancyl', 'Pronisone', 'Prednisona', 'Predniment', 'Prednisonum', 'Rayos'}, 'medline_plus_id': 'a601102', 'mesh_id': 'D018931', 'drugbank_id': 'DB00635'}, 3, 3)]

You can ignore case with:

find_drugs("i bought some prednisone".split(" "), is_ignore_case=True)

Keywords

Data Science

Medical text

Natural Language Processing

pharmaceuticals

Programming languages

Jupyter Notebook 67%
Python 33%

License

MIT

</>Source code

Packages

Participating organisations

Drug named entity recognition

Cite this software

DOI:

Description

💊 Drug named entity recognition

💻Installing drug named entity recognition Python package

💡Usage examples

Participating organisations

Contributors

Contact person

Thomas A Wood

Developer

Fast Data Science Ltd

0000-0001-8962-8571

Drug named entity recognition

Cite this software

DOI:

Description

💊 Drug named entity recognition

💻Installing drug named entity recognition Python package

💡Usage examples

Participating organisations

Contributors

Contact person

Thomas A Wood

Developer

Fast Data Science Ltd

.logo-orcid_svg__st1{fill:#fff}0000-0001-8962-8571

0000-0001-8962-8571