Ctrl K

hyphe

A research-driven web-crawler aimed at building, curating and categorizing a corpus of web actors and the network graph of hyperlinks connecting them.

43
mentions
4
contributors
3308 commitsLast commit ≈ 3 weeks ago358 stars64 forks

Cite this software

DOI:

10.5281/zenodo.6078923

Description

Hyphe is an open source web-crawler allowing researchers to build corpora made of hyperlinked webpages about a specific topic (for instance, palm oil or coronavirus).

These webpages are selected by researchers and can be grouped as « webentities », which can be single pages as well as a website, subdomains or parts of it, or even a combination of those. They represent different actors of the issue at hand (for instance, a person, an organization, etc.).

hyphe network

By crawling them, Hyphe builds iteratively and helps visualize a network graph of the relationships between these actors through the hyperlinks connecting the webentities.

New webentities are automatically suggested after they were discovered by crawling each entities hyperlinks, and researchers can then review them in an iterative and qualitative process.

hyphe curation

As it allows researchers to manually choose and then tag which actors they want to add to their corpus, Hyphe should be considered as a quali-quantitative tool.

hyphe network

Logo of hyphe
Keywords
crawling
Networks
python
web
Programming languages
License
</>Source code
Packages

Participating organisations

médialab - Sciences Po

Reference papers

Mentions

Contributors

BO
MJ