Ctrl K

hyphe

A research-driven web-crawler aimed at building, curating and categorizing a corpus of web actors and the network graph of hyperlinks connecting them.

43
mentions
4
contributors
3308 commitsLast commit ≈ 1 month ago361 stars64 forks

Cite this software

Description

Hyphe is an open source web-crawler allowing researchers to build corpora made of hyperlinked webpages about a specific topic (for instance, palm oil or coronavirus).

These webpages are selected by researchers and can be grouped as « webentities », which can be single pages as well as a website, subdomains or parts of it, or even a combination of those. They represent different actors of the issue at hand (for instance, a person, an organization, etc.).

hyphe network

By crawling them, Hyphe builds iteratively and helps visualize a network graph of the relationships between these actors through the hyperlinks connecting the webentities.

New webentities are automatically suggested after they were discovered by crawling each entities hyperlinks, and researchers can then review them in an iterative and qualitative process.

hyphe curation

As it allows researchers to manually choose and then tag which actors they want to add to their corpus, Hyphe should be considered as a quali-quantitative tool.

hyphe network

Logo of hyphe
Keywords
Programming languages
  • JavaScript 38%
  • Python 31%
  • HTML 26%
  • CSS 3%
  • Shell 2%
License
</>Source code
Packages
hub.docker.com
hub.docker.com
hub.docker.com

Participating organisations

médialab - Sciences Po

Reference papers

Mentions

Contributors