A user profile discovery tool for ActivityPub social servers
In July 2019, the number of registered users across the fediverse has reached three million. At the same time, each new user faces a discovery problem. On one hand, he must somehow find users to follow. On the other, he must be somehow found himself.
Even though most servers provide a local user directory to explore and a live feed to look at, and even though several global user directory services exist, the new user is very unlikely to be able to keep track of the profiles he had already seen; Instead, he is likely to end up opening the same profile more than once, wondering why it looks so familiar. Enter
fediscover, a profile discovery tool that seeks to automate this process.
You will need the following packages:
- Python 3.5 or later
xdg-utils— (Optional) For opening profile URLs in your web browser
xclip— (Optional) For copying profile URLs to the clipboard
Session and cache files generated by the program will be saved into
Seed the program by passing it a URL to work with, like so:
fediscover crawl URL
The URL can be anything you can get your hands on: an instance's profile directory, a profile URL of a "connected" user (someone who follows and is followed by at least a few other users), an opt-in user directory, such as one of the lists in Trunk, or any other web page that links to any user on any ActivityPub server. The program will crawl the URL and collect any profile URLs it can find. At this point, you can ask for a profile URL, like so:
Look at the profile at the end of the URL to decide if he's worth following. Repeat as necessary. You can be fairly confident that you will always have a new profile to look at, and never need to look at the same profile again.
You can speed things up by copying the profile URL to the clipboard with
fediscover next --copy or open it in your browser directly with
fediscover next --open.
You can blacklist some URL fragments from being crawled and displayed to you with
fediscover blacklist WORD.
For a complete list of recognized command line options, run
This program is free software, released under the Apache License, Version 2.0. See the LICENSE file for more information.
The program's canonical project page resides at https://simonvolpert.com/fediscover/
I gratefully accept appreciation for my work in material form at bitcoincash:qrmc2w3emlhy36tuuy4p7wj6gvdtg3usnu0c4zyfwp.
git clone https://simonvolpert.com/fediscover/
- Remove obsolete comment Simon Volpert 8 months ago
- Unify reporting of blacklisted URLs Simon Volpert 8 months ago
- Remove "skip" option Simon Volpert 8 months ago
- Keep crawling on transient connection failure, retry failed URLs once Simon Volpert 8 months ago
- Update README Simon Volpert 8 months ago
- Keep crawling until there are users in the queue or the URL queue runs out Simon Volpert 8 months ago
- Add a "verbose" option to print every URL being processed Simon Volpert 8 months ago
- Add skipped URL cound in the post-crawl report Simon Volpert 8 months ago
- Add blacklisting Simon Volpert 8 months ago
- Remove duplicate matches URLs from pages when scraping Simon Volpert 8 months ago