- Share R compute among friends across the world
⚠️ Security ⚠️
Important warning: Please note that there is nothing preventing a user in your P2P cluster from sending malicious R code to your P2P worker!
For example, a P2P user may submit a future that erases all files on the P2P worker or a future that attempts to read non-encrypted secret files of yours, e.g.
and
Because of this, it is important that you only join shared P2P clusters that you trust, i.e. where you trust all the P2P user and the user who hosts it such that they do not invite non-trusted or unknown users.
There are mechanisms for launching P2P workers in sandboxed environments. For instance, by running P2P workers in a sandboxed virtual machine (VM), in a sandboxed Linux container (e.g. Apptainer, Docker and Podman), or via dedicated sandboxing tools (e.g. Bubblewrap, Firejail, and macOS sandbox-exec
), you can mitigate some of the risk of malicious code accessing the host machine where your personal data lives.
Installation
install.packages('future.p2p', repos = c('https://futureverse.r-universe.dev', 'https://cloud.r-project.org'))
Getting started
In order to join a future P2P cluster, you must:
have an SSH key pair configured, and
have a pico.sh account.
See the ‘Getting Started’ vignette for how to set this up, but the gist for creating an SSH key pair if you already don’t have one is to:
With the key pair create a pico.sh account by logging into their server:
Choose your pico.sh username, which will also be your P2P cluster username, and click ENTER. Finally, verify SSH access to pipe.pico.sh
(sic!);
That’s it!
Set up a shared P2P cluster
Let’s assume P2P users ‘alice’, ‘bob’, ‘carol’, and ‘diana’ decides to share a P2P cluster and user ‘alice’ agrees to host it. Hosting a P2P cluster only means that you control who has access - there’s no extra load added. So, to host, ‘alice’ calls:
A future P2P cluster can be hosted from anywhere in the world, and it does not have to on a machine where you run your own R analysis.
Parallelize via P2P cluster (all users)
Any user with access to the ‘alice/friends’ cluster can use it. In our example, this means ‘bob’, ‘carol’, ‘diana’, and ‘alice’ may use the P2P cluster at the same time. Just like with any other future backend, we use plan()
to specifying that we want to parallelize via the P2P cluster.
For example,
Share your compute power with your friends (any user)
Without parallel workers, the P2P cluster is useless and will not process any parallel tasks. This is where the peer-to-peer concept comes in, where we contribute our idle compute cycles to the cluster for others to make use of. To contribute your R compute power to the alice/friends
cluster, launch a P2P worker as:
This will contribute one parallel worker to the p2p cluster. You can contribute additional ones by repeating the same command one or more times.
Appendix
Connecting to the same pico.sh account from different machines
If you have multiple computers, you can add your public SSH keys for those as well by logging in again by calling ssh pico.sh
. Then go to the pubkeys
menu, where you have options to add additional public SSH keys of yours. This way, you can use your pico.sh account from multiple computer systems, which can be handy if you want to set up parallel workers on one system and harness their compute power from another.
Troubleshoot Wormhole
If you are behind a firewall with a proxy, wormhole might fail to establish an outbound connection. For example, if you try:
it might stall forever. If that happens, press Ctrl-C to interrupt and retry by disabling the proxy settings using:
> Sys.unsetenv("http_proxy")
> system2(future.p2p::find_wormhole(), args = c("send", "--text", "hello"))
On the other computer, please run: wormhole receive (or wormhole-william recv)
Wormhole code is: 53-visitor-physique
If the latter works for you, launch R by unsetting environment variable http_proxy
, e.g.