Data sharing from Comet @SDSC using Globus CLI
Consider a scenario where data from Comet cluster at SDSC needs to be shared with an arbitrary end user. This end user does not have an account on Comet. While this kind of sharing can be accomplished via Globus’s web app manually. This article discusses a scriptable method to accomplish such a task that can further utilized in existing workflows or Science Gateways. For sharing we have three cases
- The end user does not have Globus account. (This is an easy case as we can provision a Globus account for this user by their email which in turn sends an account invitation to the user to this email with instructions to follow.)
- The end user has a Globus account and we know their email that is linked to their Globus identity. (No problems for this case)
- The end user has a Globus account, but we know their other email which is not linked to their Globus identity. (In this case we can provision a Globus account for the email we have and create a URL that prompts the user to add and associate this new identity to their original Globus account if any. This only needs to be done once for each email a user wishes to associate with their Globus identity.)
Note: These instructions are ONLY for sharing data from Comet cluster at SDSC to other end users, so if you wish to access your own data on Comet cluster via Globus you don't need anything special and hence this method is not applicable.
- Comet cluster @ SDSC
- Python 2.7x on Comet
- Globus CLI v1.2.0
Comet setup requirements
- Sign-in to your Globus account (create an account via your XSEDE or institution credentials)
- Install Globus command line interface tool on Comet as follows or more elaborate installation using virtual-env is provided here.
module load python pip install globus_cli or pip install globus_cli -–user export PATH=~/.local/bin:$PATH # add this to your path
Login setup for Globus CLI
We need to bind our Command Line Interaction (CLI) to our Globus account, this requires a one time setup that will authorize the Globus CLI client on Comet cluster to use our Identity. This can be accomplished as follows
- Login to your Globus account in a web browser
- Issue ‘globus login’ command on Comet cluster which will respond with a long URL and prompt
to enter a code. This code needs to be fetched from a web browser, so copy and paste the URL in
a web browser. Follow directions then paste the code received from here in the command line
prompt. Essentially this step binds CLI to your account. This step in needed only once as long
as one command is issued by CLI in six-month duration. See discussion about this here
Find Globus end point for XSEDE Comet cluster
Identify the end for the Comet cluster using the following command
globus endpoint search "XSEDE Comet" #Sample output #ID: de463f97-6d04-11e5-ba46-22000b92c6ec #Owner: email@example.com #Display Name: XSEDE CometNote: Make note of Comet's endpoint ID de463f97-6d04-11e5-ba46-22000b92c6ec
This will be used subsequently to create our own new end point.
Create an endpoint for your sharing needs on Comet
Data sharing from Comet can only be performed from a specific location i.e.
You may create sub folders at this location and share them with others as needed.
Note: If you deviate from this root path, sharing won't work on Comet.
Here we will set up sharing at this path /oasis/projects/nsf/sds165/amit/shared/demoshare
cd /oasis/projects/nsf/sds165/amit/shared/ mkdir demoshare cd demoshare pwd #Create end point globus endpoint create \ --shared de463f97-6d04-11e5-ba46-22000b92c6ec:/oasis/projects/nsf/sds165/amit/shared/demoshare 'Demo endpoint for sharing on Comet' \ --description 'Example of an endpoint for sharing purpose on Comet' #Sample output #Message: Shared endpoint created successfully #Endpoint ID: 656d277c-56d6-11e7-befe-22000b9a448bMake a note of this newly created end point 656d277c-56d6-11e7-befe-22000b9a448b
This will be used later to share sub folders on this location with arbitrary users via email.
Activate your endpoint
There are different (arcane/legacy) ways to activate an endpoint as documented here. The easiest is to issue the following command that uses web method, simply paste the response of this command in a web browser and follow corresponding instructions. Note: If you have activated an endpoint on same resource recently, the activation is automatic and does not require human intervention.
globus endpoint activate --web 656d277c-56d6-11e7-befe-22000b9a448b #Sample output #Web activation url: https://www.globus.org/app/endpoints/656d277c-56d6-11e7-befe-22000b9a448b/activate
Data sharing setup
We can share this location or subfolders inside this path with arbitrary users as desired. Lets create a few sub folders and files
mkdir /oasis/projects/nsf/sds165/amit/shared/demoshare/foo touch /oasis/projects/nsf/sds165/amit/shared/demoshare/foo/123.txt touch /oasis/projects/nsf/sds165/amit/shared/demoshare/foo/abc.txt
Share folder with arbitrary user
Share the foo folder with a user at firstname.lastname@example.org with read access
# use --identity instead of --provision-identity if user already has a globus account globus endpoint permission create \ --permissions r "656d277c-56d6-11e7-befe-22000b9a448b:/foo/" \ --provision-identity email@example.com #Sample output #Message: Access rule created successfully. #Rule ID: f6b52a08-56d7-11e7-befe-22000b9a448b
Create share URL
Globus CLI does not generate the shared url, but it can crafted as follows (see discussion
Base url: https://www.globus.org/app/transfer?
Query string parameters
origin_id=YOUR END POINT ID i.e. 656d277c-56d6-11e7-befe-22000b9a448b
origin_path=RELATIVE PATH FROM SHARED ENDPOINT i.e. /foo/ as %2Ffoo%2F Note: / (slash) must be encoded in its octet notation as %2F
add_identity=UID OF SHAREE i.e. globus get-identities firstname.lastname@example.org
Show all shared end points for demoshare on Comet
globus endpoint my-shared-endpoint-list de463f97-6d04-11e5-ba46-22000b92c6ec
Globus imposes various limits on number of endpoints, acl, etc. these limits could be found here, the key ones that may affect are
- 1,000 endpoints owned by a single user - this total includes both host endpoints and shared endpoints owned by the user. (This likely won't be an issue on Comet)
- 100 effective ACLs per user on an endpoint (This could be an major issue, as Globus sharing for only this user will break, but not others. As of this writing there is no fix other than trimming the ACLs for the affected user to less than 100)
- 1,000 total ACLs per endpoint (This could be an issue for Gateways with large number of users or if different sharing combinations of users need access to several output locations)
SummaryIn summary there are three main steps, the first two require manual intervention once, while the third one is fully automatic.
Authorize globus cli
#!/bin/bash module load python # Authorize globus cli globus login
Create and activate your endpoint
#!/bin/bash # Change the following #------------------------------# project_name="sds165" username="amit" my_endpoint_folder="perapera2" share_folder="foo" sharee_email="email@example.com" #------------------------------# # Set up share location # Sharing location for Comet : /oasis/projects/nsf/$project_name/$username/shared/ # NOTE: Sharing from other locations won't work # Shared folder will be: /oasis/projects/nsf/$project_name/$username/shared/$share_folder cd /oasis/projects/nsf/$project_name mkdir $username cd $username mkdir shared #this is required cd shared mkdir $my_endpoint_folder cd $my_endpoint_folder mkdir foo cd foo touch 123.txt touch abc.txt my_endpoint_path=`pwd` #Save location for my end point # Identify Comet's endpoint comet_endpoint=`globus endpoint search -F json "XSEDE Comet" \ --filter-owner-id "firstname.lastname@example.org" \ | python -c 'import json, sys; obj=json.load(sys.stdin); print obj["DATA"]["id"]'` echo $comet_endpoint # Create an endpoint for sharing result=`globus endpoint create --shared $comet_endpoint:$my_endpoint_path \ 'Demo endpoint for sharing on Comet' \ --description 'Example of an endpoint for sharing purpose on Comet' \ -F json` echo $result # Extract this newly created endpoint my_endpoint=`echo $result | python -c 'import json, sys; obj=json.load(sys.stdin); print obj["id"]'` echo "My endpoint="$my_endpoint echo "Now activate my endpoint" globus endpoint activate --web $my_endpoint
Share a folder with arbitrary user at your endpoint
#!/bin/bash # Share the foo folder with a user at $sharee_email with read access # Use --identity instead of --provision-identity if the user already has globus account globus endpoint permission create --permissions r "$my_endpoint:/$share_folder/" \ --provision-identity $sharee_email # Create share url base_url="https://www.globus.org/app/transfer?" origin_id=$my_endpoint origin_path=%2F$share_folder%2F # / share must be encoded in octal notation as %2F add_identity=`globus get-identities $sharee_email` share_url="$base_url&origin_id=$origin_id&origin_path=$origin_path&add_identity=$add_identity" echo "Share URL="$share_url