Folder migration (recursion)
OverviewCopy
A very common problem faced by businesses is the mass migration of an entire company's folder structure (as well as the files themselves) into a new management system.
While on the surface this is a daunting task, by using Tray and the recursive method, it can be easily mitigated even if as a user the intricacies of the files and folder structure remain a mystery to you.
Please have a quick watch of our introductory video on recursion:
Import and run the pre-built workflowsCopy
We have prebuilt and exported these workflows for you so you can import them to your own account:
It is assumed that the root folder in your source Box directory only has folders and no actual files
Notes on importingCopy
You will first need to create 3 empty workflows (using any trigger - this will be updated upon upload), then import the above workflows one-by-one
Assuming you have pre-existing Box and Drive authentications you will need to set the authentications for your Box and Drive steps in each workflow. Note you will also need to set the get Parent Folder (http-client-1) step to use your Box auth in both sub workflows (the http client is used here as the available operations for the Box Tray connector don't quite meet our needs).
Once imported you will need to manually set the correct workflows to be called for each 'Callable workflow' step as these dependencies are lost when importing separate workflows:
Your own version of these workflows will then be ready for testing.
We have also included a Deletion workflow, so that you as a user can re-run the use case provided with the knowledge that your data storage connectors will start "empty" on each run:
While this example will work 'out of the box' for transferring from Box to Google Drive, it should be fairly straightforward to re-purpose it to work the other way, or with other file storage applications such as Dropbox
Workflows explainedCopy
This use case highlights moving a complex folder structure from a Box account, into a Google Drive account. Without even knowing the complexity of the folder structure beforehand, the method below explains how this can be automated.
The workflows involved carry out the following tasks:
The Parent workflow loops through all files and folders within the root directory and sends them for processing to the first sub-workflow
The first sub-workflow processes the file or folder which has been sent from the parent workflow. Then:
if it is a file it will create it (and its parent folder if it doesn't exist), and finish
if it is a folder it will create it, then loop through its contents and send each item within it for sub-processing to the second sub-workflow
The second sub-workflow processes the file or folder received from the first sub-workflow in exactly the same way. If it is processing a folder it will loop through each item and send it back to the first sub-workflow
So in this way the sub-workflows will keep calling each other until all sub-folders and files have been created in Drive.
Parent Process (workflow 1):Copy
Sub-workflows 2 & 3 - these are identical:Copy
Key conceptsCopy
LookupsCopy
You will notice that there are several steps throughout the workflows which refer to 'folder lookup'.
This is what is used to recreate the folder structure.
The workflows are effectively creating a 'lookup table' as they go, which maps the id of a Box Folder to the id of the corresponding Google Folder:
Box (Key) | Drive (Value) |
---|---|
mainFolder1 Box id | mainFolder1 Drive id |
subMainFolder1.1 Box id | subMainFolder1.1 Drive id |
subMainFolder1.2 Box id | subMainFolder1.2 Drive id |
mainFolder2 Box id | mainFolder2 Drive id |
subMainFolder2.1 Box id | subMainFolder2.1 Drive id |
subMainFolder2.2 Box id | subMainFolder2.2 Drive id |
etc. |
These lookups are stored using account-level data storage, where the Key is the Box id and the Value is the Drive id.
The way in which the lookup for a folder is generated is illustrated by the following section:
A check is carried out to see if this folder has a parent folder (i.e. the Box parent folder id 'key' has a corresponding Drive folder id 'value' in Data Storage)
In this case it is 'True'
Create a new folder in Drive using the value from Data Storage for the Parent Folder (the folder Name is pulled from the workflow trigger)
A Folder lookup then has to be set for the folder just created:
You can see that
The Key for this lookup is the Box id taken from the workflow trigger
The Value is the Drive id taken from the folder just created
Once a folder is created, its items must be looped through and passed to either sub-workflow 2 or sub-workflow 1.
Preventing a 'Race Condition'Copy
When the items in a folder are being looped through, you will also notice that the loop is delayed with an await Lookup step:
This uses the Await Get Value operation which forces the loop to pause until a folder lookup has been created for this item.
This is because if there is a delay in creating the first item in a particular subfolder then the second item might create the new folder and lookup a split second before the first item does. As it had already gone down the 'no lookup exists' path the first item will still create a new folder and we will have a duplicate!