Collecting Your First Data

Step 1: Click the ‘Create Workflow’ Button

Step 2: Input ‘Workflow Name’

This is the name of your workflow, and how it will appear in the list of all the workflows that exist in your team.

Step 3: Set Criteria

This is where you tell the workflow what data you want to be collected from the websites. If you have already created a criteria, you can select this from the dropdown menu titled ‘Criteria’.

Otherwise:

  1. Input a name for your criteria in the ‘Criteria Name’ box

  2. Input field name into the box labeled ‘field name’. This will be the name of a column of data that is collected.

  3. Input a description into the box labeled ‘description’. This should be a detailed description of exactly what data you want to be collected. For more information on how to write a good description, see this helpful guide!.

  4. If this field of data should always be unique, click the box labelled ‘unique identifier’

  5. Press the add button on the right

You can repeat these steps to add as many fields as you want

Step 4: Choose datasources

This is where you choose the websites from which you want to collect data. If you have already created a datasource, you can select this from the dropdown menu titled ‘Datasource’.

Otherwise:

  1. Input a name for your datasource in the ‘Datasource Name’ box. This is the name of the set of datasources, and can be used to select this combination of datasources for a different workflow.

  2. Input the name of the website from which you want to collect data, for example ‘Aberdeen University’ if you are collecting data from the University of Aberdeen website.

  3. Input the domain from which data should be collected. You should only put the root domain, e.g., www.abdn.ac.uk rather than www.abdn.ac.uk/study/undergraduate/degree-programmes/527/G400/computing-science/.

  4. Press the add button on the right

You can repeat this as many times as is necessary

Step 5: Create workflow

Click the button labeled ‘Create Workflow’. This will create the workflow using the criteria and datasources you just added.

Step 6: Navigate to workflow

From here, you can either press ‘view workflows’ on the workflow creation success page, or click ‘workflows’ on the side bar.

You can then click on the workflow you just created to view the page for this workflow.

Step 7: Run Workflow

From here, navigate to the top right corner and click the button saying ‘Start New Run’. This will start a new run of data collection. ApeScrape will email you when this is complete.

Step 8: View Your Data!

Congratulations, you just collected your first set of data!