Create a Data Scan Job

To create a Data Scan Job, you will need to:

  1. Go to "Data Scan Jobs" and Create a New Job
  2. Fill out the Form
    a) Describe Your Data Scan Job
    b) Schedule a Job
    c) Choose Data Collections and Data Classes to Scan
    d) Set Scan Options
    e) Select S3 Buckets to Scan
  3. Save and Confirm Your Scan Job

Let's get started.

Step 1. Go to "Data Scan Jobs" and Create a New Job

In "Scans," click "Data Scan Jobs." You will see all the Data Scan Jobs that you've previously created. To create a new Job, click New Data Scan Job.

Step 2. Fill out the Form

A form with all the required fields for the new Scan Job will appear.

a) Describe Your Data Scan Job

Begin by defining a Job Name, Description, and Status.

The Job Description should be short and may provide more detail on what the Job is doing.

The Status of a Job determines if it is running or not. Enabled Jobs will run on a configured schedule, whereas Disabled Jobs will not run.

You can always disable the job after you save it. You can do so from the previous page ("Data Scan Jobs") or when you revisit and edit this particular Data Scan Job.

b) Schedule a Job

Use the dropdown menu to schedule your Analysis Job.

Jobs that are scheduled to Run Once will be deactivated as soon as the data scan is finished.

c) Choose Data Collections and Data Classes to Scan

Next, use the dropdown menu to select the data you want Open Raven to look for when it runs your Data Scan Job. You can pick as many Data Collections as you want.

When you select your Data Collections, you will see Data Classes that belong in those Data Collections automatically appear as UI chips below.

d) Set Scan Options

In Scan Options, you can:

  • Pick file types you want to exclude.
  • Choose what percentage of each S3 bucket you want to scan.
  • Input a regular expression to exclude any files that match the regular expression.

e) Select S3 Buckets to Scan

Now that you’re done choosing what to scan, you can pick where to scan. In the right-hand panel, select the S3 buckets that you want to scan.

Your account is preconfigured to a limit of 50 TB or 50 million objects across all your buckets to scan.

Your job will not save if one of those scan limits is reached. There are two progress bars that track the total size of the buckets and the total number of files to help you monitor your progress.

If you have a lot of buckets and want to find that needle in the S3 haystack, you can use our filter bar.

The Location filter bar lets you filter your buckets by Accounts and Regions.

On the other hand, the Configuration filter bar allows you to filter your buckets by their configuration, such as if the bucket is open to the internet/closed to the internet, if it is encrypted/unencrypted at rest, and if it is backed-up/not backed-up by AWS natively.

Step 3. Save and Confirm Your Scan Job

Click Save.

A confirmation screen will show up, giving you an opportunity to review all the details that you’ve just entered. When you’re ready, just press Confirm.

You will receive an email notification when your scan is created, in progress, and as soon as it has completed.