Loading and Retrieving Data in Neptune

Get Started
45 minutes
  • 3 Learning Objectives

About this Hands-on Lab

In this lab, you will load data from an S3 bucket into an existing Neptune instance using the bulk load feature. This is far more efficient than executing a large number of `INSERT` statements, addVertex, and addEdge steps, or other API calls. The Neptune instance will be available when you start the lab. However, you will need to create an IAM role and an S3 bucket, so prior knowledge of the IAM and S3 services are suggested.

Learning Objectives

Successfully complete this lab by achieving the following learning objectives:

Create an S3 Bucket and Grant Access
  1. In the AWS Management Console search for S3 service and select it.

  2. Click create bucket:

    • Bucket name: neptune-import09232020
    • Region: US East
    • Select Next
    • Tags: name and neptune-import
    • Select Next >> Next >> Next >> Create bucket
  3. Visit this lab’s content repo and download the neptune-data.rdf file to your local machine.

  4. Click on the buckets name and select Upload, and Add files. Select the neptune-data.rdf from your machine.

  5. Select Upload.

  6. Click on Services to find and select the IAM serivice.

  7. Select Roles >> Create role >> S3 >> Next:Permissions.

  8. In the search bar type S3, select AmazonS3ReadOnlyAccess, and select Next:Tags Next:Review.

  9. On the Create role page:

    • Role name:neptune-import
    • Click Create role
  10. Search for and select neptune-import.

  11. Click Trust relationships >> Edit trust relationship, and edit the Service: line to read, "Service": "rds.amazonaws.com". Select Update:Trust Policy.

  12. Click on Services to find and select the neptune-import link.

  13. Select the cluster’s name, Actions >> Manage IAM roles.

  14. On the Manage IAM roles page, ensure the neptune-import is under Current IAM roles for this cluster. If it is not, from the drop-down select neptune-import >> Add role. Select Done.

Load the Data
  1. In the AWS Management Console search for VPC serice and select it.
  2. Click on Endpoints >> Create Endpoint.
  3. On
    • Service category: Find service by name
    • Service Name:com.amazonaws.us-east-1.s3 and click Verify
    • VPC: select the existing VPC from the drop-down menu
    • Route Table ID: select the ID with the two subnets
    • Select Create endpoint button
  4. In the AWS Management Console search for Neptune service and select it.
  5. Select the name of the cluster, and copy the Cluster endpoint name. (Note the port number, 8182, next to the Cluster endpoint. You will need this info momentarily.)
  6. Search for IAM service and select it. Click Roles and search for neptune.
  7. Select AWSServiceRoleForRDS to copy the Role ARN. You will need this info momentarily.
  8. Connect to the bastion host using the credentials provided and save the employee name in a variable:
    export NEPTUNE_ENDPOINT=<endpoint>:8182
  9. Install curl with https support:
    sudo yum install libcurl.x86_64
  10. Use cURL to submit the upload:

    curl -X POST -H 'Content-Type: application/json'  

    Be sure to replace the iamRoleArn with your ARN.

    >https://$NEPTUNE_ENDPOINT/loader -d '
    > {
    > "source": "s3://neptune-import09232020/neptune-data.rdf",
    > "format": "ntriples",
    > "iamRoleArn": "<s3_Role_ARN>",
    > "region": "us-east-1",
    > "failOnError": "FALSE",
    > "parallelism": "MEDIUM",
    > "queueRequest": "TRUE"
    > }'
  11. If successful, a "200 OK" status will appear.
  12. Copy the loadID (from the 200 OK message) to monitor it:
    curl -G https://$NEPTUNE_ENDPOINT/loader/<loadID>
  13. If successful, a "200 OK" status will also appear.
Query the Data
  1. Download the RDF4J client:

    git clone https://github.com/linuxacademy/content-aws-database-specialty.git 
    cd content-aws-database-specialty/S06_Additional Database Services/
  2. Extract the client:

    tar zxvf eclipse-rdf4j-3.4.1-nowar.tgz
    cd eclipse-rdf4j-3.4.1
    bin/console.sh
  3. Create a SPARQL repo. Be sure to replace the neptune-endpoint with your own, while adding on :8182/sparql on the end:

    > create sparql  
    * SPARQL query endpoint:https://<neptune_endpoint>:8182/sparql
    * SPARQL update endpoint:https://<neptune_endpoint>:8182/sparql
    * Local repository ID [endpoint@localhost]: neptune
    * Repository title [sparql endpoint repository@localhost]: Neptune Db Instance
  4. Type yes to overwrite, and if successful, repository created message appears.
  5. Open the repo to view the submitted data via the S3 bucket:

    
    > open neptune
    neptune> sparql SELECT * where {?s ?p ?o} 

Additional Resources

You are working as a Database Administrator in charge of the company's Neptune graphing database. The developers have attempted to load a large amount of data using INSERT statements, but this has been unsuccessful. They are looking for a bulk data loading solution and have provided you with a small sample data set.

Please note, we will be using the RDF4J command-line client to query the database. A stripped-down copy of this client is included in the content repository for this lab here.

You can find the latest release of the full SDK here.

Note: On the instance, install curl with https support:

    sudo yum install libcurl.x86_64
What are Hands-on Labs

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Get Started
Who’s going to be learning?

How many seats do you need?

  • $499 USD per seat per year
  • Billed Annually
  • Renews in 12 months

Ready to accelerate learning?

For over 25 licenses, a member of our sales team will walk you through a custom tailored solution for your business.


$2,495.00

Checkout
Sign In
Welcome Back!
Thanks for reaching out!

You’ll hear from us shortly. In the meantime, why not check out what our customers have to say about ACG?