Lab
A Cloud Guru

Reindex Elasticsearch Documents

Whether you need to change the mapping of an existing index or take a subset of data from one index and copy it to another, the `_reindex` API in Elasticsearch has you covered. With the `_reindex` API, you can take all or just a subset of data from one index and copy it to another. In this hands-on lab, you are given the opportunity to exercise the following: * Reindex a subset of data from one index to a new index * Create an ingest node pipeline * Transform data during the reindexing process

Try for free Contact sales

Path Info

Level

Advanced

Duration

3h 0m

Published

Jan 10, 2020

Challenge

Create the romeo_and_juliet index.

Use the Kibana console tool to execute the following:

PUT romeo_and_juliet
{
  "mappings": {
    "properties": {
      "line_id": {
        "type": "integer"
      },
      "line_number": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      },
      "play_name": {
        "type": "keyword"
      },
      "speaker": {
        "type": "keyword"
      },
      "speech_number": {
        "type": "integer"
      },
      "text_entry": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      },
      "type": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      }
    }
  },
  "settings": {
    "number_of_shards": 4,
    "number_of_replicas": 3
  }
}

Challenge

Create the shakespeare-tokenizer ingest node pipeline.

Use the Kibana console tool to execute the following:

PUT _ingest/pipeline/shakespeare-tokenizer
{
  "description": "Tokenizes the text_entry field into an array. Adds a word_count field. Removes the play_name field.",
  "processors": [
    {
      "split": {
        "field": "text_entry",
        "separator": "\\s+",
        "target_field": "word_array"
      }
    },
    {
      "script": {
        "lang": "painless",
        "source": "ctx.word_count = ctx.word_array.length"
      }
    },
    {
      "remove": {
        "field": "play_name"
      }
    }
  ]
}

Challenge

Reindex the play "Romeo and Juliet".

Use the Kibana console tool to execute the following:

POST _reindex
{
  "source": {
    "index": "shakespeare",
    "query": {
      "match": {
        "play_name": "Romeo and Juliet"
      }
    }
  },
  "dest": {
    "index": "romeo_and_juliet",
    "pipeline": "shakespeare-tokenizer"
  }
}

Author

A Cloud Guru

The Cloud Content team comprises subject matter experts hyper focused on services offered by the leading cloud vendors (AWS, GCP, and Azure), as well as cloud-related technologies such as Linux and DevOps. The team is thrilled to share their knowledge to help you build modern tech solutions from the ground up, secure and optimize your environments, and so much more!

What's a lab?

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.