Skip to main content

Quickwit on Kubernetes

quickwit-thumbnail

Installation guide 🦮

Prerequisites​

  • Access to a Kubernetes cluster (you can easily create a local cluster by using Minikube or Kind)
  • kubectl isn't strictly speaking a dependency for installing packages via glasskube, but it is the recommended way to interact with the cluster. Therefore, it is highly recommended. Installation instructions are available for macOS, Linux and Windows.

Install Glasskube

If you've already installed glasskube you can skip this step. If not, glasskube can easily be installed by following your distribution's specific instructions.

For this demo I'll be using a MacOs distribution:

brew install glasskube/tap/glasskube # install the glasskube cli
minikube start # start a minikube Kubernetes cluster
glasskube bootstrap # install glasskube on the kind cluster

For more installation guides, find them here.

Once Glasskube has been installed access via the UI with:

glasskube serve

The dashboard will open up on http://localhost:8580/.

Creating an S3-Compatible Bucket

Before installing Quickwit, you'll need to create an object storage bucket to hold your Quickwit indexes. You can use use your choice of Cloud provider such as Scaleway, AWS S3 or MinIO. Refer to our official Quickwit documentation for storage configuration details.

Here I will be creating an AWS S3 bucket to store the Quickwit indexes. s3-dashboard

Steps:

  • Navigate to the AWS management console and create a new S3 bucket.
  • In IAM generate an API key, with S3 permissions, save the 'Access Key Id' and 'Secret Key', we will need them shortly.

Deploy Quickwit​

From the Glasskube dashboard, find the Quickwit pacakge and add your custom configuration parameters.

quickmit-parameters

  • defaultIndexRootUri: for this demo it's s3://quickwit-indexes.
  • metastoreUri: we won't use PostgreSQL so let's pick the same value we used for defaultIndexRootUri.
  • s3AccessKeyId: the "Access Key Id" from AWS we generated before.
  • s3Endpoint: Custom endpoint for use with S3-compatible providers. Not needed for S3 configuration.
  • s3Flavor: we are using the default empty value for genuine S3-compatible object storage.
  • s3Region: US-east-1 in my case.
  • s3SecretAccessKey: the "Secret Key" from AWS we generated before.

Here you can find the official Quickwit documentation for parameter completion.

It's also possible to install and configure Quickwit using the Glasskube CLI by running:

glasskube install quickwit

Once installed, you can see that a quickwit namespace has been created:

default
flux-system
glasskube-system
kube-node-lease
kube-public
kube-system
kubernetes-dashboard
quickwit

Now, check to see if the pods are running:

NAME                                               READY   STATUS    RESTARTS      AGE
quickwit-quickwit-control-plane-86bd9955f7-bwm2r 1/1 Running 1 (27m ago) 29m
quickwit-quickwit-indexer-0 1/1 Running 1 (27m ago) 29m
quickwit-quickwit-janitor-9479697ff-x4x2c 1/1 Running 1 (27m ago) 29m
quickwit-quickwit-metastore-56ff74df9f-k6d2g 1/1 Running 0 29m
quickwit-quickwit-searcher-0 1/1 Running 1 (27m ago) 29m
quickwit-quickwit-searcher-1 1/1 Running 0 27m
quickwit-quickwit-searcher-2 1/1 Running 0 27m

We can try to access to the Quickwit UI by port-forwarding the Quickwit searcher (dashboard) pod:

$ kubectl -n quickwit port-forward pod/quickwit-quickwit-searcher-0 7280

Head over to http://localhost:7280. And you should be ready to go!

quickwit dashboard

Create your first index​

Before adding documents to Quickwit, you need to create an index configured with a YAML config file. This config file notably lets you define how to map your input documents to your index fields and whether these fields should be stored and indexed. See the index config documentation.

Let's create an index configured to receive Stackoverflow posts (questions and answers).

# First, download the stackoverflow dataset config from Quickwit repository.
curl -o stackoverflow-index-config.yaml https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/stackoverflow/index-config.yaml

The index config defines three fields: title, body and creationDate. title and body are indexed and tokenized, and they are also used as default search fields, which means they will be used for search if you do not target a specific field in your query. creationDate serves as the timestamp for each record. There are no more explicit field definitions as we can use the default dynamic mode: the undeclared fields will still be indexed, by default fast fields are enabled to enable aggregation queries. and the raw tokenizer is used for text.

And here is the complete config:

# Index config file for stackoverflow dataset.
#
version: 0.7

index_id: stackoverflow

doc_mapping:
field_mappings:
- name: title
type: text
tokenizer: default
record: position
stored: true
- name: body
type: text
tokenizer: default
record: position
stored: true
- name: creationDate
type: datetime
fast: true
input_formats:
- rfc3339
fast_precision: seconds
timestamp_field: creationDate

search_settings:
default_search_fields: [title, body]

indexing_settings:
commit_timeout_secs: 30

Now we can create the index with the command:

./quickwit index create --index-config ./stackoverflow-index-config.yaml

Check that a directory ./qwdata/indexes/stackoverflow has been created, Quickwit will write index files here and a metastore.json which contains the index metadata. You're now ready to fill the index.

Continue on to the Quickwit documentation to add your first documents and execute your first search queries.


If you like this sort of content and would like to see more of it, please consider supporting us by giving us a Star on GitHub 🙏 cats-like--github-stars