Skip to content

Troubleshooting EraSearch

Estimated time to read: 5 minutes

Acquisition notice

In October 2022, ServiceNow acquired Era Software. The documentation on this site is no longer maintained and is intended for existing Era Software users only.

To get the latest information about ServiceNow's observability solutions, visit their website and documentation.

This page lists EraSearch errors and issues and how to fix them.

Debugging Helm deployments for self-hosted EraSearch

There are several reasons Helm deployments fail. To start, get deployment details with the commands below, replacing NAMESPACE_NAME and NAME with the namespace and release name you used to install EraSearch.

Review your deployments, checking the READY and AVAILABLE columns:

$ kubectl get deployments -n NAMESPACE_NAME

Get deployment-object and status information:

$ kubectl describe deployment NAME -n NAMESPACE_NAME

Get pod-specific information:

$ kubectl describe pod POD_NAME -n NAMESPACE_NAME

View warnings and other notifications related to EraSearch's namespace:

$ kubectl get events -n NAMESPACE_NAME

Here are some common deployment issues and how to fix them:

  • Bad image pull secrets

    To identify this issue, run kubectl get pods or kubectl describe pod. Then check the output for the ImagePullBackOff pod status.

    To fix this issue, make sure the imagePullSecrets name in values-eradb.yaml matches the secret you created in the EraSearch namespace.

  • Over-provisioned clusters

    To identify this issue, check for Insufficient error events, such as Insufficient CPU. Also, run kubectl get pods or kubectl describe pod to see if pods are stuck in a Pending state.

    This issue suggests there aren't enough Kubernetes cluster resources to support the deployment. To fix it, do one of the following:

    • Update the resources and replicaCount values in values-eradb.yaml to remain under the cluster resource limits. Note that reducing the resources available to EraSearch will decrease overall performance.
    • Increase the available resources to the Kubernetes cluster (or node group) to account for the new EraSearch resources.
  • No available persistent volumes

    To identify this issue, check for errors such as No persistent volumes available for this claim. Also, run kubectl get pods or kubectl describe pod to see if Cache Service pods are stuck in a Pending state.

    This issue suggests you have a misconfigured Kubernetes storage layer or you're using an invalid storage class identifier. To fix this issue, do the following: - Review the quarry.persistence.storageClass value in values-eradb.yaml. Make sure it's set to a valid cluster storage class. You can use kubectl get storageclass to see the available classes. - Make sure a storage class is available for pod storage. - Make sure your cluster has adequate storage in values-eradb.yaml.

Enabling debug logging for self-hosted EraSearch

Use debug logging to get in-depth database information and troubleshoot issues. To enable debug logging, add the value logLevel: debug​​ to any EraSearch service in your values-eradb.yaml file.

For example, this values-eradb.yaml file enables debug logging for the API and Cache services (also known as quarry):

quarry:
  logLevel: debug # ⭐️
  imagePullSecrets:
    - name: eradb-registry
  replicaCount: 4
  resources:
    cpu: 4
    memory: 8Gi
    disk: 2.5T

To deploy your changes, save the updated Helm chart and enter the command below, replacing:

  • NAME with the EraSearch database release name (for example, era).
  • x.x.x with the version of the Helm chart you got from Era Software.
  • VALUES_FILE with the path to the updated values-eradb.yaml file.
  • NAMESPACE_NAME with the relevant Kubernetes namespace.
$ helm upgrade NAME ./eradb-X.X.X.tgz \
    --values VALUES_FILE \ 
    --namespace NAMESPACE_NAME

Successful upgrade commands return Release NAME has been upgraded. Happy Helming! with other deployment details.

Index naming errors

When writing data to EraSearch, you might get the following index naming errors:

  • Index name is too long, (X > 255)

  • Index name must not be '.' or '..'

  • Index name must not contain the following characters [X]
  • Index name must not start with '_', '-', or '+'

EraSearch returns those errors if your index name breaks these rules:

  • Index names must be shorter than 255 bytes.

  • Index names must not be set to only . or ...

    .creektrails, moose..pass
    𝗫 ., ..

  • Index names must not have any of these characters: \, /, *, ?, ", <, >, |, ,, #, or :.

    sea&rock-path, timber(trail)
    𝗫 stone*pass, "wild-firs"

  • Index names must not start with these characters: _, -, or +.

    faraway_glades, stumble+route
    𝗫 _undercover_pathway, +steepcourse

Note

For Elasticsearch users, EraSearch lets indexes have uppercase letters, for example, MyEraLogs. Elasticsearch doesn't support uppercase letters in index names.

Slow write throughput

Slow write throughput can have several different causes. To start:

  • Make sure the CPU-bound task latency metric is under 1s per pod. To improve this metric, add more CPU resources to the API and Cache tiers.
  • Make sure the disk-bound task latency metric is under 5ms per pod. To improve this metric, add more disk resources or a higher number of IOPs to your Cache Service tier.
  • If insertion times are much larger than the maxwell.treasurer.batch_delay_ms setting, reduce the batch_delay_ms setting and increase the monthly_budget setting. Note that this change increases the financial costs associated with your object storage provider, but increases overall system throughput.

Modifying settings at runtime

Warning

The settings described below can have drastic effects on performance and runtime behavior. Please check with EraDB support before modifying any values.

The EraDB Cache Service supports modification of some settings at runtime through the PUT /_eradb/settings/v1 endpoint. The settings available for modification are:

  • aggregate_concurrency (number) - The number of concurrent threads to use when servicing aggregate queries.
  • search_concurrency (number) - The number of concurrent threads to use when servicing search queries.
  • hydration_concurrency (number) - The number of concurrent threads to use when rehydrating data back from object storage.
  • roots_per_task (number) - The number of roots inspected per compaction / search task.

To update these settings from within a running Cache Service pod, run the following command from within the pod itself:

curl -XPUT \
  -H "Content-Type: application/json" \
  "https://localhost:9200/_eradb/settings/v1" \
  --data-binary '{"aggregate_concurrency":12,"search_concurrency":1,"hydration_concurrency":20,"roots_per_task":1}'

Settings not specified in the JSON body of the request will default to their current setting. All settings will be reset to their default / environment values upon restart, so endpoint should primarily be used for performance investigations or debugging.


Last update: August 7, 2023