Troubleshooting EraSearch
Estimated time to read: 5 minutes
Acquisition notice
In October 2022, ServiceNow acquired Era Software. The documentation on this site is no longer maintained and is intended for existing Era Software users only.
To get the latest information about ServiceNow's observability solutions, visit their website and documentation.
This page lists EraSearch errors and issues and how to fix them.
Debugging Helm deployments for self-hosted EraSearch¶
There are several reasons Helm deployments fail. To start, get deployment details with the commands below, replacing NAMESPACE_NAME
and NAME
with the namespace and release name you used to install EraSearch.
Review your deployments, checking the READY
and AVAILABLE
columns:
Get deployment-object and status information:
Get pod-specific information:
View warnings and other notifications related to EraSearch's namespace:
Here are some common deployment issues and how to fix them:
-
Bad image pull secrets
To identify this issue, run
kubectl get pods
orkubectl describe pod
. Then check the output for theImagePullBackOff
pod status.To fix this issue, make sure the
imagePullSecrets
name invalues-eradb.yaml
matches the secret you created in the EraSearch namespace. -
Over-provisioned clusters
To identify this issue, check for
Insufficient
error events, such asInsufficient CPU
. Also, runkubectl get pods
orkubectl describe pod
to see if pods are stuck in aPending
state.This issue suggests there aren't enough Kubernetes cluster resources to support the deployment. To fix it, do one of the following:
- Update the
resources
andreplicaCount
values invalues-eradb.yaml
to remain under the cluster resource limits. Note that reducing the resources available to EraSearch will decrease overall performance. - Increase the available resources to the Kubernetes cluster (or node group) to account for the new EraSearch resources.
- Update the
-
No available persistent volumes
To identify this issue, check for errors such as
No persistent volumes available for this claim
. Also, runkubectl get pods
orkubectl describe pod
to see if Cache Service pods are stuck in aPending
state.This issue suggests you have a misconfigured Kubernetes storage layer or you're using an invalid storage class identifier. To fix this issue, do the following: - Review the
quarry.persistence.storageClass
value invalues-eradb.yaml
. Make sure it's set to a valid cluster storage class. You can usekubectl get storageclass
to see the available classes. - Make sure a storage class is available for pod storage. - Make sure your cluster has adequate storage invalues-eradb.yaml
.
Enabling debug logging for self-hosted EraSearch¶
Use debug logging to get in-depth database information and troubleshoot issues. To enable debug logging, add the value logLevel: debug
to any EraSearch service in your values-eradb.yaml
file.
For example, this values-eradb.yaml
file enables debug logging for the API and Cache services (also known as quarry
):
quarry:
logLevel: debug # ⭐️
imagePullSecrets:
- name: eradb-registry
replicaCount: 4
resources:
cpu: 4
memory: 8Gi
disk: 2.5T
To deploy your changes, save the updated Helm chart and enter the command below, replacing:
NAME
with the EraSearch database release name (for example,era
).x.x.x
with the version of the Helm chart you got from Era Software.VALUES_FILE
with the path to the updatedvalues-eradb.yaml
file.NAMESPACE_NAME
with the relevant Kubernetes namespace.
Successful upgrade commands return Release NAME has been upgraded. Happy Helming!
with other deployment details.
Index naming errors¶
When writing data to EraSearch, you might get the following index naming errors:
-
Index name is too long, (X > 255)
-
Index name must not be '.' or '..'
Index name must not contain the following characters [X]
Index name must not start with '_', '-', or '+'
EraSearch returns those errors if your index name breaks these rules:
-
Index names must be shorter than 255 bytes.
-
Index names must not be set to only
.
or..
.✔
.creektrails
,moose..pass
𝗫.
,..
-
Index names must not have any of these characters:
\
,/
,*
,?
,"
,<
,>
,|
,,
,#
, or:
.✔
sea&rock-path
,timber(trail)
𝗫stone*pass
,"wild-firs"
-
Index names must not start with these characters:
_
,-
, or+
.✔
faraway_glades
,stumble+route
𝗫_undercover_pathway
,+steepcourse
Note
For Elasticsearch users, EraSearch lets indexes have uppercase letters, for example, MyEraLogs
. Elasticsearch doesn't support uppercase letters in index names.
Slow write throughput¶
Slow write throughput can have several different causes. To start:
- Make sure the CPU-bound task latency metric is under
1s
per pod. To improve this metric, add more CPU resources to the API and Cache tiers. - Make sure the disk-bound task latency metric is under
5ms
per pod. To improve this metric, add more disk resources or a higher number of IOPs to your Cache Service tier. - If insertion times are much larger than the
maxwell.treasurer.batch_delay_ms
setting, reduce thebatch_delay_ms
setting and increase themonthly_budget
setting. Note that this change increases the financial costs associated with your object storage provider, but increases overall system throughput.
Modifying settings at runtime¶
Warning
The settings described below can have drastic effects on performance and runtime behavior. Please check with EraDB support before modifying any values.
The EraDB Cache Service supports modification of some settings at runtime through the PUT /_eradb/settings/v1
endpoint. The settings available for modification are:
aggregate_concurrency
(number) - The number of concurrent threads to use when servicing aggregate queries.search_concurrency
(number) - The number of concurrent threads to use when servicing search queries.hydration_concurrency
(number) - The number of concurrent threads to use when rehydrating data back from object storage.roots_per_task
(number) - The number of roots inspected per compaction / search task.
To update these settings from within a running Cache Service pod, run the following command from within the pod itself:
curl -XPUT \
-H "Content-Type: application/json" \
"https://localhost:9200/_eradb/settings/v1" \
--data-binary '{"aggregate_concurrency":12,"search_concurrency":1,"hydration_concurrency":20,"roots_per_task":1}'
Settings not specified in the JSON body of the request will default to their current setting. All settings will be reset to their default / environment values upon restart, so endpoint should primarily be used for performance investigations or debugging.