# Configure Elasticsearch

## Configure connections

To configure Elasticsearch, first, you need to configure the connections.

There are two possibilities of connection:

- using [cluster of Elasticsearch nodes](#configure-clustering)
- using [Elasticsearch Cloud](#configure-elasticsearch-cloud)

No matter which option you choose, you have to define the connection settings under the `connections` key. Set a name of the connection:

```
ibexa_elasticsearch:
    connections:
        <connection_name>:
```

A default connection

If you define more than one connection, for example, to create a separate connection for each repository, you must select the one that Ibexa DXP should use with the following setting:

```
ibexa_elasticsearch:
    # ...
    default_connection: <connection_name>
```

Now, you need to decide whether to add a cluster that you administer and manage yourself, or use a cloud solution from Elastic, and configure additional parameters.

If you want to connect by using a cluster, follow the instructions below in the [Cluster](#configure-clustering) section. If you want to use Elasticsearch Cloud, skip to [Elasticsearch Cloud](#configure-elasticsearch-cloud) section.

## Configure clustering

A cluster consists of nodes. You might start with one node and then add more nodes if you need more processing power.

When you configure a node, you need to set the following parameters:

- `host` - an IP address or domain name of the host. Default value: `localhost`.
- `port` - a port to connect to. Default value: `9200`. If you have several Elasticsearch instances that run on the same host, and want to make them distinct, you can change the default number.
- `scheme` - a protocol used to access the node. Default value: `http`.
- `path` - by default, path isn't used. Default value: `null`. If you have several Elasticsearch instances that run on the same host, and want to make them distinct, you can define a path for each instance.
- `user`/`pass` - credentials, if needed to log in to the host. Default values: `null`.

Next, list the addresses of cluster nodes under the `hosts` key:

```
ibexa_elasticsearch:
    connections:
        <connection_name>:
            hosts:
                - '127.0.0.1:9200'
                # ...
```

There are several ways that you can use to pass host parameters. The easiest one is to pass them as a string:

```
- https://<my.elasticsearch.domain>:9200/<path>/
```

You can also pass the host configuration as an object that lists parameter-value pairs, for example, when your authentication settings contain special characters.

```
- { host: '<my.elasticsearch.domain>', scheme: 'http', port: 9200, path: '/', user: <username>, pass: <password> }
```

Cluster connection configuration should have the following structure:

```
ibexa_elasticsearch:
    connections:
        simple:
            hosts:
                - '127.0.0.1:9200'
                - '127.0.0.1:9201'
                - '127.0.0.1:9202'

        localhost:
            debug: true
            hosts:
                - "127.0.0.1:9200"
                - "b.elasticsearch.loc:9200"
                - "c.elasticsearch.loc:9200"

        intranet:
            debug: true
            hosts:
                - "c.elasticsearch.loc:9200"

    default_connection: simple
```

### Multi-node cluster behavior

When you configure a cluster-based connection, and the cluster consists of many nodes, you can choose strategies that govern how the cluster reacts to changing operating conditions, or how workload is distributed among the nodes.

#### Node pool settings

With these settings you decide how nodes in the cluster are selected and how failed nodes are resurrected. The node pool manages the list of active nodes, which can change over time due to connectivity issues, host malfunction, or when you add new nodes to the cluster to increase performance.

By default, Elasticsearch uses the `SimpleNodePool` algorithm with `RoundRobin` selector and `NoResurrect` strategy.

You can customize the node pool behavior with the following settings:

```
<connection_name>:
    # ...
    node_pool_selector: Elastic\Transport\NodePool\Selector\RoundRobin
    node_pool_resurrect: Elastic\Transport\NodePool\Resurrect\NoResurrect
```

For more information and a list of available choices, see [Node pool](https://www.elastic.co/guide/en/elasticsearch/client/php-api/8.19/node_pool.html).

Load tests recommendation

If you change the node pool settings, it's recommended that you perform load tests to check whether the change doesn't negatively impact the performance of your environment.

##### Number of retries

The `retries` setting configures the number of attempts that Ibexa DXP makes to connect to the nodes of the cluster before it throws an exception. By default, `null` is used, which means that the number of retries equals to the number of nodes in the cluster.

```
<connection_name>:
    # ...
    retries: null
```

Depending on the node pool settings that you select, Ibexa DXP's reaction to reaching the maximum number of retries might differ.

For more information, see [Set retries](https://www.elastic.co/guide/en/elasticsearch/client/php-api/8.19/set-retries.html).

## Configure Elasticsearch Cloud

As an alternative to using your own cluster, you can use Elasticsearch Cloud, a commercial SaaS solution. With Elasticsearch Cloud you don't have to build or manage your own Elasticsearch cluster. Also, you do all the configuration and administration in a graphical user interface.

To connect to a cloud solution with Ibexa DXP, you must set the `elastic_cloud_id` parameter by providing an alphanumerical ID string that you get from the cloud's user interface, for example:

```
<connection_name>:
    elastic_cloud_id: 'production:ZWFzdHVzMi5henVyZS5lbGFzdGljLWNsb3VkLmNvbTo5MjQzJGUwZ'
```

With the ID set, you must configure authentication to be able to access the remote environment.

## Configure security

Elasticsearch instances support `basic` and `api_key` authentication methods. You select authentication type and configure the settings under the `authentication` key. By default, authentication is disabled:

```
<connection_name>:
    # ...
    authentication:
        type: null
```

If you connect to Elasticsearch hosts outside of your local network, you might also need to configure SSL encryption.

### Basic authentication

If your Elasticsearch server is protected by HTTP authentication, you must provide Ibexa DXP with the credentials. In the basic authentication, you must pass the following parameters:

```
<connection_name>
    # ...
    authentication:
        type: basic
        credentials: ['<user_name', '<password>']
```

For example:

```
ibexa_elasticsearch:
    connections:
        cloud:
            debug: true
            elastic_cloud_id:   'test:ZWFzdHVzMi5henVyZS5lbGFzdGljLWNsb3VkLmNvbTo5MjQzJGUwZ'
            authentication:
                type: basic
                credentials: ['elastic', '1htFY83VvX2JRDw88MOkOejk']
```

### API key authentication

If your Elasticsearch cluster is protected by API keys, you must provide the key and secret in authentication configuration to connect Ibexa DXP with the cluster. With API key authentication you can define different authorization levels, such as [`create_index` or `index`](https://www.elastic.co/guide/en/elasticsearch/reference/8.19/security-privileges.html#privileges-list-indices). Such approach proves useful if the cluster is available to the public.

For more information, see [Create API key](https://www.elastic.co/guide/en/elasticsearch/reference/8.19/security-api-create-api-key.html).

When using API key authentication, you must pass the following parameters to authenticate access to the cluster:

```
<connection_name>:
    # ...
    authentication:
        type: api_key
        credentials: ['<api_key>', '<api_key_id>']
```

For example:

```
ibexa_elasticsearch:
    connections:
        cloud:
            debug: true
            elastic_cloud_id:   'test:ZWFzdHVzMi5henVyZS5lbGFzdGljLWNsb3VkLmNvbTo5MjQzJGUwZ'
            authentication:
                type: api_key
                credentials: ['ui2lp2axTNmsyakw9tvNnw', 'VuaCfGcBCdbkQm-e5aOx']
```

Alternatively, pass the encoded API key value, which Elasticsearch also calls "API key credentials":

```
<connection_name>:
    # ...
    authentication:
        type: api_key
        credentials: ['<api_key_encoded>']
```

For example:

```
ibexa_elasticsearch:
    connections:
        cloud:
            debug: true
            elastic_cloud_id:   'test:ZWFzdHVzMi5henVyZS5lbGFzdGljLWNsb3VkLmNvbTo5MjQzJGUwZ'
            authentication:
                type: api_key
                credentials: ['VnVhQ2ZHY0JDZGJrUW0tZTVhT3g6dWkybHAyYXhUTm1zeWFrdzl0dk5udw==']
```

To see the difference between API key, API key id, and encoded API key, refer to the [examples in Elasticsearch documentation](https://www.elastic.co/guide/en/elasticsearch/reference/8.19/security-api-create-api-key.html#security-api-create-api-key-example).

### SSL

When you need to protect your communication with the Elasticsearch server, you can use SSL encryption. When configuring SSL for your internal infrastructure, you can use your own client certificates signed by a public CA. Configure SSL by passing the path-passwords pairs for both the certificate and the certificate key.

For example:

```
ibexa_elasticsearch:
    connections:
        cloud_with_ssl:
            debug: true
            elastic_cloud_id:   'test:ZWFzdHVzMi5henVyZS5lbGFzdGljLWNsb3VkLmNvbTo5MjQzJGUwZ'
            authentication:
                type: api_key
                credentials: ['8Ek5f3IBGQlWj6v4M7zG', 'rmI6IechSnSJymWJ4LZqUw']
            ssl:
                cert:
                    path: '/path/to/cert.pem'
                    pass: ~
                cert_key:
                    path: '/path/to/cert-key'
                    pass: ~
```

If you don't have a client certificate signed by public certificate authority, but you have a self-signed CA certificate generated by `elasticsearch-certutil` or another tool (for example for development purposes), use the following `ssl` configuration:

```
ibexa_elasticsearch:
    connections:
        cloud_with_ssl:
            debug: true
            elastic_cloud_id:   'test:ZWFzdHVzMi5henVyZS5lbGFzdGljLWNsb3VkLmNvbTo5MjQzJGUwZ'
            ssl:
                ca_cert:
                    path: '/path/to/ca_cert.pem'
```

If you configure both `ca_cert` and `cert` entries, the `ca_cert` parameter takes precedence over the `cert` parameter.

After you have configured SSL, you can still disable it, for example when the certificates expire, or you're migrating to a new set of certificates. To do this, pass the following setting under the `ssl` key:

```
verification: false
```

For more information, see [Elasticsearch: SSL Encryption](https://www.elastic.co/guide/en/elasticsearch/client/php-api/8.19/connecting.html#ssl-encryption).

### Enable debugging

In a staging environment, you can log messages about the status of communication with Elasticsearch. You can then use Symfony Profiler to review the logs.

By default, debugging is disabled. To enable debugging, you can use the following setting:

```
<connection_name>:
    # ...
    debug:  <true/false>
```

- `debug` logs information about requests, including request status and timing

Elasticsearch 7 compatibility

If you're using Elasticsearch 7, you can also use the `trace` setting for additional debugging information. This setting is deprecated and removed in Elasticsearch 8.

Tip

Make sure that you disable debugging in a production environment.

## Define field type mapping templates

Before you can re-index the Ibexa DXP data, so that Elasticsearch can search through its contents, you must define an index template. Templates instruct Elasticsearch to recognize Ibexa DXP fields as specific data types, based on, for example, a field name. They help you prevent Elasticsearch from using the dynamic field mapping feature to create type mappings automatically. You can create several field type mapping templates for each index, for example, to define settings that are specific for different languages. When you establish a relationship between a field mapping template and a connection, you can apply several templates, too.

### Define a template

To define a field mapping template, you must provide settings under the `index_templates` key. The structure of the template is as follows:

```
ibexa_elasticsearch:
    # ...
    index_templates:
        <index_template_name>:
            patterns:
            # ...
            settings:
            # ...
            mappings:
            # ...
```

Set a unique name for the template and configure the following keys:

- `patterns` - A list of wildcards that Elasticsearch uses to match the field mapping template to an index. Index names use the following pattern:

  `<repository>_<document_type>_<language_code>_<content_type_id>`

  By default, repository name is set to `default`, however, in the context of an Ibexa DXP instance, there can be [several repositories with different names](https://doc.ibexa.co/en/latest/administration/configuration/repository_configuration/#defining-custom-connection). Document type can be either `content` or `location`. In a language code, hyphens are replaced with underscores, and all characters must be lowercase. An index name can therefore look like this:

  `default_content_eng_gb_2`

  You can use the `patterns` setting when your data contains content in different languages. You can create index templates with settings that apply to a specific language only, for example, to eliminate stop words from the index, or help divide concatenations. You use patterns to identify index templates that contain settings specific for a given language:

```
ibexa_elasticsearch:
  # ...
  index_templates:
      default_en_us:
          patterns: ['default_*', '*eng_us*']
          # ...
```

- `settings` - Settings under this key control all aspects related to an index.

For more information and a list of available settings, see [Elasticsearch documentation](https://www.elastic.co/guide/en/elasticsearch/reference/8.19/index-modules.html#index-modules-settings).

```
For example, you can define settings that convert text into a format that is optimized for search, like a normalizer that changes a case of all phrases in the index:
```

```
  ibexa_elasticsearch:
      # ...
          index_templates:
              default:
                  # ...
                  settings:
                      analysis:
                          normalizer:
                              lowercase_normalizer:
                                  type: custom
                                  char_filter: []
                                  filter: lowercase
                                  # ...
```

- `mappings` - Settings under this key define mapping for fields in the index.

For more information about mappings, see [Elasticsearch documentation](https://www.elastic.co/guide/en/elasticsearch/reference/8.19/mapping.html).

```
When you create a custom index template, with settings for your own field and document types, make sure that it contains mappings for all searchable fields that are available in Ibexa DXP.
```

To see the default configuration, go to `vendor/ibexa/elasticsearch/src/bundle/Resources/config/` and open the `default-config.yaml` file.

### Fine-tune the search results

Your search results can be adjusted by configuring additional parameters. For a list of available mapping parameters and their usage, see [Elasticsearch documentation](https://www.elastic.co/guide/en/elasticsearch/reference/8.19/mapping-params.html).

For example, you can apply a mapping parameter, in this case, a normalizer, to a specific mapping under the `dynamic_templates` key:

```
ibexa_elasticsearch:
    # ...
    index_templates:
        default:
            # ...
            mappings:
                # ...
                dynamic_templates:
                    - ez_string:
                        match: "*_s"
                        mapping:
                        type: keyword
                        normalizer: lowercase_normalizer
                    # ...
```

You can also set a boosting factor for a specific field. Boosting increases the relevance of hits, for example making keywords from the title more relevant than the ones from other places of the document. Set the boosting factor under the `properties` key:

```
ibexa_elasticsearch:
    # ...
    index_templates:
        default:
            # ...
            mappings:
                properties:
                    content_name_s:
                        boost: 2.0
                # ...
```

You can even copy contents of existing fields, process them and then paste into another field, which than can be queried:

```
ibexa_elasticsearch:
    # ...
    index_templates:
        default:
            # ...
            mappings:
                properties:
                    user_first_name_s:
                        type: keyword
                        normalizer: lowercase_normalizer
                        copy_to: custom_field
                # ...
```

### Add language-specific analysers

You can configure Elasticsearch to perform language-specific analysis like stemming. This way searching for "cars" returns hits with content that contains the word "car". On a multilingual site, you can have different analyzers configured for different languages, something which is typically required because stemming rules are language-specific.

#### Make a copy of the default template

To enable a language-specific analyzer, create a new template for each language in `config/packages/ibexa_elasticsearch.yaml` first. This template should be based on the `default` template found in `vendor/ibexa/elasticsearch/src/bundle/Resources/config/default-config.yaml`. The name of the new template should indicate the language it applies to, for example `eng_gb`, `nor_no` or `fre_fr`.

#### Change match pattern for the new template

The default template matches on `*_location_*` and `*_content_*`. These patterns aren't language-specific and you cannot use them if you plan to use different templates for different languages. In your copy of the default template, change the pattern as follows:

```
        patterns:
-            - '*_location_*'
-            - '*_content_*'
+            - "*_eng_gb*"
```

This pattern matches on English.

For more information about specifying the pattern for your language, see [Define a template](#define-a-template).

#### Create config for language specific analyzer

For information about configuring an analyzer for each specific language, see [Elasticsearch documentation](https://www.elastic.co/guide/en/elasticsearch/reference/8.19/analysis-lang-analyzer.html).

An adoption of the [`english` analyzer](https://www.elastic.co/guide/en/elasticsearch/reference/8.19/analysis-lang-analyzer.html#english-analyzer) in Ibexa DXP configuration looks like this:

```
ibexa_elasticsearch:
    index_templates:
        english:
            patterns:
                - '*_eng_gb*'
            settings:
                analysis:
                    normalizer:
                        lowercase_normalizer:
                            type: custom
                            char_filter: []
                            filter:
                                - lowercase
                    analyzer:
                        english_analyzer:
                            type: custom
                            tokenizer: lowercase
                            filter:
                                - lowercase
                                - english_stop
                                - english_keywords
                                - english_stemmer
                                - english_possessive_stemmer
                        ibexa_spellcheck_analyzer:
                            type: custom
                            tokenizer: lowercase
                            filter:
                                - lowercase
                                - ibexa_spellcheck_shingle_filter
                        ibexa_spellcheck_raw_analyzer:
                            type: custom
                            tokenizer: standard
                            filter:
                                - lowercase
                                - english_possessive_stemmer
                    filter:
                        ibexa_spellcheck_shingle_filter:
                            type: shingle
                            min_shingle_size: 2
                            max_shingle_size: 3
                        english_stop:
                            type: stop
                            stopwords: '_english_'
                        english_keywords:
                            type: keyword_marker
                            keywords: []
                        english_stemmer:
                            type: stemmer
                            language: light_english
                        english_possessive_stemmer:
                            type: stemmer
                            language: possessive_english
                refresh_interval: "-1"
            mappings:
                dynamic_templates:
                    -   ez_int:
                            match: "*_i"
                            mapping:
                                type: integer
                    -   ez_mint:
                            match: "*_mi"
                            mapping:
                                type: integer
                    -   ez_id:
                            match: "*_id"
                            mapping:
                                type: keyword
                    -   ez_mid:
                            match: "*_mid"
                            mapping:
                                type: keyword
                    -   ez_string:
                            match: "*_s"
                            mapping:
                                type: keyword
                                normalizer: lowercase_normalizer
                    -   ez_mstring:
                            match: "*_ms"
                            mapping:
                                type: keyword
                                normalizer: lowercase_normalizer
                    -   ez_long:
                            match: "*_l"
                            mapping:
                                type: long
                    -   ez_mlong:
                            match: "*_ml"
                            mapping:
                                type: long
                    -   ez_text:
                            match: "*_t"
                            mapping:
                                type: text
                                analyzer: english_analyzer
                    -   ez_text_fulltext:
                            match: "*_fulltext"
                            mapping:
                                type: text
                                analyzer: english_analyzer
                    -   ez_boolean:
                            match: "*_b"
                            mapping:
                                type: boolean
                    -   ez_mboolean:
                            match: "*_mb"
                            mapping:
                                type: boolean
                    -   ez_float:
                            match: "*_f"
                            mapping:
                                type: float
                    -   ez_double:
                            match: "*_d"
                            mapping:
                                type: double
                    -   ez_date:
                            match: "*_dt"
                            mapping:
                                type: date
                    -   ez_geolocation:
                            match: "*_gl"
                            mapping:
                                type: geo_point
                    -   ez_spellcheck:
                            match: "*_spellcheck"
                            mapping:
                                type: text
                                analyzer: ibexa_spellcheck_analyzer
                                fields:
                                    raw:
                                        type: text
                                        analyzer: ibexa_spellcheck_raw_analyzer
```

Then, you must bind this language template to your Elasticsearch connection.

## Bind templates with connections

After you create an index template (for example, for specific data types or linguistic analysis), you must link it to an Elasticsearch connection by adding the `index_templates` key to the connection definition.

If your configuration file contains several connection definitions, you can reuse the same template for different connections. If you have several index templates, you can apply different combinations of templates to different connections.

```
ibexa_elasticsearch:
    connections:
        <connection_for_english_only_repository>:
            # ...
            index_templates:
                - eng_gb
        <connection_for_multilangual_repository>:
            # ...
            index_templates:
                - eng_gb
                - fre_fr
                - ger_de
```

For more information about how Elasticsearch handles settings and mappings from multiple templates that match the same index, see [Elasticsearch documentation](https://www.elastic.co/guide/en/elasticsearch/reference/8.19/index-templates.html).

# Extend Elasticsearch

To learn how you can create document field mappers, custom Search Criteria, custom Sort Clauses and Aggregations, see [Create custom Search Criterion](https://doc.ibexa.co/en/latest/search/extensibility/create_custom_search_criterion/index.md).
